The patent badge is an abbreviated version of the USPTO patent document. The patent badge does contain a link to the full patent document.

The patent badge is an abbreviated version of the USPTO patent document. The patent badge covers the following: Patent number, Date patent was issued, Date patent was filed, Title of the patent, Applicant, Inventor, Assignee, Attorney firm, Primary examiner, Assistant examiner, CPCs, and Abstract. The patent badge does contain a link to the full patent document (in Adobe Acrobat format, aka pdf). To download or print any patent click here.

Patent No.:

US 11862146 B1

Date of Patent:

Jan. 02, 2024

Filed:

Jul. 02, 2020

Multistream acoustic models with dilations

Applicant:

Asapp, Inc., New York, NY (US);

Inventors:

Kyu Jeong Han, Pleasanton, CA (US);

Tao Ma, Mountain View, CA (US);

Daniel Povey, Redmond, WA (US);

Assignee:

ASAPP, INC., New York, NY (US);

Attorney:

GTC Law Group PC & Affiliates

Primary Examiner:

Edwin S Leland, III

Int. Cl.

CPC ...

G10L 25/24 (2013.01); G06N 3/045 (2023.01); G10L 15/16 (2006.01); G10L 15/22 (2006.01); G10L 15/06 (2013.01); G06N 3/08 (2023.01); G06N 3/048 (2023.01);

U.S. Cl.

CPC ...

G10L 15/16 (2013.01); G06N 3/045 (2023.01); G10L 25/24 (2013.01); G06N 3/048 (2023.01); G06N 3/08 (2013.01); G10L 15/063 (2013.01); G10L 2015/223 (2013.01);

Abstract

Audio signals of speech may be processed using an acoustic model. An acoustic model may be implemented with multiple streams of processing where different streams perform processing using different dilation rates. For example, a first stream may process features of the audio signal with one or more convolutional neural network layers having a first dilation rate, and a second stream may process features of the audio signal with one or more convolutional neural network layers having a second dilation rate. Each stream may compute a stream vector, and the stream vectors may be combined to a vector of speech unit scores, where the vector of speech unit scores provides information about the acoustic content of the audio signal. The vector of speech unit scores may be used for any appropriate application of speech, such as automatic speech recognition.

Find Patent Forward Citations