The patent badge is an abbreviated version of the USPTO patent document. The patent badge does contain a link to the full patent document.

The patent badge is an abbreviated version of the USPTO patent document. The patent badge covers the following: Patent number, Date patent was issued, Date patent was filed, Title of the patent, Applicant, Inventor, Assignee, Attorney firm, Primary examiner, Assistant examiner, CPCs, and Abstract. The patent badge does contain a link to the full patent document (in Adobe Acrobat format, aka pdf). To download or print any patent click here.

Date of Patent:
Mar. 05, 2024

Filed:

Mar. 31, 2023
Applicant:

Google Llc, Mountain View, CA (US);

Inventors:

Rohit Prakash Prabhavalkar, Palo Alto, CA (US);

Tara N. Sainath, Jersey City, NJ (US);

Yonghui Wu, Fremont, CA (US);

Patrick An Phu Nguyen, Mountain View, CA (US);

Zhifeng Chen, Sunnyvale, CA (US);

Chung-Cheng Chiu, Sunnyvale, CA (US);

Anjuli Patricia Kannan, Berkeley, CA (US);

Assignee:

Google LLC, Mountain View, CA (US);

Attorneys:
Primary Examiner:
Assistant Examiner:
Int. Cl.
CPC ...
G10L 15/197 (2013.01); G10L 15/02 (2006.01); G10L 15/06 (2013.01); G10L 15/16 (2006.01); G10L 15/22 (2006.01);
U.S. Cl.
CPC ...
G10L 15/197 (2013.01); G10L 15/02 (2013.01); G10L 15/063 (2013.01); G10L 15/16 (2013.01); G10L 15/22 (2013.01); G10L 2015/025 (2013.01);
Abstract

Methods, systems, and apparatus, including computer programs encoded on computer-readable storage media, for speech recognition using attention-based sequence-to-sequence models. In some implementations, audio data indicating acoustic characteristics of an utterance is received. A sequence of feature vectors indicative of the acoustic characteristics of the utterance is generated. The sequence of feature vectors is processed using a speech recognition model that has been trained using a loss function that uses a set of speech recognition hypothesis samples, the speech recognition model including an encoder, an attention module, and a decoder. The encoder and decoder each include one or more recurrent neural network layers. A sequence of output vectors representing distributions over a predetermined set of linguistic units is obtained. A transcription for the utterance is obtained based on the sequence of output vectors. Data indicating the transcription of the utterance is provided.


Find Patent Forward Citations

Loading…