The patent badge is an abbreviated version of the USPTO patent document. The patent badge does contain a link to the full patent document.

The patent badge is an abbreviated version of the USPTO patent document. The patent badge covers the following: Patent number, Date patent was issued, Date patent was filed, Title of the patent, Applicant, Inventor, Assignee, Attorney firm, Primary examiner, Assistant examiner, CPCs, and Abstract. The patent badge does contain a link to the full patent document (in Adobe Acrobat format, aka pdf). To download or print any patent click here.

Date of Patent:
May. 17, 2022

Filed:

Dec. 17, 2019
Applicant:

Google Llc, Mountain View, CA (US);

Inventors:

Wei Han, Mountain View, CA (US);

Chung-Cheng Chiu, Sunnyvale, CA (US);

Yu Zhang, Mountain View, CA (US);

Yonghui Wu, Fremont, CA (US);

Patrick Nguyen, Mountain View, CA (US);

Sergey Kishchenko, Mountain View, CA (US);

Assignee:

Google LLC, Mountain View, CA (US);

Attorneys:
Primary Examiner:
Int. Cl.
CPC ...
G10L 15/00 (2013.01); G10L 15/16 (2006.01); G10L 15/04 (2013.01); G10L 15/06 (2013.01); G10L 15/22 (2006.01); G10L 15/187 (2013.01); G10L 15/26 (2006.01); G10L 15/02 (2006.01);
U.S. Cl.
CPC ...
G10L 15/16 (2013.01); G10L 15/04 (2013.01); G10L 15/063 (2013.01); G10L 15/22 (2013.01); G10L 15/02 (2013.01); G10L 15/187 (2013.01); G10L 15/26 (2013.01);
Abstract

A method includes obtaining audio data for a long-form utterance and segmenting the audio data for the long-form utterance into a plurality of overlapping segments. The method also includes, for each overlapping segment of the plurality of overlapping segments: providing features indicative of acoustic characteristics of the long-form utterance represented by the corresponding overlapping segment as input to an encoder neural network; processing an output of the encoder neural network using an attender neural network to generate a context vector; and generating word elements using the context vector and a decoder neural network. The method also includes generating a transcription for the long-form utterance by merging the word elements from the plurality of overlapping segments and providing the transcription as an output of the automated speech recognition system.


Find Patent Forward Citations

Loading…