The patent badge is an abbreviated version of the USPTO patent document. The patent badge does contain a link to the full patent document.

The patent badge is an abbreviated version of the USPTO patent document. The patent badge covers the following: Patent number, Date patent was issued, Date patent was filed, Title of the patent, Applicant, Inventor, Assignee, Attorney firm, Primary examiner, Assistant examiner, CPCs, and Abstract. The patent badge does contain a link to the full patent document (in Adobe Acrobat format, aka pdf). To download or print any patent click here.

Date of Patent:
Sep. 15, 2020

Filed:

Nov. 13, 2018
Applicant:

Amazon Technologies, Inc., Seattle, WA (US);

Inventors:

Stefano Stefani, Issaquah, WA (US);

Pramod Gurunath, Sammamish, WA (US);

Ashish Singh, Bothell, WA (US);

Katrin Kirchoff, Seattle, WA (US);

Deepikaa Suresh, Seattle, WA (US);

Varun Sembium Varadarajan, Bothell, WA (US);

Vasanth Philomin, Seattle, WA (US);

Vikram Sathyanarayana Anbazhagan, Issaquah, WA (US);

Pu Paul Zhao, Seattle, WA (US);

Vijit Gupta, Mercer Island, WA (US);

Ruoyu Huang, Seattle, WA (US);

Assignee:

Amazon Technolgies, Inc., Seattle, WA (US);

Attorney:
Primary Examiner:
Int. Cl.
CPC ...
G10L 15/00 (2013.01); G10L 15/06 (2013.01); G10L 25/78 (2013.01); G10L 15/30 (2013.01); G10L 15/04 (2013.01); G10L 15/26 (2006.01); G10L 15/183 (2013.01);
U.S. Cl.
CPC ...
G10L 15/063 (2013.01); G10L 15/04 (2013.01); G10L 15/183 (2013.01); G10L 15/26 (2013.01); G10L 15/30 (2013.01); G10L 25/78 (2013.01); G10L 2015/0631 (2013.01);
Abstract

Techniques for streaming real-time automated speech recognition (ASR) are described. A user can stream audio data to a frontend service of the ASR service. The frontend service can establish a bi-directional connection to an audio decoder host to perform ASR on the data stream. The audio decoder host may include a streaming ASR engine which can analyze chunks of the audio data stream using an acoustic model to divide the audio data into words, and a language model to identify sentences made of the words spoken in the audio file. The acoustic model can be trained using short audio sentence data (e.g., on the order of 30 seconds to a few minutes), enabling the transcription service to accurately transcribe short chunks of audio data. The results are then punctuated and normalized. The resulting transcript is then streamed back to the user over the bi-directional connection.


Find Patent Forward Citations

Loading…