The patent badge is an abbreviated version of the USPTO patent document. The patent badge does contain a link to the full patent document.

The patent badge is an abbreviated version of the USPTO patent document. The patent badge covers the following: Patent number, Date patent was issued, Date patent was filed, Title of the patent, Applicant, Inventor, Assignee, Attorney firm, Primary examiner, Assistant examiner, CPCs, and Abstract. The patent badge does contain a link to the full patent document (in Adobe Acrobat format, aka pdf). To download or print any patent click here.

Patent No.:

US 10629193 B1

Date of Patent:

Apr. 21, 2020

Filed:

Mar. 09, 2018

Advancing word-based speech recognition processing

Applicant:

Microsoft Technology Licensing, Llc, Redmond, WA (US);

Inventors:

Guoli Ye, Redmond, WA (US);

James Droppo, Carnation, WA (US);

Jinyu Li, Redmond, WA (US);

Rui Zhao, Redmond, WA (US);

Yifan Gong, Sammamish, WA (US);

Assignee:

Microsoft Technology Licensing, LLC, Redmond, WA (US);

Attorney:

Primary Examiner:

Edwin S Leland, III

Int. Cl.

CPC ...

G10L 15/187 (2013.01); G10L 15/16 (2006.01); G10L 15/06 (2013.01); G10L 15/22 (2006.01); G10L 15/08 (2006.01);

U.S. Cl.

CPC ...

G10L 15/187 (2013.01); G10L 15/063 (2013.01); G10L 15/08 (2013.01); G10L 15/16 (2013.01); G10L 15/22 (2013.01); G10L 2015/0635 (2013.01); G10L 2015/223 (2013.01);

Abstract

Non-limiting examples of the present disclosure describe advancements in acoustic-to-word modeling that improve accuracy in speech recognition processing through the replacement of out-of-vocabulary (OOV) tokens. During the decoding of speech signals, better accuracy in speech recognition processing is achieved through training and implementation of multiple different solutions that present enhanced speech recognition models. In one example, a hybrid neural network model for speech recognition processing combines a word-based neural network model as a primary model and a character-based neural network model as an auxiliary model. The primary word-based model emits a word sequence, and an output of character-based auxiliary model is consulted at a segment where the word-based model emits an OOV token. In another example, a mixed unit speech recognition model is developed and trained to generate a mixed word and character sequence during decoding of a speech signal without requiring generation of OOV tokens.

Find Patent Forward Citations