The patent badge is an abbreviated version of the USPTO patent document. The patent badge does contain a link to the full patent document.

The patent badge is an abbreviated version of the USPTO patent document. The patent badge covers the following: Patent number, Date patent was issued, Date patent was filed, Title of the patent, Applicant, Inventor, Assignee, Attorney firm, Primary examiner, Assistant examiner, CPCs, and Abstract. The patent badge does contain a link to the full patent document (in Adobe Acrobat format, aka pdf). To download or print any patent click here.

Patent No.:

US 12482455 B1

Date of Patent:

Nov. 25, 2025

Filed:

Oct. 01, 2021

Systems and methods for training dual-mode machine-learned speech recognition models

Applicant:

Google Llc, Mountain View, CA (US);

Inventors:

Jiahui Yu, Jersey City, NJ (US);

Ruoming Pang, New York, NY (US);

Wei Han, Mountain View, CA (US);

Anmol Gulati, New York, NY (US);

Chung-Cheng Chiu, Mountain View, CA (US);

Bo Li, Fremont, CA (US);

Tara N. Sainath, Jersey City, NJ (US);

Yonghui Wu, Palo Alto, CA (US);

Assignee:

GOOGLE LLC, Mountain View, CA (US);

Attorney:

DORITY & MANNING P.A.

Primary Examiner:

Mark Villena

Int. Cl.

CPC ...

G10L 15/16 (2006.01); G10L 15/22 (2006.01); G10L 15/32 (2013.01);

U.S. Cl.

CPC ...

G10L 15/16 (2013.01); G10L 15/22 (2013.01); G10L 15/32 (2013.01);

Abstract

Systems and methods of the present disclosure are directed to a computing system, including one or more processors and a machine-learned multi-mode speech recognition model configured to operate in a streaming recognition mode or a contextual recognition mode. The computing system can perform operations including obtaining speech data and a ground truth label and processing the speech data using the contextual recognition mode to obtain contextual prediction data. The operations can include evaluating a difference between the contextual prediction data and the ground truth label and processing the speech data using the streaming recognition mode to obtain streaming prediction data. The operations can include evaluating a difference between the streaming prediction data and the ground truth label and the contextual and streaming prediction data. The operations can include adjusting parameters of the speech recognition model.

Find Patent Forward Citations