For the Inventor, By the Inventor

The patent badge is an abbreviated version of the USPTO patent document. The patent badge does contain a link to the full patent document.

The patent badge is an abbreviated version of the USPTO patent document. The patent badge covers the following: Patent number, Date patent was issued, Date patent was filed, Title of the patent, Applicant, Inventor, Assignee, Attorney firm, Primary examiner, Assistant examiner, CPCs, and Abstract. The patent badge does contain a link to the full patent document (in Adobe Acrobat format, aka pdf). To download or print any patent click here.

Patent No.:

US 12417770 B1

Date of Patent:

Sep. 16, 2025

Filed:

Mar. 13, 2023

Unified cascaded encoder asr model for dynamic model sizes

Applicant:

Google Llc, Mountain View, CA (US);

Inventors:

Shaojin Ding, Mountain View, CA (US);

Yangzhang He, Mountain View, CA (US);

Xin Wang, Mountain View, CA (US);

Weiran Wang, Palo Alto, CA (US);

Trevor Strohman, Mountain View, CA (US);

Tara N. Sainath, Jersey City, NJ (US);

Rohit Prakash Prabhavalkar, Palo Alto, CA (US);

Robert David, Mountain View, CA (US);

Rina Panigrahy, Mountain View, CA (US);

Rami Botros, Mountain View, CA (US);

Qiao Liang, Mountain View, CA (US);

Ian Mcgraw, Mountain View, CA (US);

Ding Zhao, Mountain View, CA (US);

Dongseong Hwang, Mountain View, CA (US);

Assignee:

Google LLC, Mountain View, CA (US);

Attorneys:

Honigman LLP

Brett A. Krueger

Grant Griffith

Primary Examiner:

Jonathan C Kim

Int. Cl.

CPC ...

G10L 15/32 (2013.01); G10L 15/16 (2006.01); G10L 15/22 (2006.01);

U.S. Cl.

CPC ...

G10L 15/32 (2013.01); G10L 15/16 (2013.01); G10L 15/22 (2013.01); G10L 2015/223 (2013.01);

Abstract

An automated speech recognition (ASR) model includes a first encoder, a first encoder, a second encoder, and a second decoder. The first encoder receives, as input, a sequence of acoustic frames, and generates, at each of a plurality of output steps, a first higher order feature representation for a corresponding acoustic frame in the sequence of acoustic frames. The first decoder receives, as input, the first higher order feature representation generated by the first encoder, and generates a first probability distribution over possible speech recognition hypotheses. The second encoder receives, as input, the first higher order feature representation generated by the first encoder, and generates a second higher order feature representation for a corresponding first higher order feature frame. The second decoder receives, as input, the second higher order feature representation generated by the second encoder, and generates a second probability distribution over possible speech recognition hypotheses.

Find Patent Forward Citations

Loading…