The patent badge is an abbreviated version of the USPTO patent document. The patent badge does contain a link to the full patent document.

The patent badge is an abbreviated version of the USPTO patent document. The patent badge covers the following: Patent number, Date patent was issued, Date patent was filed, Title of the patent, Applicant, Inventor, Assignee, Attorney firm, Primary examiner, Assistant examiner, CPCs, and Abstract. The patent badge does contain a link to the full patent document (in Adobe Acrobat format, aka pdf). To download or print any patent click here.

Date of Patent:
Jun. 17, 2025

Filed:

Nov. 18, 2019
Applicant:

Google Llc, Mountain View, CA (US);

Inventors:

Olivier Siohan, Mountain View, CA (US);

Takaki Makino, Mountain View, CA (US);

Richard Rose, Mountain View, CA (US);

Otavio Braga, Mountain View, CA (US);

Hank Liao, Mountain View, CA (US);

Basilio Garcia Castillo, Mountain View, CA (US);

Assignee:

Google LLC, Mountain View, CA (US);

Attorneys:
Primary Examiner:
Int. Cl.
CPC ...
G10L 13/08 (2013.01); G06V 10/774 (2022.01); G06V 20/40 (2022.01); G06V 40/16 (2022.01); G10L 13/02 (2013.01); G10L 15/06 (2013.01); G10L 15/08 (2006.01); G10L 15/22 (2006.01); G10L 15/25 (2013.01); G10L 15/30 (2013.01); G10L 25/57 (2013.01);
U.S. Cl.
CPC ...
G10L 15/083 (2013.01); G06V 10/774 (2022.01); G06V 20/46 (2022.01); G06V 40/171 (2022.01); G10L 13/02 (2013.01); G10L 15/063 (2013.01); G10L 15/22 (2013.01); G10L 15/25 (2013.01); G10L 15/30 (2013.01); G10L 25/57 (2013.01);
Abstract

A method () includes receiving audio data () corresponding to an utterance () spoken by a user (), receiving video data () representing motion of lips of the user while the user was speaking the utterance, and obtaining multiple candidate transcriptions () for the utterance based on the audio data. For each candidate transcription of the multiple candidate transcriptions, the method also includes generating a synthesized speech representation () of the corresponding candidate transcription and determining an agreement score () indicating a likelihood that the synthesized speech representation matches the motion of the lips of the user while the user speaks the utterance. The method also includes selecting one of the multiple candidate transcriptions for the utterance as a speech recognition output () based on the agreement scores determined for the multiple candidate transcriptions for the utterance.


Find Patent Forward Citations

Loading…