The patent badge is an abbreviated version of the USPTO patent document. The patent badge does contain a link to the full patent document.
The patent badge is an abbreviated version of the USPTO patent document. The patent badge covers the following: Patent number, Date patent was issued, Date patent was filed, Title of the patent, Applicant, Inventor, Assignee, Attorney firm, Primary examiner, Assistant examiner, CPCs, and Abstract. The patent badge does contain a link to the full patent document (in Adobe Acrobat format, aka pdf). To download or print any patent click here.
Patent No.:
Date of Patent:
Feb. 13, 2024
Filed:
Mar. 21, 2023
Google Llc, Mountain View, CA (US);
Otavio Braga, Mountain View, CA (US);
Google LLC, Mountain View, CA (US);
Abstract
An audio-visual automated speech recognition model for transcribing speech from audio-visual data includes an encoder frontend and a decoder. The encoder includes an attention mechanism configured to receive an audio track of the audio-visual data and a video portion of the audio-visual data. The video portion of the audio-visual data includes a plurality of video face tracks each associated with a face of a respective person. For each video face track of the plurality of video face tracks, the attention mechanism is configured to determine a confidence score indicating a likelihood that the face of the respective person associated with the video face track includes a speaking face of the audio track. The decoder is configured to process the audio track and the video face track of the plurality of video face tracks associated with the highest confidence score to determine a speech recognition result of the audio track.