The patent badge is an abbreviated version of the USPTO patent document. The patent badge does contain a link to the full patent document.

The patent badge is an abbreviated version of the USPTO patent document. The patent badge covers the following: Patent number, Date patent was issued, Date patent was filed, Title of the patent, Applicant, Inventor, Assignee, Attorney firm, Primary examiner, Assistant examiner, CPCs, and Abstract. The patent badge does contain a link to the full patent document (in Adobe Acrobat format, aka pdf). To download or print any patent click here.

Patent No.:

US 11568878 B1

Date of Patent:

Jan. 31, 2023

Filed:

Apr. 16, 2021

Voice shortcut detection with speaker verification

Applicant:

Google Llc, Mountain View, CA (US);

Inventors:

Rajeev Rikhye, Freemont, CA (US);

Quan Wang, Hoboken, NJ (US);

Yanzhang He, Palo Alto, CA (US);

Qiao Liang, Redwood City, CA (US);

Ian C. McGraw, Menlo Park, CA (US);

Assignee:

GOOGLE LLC, Mountain View, CA (US);

Attorney:

Gray Ice Higdon

Primary Examiner:

Shaun Roberts

Int. Cl.

CPC ...

G10L 17/24 (2013.01); G10L 17/06 (2013.01); G10L 21/028 (2013.01);

U.S. Cl.

CPC ...

G10L 17/24 (2013.01); G10L 17/06 (2013.01); G10L 21/028 (2013.01);

Abstract

Techniques disclosed herein are directed towards streaming keyphrase detection which can be customized to detect one or more particular keyphrases, without requiring retraining of any model(s) for those particular keyphrase(s). Many implementations include processing audio data using a speaker separation model to generate separated audio data which isolates an utterance spoken by a human speaker from one or more additional sounds not spoken by the human speaker, and processing the separated audio data using a text independent speaker identification model to determine whether a verified and/or registered user spoke a spoken utterance captured in the audio data. Various implementations include processing the audio data and/or the separated audio data using an automatic speech recognition model to generate a text representation of the utterance. Additionally or alternatively, the text representation of the utterance can be processed to determine whether at least a portion of the text representation of the utterance captures a particular keyphrase. When the system determines the registered and/or verified user spoke the utterance and the system determines the text representation of the utterance captures the particular keyphrase, the system can cause a computing device to perform one or more actions corresponding to the particular keyphrase.

Find Patent Forward Citations