The patent badge is an abbreviated version of the USPTO patent document. The patent badge does contain a link to the full patent document.

The patent badge is an abbreviated version of the USPTO patent document. The patent badge covers the following: Patent number, Date patent was issued, Date patent was filed, Title of the patent, Applicant, Inventor, Assignee, Attorney firm, Primary examiner, Assistant examiner, CPCs, and Abstract. The patent badge does contain a link to the full patent document (in Adobe Acrobat format, aka pdf). To download or print any patent click here.

Date of Patent:
Jan. 03, 2023

Filed:

Apr. 15, 2019
Applicant:

Google Llc, Mountain View, CA (US);

Inventors:

Quan Wang, New York, NY (US);

Yash Sheth, Sunnyvale, CA (US);

Ignacio Lopez Moreno, New York, NY (US);

Li Wan, New York, NY (US);

Assignee:

GOOGLE LLC, Mountain View, CA (US);

Attorney:
Primary Examiner:
Int. Cl.
CPC ...
G10L 17/18 (2013.01); G10L 15/26 (2006.01); G10L 17/04 (2013.01); G10L 21/0216 (2013.01); G06K 9/62 (2022.01); G10L 15/16 (2006.01); G10L 17/00 (2013.01);
U.S. Cl.
CPC ...
G10L 17/18 (2013.01); G10L 15/26 (2013.01); G10L 17/04 (2013.01); G06K 9/6246 (2013.01); G10L 15/16 (2013.01); G10L 17/00 (2013.01); G10L 2021/02165 (2013.01);
Abstract

Techniques are described for training and/or utilizing an end-to-end speaker diarization model. In various implementations, the model is a recurrent neural network (RNN) model, such as an RNN model that includes at least one memory layer, such as a long short-term memory (LSTM) layer. Audio features of audio data can be applied as input to an end-to-end speaker diarization model trained according to implementations disclosed herein, and the model utilized to process the audio features to generate, as direct output over the model, speaker diarization results. Further, the end-to-end speaker diarization model can be a sequence-to-sequence model, where the sequence can have variable length. Accordingly, the model can be utilized to generate speaker diarization results for any of various length audio segments.


Find Patent Forward Citations

Loading…