The patent badge is an abbreviated version of the USPTO patent document. The patent badge does contain a link to the full patent document.

The patent badge is an abbreviated version of the USPTO patent document. The patent badge covers the following: Patent number, Date patent was issued, Date patent was filed, Title of the patent, Applicant, Inventor, Assignee, Attorney firm, Primary examiner, Assistant examiner, CPCs, and Abstract. The patent badge does contain a link to the full patent document (in Adobe Acrobat format, aka pdf). To download or print any patent click here.

Patent No.:

US 11475909 B1

Date of Patent:

Oct. 18, 2022

Filed:

Feb. 08, 2021

Separating speech by source in audio recordings by predicting isolated audio signals conditioned on speaker representations

Applicant:

Google Llc, Mountain View, CA (US);

Inventors:

Neil Zeghidour, Paris, FR;

David Grangier, Mountain View, CA (US);

Assignee:

Google LLC, Mountain View, CA (US);

Attorney:

Fish & Richardson P.C.

Primary Examiner:

Richa Mishra

Int. Cl.

CPC ...

G10L 21/028 (2013.01); G10L 21/0316 (2013.01); G10L 17/04 (2013.01); G10L 17/18 (2013.01); G06N 3/04 (2006.01); G06N 3/08 (2006.01);

U.S. Cl.

CPC ...

G10L 21/028 (2013.01); G06N 3/0454 (2013.01); G06N 3/08 (2013.01); G10L 17/04 (2013.01); G10L 17/18 (2013.01); G10L 21/0316 (2013.01);

Abstract

Methods, systems, and apparatus, including computer programs encoded on computer storage media, for performing speech separation. One of the methods includes obtaining a recording comprising speech from a plurality of speakers; processing the recording using a speaker neural network having speaker parameter values and configured to process the recording in accordance with the speaker parameter values to generate a plurality of per-recording speaker representations, each speaker representation representing features of a respective identified speaker in the recording; and processing the per-recording speaker representations and the recording using a separation neural network having separation parameter values and configured to process the recording and the speaker representations in accordance with the separation parameter values to generate, for each speaker representation, a respective predicted isolated audio signal that corresponds to speech of one of the speakers in the recording.

Find Patent Forward Citations