The patent badge is an abbreviated version of the USPTO patent document. The patent badge does contain a link to the full patent document.

The patent badge is an abbreviated version of the USPTO patent document. The patent badge covers the following: Patent number, Date patent was issued, Date patent was filed, Title of the patent, Applicant, Inventor, Assignee, Attorney firm, Primary examiner, Assistant examiner, CPCs, and Abstract. The patent badge does contain a link to the full patent document (in Adobe Acrobat format, aka pdf). To download or print any patent click here.

Date of Patent:
Jun. 07, 2022

Filed:

Oct. 01, 2020
Applicant:

Deepmind Technologies Limited, London, GB;

Inventors:

Yutian Chen, Cambridge, GB;

Scott Ellison Reed, New York, NY (US);

Aaron Gerard Antonius van den Oord, London, GB;

Oriol Vinyals, London, GB;

Heiga Zen, Tokyo, JP;

Ioannis Alexandros Assael, London, GB;

Brendan Shillingford, London, GB;

Joao Ferdinando Gomes de Freitas, London, GB;

Assignee:
Attorney:
Primary Examiner:
Int. Cl.
CPC ...
G10L 13/047 (2013.01); G10L 13/033 (2013.01); G10L 13/00 (2006.01); G06N 3/04 (2006.01); G06N 3/08 (2006.01);
U.S. Cl.
CPC ...
G10L 13/047 (2013.01); G06N 3/0445 (2013.01); G06N 3/0454 (2013.01); G06N 3/08 (2013.01); G10L 13/00 (2013.01); G10L 13/033 (2013.01);
Abstract

Methods, systems, and apparatus, including computer programs encoded on computer storage media, for generating an adaptive audio-generation model. One of the methods includes generating an adaptive audio-generation model including learning a plurality of embedding vectors and parameter values of a neural network using training data comprising first text and audio data representing a plurality of different individual speakers speaking portions of the first text, wherein the plurality of embedding vectors represent respective voice characteristics of the plurality of different individual speakers. The adaptive audio-generation model is adapted for a new individual speaker using adaptation data comprising second text and audio data representing the new individual speaker speaking portions of the second text, the new individual speaker being different from each of the plurality of individual speakers, wherein adapting the audio-generation model includes learning a new embedding vector for the new individual speaker.


Find Patent Forward Citations

Loading…