The patent badge is an abbreviated version of the USPTO patent document. The patent badge does contain a link to the full patent document.

The patent badge is an abbreviated version of the USPTO patent document. The patent badge covers the following: Patent number, Date patent was issued, Date patent was filed, Title of the patent, Applicant, Inventor, Assignee, Attorney firm, Primary examiner, Assistant examiner, CPCs, and Abstract. The patent badge does contain a link to the full patent document (in Adobe Acrobat format, aka pdf). To download or print any patent click here.

Date of Patent:
Jan. 11, 2022

Filed:

May. 07, 2020
Applicant:

Google Llc, Mountain View, CA (US);

Inventors:

Zhehuai Chen, Jersey City, NJ (US);

Andrew M. Rosenberg, Brooklyn, NY (US);

Bhuvana Ramabhadran, Mt. Kisco, NY (US);

Pedro J. Moreno Mengibar, Jersey City, NJ (US);

Assignee:

Google LLC, Mountain View, CA (US);

Attorneys:
Primary Examiner:
Int. Cl.
CPC ...
G10L 15/26 (2006.01); G10L 13/00 (2006.01); G10L 13/08 (2013.01); G10L 15/06 (2013.01);
U.S. Cl.
CPC ...
G10L 13/00 (2013.01); G10L 13/08 (2013.01); G10L 15/063 (2013.01);
Abstract

A method for training a generative adversarial network (GAN)-based text-to-speech (TTS) model and a speech recognition model in unison includes obtaining a plurality of training text utterances. At each of a plurality of output steps for each training text utterance, the method also includes generating, for output by the GAN-Based TTS model, a synthetic speech representation of the corresponding training text utterance, and determining, using an adversarial discriminator of the GAN, an adversarial loss term indicative of an amount of acoustic noise disparity in one of the non-synthetic speech representations selected from the set of spoken training utterances relative to the corresponding synthetic speech representation of the corresponding training text utterance. The method also includes updating parameters of the GAN-based TTS model based on the adversarial loss term determined at each of the plurality of output steps for each training text utterance of the plurality of training text utterances.


Find Patent Forward Citations

Loading…