The patent badge is an abbreviated version of the USPTO patent document. The patent badge does contain a link to the full patent document.

The patent badge is an abbreviated version of the USPTO patent document. The patent badge covers the following: Patent number, Date patent was issued, Date patent was filed, Title of the patent, Applicant, Inventor, Assignee, Attorney firm, Primary examiner, Assistant examiner, CPCs, and Abstract. The patent badge does contain a link to the full patent document (in Adobe Acrobat format, aka pdf). To download or print any patent click here.

Date of Patent:
Nov. 29, 2022

Filed:

Aug. 13, 2020
Applicant:

Google Llc, Mountain View, CA (US);

Inventors:

Lev Finkelstein, Mountain View, CA (US);

Chun-An Chan, Mountain View, CA (US);

Byungha Chun, Tokyo, JP;

Ye Jia, Mountain View, CA (US);

Yu Zhang, Mountain View, CA (US);

Robert Andrew James Clark, Hertfordshire, GB;

Vincent Wan, London, GB;

Assignee:

Google LLC, Mountain View, CA (US);

Attorneys:
Primary Examiner:
Int. Cl.
CPC ...
G10L 13/10 (2013.01); G10L 13/02 (2013.01); G10L 17/18 (2013.01);
U.S. Cl.
CPC ...
G10L 13/10 (2013.01); G10L 13/02 (2013.01); G10L 17/18 (2013.01);
Abstract

A method includes receiving an input text utterance to be synthesized into expressive speech having an intended prosody and a target voice and generating, using a first text-to-speech (TTS) model, an intermediate synthesized speech representation tor the input text utterance. The intermediate synthesized speech representation possesses the intended prosody. The method also includes providing the intermediate synthesized speech representation to a second TTS model that includes an encoder portion and a decoder portion. The encoder portion is configured to encode the intermediate synthesized speech representation into an utterance embedding that specifies the intended prosody. The decoder portion is configured to process the input text utterance and the utterance embedding to generate an output audio signal of expressive speech that has the intended prosody specified by the utterance embedding and speaker characteristics of the target voice.


Find Patent Forward Citations

Loading…