The patent badge is an abbreviated version of the USPTO patent document. The patent badge does contain a link to the full patent document.

The patent badge is an abbreviated version of the USPTO patent document. The patent badge covers the following: Patent number, Date patent was issued, Date patent was filed, Title of the patent, Applicant, Inventor, Assignee, Attorney firm, Primary examiner, Assistant examiner, CPCs, and Abstract. The patent badge does contain a link to the full patent document (in Adobe Acrobat format, aka pdf). To download or print any patent click here.

Date of Patent:
Nov. 19, 2024

Filed:

Apr. 05, 2021
Applicant:

Google Llc, Mountain View, CA (US);

Inventors:

Yonghui Wu, Fremont, CA (US);

Jonathan Shen, Santa Clara, CA (US);

Ruoming Pang, New York, NY (US);

Ron J. Weiss, New York, NY (US);

Michael Schuster, Saratoga, CA (US);

Navdeep Jaitly, Mountain View, CA (US);

Zongheng Yang, Berkeley, CA (US);

Zhifeng Chen, Sunnyvale, CA (US);

Yu Zhang, Mountain View, CA (US);

Yuxuan Wang, Sunnyvale, CA (US);

Russell John Wyatt Skerry-Ryan, Mountain View, CA (US);

Ryan M. Rifkin, Oakland, CA (US);

Ioannis Agiomyrgiannakis, London, GB;

Assignee:

Google LLC, Mountain View, CA (US);

Attorney:
Primary Examiner:
Int. Cl.
CPC ...
G10L 13/047 (2013.01); G06N 3/045 (2023.01); G06N 3/08 (2023.01); G06N 5/046 (2023.01); G06N 7/01 (2023.01); G10L 13/08 (2013.01); G10L 25/18 (2013.01); G10L 25/30 (2013.01);
U.S. Cl.
CPC ...
G10L 25/30 (2013.01); G06N 3/045 (2023.01); G06N 3/08 (2013.01); G06N 5/046 (2013.01); G06N 7/01 (2023.01); G10L 13/047 (2013.01); G10L 13/08 (2013.01); G10L 25/18 (2013.01);
Abstract

Methods, systems, and computer program products for generating, from an input character sequence, an output sequence of audio data representing the input character sequence. The output sequence of audio data includes a respective audio output sample for each of a number of time steps. One example method includes, for each of the time steps: generating a mel-frequency spectrogram for the time step by processing a representation of a respective portion of the input character sequence using a decoder neural network; generating a probability distribution over a plurality of possible audio output samples for the time step by processing the mel-frequency spectrogram for the time step using a vocoder neural network; and selecting the audio output sample for the time step from the possible audio output samples in accordance with the probability distribution.


Find Patent Forward Citations

Loading…