The patent badge is an abbreviated version of the USPTO patent document. The patent badge does contain a link to the full patent document.

The patent badge is an abbreviated version of the USPTO patent document. The patent badge covers the following: Patent number, Date patent was issued, Date patent was filed, Title of the patent, Applicant, Inventor, Assignee, Attorney firm, Primary examiner, Assistant examiner, CPCs, and Abstract. The patent badge does contain a link to the full patent document (in Adobe Acrobat format, aka pdf). To download or print any patent click here.

Patent No.:

US 11715457 B1

Date of Patent:

Aug. 01, 2023

Filed:

Dec. 19, 2022

Real time correction of accent in speech audio signals

Applicant:

Intone Inc., New York, NY (US);

Inventors:

Andrei Golman, San Francisco, CA (US);

Dmitrii Sadykov, Yerevan, AM;

Assignee:

Intone Inc., New York, NY (US);

Attorney:

Georgiy L. Khayet

Primary Examiner:

Vu B Hang

Int. Cl.

CPC ...

G10L 15/00 (2013.01); G10L 15/02 (2006.01); G10L 13/04 (2013.01); G10L 25/30 (2013.01); G10L 15/22 (2006.01); G10L 15/183 (2013.01); G10L 13/08 (2013.01); G10L 15/18 (2013.01);

U.S. Cl.

CPC ...

G10L 15/02 (2013.01); G10L 13/04 (2013.01); G10L 15/22 (2013.01); G10L 25/30 (2013.01); G10L 13/08 (2013.01); G10L 15/183 (2013.01); G10L 15/1822 (2013.01); G10L 2015/025 (2013.01);

Abstract

Systems and methods for real-time correction of an accent in a speech audio signal are provided. A method includes dividing the speech audio signal into a stream of input chunks, an input chunk from the stream of input chunks including a pre-defined number of frames of the speech audio signal, extracting, by an acoustic features extraction module from the input chunk and a context associated with the input chunk, acoustic features, the context is a pre-determined number of the frames preceding the input chunk in the stream; extracting, by a linguistic features extraction module from the input chunk and the context, linguistic features, receiving a speaker embedding for a human speaker, providing the speaker embedding, the acoustic features, and the linguistic features to a synthesis module to generate a melspectrogram with a reduced accent, providing the melspectrogram to a vocoder to generate an output chunk of an output audio signal.

Find Patent Forward Citations