The patent badge is an abbreviated version of the USPTO patent document. The patent badge does contain a link to the full patent document.

G10L 21/007 (2013.01); G06F 3/16 (2006.01); G10L 13/00 (2006.01); G10L 13/033 (2013.01); G10L 15/02 (2006.01); G10L 15/06 (2013.01); G10L 15/16 (2006.01); G10L 15/26 (2006.01); G10L 21/003 (2013.01); G10L 21/01 (2013.01); G10L 21/013 (2013.01);

U.S. Cl.

CPC ...

G10L 21/007 (2013.01); G06F 3/162 (2013.01); G10L 13/00 (2013.01); G10L 13/033 (2013.01); G10L 15/02 (2013.01); G10L 15/063 (2013.01); G10L 15/16 (2013.01); G10L 15/26 (2013.01); G10L 21/003 (2013.01); G10L 21/013 (2013.01); G10L 21/01 (2013.01); G10L 2021/0135 (2013.01);

Abstract

The disclosed technology relates to methods, accent conversion systems, and non-transitory computer readable media for real-time accent conversion. In some examples, a set of phonetic embedding vectors is obtained for phonetic content representing a source accent and obtained from input audio data. A trained machine learning model is applied to the set of phonetic embedding vectors to generate a set of transformed phonetic embedding vectors corresponding to phonetic characteristics of speech data in a target accent. An alignment is determined by maximizing a cosine distance between the set of phonetic embedding vectors and the set of transformed phonetic embedding vectors. The speech data is then aligned to the phonetic content based on the determined alignment to generate output audio data representing the target accent. The disclosed technology transforms phonetic characteristics of a source accent to match the target accent more closely for efficient and seamless accent conversion in real-time applications.

Find Patent Forward Citations