The patent badge is an abbreviated version of the USPTO patent document. The patent badge does contain a link to the full patent document.

G10L 13/08 (2013.01); G10L 15/22 (2006.01); G11B 20/10 (2006.01); G06F 3/16 (2006.01); G10L 13/10 (2013.01); G06F 40/47 (2020.01); G10L 25/90 (2013.01); G10L 15/06 (2013.01); G10L 13/00 (2006.01); G10L 15/26 (2006.01); G06V 40/16 (2022.01);

U.S. Cl.

CPC ...

G10L 13/10 (2013.01); G06F 40/47 (2020.01); G06V 40/161 (2022.01); G10L 13/00 (2013.01); G10L 15/063 (2013.01); G10L 15/22 (2013.01); G10L 15/26 (2013.01); G10L 25/90 (2013.01);

Abstract

Techniques for the generation of dubbed audio for an audio/video are described. An exemplary approach is to receive a request to generate dubbed speech for an audio/visual file; and in response to the request to: extract speech segments from an audio track of the audio/visual file associated with identified speakers; translate the extracted speech segments into a target language; determine a machine learning model per identified speaker, the trained machine learning models to be used to generate a spoken version of the translated, extracted speech segments based on the identified speaker; generate, per translated, extracted speech segment, a spoken version of the translated, extracted speech segments using a trained machine learning model that corresponds to the identified speaker of the translated, extracted speech segment and prosody information for the extracted speech segments; and replace the extracted speech segments from the audio track of the audio/visual file with the spoken versions spoken version of the translated, extracted speech segments to generate a modified audio track.

Find Patent Forward Citations