The patent badge is an abbreviated version of the USPTO patent document. The patent badge does contain a link to the full patent document.

G10L 15/26 (2006.01); G10L 21/10 (2013.01); G10L 21/06 (2013.01); G10L 15/04 (2013.01); G10L 15/183 (2013.01); G10L 15/25 (2013.01); G10L 25/57 (2013.01); G10L 25/63 (2013.01); G11B 27/036 (2006.01); G10L 25/81 (2013.01); G10L 15/02 (2006.01);

U.S. Cl.

CPC ...

G10L 15/265 (2013.01); G10L 15/04 (2013.01); G10L 15/183 (2013.01); G10L 15/25 (2013.01); G10L 15/26 (2013.01); G10L 21/06 (2013.01); G10L 21/10 (2013.01); G10L 25/57 (2013.01); G10L 25/63 (2013.01); G10L 25/81 (2013.01); G11B 27/036 (2013.01); G10L 2015/025 (2013.01); G10L 2021/065 (2013.01); G10L 2021/105 (2013.01);

Abstract

A system and method to insert visual subtitles in videos is described. The method comprises segmenting an input video signal to extract the speech segments and music segments. Next, a speaker representation is associated for each speech segment corresponding to a speaker visible in the frame. Further, speech segments are analyzed to compute the phones and the duration of each phone. The phones are mapped to a corresponding viseme and a viseme based language model is created with a corresponding score. Most relevant viseme is selected for the speech segments by computing a total viseme score. Further, a speaker representation sequence is created such that phones and emotions in the speech segments are represented as reconstructed lip movements and eyebrow movements. The speaker representation sequence is then integrated with the music segments and super imposed on the input video signal to create subtitles.

Find Patent Forward Citations