The patent badge is an abbreviated version of the USPTO patent document. The patent badge does contain a link to the full patent document.

The patent badge is an abbreviated version of the USPTO patent document. The patent badge covers the following: Patent number, Date patent was issued, Date patent was filed, Title of the patent, Applicant, Inventor, Assignee, Attorney firm, Primary examiner, Assistant examiner, CPCs, and Abstract. The patent badge does contain a link to the full patent document (in Adobe Acrobat format, aka pdf). To download or print any patent click here.

Date of Patent:
Oct. 29, 2019

Filed:

Mar. 29, 2017
Applicant:

Tata Consultancy Services Limited, Mumbai, IN;

Inventors:

Chitralekha Bhat, Thane, IN;

Sunil Kumar Kopparapu, Thane, IN;

Ashish Panda, Thane, IN;

Assignee:
Attorney:
Primary Examiner:
Int. Cl.
CPC ...
G10L 15/26 (2006.01); G10L 21/10 (2013.01); G10L 21/06 (2013.01); G10L 15/04 (2013.01); G10L 15/183 (2013.01); G10L 15/25 (2013.01); G10L 25/57 (2013.01); G10L 25/63 (2013.01); G11B 27/036 (2006.01); G10L 25/81 (2013.01); G10L 15/02 (2006.01);
U.S. Cl.
CPC ...
G10L 15/265 (2013.01); G10L 15/04 (2013.01); G10L 15/183 (2013.01); G10L 15/25 (2013.01); G10L 15/26 (2013.01); G10L 21/06 (2013.01); G10L 21/10 (2013.01); G10L 25/57 (2013.01); G10L 25/63 (2013.01); G10L 25/81 (2013.01); G11B 27/036 (2013.01); G10L 2015/025 (2013.01); G10L 2021/065 (2013.01); G10L 2021/105 (2013.01);
Abstract

A system and method to insert visual subtitles in videos is described. The method comprises segmenting an input video signal to extract the speech segments and music segments. Next, a speaker representation is associated for each speech segment corresponding to a speaker visible in the frame. Further, speech segments are analyzed to compute the phones and the duration of each phone. The phones are mapped to a corresponding viseme and a viseme based language model is created with a corresponding score. Most relevant viseme is selected for the speech segments by computing a total viseme score. Further, a speaker representation sequence is created such that phones and emotions in the speech segments are represented as reconstructed lip movements and eyebrow movements. The speaker representation sequence is then integrated with the music segments and super imposed on the input video signal to create subtitles.


Find Patent Forward Citations

Loading…