The patent badge is an abbreviated version of the USPTO patent document. The patent badge does contain a link to the full patent document.

The patent badge is an abbreviated version of the USPTO patent document. The patent badge covers the following: Patent number, Date patent was issued, Date patent was filed, Title of the patent, Applicant, Inventor, Assignee, Attorney firm, Primary examiner, Assistant examiner, CPCs, and Abstract. The patent badge does contain a link to the full patent document (in Adobe Acrobat format, aka pdf). To download or print any patent click here.

Date of Patent:
Feb. 20, 2018

Filed:

Jul. 02, 2015
Applicant:

International Business Machines Corporation, Armonk, NY (US);

Inventors:

Jonathan H. Connell, II, Cortlandt-Manor, NY (US);

Etienne Marcheret, White Plains, NY (US);

Attorneys:
Primary Examiner:
Int. Cl.
CPC ...
G10L 15/00 (2013.01); G05B 15/00 (2006.01); H04R 25/00 (2006.01); G06K 9/62 (2006.01); G10L 15/25 (2013.01); G10L 15/07 (2013.01); G06N 3/00 (2006.01); G06K 9/00 (2006.01); G06F 3/01 (2006.01); G10L 15/24 (2013.01); G10L 15/22 (2006.01);
U.S. Cl.
CPC ...
G10L 15/25 (2013.01); G10L 15/07 (2013.01); G06F 3/016 (2013.01); G06K 9/00281 (2013.01); G06N 3/008 (2013.01); G10L 15/24 (2013.01); G10L 2015/227 (2013.01); H04R 25/407 (2013.01);
Abstract

Non-acoustic data from a vicinity of speech input is obtained. A subject speaker is identified as the source of the speech input from the obtained non-acoustic data by detecting mouth motion on one or more faces segmented from the non-acoustic data by comparing a first pixel intensity associated at a first time with a second pixel intensity at a second time, and selecting a face corresponding to the subject speaker from the one or more faces in response to a determination that a number of significantly changed pixels between the first pixel intensity and the second pixel intensity exceeds a threshold. A demographic is assigned to the subject speaker based on an analysis of one or more non-acoustic attributes of the subject speaker extracted from the non-acoustic data. The speech input is processed using a speech recognition system adjusted using a model selected based on the demographic.


Find Patent Forward Citations

Loading…