The patent badge is an abbreviated version of the USPTO patent document. The patent badge does contain a link to the full patent document.

G10L 15/22 (2006.01); G10L 15/18 (2013.01); G10L 25/51 (2013.01); G10L 25/03 (2013.01); G06F 17/27 (2006.01); G10L 15/06 (2013.01); G10L 25/18 (2013.01); G10L 25/60 (2013.01); G10L 25/87 (2013.01); G10L 25/90 (2013.01); G10L 15/183 (2013.01);

U.S. Cl.

CPC ...

G10L 15/1807 (2013.01); G06F 17/277 (2013.01); G10L 15/063 (2013.01); G10L 15/22 (2013.01); G10L 25/03 (2013.01); G10L 25/18 (2013.01); G10L 25/51 (2013.01); G10L 25/60 (2013.01); G10L 25/87 (2013.01); G10L 25/90 (2013.01); G10L 15/183 (2013.01);

Abstract

Prosodic features are used for discriminating computer-directed speech from human-directed speech. Statistics and models describing energy/intensity patterns over time, speech/pause distributions, pitch patterns, vocal effort features, and speech segment duration patterns may be used for prosodic modeling. The prosodic features for at least a portion of an utterance are monitored over a period of time to determine a shape associated with the utterance. A score may be determined to assist in classifying the current utterance as human directed or computer directed without relying on knowledge of preceding utterances or utterances following the current utterance. Outside data may be used for training lexical addressee detection systems for the H-H-C scenario. H-C training data can be obtained from a single-user H-C collection and that H-H speech can be modeled using general conversational speech. H-C and H-H language models may also be adapted using interpolation with small amounts of matched H-H-C data.

Find Patent Forward Citations