The patent badge is an abbreviated version of the USPTO patent document. The patent badge does contain a link to the full patent document.

G10L 15/04 (2013.01); G10L 15/16 (2006.01); G10L 15/26 (2006.01); G10L 25/78 (2013.01); G10L 25/30 (2013.01); G10L 15/02 (2006.01);

U.S. Cl.

CPC ...

G10L 15/26 (2013.01); G10L 15/02 (2013.01); G10L 15/04 (2013.01); G10L 25/30 (2013.01); G10L 25/78 (2013.01); G10L 2025/783 (2013.01);

Abstract

An apparatus includes at least one processor to, in response to a request to perform speech-to-text conversion: perform a pause detection technique including analyzing speech audio to identify pauses, and analyzing lengths of the pauses to identify likely sentence pauses; perform a speaker diarization technique including dividing the speech audio into fragments, analyzing vocal characteristics of speech sounds of each fragment to identify a speaker of a set of speakers, and identifying instances of a change in speakers between each temporally consecutive pair of fragments to identify likely speaker changes; and perform speech-to-text operations including dividing the speech audio into segments based on at least the likely sentence pauses and likely speaker changes, using at least an acoustic model with each segment to identify likely speech sounds in the speech audio, and generating a transcript of the speech audio based at least on the likely speech sounds.

Find Patent Forward Citations