The patent badge is an abbreviated version of the USPTO patent document. The patent badge does contain a link to the full patent document.

The patent badge is an abbreviated version of the USPTO patent document. The patent badge covers the following: Patent number, Date patent was issued, Date patent was filed, Title of the patent, Applicant, Inventor, Assignee, Attorney firm, Primary examiner, Assistant examiner, CPCs, and Abstract. The patent badge does contain a link to the full patent document (in Adobe Acrobat format, aka pdf). To download or print any patent click here.

Date of Patent:
Dec. 27, 2022

Filed:

Jun. 28, 2022
Applicant:

Sas Institute Inc., Cary, NC (US);

Inventors:

Xiaolong Li, Cary, NC (US);

Samuel Norris Henderson, Raleigh, NC (US);

Xiaozhuo Cheng, Cary, NC (US);

Xu Yang, Cary, NC (US);

Assignee:

SAS INSTITUTE INC., Cary, NC (US);

Attorney:
Primary Examiner:
Int. Cl.
CPC ...
G10L 15/04 (2013.01); G10L 15/16 (2006.01); G10L 15/26 (2006.01); G10L 25/78 (2013.01); G10L 25/30 (2013.01); G10L 15/02 (2006.01);
U.S. Cl.
CPC ...
G10L 15/26 (2013.01); G10L 15/02 (2013.01); G10L 15/04 (2013.01); G10L 25/30 (2013.01); G10L 25/78 (2013.01); G10L 2025/783 (2013.01);
Abstract

An apparatus includes at least one processor to, in response to a request to perform speech-to-text conversion: perform a pause detection technique including analyzing speech audio to identify pauses, and analyzing lengths of the pauses to identify likely sentence pauses; perform a speaker diarization technique including dividing the speech audio into fragments, analyzing vocal characteristics of speech sounds of each fragment to identify a speaker of a set of speakers, and identifying instances of a change in speakers between each temporally consecutive pair of fragments to identify likely speaker changes; and perform speech-to-text operations including dividing the speech audio into segments based on at least the likely sentence pauses and likely speaker changes, using at least an acoustic model with each segment to identify likely speech sounds in the speech audio, and generating a transcript of the speech audio based at least on the likely speech sounds.


Find Patent Forward Citations

Loading…