The patent badge is an abbreviated version of the USPTO patent document. The patent badge does contain a link to the full patent document.

The patent badge is an abbreviated version of the USPTO patent document. The patent badge covers the following: Patent number, Date patent was issued, Date patent was filed, Title of the patent, Applicant, Inventor, Assignee, Attorney firm, Primary examiner, Assistant examiner, CPCs, and Abstract. The patent badge does contain a link to the full patent document (in Adobe Acrobat format, aka pdf). To download or print any patent click here.

Date of Patent:
Dec. 10, 2024

Filed:

Apr. 12, 2024
Applicant:

Sas Institute Inc., Cary, NC (US);

Inventors:

Xiaolong Li, Cary, NC (US);

Xiaozhuo Cheng, Cary, NC (US);

Xu Yang, Cary, NC (US);

Assignee:

SAS INSTITUTE INC., Cary, NC (US);

Attorneys:
Primary Examiner:
Int. Cl.
CPC ...
G10L 15/22 (2006.01); G10L 15/02 (2006.01); G10L 15/04 (2013.01); G10L 15/26 (2006.01); G10L 25/30 (2013.01); G10L 25/78 (2013.01);
U.S. Cl.
CPC ...
G10L 15/26 (2013.01); G10L 15/02 (2013.01); G10L 15/04 (2013.01); G10L 25/30 (2013.01); G10L 25/78 (2013.01); G10L 2025/783 (2013.01);
Abstract

A system, method, and computer-program product includes receiving speech audio of a multi-turn conversation, generating, via a speech-to-text process, a transcript of the speech audio, wherein the transcript of the speech audio textually segments speech spoken during the multi-turn conversation into a plurality of utterances, generating a speaker diarization prompt that includes contextual information about a plurality of speakers participating in the multi-turn conversation, inputting, to a large language model, the speaker diarization prompt and the transcript of the speech audio, and obtaining, from the large language model, an output comprising an enhanced transcript of the speech audio, wherein the enhanced transcript of the speech audio textually segments the speech spoken during the multi-turn conversation into a plurality of refined utterances and associates a speaker identification value with each of the plurality of refined utterances.


Find Patent Forward Citations

Loading…