The patent badge is an abbreviated version of the USPTO patent document. The patent badge does contain a link to the full patent document.

The patent badge is an abbreviated version of the USPTO patent document. The patent badge covers the following: Patent number, Date patent was issued, Date patent was filed, Title of the patent, Applicant, Inventor, Assignee, Attorney firm, Primary examiner, Assistant examiner, CPCs, and Abstract. The patent badge does contain a link to the full patent document (in Adobe Acrobat format, aka pdf). To download or print any patent click here.

Date of Patent:
Oct. 07, 2025

Filed:

Apr. 07, 2023
Applicant:

Microsoft Technology Licensing, Llc, Redmond, WA (US);

Inventors:

Adrian Wyatt Bonar, Seattle, WA (US);

Jennifer Fox, Seattle, WA (US);

Nicole E. Berdy, Cambridge, MA (US);

Mollie Munoz, Redmond, WA (US);

Shawn Callegari, Redmond, WA (US);

Devis Lucato, Redmond, WA (US);

Ryan H. Volum, Seattle, WA (US);

Assignee:
Attorneys:
Primary Examiner:
Int. Cl.
CPC ...
G10L 13/10 (2013.01); G10L 15/26 (2006.01); G10L 13/02 (2013.01); G10L 13/08 (2013.01); G10L 15/18 (2013.01); G10L 15/22 (2006.01); G10L 25/63 (2013.01);
U.S. Cl.
CPC ...
G10L 13/10 (2013.01); G10L 15/26 (2013.01); G10L 13/02 (2013.01); G10L 13/08 (2013.01); G10L 2013/083 (2013.01); G10L 15/1815 (2013.01); G10L 2015/225 (2013.01); G10L 25/63 (2013.01);
Abstract

The techniques disclosed herein enable systems for spoken natural stylistic conversations with large language models. In contrast to many existing modalities for interacting with large language models that are limited to text, the techniques presented herein enable users to carry a fully spoken conversation with a large language model. This is accomplished by converting a user speech audio input to text and utilizing a prompt engine to analyze a sentiment expressed by the user. A large language model, having been trained on example conversations, by generating a text response as well as a style cue to express emotion in response to the sentiment expressed by speech audio input. A text-to-speech engine can subsequently interpret the text response and style cue to generate an audio output which emulates the sensation of human conversation.


Find Patent Forward Citations

Loading…