The patent badge is an abbreviated version of the USPTO patent document. The patent badge does contain a link to the full patent document.

The patent badge is an abbreviated version of the USPTO patent document. The patent badge covers the following: Patent number, Date patent was issued, Date patent was filed, Title of the patent, Applicant, Inventor, Assignee, Attorney firm, Primary examiner, Assistant examiner, CPCs, and Abstract. The patent badge does contain a link to the full patent document (in Adobe Acrobat format, aka pdf). To download or print any patent click here.

Date of Patent:
Aug. 19, 2025

Filed:

Feb. 13, 2024
Applicant:

Microsoft Technology Licensing, Llc, Redmond, WA (US);

Inventors:

Naoyuki Kanda, Bellevue, WA (US);

Xuankai Chang, Baltimore, MD (US);

Yashesh Gaur, Redmond, WA (US);

Xiaofei Wang, Bellevue, WA (US);

Zhong Meng, Mercer Island, WA (US);

Takuya Yoshioka, Bellevue, WA (US);

Assignee:
Attorney:
Primary Examiner:
Int. Cl.
CPC ...
G10L 15/00 (2013.01); G10L 15/22 (2006.01); G10L 15/26 (2006.01); G10L 17/02 (2013.01); G10L 19/022 (2013.01); G10L 21/0272 (2013.01);
U.S. Cl.
CPC ...
G10L 17/02 (2013.01); G10L 15/22 (2013.01); G10L 15/26 (2013.01); G10L 19/022 (2013.01); G10L 21/0272 (2013.01);
Abstract

A hypothesis stitcher for speech recognition of long-form audio provides superior performance, such as higher accuracy and reduced computational cost. An example disclosed operation includes: segmenting the audio stream into a plurality of audio segments; identifying a plurality of speakers within each of the plurality of audio segments; performing automatic speech recognition (ASR) on each of the plurality of audio segments to generate a plurality of short-segment hypotheses; merging at least a portion of the short-segment hypotheses into a first merged hypothesis set; inserting stitching symbols into the first merged hypothesis set, the stitching symbols including a window change (WC) symbol; and consolidating, with a network-based hypothesis stitcher, the first merged hypothesis set into a first consolidated hypothesis. Multiple variations are disclosed, including alignment-based stitchers and serialized stitchers, which may operate as speaker-specific stitchers or multi-speaker stitchers, and may further support multiple options for differing hypothesis configurations.


Find Patent Forward Citations

Loading…