The patent badge is an abbreviated version of the USPTO patent document. The patent badge does contain a link to the full patent document.
The patent badge is an abbreviated version of the USPTO patent document. The patent badge covers the following: Patent number, Date patent was issued, Date patent was filed, Title of the patent, Applicant, Inventor, Assignee, Attorney firm, Primary examiner, Assistant examiner, CPCs, and Abstract. The patent badge does contain a link to the full patent document (in Adobe Acrobat format, aka pdf). To download or print any patent click here.
Patent No.:
Date of Patent:
Jul. 17, 2001
Filed:
Mar. 20, 2000
David E. Heckerman, Bellevue, WA (US);
Fileno A. Alleva, Redmond, WA (US);
Robert L. Rounthwaite, Fall City, WA (US);
Daniel Rosen, Bellevue, WA (US);
Mei-Yuh Hwang, Redmond, WA (US);
Yoram Yaacovi, Redmond, WA (US);
John L. Manferdelli, Redmond, WA (US);
Microsoft Corporation, Redmond, WA (US);
Abstract
Automated methods and apparatus for synchronizing audio and text data, e.g., in the form of electronic files, representing audio and text expressions of the same work or information are described. Also described are automated methods of detecting errors and other discrepancies between the audio and text versions of the same work. A speech recognition operation is performed on the audio data initially using a speaker independent acoustic model. The recognized text in addition to audio time stamps are produced by the speech recognition operation. The recognized text is compared to the text in text data to identify correctly recognized words. The acoustic model is then retrained using the correctly recognized text and corresponding audio segments from the audio data transforming the initial acoustic model into a speaker trained acoustic model. The retrained acoustic model is then used to perform an additional speech recognition operation on the audio data. The audio and text data are synchronized using the results of the updated acoustic model. In addition, one or more error reports based on the final recognition results are generated showing discrepancies between the recognized words and the words included in the text. By retraining the acoustic model in the above described manner, improved accuracy is achieved.