The patent badge is an abbreviated version of the USPTO patent document. The patent badge does contain a link to the full patent document.
The patent badge is an abbreviated version of the USPTO patent document. The patent badge covers the following: Patent number, Date patent was issued, Date patent was filed, Title of the patent, Applicant, Inventor, Assignee, Attorney firm, Primary examiner, Assistant examiner, CPCs, and Abstract. The patent badge does contain a link to the full patent document (in Adobe Acrobat format, aka pdf). To download or print any patent click here.
Patent No.:
Date of Patent:
Aug. 09, 2005
Filed:
Mar. 17, 1999
Ponani Gopalakrishnan, Yorktown Heights, NY (US);
Dimitri Kanevsky, Ossining, NY (US);
Michael Daniel Monkowski, New Windsor, NY (US);
Jan Sedivy, Prague, CZ;
Ponani Gopalakrishnan, Yorktown Heights, NY (US);
Dimitri Kanevsky, Ossining, NY (US);
Michael Daniel Monkowski, New Windsor, NY (US);
Jan Sedivy, Prague, CZ;
International Business Machines Corporation, Armonk, NY (US);
Abstract
Systems and methods are provided for generating a language component vocabulary VC for a speech recognition system having a language vocabulary V of a plurality of word forms. One method for generating a language component vocabulary VC for a speech recognition system having a language vocabulary V of a plurality of word forms includes partitioning the language vocabulary V into subsets of word forms based on frequencies of occurrence of the respective word forms, in at least one the subsets, splitting word forms having frequencies less than a threshold to thereby generate word form components and generating a language component vocabulary VC including word forms and word form components. The resulting language component vocabulary, which includes word forms and word components, is used to generate a language model that can be efficiently implemented for real-time automatic speech recognition applications for languages with large vocabularies.