The patent badge is an abbreviated version of the USPTO patent document. The patent badge does contain a link to the full patent document.

The patent badge is an abbreviated version of the USPTO patent document. The patent badge covers the following: Patent number, Date patent was issued, Date patent was filed, Title of the patent, Applicant, Inventor, Assignee, Attorney firm, Primary examiner, Assistant examiner, CPCs, and Abstract. The patent badge does contain a link to the full patent document (in Adobe Acrobat format, aka pdf). To download or print any patent click here.

Date of Patent:
Apr. 06, 2010

Filed:

Mar. 10, 2004
Applicants:

Mei-yuh Hwang, Sammamish, WA (US);

LI Jiang, Redmond, WA (US);

Inventors:

Mei-Yuh Hwang, Sammamish, WA (US);

Li Jiang, Redmond, WA (US);

Assignee:

Microsoft Corporation, Redmond, WA (US);

Attorneys:
Primary Examiner:
Assistant Examiner:
Int. Cl.
CPC ...
G10L 15/04 (2006.01);
U.S. Cl.
CPC ...
Abstract

A method and apparatus are provided for segmenting words into component parts. Under the invention, mutual information scores for pairs of graphoneme units found in a set of words are determined. Each graphoneme unit includes at least one letter. The graphoneme units of one pair of graphoneme units are combined based on the mutual information score. This forms a new graphoneme unit. Under one aspect of the invention, a syllable n-gram model is trained based on words that have been segmented into syllables using mutual information. The syllable n-gram model is used to segment a phonetic representation of a new word into syllables. Similarly, an inventory of morphemes is formed using mutual information and a morpheme n-gram is trained that can be used to segment a new word into a sequence of morphemes.

Published as:
CN1667699A; EP1575029A2; US2005203739A1; JP2005258439A; KR20060043825A; EP1575029A3; US7693715B2; CN1667699B; KR100996817B1; EP1575029B1; ATE508453T1; DE602005027770D1;

Find Patent Forward Citations

Loading…