The patent badge is an abbreviated version of the USPTO patent document. The patent badge does contain a link to the full patent document.
The patent badge is an abbreviated version of the USPTO patent document. The patent badge covers the following: Patent number, Date patent was issued, Date patent was filed, Title of the patent, Applicant, Inventor, Assignee, Attorney firm, Primary examiner, Assistant examiner, CPCs, and Abstract. The patent badge does contain a link to the full patent document (in Adobe Acrobat format, aka pdf). To download or print any patent click here.
Patent No.:
Date of Patent:
Sep. 11, 2007
Filed:
Jul. 15, 2005
Joseph E. Pentheroudakis, Seattle, WA (US);
David G. Bradlee, Seattle, WA (US);
Sonja S. Knoll, Redmond, WA (US);
Joseph E. Pentheroudakis, Seattle, WA (US);
David G. Bradlee, Seattle, WA (US);
Sonja S. Knoll, Redmond, WA (US);
Microsoft Corporation, Redmond, WA (US);
Abstract
The present invention is a segmenter used in a natural language processing system. The segmenter segments a textual input string into tokens for further natural language processing. In accordance with one feature of the invention, the segmenter includes a tokeinzer engine that proposes segmentations and submits them to a linguistic knowledge component for validation. In accordance with another feature of the invention, the segmentation system includes language specific data that contains a precedence hierarchy for punctuation. If proposed tokens in the input string contain punctuation, they can illustratively be broken into subtokens based on the precedence hierarchy.