The patent badge is an abbreviated version of the USPTO patent document. The patent badge does contain a link to the full patent document.
The patent badge is an abbreviated version of the USPTO patent document. The patent badge covers the following: Patent number, Date patent was issued, Date patent was filed, Title of the patent, Applicant, Inventor, Assignee, Attorney firm, Primary examiner, Assistant examiner, CPCs, and Abstract. The patent badge does contain a link to the full patent document (in Adobe Acrobat format, aka pdf). To download or print any patent click here.
Patent No.:
Date of Patent:
Dec. 09, 2008
Filed:
Apr. 16, 2004
Xiaoqiang Luo, Ardsley, NY (US);
Robert Todd Ward, Croton on Hudson, NY (US);
Xiaoqiang Luo, Ardsley, NY (US);
Robert Todd Ward, Croton on Hudson, NY (US);
International Business Machines Corporation, Armonk, NY (US);
Abstract
A parser is provided that parses a Chinese text stream at the character level and builds a syntactic structure of Chinese character sequences. A character-based syntactic parse tree contains word boundaries, part-of-speech tags, and phrasal structure information. Syntactic knowledge constrains the system when it determines word boundaries. A deterministic procedure is used to convert word-based parse trees into character-based trees. Character-level tags are derived from word-level part-of-speech tags and word-boundary information is encoded with a positional tag. Word-level parts-of-speech become a constituent label in character-based trees. A maximum entropy parser is then built and tested.