The patent badge is an abbreviated version of the USPTO patent document. The patent badge does contain a link to the full patent document.

The patent badge is an abbreviated version of the USPTO patent document. The patent badge covers the following: Patent number, Date patent was issued, Date patent was filed, Title of the patent, Applicant, Inventor, Assignee, Attorney firm, Primary examiner, Assistant examiner, CPCs, and Abstract. The patent badge does contain a link to the full patent document (in Adobe Acrobat format, aka pdf). To download or print any patent click here.

Date of Patent:
Aug. 19, 2014

Filed:

Sep. 26, 2011
Applicants:

Caroline Brun, Grenoble, FR;

Vassilina Nikoulina, Grenoble, FR;

Nikolaos Lagos, Grenoble, FR;

Inventors:

Caroline Brun, Grenoble, FR;

Vassilina Nikoulina, Grenoble, FR;

Nikolaos Lagos, Grenoble, FR;

Assignee:

Xerox Corporation, Norwalk, CT (US);

Attorney:
Primary Examiner:
Int. Cl.
CPC ...
G06F 17/20 (2006.01); G06F 17/28 (2006.01); G06F 17/27 (2006.01); G06F 17/21 (2006.01); G10L 21/00 (2013.01); G06F 17/30 (2006.01);
U.S. Cl.
CPC ...
G06F 17/271 (2013.01); G06F 17/2775 (2013.01); G06F 17/277 (2013.01); G06F 17/30669 (2013.01); G06F 17/278 (2013.01); G06F 17/30684 (2013.01); G06F 17/2735 (2013.01); G06F 17/30666 (2013.01); G06F 17/2818 (2013.01);
Abstract

A system and method for natural language processing of queries are provided. A lexicon includes text elements that are recognized as being a proper noun when capitalized. A natural language query includes a sequence of text elements including words. The query is processed. The processing includes a preprocessing step, in which part of speech features are assigned to the text elements in the query. This includes identifying, from a lexicon, a text element in the query which starts with a lowercase letter and assigning recapitalization information to the text element in the query, based on the lexicon. This information includes a part of speech feature of the capitalized form of the text element. Then parts of speech for the text elements in the query are disambiguated, which includes applying rules for recapitalizing text elements based on the recapitalization information.


Find Patent Forward Citations

Loading…