The patent badge is an abbreviated version of the USPTO patent document. The patent badge does contain a link to the full patent document.

The patent badge is an abbreviated version of the USPTO patent document. The patent badge covers the following: Patent number, Date patent was issued, Date patent was filed, Title of the patent, Applicant, Inventor, Assignee, Attorney firm, Primary examiner, Assistant examiner, CPCs, and Abstract. The patent badge does contain a link to the full patent document (in Adobe Acrobat format, aka pdf). To download or print any patent click here.

Date of Patent:
Jun. 20, 2017

Filed:

Mar. 05, 2015
Applicant:

International Business Machines Corporation, Armonk, NY (US);

Inventors:

Branimir K. Boguraev, Bedford, NY (US);

Esme Manandise, Tallahassee, FL (US);

Benjamin P. Segal, Hyde Park, NY (US);

Attorneys:
Primary Examiner:
Int. Cl.
CPC ...
G06F 17/27 (2006.01); G06F 17/30 (2006.01);
U.S. Cl.
CPC ...
G06F 17/2735 (2013.01); G06F 17/2755 (2013.01); G06F 17/30654 (2013.01);
Abstract

According to an aspect, a candidate token sequence including one or more word tokens is extracted from an unstructured domain glossary that includes entries associated with a domain. A look-up operation is performed to retrieve language data for each word token in the candidate token sequence and annotates each word token in the candidate token sequence found by the look-up operation with corresponding retrieved language data to form an annotated sequence. A pattern match of the annotated sequence is performed relative to a repository of patterns and identifies a best matching pattern from the repository of patterns to the annotated sequence based on matching criteria. The annotated sequence is refined with lexical information associated with the best matching pattern as a refined annotated sequence. The candidate token sequence and the refined annotated sequence are output to a domain-specific computational lexicon file.


Find Patent Forward Citations

Loading…