The patent badge is an abbreviated version of the USPTO patent document. The patent badge does contain a link to the full patent document.

The patent badge is an abbreviated version of the USPTO patent document. The patent badge covers the following: Patent number, Date patent was issued, Date patent was filed, Title of the patent, Applicant, Inventor, Assignee, Attorney firm, Primary examiner, Assistant examiner, CPCs, and Abstract. The patent badge does contain a link to the full patent document (in Adobe Acrobat format, aka pdf). To download or print any patent click here.

Date of Patent:
Apr. 12, 2016

Filed:

Aug. 20, 2013
Applicant:

International Business Machines Corporation, Armonk, NY (US);

Inventors:

Ying Chen, San Jose, CA (US);

William S. Spangler, San Martin, CA (US);

Su Yan, San Jose, CA (US);

Attorney:
Primary Examiner:
Int. Cl.
CPC ...
G06F 17/27 (2006.01);
U.S. Cl.
CPC ...
G06F 17/27 (2013.01); G06F 17/278 (2013.01); G06F 17/2735 (2013.01);
Abstract

According to one embodiment, a method is provided for approximate named-entity extraction from a dictionary that includes entries, where each of the entries includes one or more words. Words are read from the entries of the dictionary, and network resources are searched to determine a frequency of occurrence of the words on the network resources. In view of the frequency of occurrence of the words located on the network resources, domain relevancy of the words in the entries of the dictionary is determined. A domain repository is created using top-ranked words as determined by the domain relevancy of the words. In view of the domain repository, signatures for both the entries of the dictionary and strings of an input document are computed. The strings of the input document are filtered by comparing the signatures of the strings against the signatures of the entries to identify approximate-match entity names.


Find Patent Forward Citations

Loading…