The patent badge is an abbreviated version of the USPTO patent document. The patent badge does contain a link to the full patent document.

The patent badge is an abbreviated version of the USPTO patent document. The patent badge covers the following: Patent number, Date patent was issued, Date patent was filed, Title of the patent, Applicant, Inventor, Assignee, Attorney firm, Primary examiner, Assistant examiner, CPCs, and Abstract. The patent badge does contain a link to the full patent document (in Adobe Acrobat format, aka pdf). To download or print any patent click here.

Date of Patent:
May. 14, 2019

Filed:

Feb. 27, 2017
Applicant:

International Business Machines Corporation, Armonk, NY (US);

Inventors:

Laura Chiticariu, San Jose, CA (US);

Jeffrey Thomas Kreulen, San Jose, CA (US);

Rajasekar Krishnamurthy, Campbell, CA (US);

Prithviraj Sen, San Jose, CA (US);

Shivakumar Vaithyanathan, San Jose, CA (US);

Attorney:
Primary Examiner:
Int. Cl.
CPC ...
G06F 17/00 (2019.01); G06N 99/00 (2019.01); G06F 17/30 (2006.01); G06N 5/02 (2006.01); G06F 17/24 (2006.01);
U.S. Cl.
CPC ...
G06N 99/005 (2013.01); G06F 17/30705 (2013.01); G06N 5/025 (2013.01); G06F 17/241 (2013.01);
Abstract

One embodiment provides a method for developing a text analytics program for extracting at least one target concept including: utilizing at least one processor to execute computer code that performs the steps of: initiating a development tool that accepts user input to develop rules for extraction of features of the at least one target concept within a dataset comprising textual information; developing, using the rules for feature extraction, an evaluation dataset comprising at least one document annotated with the at least one target concept to be extracted by the text analytics program; creating, using the rules for feature extraction, a rule-based annotator to extract the at least one target concept; training, using the evaluation dataset, a machine-learning annotator to extract the at least one target concept within the dataset; combining the rule-based annotator and the machine learning annotator to form a combined annotator; evaluating, using the evaluation dataset, extraction performance of the combined annotator against a predetermined threshold; and publishing, when the extraction performance of the combined annotator exceeds the predetermined threshold, the combined annotator for use in an application that extracts the at least one target concept from a plurality of datasets.


Find Patent Forward Citations

Loading…