The patent badge is an abbreviated version of the USPTO patent document. The patent badge does contain a link to the full patent document.

The patent badge is an abbreviated version of the USPTO patent document. The patent badge covers the following: Patent number, Date patent was issued, Date patent was filed, Title of the patent, Applicant, Inventor, Assignee, Attorney firm, Primary examiner, Assistant examiner, CPCs, and Abstract. The patent badge does contain a link to the full patent document (in Adobe Acrobat format, aka pdf). To download or print any patent click here.

Date of Patent:
Aug. 20, 2019

Filed:

Feb. 26, 2015
Applicant:

International Business Machines Corporation, Armonk, NY (US);

Inventors:

Julius Goth, III, Franklinton, NC (US);

Dwi Sianto Mansjur, Cary, NC (US);

Kyle L. Croutwater, Chapel Hill, NC (US);

Beata Strack, Durham, NC (US);

Attorneys:
Primary Examiner:
Int. Cl.
CPC ...
G06F 16/245 (2019.01); G06N 20/00 (2019.01); G06N 5/02 (2006.01); G06F 16/2457 (2019.01); G09B 7/00 (2006.01); G09B 7/02 (2006.01);
U.S. Cl.
CPC ...
G06F 16/24578 (2019.01); G06N 5/022 (2013.01); G06N 20/00 (2019.01); G09B 7/00 (2013.01); G09B 7/02 (2013.01);
Abstract

An active learning framework is operative to identify informative questions that should be added to existing question-answer (Q&A) pairs that comprise a training dataset for a learning model. In this approach, the question-answer pairs (to be labeled as 'true' or 'false') are automatically selected from a larger pool of unlabeled data. A spatial-directed clustering algorithm partitions the relevant question-answer space of unlabeled data. A margin-induced loss function is then used to rank a question. For each question selected, a label is then obtained, preferably by assigning a prediction for each associated question-answer pair using a current model that has been trained on labeled question-answer pairs. After the questions are labeled, an additional re-sampling is performed to assure high quality of the training data. Preferably, and with respect to a particular question, this additional re-sampling is based on a distance measure between correct and incorrect answers.


Find Patent Forward Citations

Loading…