The patent badge is an abbreviated version of the USPTO patent document. The patent badge does contain a link to the full patent document.

The patent badge is an abbreviated version of the USPTO patent document. The patent badge covers the following: Patent number, Date patent was issued, Date patent was filed, Title of the patent, Applicant, Inventor, Assignee, Attorney firm, Primary examiner, Assistant examiner, CPCs, and Abstract. The patent badge does contain a link to the full patent document (in Adobe Acrobat format, aka pdf). To download or print any patent click here.

Date of Patent:
Nov. 29, 2022

Filed:

Jun. 12, 2019
Applicant:

International Business Machines Corporation, Armonk, NY (US);

Inventors:

Pathirage D. S. U. Perera, San Jose, CA (US);

Eitan D. Farchi, Pardes Hana, IL;

Orna Raz, Haifa, IL;

Ramani Routray, San Jose, CA (US);

Sheng Hua Bao, San Jose, CA (US);

Marcel Zalmanovici, Kiriat Motzkin, IL;

Attorney:
Primary Examiner:
Int. Cl.
CPC ...
G06V 30/19 (2022.01); G06K 9/62 (2022.01); G06N 20/00 (2019.01); G06F 16/93 (2019.01);
U.S. Cl.
CPC ...
G06V 30/19147 (2022.01); G06F 16/93 (2019.01); G06K 9/6221 (2013.01); G06K 9/6257 (2013.01); G06K 9/6272 (2013.01); G06N 20/00 (2019.01);
Abstract

A computer system trains a machine learning model. A vector representation is generated for each document in a collection of documents. The documents are clustered based on the vector representations of the documents to produce a plurality of clusters. A training set is produced by selecting one or more documents from each cluster, wherein the selected documents represent a sample of the collection of documents to train the machine learning model. The machine learning model is trained by applying the training set to the machine learning model. Embodiments of the present invention further include a method and program product for training a machine learning model in substantially the same manner described above.


Find Patent Forward Citations

Loading…