The patent badge is an abbreviated version of the USPTO patent document. The patent badge does contain a link to the full patent document.
The patent badge is an abbreviated version of the USPTO patent document. The patent badge covers the following: Patent number, Date patent was issued, Date patent was filed, Title of the patent, Applicant, Inventor, Assignee, Attorney firm, Primary examiner, Assistant examiner, CPCs, and Abstract. The patent badge does contain a link to the full patent document (in Adobe Acrobat format, aka pdf). To download or print any patent click here.
Patent No.:
Date of Patent:
Jan. 30, 2024
Filed:
Oct. 06, 2020
Genpact Luxembourg S.à R.l. Ii, Luxembourg, LU;
Sreekanth Menon, Bangalore, IN;
Prakash Selvakumar, Bangalore, IN;
Sudheesh Sudevan, Thalassery, IN;
Genpact Luxembourg S.à r.l. II, Luxembourg, LU;
Abstract
A method and system are provided for training a machine-learning (ML) system/module and to provide an ML model. In one embodiment, a method includes using a labeled entities set to train a machine learning (ML) system, to obtain an ML model, and using the trained ML model to predict labels for entities in an unlabeled entities set, yielding a machine-labeled entities set. One or more individual ML models may be trained and used in this way, where each individual ML model corresponds to a respective document source. The document sources can be identified via classification of a corpus of documents. The prediction of labels provides a respective confidence score for each machine-labeled entity. The method also includes selecting from the machine-labeled entities set, a subset of machine-labeled entities having a respective confidence score at least equal to a threshold confidence score; and updating the labeled entities set by adding thereto the selected subset of machine-labeled entities. The method further includes removing from the machine-labeled entities set the selected subset of machine-labeled entities and deleting labels assigned to the entities in the updated machine-labeled entities set to provide the unlabeled entities set for a next iteration. The method also includes, if a termination condition is not reached, repeating the steps above and, otherwise, storing the ML model.