The patent badge is an abbreviated version of the USPTO patent document. The patent badge does contain a link to the full patent document.
The patent badge is an abbreviated version of the USPTO patent document. The patent badge covers the following: Patent number, Date patent was issued, Date patent was filed, Title of the patent, Applicant, Inventor, Assignee, Attorney firm, Primary examiner, Assistant examiner, CPCs, and Abstract. The patent badge does contain a link to the full patent document (in Adobe Acrobat format, aka pdf). To download or print any patent click here.
Patent No.:
Date of Patent:
May. 14, 2024
Filed:
Jul. 07, 2023
Xerox Corporation, Norwalk, CT (US);
Matthew Shreve, Mountain View, CA (US);
Francisco E. Torres, San Jose, CA (US);
Raja Bala, Pittsford, NY (US);
Robert R. Price, Palo Alto, CA (US);
Pei Li, San Jose, CA (US);
Xerox Corporation, Norwalk, CT (US);
Abstract
A method of labeling a dataset includes inputting a testing set comprising a plurality of input data samples into a plurality of pre-trained machine learning models to generate a set of embeddings output by the plurality of pre-trained machine learning models. The method further includes performing an iterative cluster labeling algorithm that includes generating a plurality of clusterings from the set of embeddings, analyzing the plurality of clusterings to identify a target embedding with a highest duster quality, analyzing the target embedding to determine a compactness for each of the plurality of clusterings of the target embedding, and identifying a target cluster among the plurality of clusterings of the target embedding based on the compactness. The method further includes assigning pseudo-labels to the subset of the plurality of input data samples that are members of the target duster.