The patent badge is an abbreviated version of the USPTO patent document. The patent badge does contain a link to the full patent document.

The patent badge is an abbreviated version of the USPTO patent document. The patent badge covers the following: Patent number, Date patent was issued, Date patent was filed, Title of the patent, Applicant, Inventor, Assignee, Attorney firm, Primary examiner, Assistant examiner, CPCs, and Abstract. The patent badge does contain a link to the full patent document (in Adobe Acrobat format, aka pdf). To download or print any patent click here.

Date of Patent:
Sep. 06, 2022

Filed:

Dec. 04, 2019
Applicant:

Microsoft Technology Licensing, Llc, Redmond, WA (US);

Inventors:

Tianhao Lu, New York, NY (US);

Junzhe Miao, San Jose, CA (US);

Yunpeng Xu, Millburn, NJ (US);

Dan Shacham, Sunnyvale, CA (US);

Hong H. Tam, Fremont, CA (US);

Tao Xiong, Jersey City, NJ (US);

Assignee:
Attorney:
Primary Examiner:
Int. Cl.
CPC ...
G06N 20/00 (2019.01); G06F 16/174 (2019.01); G06F 16/23 (2019.01); G06F 16/953 (2019.01);
U.S. Cl.
CPC ...
G06N 20/00 (2019.01); G06F 16/174 (2019.01); G06F 16/2365 (2019.01);
Abstract

The disclosed embodiments provide a system that identifies duplicate entities. During operation, the system selects training data for a first machine learning model based on confidence scores representing likelihoods that pairs of entities in an online system are duplicates. Next, the system updates parameters of the first machine learning model based on features and labels in the training data. The system then identifies a first subset of additional pairs of the entities as duplicate entities based on scores generated by the first machine learning model from values of the features for the additional pairs and a first threshold associated with the scores. The system also determines a canonical entity in each of the duplicate entities based on additional features. Finally, the system updates content outputted in a user interface of the online system based on the identified first subset of the additional pairs.


Find Patent Forward Citations

Loading…