The patent badge is an abbreviated version of the USPTO patent document. The patent badge does contain a link to the full patent document.
The patent badge is an abbreviated version of the USPTO patent document. The patent badge covers the following: Patent number, Date patent was issued, Date patent was filed, Title of the patent, Applicant, Inventor, Assignee, Attorney firm, Primary examiner, Assistant examiner, CPCs, and Abstract. The patent badge does contain a link to the full patent document (in Adobe Acrobat format, aka pdf). To download or print any patent click here.
Patent No.:
Date of Patent:
Apr. 09, 2013
Filed:
Jul. 18, 2012
Zhen Cao, Mountain View, CA (US);
Naval Verma, Sunnyvale, CA (US);
Zhen Cao, Mountain View, CA (US);
Naval Verma, Sunnyvale, CA (US);
Google Inc., Mountain View, CA (US);
Abstract
A model refinement system refines initial split rules that define an initial decision tree to generate final split-rules. The model refinement refines the initial split rules by removing clauses that are satisfied by match scores that are less than a threshold match score to generate initial trimmed rules. Using the initial trimmed rules, the model refinement system classifies an initial training set and filters the initial training set to remove negative training pairs that are classified as duplicate pairs resulting in a filtered training set. An intermediate decision tree defined by intermediate split-rules is generated based on the filtered training set. Final split-rules are generated based on the intermediate split rules and input pairs of data records are classified as duplicate pairs based on attribute values of the input pairs and the final split-rules.