The patent badge is an abbreviated version of the USPTO patent document. The patent badge does contain a link to the full patent document.

The patent badge is an abbreviated version of the USPTO patent document. The patent badge covers the following: Patent number, Date patent was issued, Date patent was filed, Title of the patent, Applicant, Inventor, Assignee, Attorney firm, Primary examiner, Assistant examiner, CPCs, and Abstract. The patent badge does contain a link to the full patent document (in Adobe Acrobat format, aka pdf). To download or print any patent click here.

Date of Patent:
Feb. 04, 2014

Filed:

Jan. 12, 2012
Applicants:

William P. Mcneill, Seattle, WA (US);

Andrew Borthwick, Kirkland, WA (US);

Inventors:

William P. McNeill, Seattle, WA (US);

Andrew Borthwick, Kirkland, WA (US);

Assignee:

Intelius Inc., Bellevue, WA (US);

Attorney:
Primary Examiner:
Int. Cl.
CPC ...
G06F 17/30 (2006.01);
U.S. Cl.
CPC ...
Abstract

Dynamic blocking determines which pairs of records in a data set should be examined as potential duplicates. Records are grouped together into blocks by shared properties that are indicators of duplication. Blocks that are too large to be efficiently processed are further subdivided by other properties chosen in a data-driven way. We demonstrate the viability of this algorithm for large data sets. We have scaled this system up to work on billions of records on an 80 node Hadoop cluster.


Find Patent Forward Citations

Loading…