The patent badge is an abbreviated version of the USPTO patent document. The patent badge does contain a link to the full patent document.

The patent badge is an abbreviated version of the USPTO patent document. The patent badge covers the following: Patent number, Date patent was issued, Date patent was filed, Title of the patent, Applicant, Inventor, Assignee, Attorney firm, Primary examiner, Assistant examiner, CPCs, and Abstract. The patent badge does contain a link to the full patent document (in Adobe Acrobat format, aka pdf). To download or print any patent click here.

Date of Patent:
Nov. 05, 2019

Filed:

Dec. 27, 2016
Applicant:

Amazon Technologies, Inc., Seattle, WA (US);

Inventors:

Wei Yu, Sammamish, WA (US);

Nengwu Zhu, Sammamish, WA (US);

Hyen Vui Chung, Bellevue, WA (US);

Qihui Lee, Seattle, WA (US);

Assignee:

Amazon Technologies, Inc., Seattle, WA (US);

Attorney:
Primary Examiner:
Int. Cl.
CPC ...
G06F 16/00 (2019.01); G06F 16/13 (2019.01); G06F 16/2455 (2019.01);
U.S. Cl.
CPC ...
G06F 16/13 (2019.01); G06F 16/2456 (2019.01); G06F 16/24554 (2019.01);
Abstract

Technologies are disclosed for providing a large scale data join service within a service provider network. A data set includes first and second sets of files that correspond to each other. Each file includes a first identifier (ID) and a second ID. The first set of files is partitioned based at least in part upon the first ID into a plurality of first subsets of files and the second set of files is partitioned based at least in part upon the first ID into a plurality of second subsets of files. Files within a first group of the plurality of first subsets and files within a second group of the plurality of second subsets are encoded into first and second bitsets, respectively, based at least in part upon the second IDs. An exclusive-or operation is performed on the first and second bitsets to find discrepancies between the data files.


Find Patent Forward Citations

Loading…