The patent badge is an abbreviated version of the USPTO patent document. The patent badge does contain a link to the full patent document.

The patent badge is an abbreviated version of the USPTO patent document. The patent badge covers the following: Patent number, Date patent was issued, Date patent was filed, Title of the patent, Applicant, Inventor, Assignee, Attorney firm, Primary examiner, Assistant examiner, CPCs, and Abstract. The patent badge does contain a link to the full patent document (in Adobe Acrobat format, aka pdf). To download or print any patent click here.

Date of Patent:
Jun. 13, 2017

Filed:

Jul. 14, 2011
Applicant:

John W. Bates, Mendon, MA (US);

Inventor:

John W. Bates, Mendon, MA (US);

Assignee:

EMC IP Holding Company LLC, Hopkinton, MA (US);

Attorney:
Primary Examiner:
Int. Cl.
CPC ...
G06F 3/06 (2006.01); G06F 17/30 (2006.01); H04N 19/63 (2014.01); G06F 11/14 (2006.01);
U.S. Cl.
CPC ...
G06F 3/067 (2013.01); G06F 3/0608 (2013.01); G06F 3/0641 (2013.01); G06F 17/30138 (2013.01); H04N 19/63 (2014.11); G06F 11/1453 (2013.01); G06F 17/30247 (2013.01); G06F 17/30312 (2013.01);
Abstract

A method for data deduplication includes the following steps. First, segmenting an original data set into a plurality of data segments. Next, transforming the data in each data segment into a transformed data representation that has a band-type structure for each data segment. The band-type structure includes a plurality of bands. Next, selecting a first set of bands, grouping them together and storing them with the original data set. The first set of bands includes non-identical transformed data for each data segment. Next, selecting a second set of bands and grouping them together. The second set of bands includes identical transformed data for each data segment. Next, applying a hash function onto the transformed data of the second set of bands and thereby generating transformed data segments indexed by hash function indices. Finally, storing the hash function indices and the transformed data representation of one representative data segment in a deduplication database.


Find Patent Forward Citations

Loading…