The patent badge is an abbreviated version of the USPTO patent document. The patent badge does contain a link to the full patent document.

The patent badge is an abbreviated version of the USPTO patent document. The patent badge covers the following: Patent number, Date patent was issued, Date patent was filed, Title of the patent, Applicant, Inventor, Assignee, Attorney firm, Primary examiner, Assistant examiner, CPCs, and Abstract. The patent badge does contain a link to the full patent document (in Adobe Acrobat format, aka pdf). To download or print any patent click here.

Date of Patent:
Aug. 29, 2017

Filed:

Jun. 08, 2015
Applicant:

International Business Machines Corporation, Armonk, NY (US);

Inventors:

Lior Aronovich, Toronto, CA;

Ron Asher, Tel Aviv, IL;

Michael Hirsch, Mazkeret Batya, IL;

Shmuel T. Klein, Rehovot, IL;

Ehud Meiri, Tel Aviv, IL;

Yair Toaff, Givat Shmuel, IL;

Attorney:
Primary Examiner:
Int. Cl.
CPC ...
G06F 7/00 (2006.01); G06F 3/06 (2006.01); G06F 17/30 (2006.01); H03M 7/30 (2006.01);
U.S. Cl.
CPC ...
G06F 3/0641 (2013.01); G06F 3/067 (2013.01); G06F 3/0608 (2013.01); G06F 7/00 (2013.01); G06F 17/30159 (2013.01); G06F 17/30303 (2013.01); G06F 17/30371 (2013.01); H03M 7/3091 (2013.01); H03M 7/3093 (2013.01);
Abstract

Exemplary method, system, and computer program product embodiments for scalable data deduplication working with small data chunk in a computing environment are provided. In one embodiment, by way of example only, for each small data chunk, a signature is generated based on a combination of a representation of characters used in selecting data to be deduplicated. A c-spectrum of the small data chunk being a sequence of representations of different characters ordered by a frequency of occurrence in the small data chunk, and an f-spectrum of the small data chunk being a corresponding sequence of frequencies of the different characters in the small data chunk.


Find Patent Forward Citations

Loading…