The patent badge is an abbreviated version of the USPTO patent document. The patent badge does contain a link to the full patent document.

The patent badge is an abbreviated version of the USPTO patent document. The patent badge covers the following: Patent number, Date patent was issued, Date patent was filed, Title of the patent, Applicant, Inventor, Assignee, Attorney firm, Primary examiner, Assistant examiner, CPCs, and Abstract. The patent badge does contain a link to the full patent document (in Adobe Acrobat format, aka pdf). To download or print any patent click here.

Date of Patent:
Mar. 28, 2023

Filed:

Jan. 22, 2021
Applicant:

Emc Ip Holding Company Llc, Hopkinton, MA (US);

Inventors:

Alexei Kabishcer, Ramat Gan, IL;

Uri Shabi, Tel Mond, IL;

Assignee:

EMC IP Holding Company LLC, Hopkinton, MA (US);

Attorney:
Primary Examiner:
Int. Cl.
CPC ...
G06F 16/00 (2019.01); G06F 16/215 (2019.01); G06F 16/22 (2019.01); H03M 7/30 (2006.01);
U.S. Cl.
CPC ...
G06F 16/215 (2019.01); G06F 16/2255 (2019.01); H03M 7/3088 (2013.01);
Abstract

Dictionary-based compression is performed to compress data units using a similar data unit as the base unit (i.e., dictionary) for each candidate data unit. Similarity may be determined between data units by applying a locality-sensitive hashing scheme to each candidate data unit to produce a hash value, and by determining whether there is a matching value in a hash index of hash values for existing data units on the system. If there is a matching hash value, the candidate data unit may be compressed using the data unit corresponding to the matching hash value as the dictionary. Only a representative portion of the data unit may be hashed to produce the hash value, the portion comprised of chunks of the data unit, where each chunk is a continuous, uninterrupted section of data. The chunks themselves may not be (in some embodiments likely are not) contiguous to one another.


Find Patent Forward Citations

Loading…