The patent badge is an abbreviated version of the USPTO patent document. The patent badge does contain a link to the full patent document.

The patent badge is an abbreviated version of the USPTO patent document. The patent badge covers the following: Patent number, Date patent was issued, Date patent was filed, Title of the patent, Applicant, Inventor, Assignee, Attorney firm, Primary examiner, Assistant examiner, CPCs, and Abstract. The patent badge does contain a link to the full patent document (in Adobe Acrobat format, aka pdf). To download or print any patent click here.

Date of Patent:
Oct. 17, 2023

Filed:

Nov. 23, 2022
Applicant:

Microsoft Technology Licensing, Llc, Redmond, WA (US);

Inventors:

Babatunde Micheal Okutubo, Bellevue, WA (US);

Maninderjit Singh Parmar, Redmond, WA (US);

Edgars Sedols, Bellevue, WA (US);

Assignee:
Attorney:
Primary Examiner:
Int. Cl.
CPC ...
G06F 16/00 (2019.01); G06F 16/13 (2019.01); G06F 16/27 (2019.01); G06F 16/28 (2019.01);
U.S. Cl.
CPC ...
G06F 16/13 (2019.01); G06F 16/27 (2019.01); G06F 16/285 (2019.01);
Abstract

Methods and systems are provided for improved access to rows of data in a distributed data system. Each data row is associated with a partition. Data rows are distributed in one or more files and an impure file includes data rows associated multiple partitions. A clustering set is generated from a plurality of impure files by selecting a candidate impure file based on file access activity metrics and one or more neighbor impure files. Data rows of the impure files included in the clustering set are sorted according to their respective associated partitions. A set of disjoint partition range files are generated based on the sorted data rows of the impure files included in the clustering set. Each file of the set of disjoint partition range files is transferred to a respective target partition.


Find Patent Forward Citations

Loading…