The patent badge is an abbreviated version of the USPTO patent document. The patent badge does contain a link to the full patent document.

The patent badge is an abbreviated version of the USPTO patent document. The patent badge covers the following: Patent number, Date patent was issued, Date patent was filed, Title of the patent, Applicant, Inventor, Assignee, Attorney firm, Primary examiner, Assistant examiner, CPCs, and Abstract. The patent badge does contain a link to the full patent document (in Adobe Acrobat format, aka pdf). To download or print any patent click here.

Patent No.:

US 11755614 B1

Date of Patent:

Sep. 12, 2023

Filed:

Apr. 22, 2022

Generation and graphical display of data transform provenance metadata

Applicant:

Palantir Technologies Inc., Denver, CO (US);

Inventors:

Matthew Maclean, New York, NY (US);

Adam Borochoff, New York, NY (US);

Jared Newman, Costa Mesa, CA (US);

Joseph Rafidi, Mountain View, CA (US);

Assignee:

Palantir Technologies Inc., Denver, CO (US);

Attorney:

Duane Morris LLP

Primary Examiner:

Etienne P Leroux

Int. Cl.

CPC ...

G06F 16/26 (2019.01); G06F 16/27 (2019.01); G06F 16/22 (2019.01); G06F 16/21 (2019.01); G06F 16/25 (2019.01);

U.S. Cl.

CPC ...

G06F 16/26 (2019.01); G06F 16/212 (2019.01); G06F 16/221 (2019.01); G06F 16/2282 (2019.01); G06F 16/258 (2019.01); G06F 16/27 (2019.01);

Abstract

Techniques for propagation of deletion operations among a plurality of related datasets are described herein. In an embodiment, a data processing method comprises, using a distributed database system that is programmed to manage a plurality of different raw datasets and a plurality of derived datasets that have been derived from the raw datasets based on a plurality of derivation relationships that link the raw datasets to the derived datasets: from a first dataset that is stored in the distributed database system, determining a subset of records that are candidates for propagated deletion of specified data values; determining one or more particular raw datasets that contain the subset of records; deleting the specified data values from the particular raw datasets; based on the plurality of derivation relationships and the particular raw datasets, identifying one or more particular derived datasets that have been derived from the particular raw datasets; generating and executing a build of the one or more particular derived datasets to result in creating and storing the one or more particular derived datasets without the specified data values that were deleted from the particular raw datasets; repeating the generating and executing for all derived datasets that have derivation relationships to the particular raw datasets; wherein the method is performed using one or more processors.

Find Patent Forward Citations