The patent badge is an abbreviated version of the USPTO patent document. The patent badge does contain a link to the full patent document.
The patent badge is an abbreviated version of the USPTO patent document. The patent badge covers the following: Patent number, Date patent was issued, Date patent was filed, Title of the patent, Applicant, Inventor, Assignee, Attorney firm, Primary examiner, Assistant examiner, CPCs, and Abstract. The patent badge does contain a link to the full patent document (in Adobe Acrobat format, aka pdf). To download or print any patent click here.
Patent No.:
Date of Patent:
Oct. 14, 2025
Filed:
Jun. 28, 2022
Hewlett Packard Enterprise Development Lp, Houston, TX (US);
Annmary Justine Koomthanam, Bangalore, IN;
Suparna Bhattacharya, Bangalore, IN;
Aalap Tripathy, Houston, TX (US);
Sergey Serebryakov, Milpitas, CA (US);
Martin Foltin, Ft. Collins, CO (US);
Paolo Faraboschi, Milpitas, CA (US);
Hewlett Packard Enterprise Development LP, Spring, TX (US);
Abstract
Systems and methods are provide for automatically constructing data lineage representations for distributed data processing pipelines. These data lineage representations (which are constructed and stored in a central repository shared by the multiple data processing sites) can be used to among other things, clone the distributed data processing pipeline for quality assurance or debugging purposes. Examples of the presently disclosed technology are able to construct data lineage representations for distributed data processing pipelines by (1) generating a hash content value for universally identifying each data artifact of the distributed data processing pipeline across the multiple processing stages/processing sites of the distributed data processing pipeline; and (2) creating an data processing pipeline abstraction hierarchy for associating each data artifact to input and output events for given executions of given data processing stages (performed by the multiple data processing sites).