The patent badge is an abbreviated version of the USPTO patent document. The patent badge does contain a link to the full patent document.

The patent badge is an abbreviated version of the USPTO patent document. The patent badge covers the following: Patent number, Date patent was issued, Date patent was filed, Title of the patent, Applicant, Inventor, Assignee, Attorney firm, Primary examiner, Assistant examiner, CPCs, and Abstract. The patent badge does contain a link to the full patent document (in Adobe Acrobat format, aka pdf). To download or print any patent click here.

Date of Patent:
Jan. 26, 2021

Filed:

Jul. 20, 2018
Applicant:

Emc Ip Holding Company Llc, Hopkinton, MA (US);

Inventors:

Vinícius Michel Gottin, Rio de Janeiro, BR;

Jonas F. Dias, Rio de Janeiro, BR;

Edward José Pacheco Condori, Rio de Janeiro, BR;

Angelo E. M. Ciarlini, Rio de Janeiro, BR;

Bruno Carlos da Cunha Costa, Teresópolis, BR;

Fábio André Machado Porto, Petrópolis, BR;

Paulo de Figueiredo Pires, Rio de Janeiro, BR;

Yania Molina Souto, Petrópolis, BR;

Wagner dos Santos Vieira, Rio de Janeiro, BR;

Assignee:

EMC IP Holding Company LLC, Hopkinton, MA (US);

Attorney:
Primary Examiner:
Int. Cl.
CPC ...
G06F 9/48 (2006.01); G06F 9/50 (2006.01);
U.S. Cl.
CPC ...
G06F 9/4881 (2013.01); G06F 9/485 (2013.01); G06F 9/5083 (2013.01);
Abstract

Techniques are provided for dataflow execution time estimation for distributed processing frameworks. An exemplary method comprises: obtaining an input dataset for a dataflow for execution; determining a substantially minimal data unit for a given operation of the dataflow processed by the given operation; estimating a number of rounds required to execute a number of data units in the input dataset using nodes assigned to execute the given operation; determining an execution time spent by the given operation to process one data unit; estimating the execution time for the given operation based on the execution time spent by the given operation to process one data unit and the number of rounds required to execute the number of data units in the input dataset; and executing the given operation with the input dataset. A persistent cost model is optionally employed to record the execution times of known dataflow operations.


Find Patent Forward Citations

Loading…