The patent badge is an abbreviated version of the USPTO patent document. The patent badge does contain a link to the full patent document.

The patent badge is an abbreviated version of the USPTO patent document. The patent badge covers the following: Patent number, Date patent was issued, Date patent was filed, Title of the patent, Applicant, Inventor, Assignee, Attorney firm, Primary examiner, Assistant examiner, CPCs, and Abstract. The patent badge does contain a link to the full patent document (in Adobe Acrobat format, aka pdf). To download or print any patent click here.

Date of Patent:
Oct. 10, 2017

Filed:

Sep. 13, 2014
Applicant:

International Business Machines Corporation, Armonk, NY (US);

Inventors:

Atreyee Dey, Bangalore, IN;

Prasan Roy, Bangalore, IN;

Attorneys:
Primary Examiner:
Int. Cl.
CPC ...
G06F 17/30 (2006.01); G06F 11/36 (2006.01);
U.S. Cl.
CPC ...
G06F 17/30289 (2013.01); G06F 11/36 (2013.01);
Abstract

Generation of synthetic database data includes annotated query subplans for a multiple table query workload that includes a desired cardinality for nodes (v) in the subplans. The subplans may be merged and represented by a direct acyclic graph (DAG). The maximum entropy joint probability distribution for each attribute (x) for each node (v) is determined as: for each node v, where wis a weight of node v, fis a conjunct of predicates in a subplan rooted at node v, and Z is a normalization factor. This distribution is determined such that the desired cardinality, and selectivities for each node v determined from the desired cardinality, are satisfied. The data for a plurality of tables are generated by sampling the maximum entropy joint probability distribution for a domain of attributes (x) of a plurality of tables. Data may be efficiently generated for multiple table queries and for DAGs.


Find Patent Forward Citations

Loading…