The patent badge is an abbreviated version of the USPTO patent document. The patent badge does contain a link to the full patent document.
The patent badge is an abbreviated version of the USPTO patent document. The patent badge covers the following: Patent number, Date patent was issued, Date patent was filed, Title of the patent, Applicant, Inventor, Assignee, Attorney firm, Primary examiner, Assistant examiner, CPCs, and Abstract. The patent badge does contain a link to the full patent document (in Adobe Acrobat format, aka pdf). To download or print any patent click here.
Patent No.:
Date of Patent:
Oct. 10, 2017
Filed:
Sep. 13, 2014
International Business Machines Corporation, Armonk, NY (US);
Atreyee Dey, Bangalore, IN;
Prasan Roy, Bangalore, IN;
International Business Machines Corporation, Armonk, NY (US);
Abstract
Generation of synthetic database data includes annotated query subplans for a multiple table query workload that includes a desired cardinality for nodes (v) in the subplans. The subplans may be merged and represented by a direct acyclic graph (DAG). The maximum entropy joint probability distribution for each attribute (x) for each node (v) is determined as: for each node v, where wis a weight of node v, fis a conjunct of predicates in a subplan rooted at node v, and Z is a normalization factor. This distribution is determined such that the desired cardinality, and selectivities for each node v determined from the desired cardinality, are satisfied. The data for a plurality of tables are generated by sampling the maximum entropy joint probability distribution for a domain of attributes (x) of a plurality of tables. Data may be efficiently generated for multiple table queries and for DAGs.