The patent badge is an abbreviated version of the USPTO patent document. The patent badge does contain a link to the full patent document.
The patent badge is an abbreviated version of the USPTO patent document. The patent badge covers the following: Patent number, Date patent was issued, Date patent was filed, Title of the patent, Applicant, Inventor, Assignee, Attorney firm, Primary examiner, Assistant examiner, CPCs, and Abstract. The patent badge does contain a link to the full patent document (in Adobe Acrobat format, aka pdf). To download or print any patent click here.
Patent No.:
Date of Patent:
Jun. 02, 2009
Filed:
Aug. 31, 2006
Paul Geoffrey Brown, San Jose, CA (US);
Peter Jay Haas, San Jose, CA (US);
Paul Geoffrey Brown, San Jose, CA (US);
Peter Jay Haas, San Jose, CA (US);
International Business Machines Corporation, Armonk, NY (US);
Abstract
A sampling infrastructure/scheme that supports flexible, efficient, scalable and uniform sampling is disclosed. A sample is maintained in a compact histogram form while the sample footprint stays below a specified upper bound. If, at any point, the sample footprint exceeds the upper bound, then the compact representation is abandoned, the sample purged to obtain a subsample. The histogram of the purged subsample is expanded to a bag of values while sampling remaining data values of the partitioned subset. The expanded purged subsample is converted to a histogram and uniform random samples are yielded. The sampling scheme retains the bounded footprint property and to a partial degree the compact representation of the Concise Sampling scheme, while ensuring statistical uniformity. Samples from at least two partitioned subsets are merged on demand to yield uniform merged samples of combined partitions wherein the merged samples also maintain the histogram representation and bounded footprint property.