The patent badge is an abbreviated version of the USPTO patent document. The patent badge does contain a link to the full patent document.

The patent badge is an abbreviated version of the USPTO patent document. The patent badge covers the following: Patent number, Date patent was issued, Date patent was filed, Title of the patent, Applicant, Inventor, Assignee, Attorney firm, Primary examiner, Assistant examiner, CPCs, and Abstract. The patent badge does contain a link to the full patent document (in Adobe Acrobat format, aka pdf). To download or print any patent click here.

Date of Patent:
Nov. 16, 1999

Filed:

Dec. 12, 1997
Applicant:
Inventors:

Vineet Singh, San Jose, CA (US);

Khaled Alsabti, Gainesville, FL (US);

Sanjay Ranka, Gainesville, FL (US);

Assignee:

Hitachi America Ltd., Tarrytown, NY (US);

Attorney:
Primary Examiner:
Assistant Examiner:
Int. Cl.
CPC ...
G06F / ;
U.S. Cl.
CPC ...
707100 ; 707-6 ; 707-7 ; 710132 ; 39580011 ;
Abstract

Multidimensional similarity join finds pairs of multi-dimensional points that are within some small distance of each other. Databases in domains such as multimedia and time-series can require a high number of dimensions. The .epsilon.-k-d-B tree has been proposed as a data structure that scales better as number of dimensions increases compared to previous data structures such as the R-tree (and variations), grid-file, and k-d-B tree. We present a cost model of the .epsilon.-k-d-B tree and use it to optimize the leaf size. This new leaf size is shown to be better in most situations compared to previous work that used a constant leaf size. We present novel parallel procedures for the .epsilon.-k-d-B tree. A load-balancing strategy based on equi-depth histograms is shown to work well for uniform or low-skew situations, whereas another based on weighted, equi-depth histograms works far better for high-skew datasets. The latter strategy is only slightly slower than the former strategy for low skew datasets. The weights for the latter strategy are based on the same cost model that is used to determine optimal leaf sizes.


Find Patent Forward Citations

Loading…