The patent badge is an abbreviated version of the USPTO patent document. The patent badge does contain a link to the full patent document.
The patent badge is an abbreviated version of the USPTO patent document. The patent badge covers the following: Patent number, Date patent was issued, Date patent was filed, Title of the patent, Applicant, Inventor, Assignee, Attorney firm, Primary examiner, Assistant examiner, CPCs, and Abstract. The patent badge does contain a link to the full patent document (in Adobe Acrobat format, aka pdf). To download or print any patent click here.
Patent No.:
Date of Patent:
Nov. 09, 1999
Filed:
Oct. 31, 1997
Vineet Singh, San Jose, CA (US);
Sanjay Ranka, Gainesville, FL (US);
Khaled Alsabti, Gainesville, FL (US);
Hitachi America, Ltd., Tarrytown, NY (US);
Abstract
The present invention is directed to an improved data clustering method and apparatus for use in data mining operations. The present invention determines the pattern vectors of a k-d tree structure which are closest to a given prototype cluster by pruning prototypes through geometrical constraints, before a k-means process is applied to the prototypes. For each sub-branch in the k-d tree, a candidate set of prototypes is formed from the parent of a child node. The minimum and maximum distances from any point in the child node to any prototype in the candidate set is determined. The smallest of the maximum distances found is compared to the minimum distances of each prototype in the candidate set. Those prototypes with a minimum distance greater than the smallest of the maximum distances are pruned or eliminated. Pruning the number of remote prototypes reduces the number of distance calculations for the k-means process, significantly reducing the overall computation time.