The patent badge is an abbreviated version of the USPTO patent document. The patent badge does contain a link to the full patent document.
The patent badge is an abbreviated version of the USPTO patent document. The patent badge covers the following: Patent number, Date patent was issued, Date patent was filed, Title of the patent, Applicant, Inventor, Assignee, Attorney firm, Primary examiner, Assistant examiner, CPCs, and Abstract. The patent badge does contain a link to the full patent document (in Adobe Acrobat format, aka pdf). To download or print any patent click here.
Patent No.:
Date of Patent:
Nov. 16, 2010
Filed:
Nov. 17, 2005
William K. Perrizo, Fargo, ND (US);
Taufik Fuadi Abidin, Piscataway, NJ (US);
Amal Shehan Perera, Knoxville, TN (US);
Masum Serazi, Edison, NJ (US);
William K. Perrizo, Fargo, ND (US);
Taufik Fuadi Abidin, Piscataway, NJ (US);
Amal Shehan Perera, Knoxville, TN (US);
Masum Serazi, Edison, NJ (US);
NDSU Research Foundation, Fargo, ND (US);
Abstract
A system and method for performing and accelerating cluster analysis of large data sets is presented. The data set is formatted into binary bit Sequential (bSQ) format and then structured into a Peano Count tree (P-tree) format which represents a lossless tree representation of the original data. A P-tree algebra is defined and used to formulate a vertical set inner product (VSIP) technique that can be used to efficiently and scalably measure the mean value and total variation of a set about a fixed point in the large dataset. The set can be any projected subspace of any vector space, including oblique sub spaces. The VSIPs are used to determine the closeness of a point to a set of points in the large dataset making the VSIPs very useful in classification, clustering and outlier detection. One advantage is that the number of centroids (k) need not be pre-specified but are effectively determined. The high quality of the centroids makes them useful in partitioning clustering methods such as the k-means and the k-medoids clustering. The present invention also identifies the outliers.