The patent badge is an abbreviated version of the USPTO patent document. The patent badge does contain a link to the full patent document.
The patent badge is an abbreviated version of the USPTO patent document. The patent badge covers the following: Patent number, Date patent was issued, Date patent was filed, Title of the patent, Applicant, Inventor, Assignee, Attorney firm, Primary examiner, Assistant examiner, CPCs, and Abstract. The patent badge does contain a link to the full patent document (in Adobe Acrobat format, aka pdf). To download or print any patent click here.
Patent No.:
Date of Patent:
Sep. 14, 2010
Filed:
Feb. 25, 2008
Klaus Brinker, Princeton, NJ (US);
Fabian Moerchen, Princeton, NJ (US);
Bernhard Glomann, Bayonne, NJ (US);
Claus Neubauer, Monmouth Junction, NJ (US);
Klaus Brinker, Princeton, NJ (US);
Fabian Moerchen, Princeton, NJ (US);
Bernhard Glomann, Bayonne, NJ (US);
Claus Neubauer, Monmouth Junction, NJ (US);
Siemens Corporation, Iselin, NJ (US);
Abstract
Documents from a data stream are clustered by first generating a feature vector for each document. A set of cluster centroids (e.g., feature vectors of their corresponding clusters) are retrieved from a memory based on the feature vector of the document using a locality sensitive hashing function. The centroids may be retrieved by retrieving a set of cluster identifiers from a cluster table, the cluster identifiers each indicative of a respective cluster centroid, and retrieving the cluster centroids corresponding to the retrieved cluster identifiers from a memory. Documents may then be clustered into one or more of the candidate clusters using distance measures from the feature vector of the document to the cluster centroids.