The patent badge is an abbreviated version of the USPTO patent document. The patent badge does contain a link to the full patent document.
The patent badge is an abbreviated version of the USPTO patent document. The patent badge covers the following: Patent number, Date patent was issued, Date patent was filed, Title of the patent, Applicant, Inventor, Assignee, Attorney firm, Primary examiner, Assistant examiner, CPCs, and Abstract. The patent badge does contain a link to the full patent document (in Adobe Acrobat format, aka pdf). To download or print any patent click here.
Patent No.:
Date of Patent:
Jul. 20, 1999
Filed:
Mar. 28, 1997
Oliver A Hilsenrath, Alamo, CA (US);
Ron Carmel, San Ramon, CA (US);
Hagai Ariel, San Ramon, CA (US);
Mantra Technologies, Inc., San Ramon, CA (US);
Abstract
A computer-implemented method for comparing the contents of two sets of documents includes the step of extracting from a set of documents �44! corresponding sets of document extract entries �46!. The method further includes a step of generating from the sets of document extract entries �46! corresponding sets of word clusters �48!. Each word cluster comprises a cluster word list having N words, an N.times.N total distance matrix, and an N.times.N number of connections matrix. The preferred embodiment includes a step of grouping similar word clusters and combining the similar word clusters to form a single word cluster for each group. The grouping comprises evaluating a measure of cluster similarity between two word clusters, and placing them in a common group of similar word clusters if the measure of similarity exceeds a predetermined value. The step of evaluating cluster similarity comprises intersecting clusters to form subclusters and calculating a function of the subclusters. In the preferred embodiment, the method is implemented in a system to automatically identify database documents which are of interest to a given user or users. In this implementation, the method comprises the step of automatically deriving the first set of documents from a local data storage device, such as a user's hard disk. The method also comprises the step of deriving the second set of documents from a second data storage device, such as a network machine. This application of the invention, therefore, provides fast and accurate searching to identify documents of interest to a particular user or users without any need for the user or users to specify what search criteria to use.