The patent badge is an abbreviated version of the USPTO patent document. The patent badge does contain a link to the full patent document.
The patent badge is an abbreviated version of the USPTO patent document. The patent badge covers the following: Patent number, Date patent was issued, Date patent was filed, Title of the patent, Applicant, Inventor, Assignee, Attorney firm, Primary examiner, Assistant examiner, CPCs, and Abstract. The patent badge does contain a link to the full patent document (in Adobe Acrobat format, aka pdf). To download or print any patent click here.
Patent No.:
Date of Patent:
Jan. 27, 2009
Filed:
Jan. 24, 2005
Matthew S. Sommer, Addison, TX (US);
Kevin B. Thompson, Evanston, IL (US);
Matthew S. Sommer, Addison, TX (US);
Kevin B. Thompson, Evanston, IL (US);
Kroll Ontrack, Inc., Eden Prairie, MN (US);
Abstract
A term-by-document matrix is compiled from a corpus of documents representative of a particular subject matter that represents the frequency of occurrence of each term per document. A weighted term dictionary is created using a global weighting algorithm and then applied to the term-by-document matrix forming a weighted term-by-document matrix. A term vector matrix and a singular value concept matrix are computed by singular value decomposition of the weighted term-document index. The k largest singular concept values are kept and all others are set to zero thereby reducing to the concept dimensions in the term vector matrix and a singular value concept matrix. The reduced term vector matrix, reduced singular value concept matrix and weighted term-document dictionary can be used to project pseudo-document vectors representing documents not appearing in the original document corpus in a representative semantic space. The similarities of those documents can be ascertained from the position of their respective pseudo-document vectors in the representative semantic space.