The patent badge is an abbreviated version of the USPTO patent document. The patent badge does contain a link to the full patent document.

The patent badge is an abbreviated version of the USPTO patent document. The patent badge covers the following: Patent number, Date patent was issued, Date patent was filed, Title of the patent, Applicant, Inventor, Assignee, Attorney firm, Primary examiner, Assistant examiner, CPCs, and Abstract. The patent badge does contain a link to the full patent document (in Adobe Acrobat format, aka pdf). To download or print any patent click here.

Date of Patent:
Nov. 30, 2010

Filed:

Mar. 30, 2006
Applicants:

Chenxi Lin, Beijing, CN;

Jie Han, Beijing, CN;

Guirong Xue, Beijing, CN;

Hua-jun Zeng, Beijing, CN;

Benyu Zhang, Beijing, CN;

Zheng Chen, Beijing, CN;

Jian Wang, Beijing, CN;

Inventors:

Chenxi Lin, Beijing, CN;

Jie Han, Beijing, CN;

Guirong Xue, Beijing, CN;

Hua-Jun Zeng, Beijing, CN;

Benyu Zhang, Beijing, CN;

Zheng Chen, Beijing, CN;

Jian Wang, Beijing, CN;

Assignee:

Microsoft Corporation, Redmond, WA (US);

Attorney:
Primary Examiner:
Assistant Examiner:
Int. Cl.
CPC ...
G06F 17/27 (2006.01);
U.S. Cl.
CPC ...
Abstract

A scalable two-pass scalable probabilistic latent semantic analysis (PLSA) methodology is disclosed that may perform more efficiently, and in some cases more accurately, than traditional PLSA, especially where large and/or sparse data sets are provided for analysis. The improved methodology can greatly reduce the storage and/or computational costs of training a PLSA model. In the first pass of the two-pass methodology, objects are clustered into groups, and PLSA is performed on the groups instead of the original individual objects. In the second pass, the conditional probability of a latent class, given an object, is obtained. This may be done by extending the training results of the first pass. During the second pass, the most likely latent classes for each object are identified.


Find Patent Forward Citations

Loading…