The patent badge is an abbreviated version of the USPTO patent document. The patent badge does contain a link to the full patent document.

The patent badge is an abbreviated version of the USPTO patent document. The patent badge covers the following: Patent number, Date patent was issued, Date patent was filed, Title of the patent, Applicant, Inventor, Assignee, Attorney firm, Primary examiner, Assistant examiner, CPCs, and Abstract. The patent badge does contain a link to the full patent document (in Adobe Acrobat format, aka pdf). To download or print any patent click here.

Date of Patent:
Jan. 24, 2012

Filed:

Oct. 10, 2008
Applicants:

Rakesh Gupta, Mountain View, CA (US);

Lev Ratinov, Raymond, OH (US);

Inventors:

Rakesh Gupta, Mountain View, CA (US);

Lev Ratinov, Raymond, OH (US);

Assignee:
Attorneys:
Primary Examiner:
Int. Cl.
CPC ...
G06F 7/00 (2006.01); G06F 17/30 (2006.01);
U.S. Cl.
CPC ...
Abstract

The present invention provides a method for incorporating features from heterogeneous auxiliary datasets into input text data for use in classification. Heterogeneous auxiliary datasets, such as labeled datasets and unlabeled datasets, are accessed after receiving input text data. Features are extracted from each of the heterogeneous auxiliary datasets. The features are combined with the input text data to generate a set of features which may potentially be used to classify the input text data. Classification features are then extracted from the set of features and used to classify the input text data. In one embodiment, the classification features are extracted by calculating a mutual information value associated with each feature in the set of features and identifying features having a mutual information value exceeding a threshold value.


Find Patent Forward Citations

Loading…