The patent badge is an abbreviated version of the USPTO patent document. The patent badge does contain a link to the full patent document.

The patent badge is an abbreviated version of the USPTO patent document. The patent badge covers the following: Patent number, Date patent was issued, Date patent was filed, Title of the patent, Applicant, Inventor, Assignee, Attorney firm, Primary examiner, Assistant examiner, CPCs, and Abstract. The patent badge does contain a link to the full patent document (in Adobe Acrobat format, aka pdf). To download or print any patent click here.

Date of Patent:
Oct. 05, 1993

Filed:

Jul. 31, 1991
Applicant:
Inventors:

Brij M Masand, Medford, MA (US);

Stephen J Smith, Lynnfield, MA (US);

Assignee:

Thinking Machines Corporation, Cambridge, MA (US);

Attorneys:
Primary Examiner:
Assistant Examiner:
Int. Cl.
CPC ...
G06F / ; G01L / ;
U.S. Cl.
CPC ...
36441908 ;
Abstract

Classification of natural language data wherein the natural language data has an open-ended range of possible values or the data values do not have a relative order. A training database stores training records, wherein each training record includes predictor data fields. Each predictor data field containes a feature, wherein each feature is a natural language term, and a target data field containing a target value representing a classification of the record. Features may also include conjunctions of natural language terms and each feature may also be a member of a category subset of features. The training database stores, for each feature, a probability weight value representing the probability that a record will have the target value contained in the target data field if a feature contained in a corresponding predictor data field occurs in the record. Features are extracted from a new record and each feature from the new record is used to query the training records to determine the probability weights from the training records having matching features. The probability weights are accumulated for each training record to determine a comparison score representing the probability that the training record matches the new record and provide an output indicating the training records most probability matching the new record.


Find Patent Forward Citations

Loading…