The patent badge is an abbreviated version of the USPTO patent document. The patent badge does contain a link to the full patent document.

The patent badge is an abbreviated version of the USPTO patent document. The patent badge covers the following: Patent number, Date patent was issued, Date patent was filed, Title of the patent, Applicant, Inventor, Assignee, Attorney firm, Primary examiner, Assistant examiner, CPCs, and Abstract. The patent badge does contain a link to the full patent document (in Adobe Acrobat format, aka pdf). To download or print any patent click here.

Date of Patent:
Aug. 01, 2000

Filed:

Mar. 19, 1998
Applicant:
Inventors:

Hideki Yamamoto, Tokyo, JP;

Mihoko Kitamura, Tokyo, JP;

Sayori Shimohata, Tokyo, JP;

Mikio Yamamoto, Tsukuba, JP;

Assignee:
Attorney:
Primary Examiner:
Int. Cl.
CPC ...
G06F / ;
U.S. Cl.
CPC ...
704-9 ; 707531 ;
Abstract

There is provided a morphological analysis method and device whereby, even if unknown words are present, processing can be effected with high accuracy and at high speed and economy of resources can be achieved. Expanded characters e.sub.i are generated by adding to each character c.sub.i of input text, in addition to word division information d.sub.i, expansion information including required arbitrarily selectable information such as tag information, and all possible expanded character sequences are generated. Beforehand, by training, the partial chain probabilities (appearance probabilities) of N-gram (where, normally N=1 or 2 or 3) character sequences are stored in an expanded character table. The partial character sequences of the expanded character sequences are successively extracted from the beginning of the expanded character sequence and the respective partial chain probabilities are found by referring to the expanded character table, and the product of the thus-found partial chain probabilities is obtained. This product is found for all the expanded character sequences, and analysis results etc. consisting of a row of word sequences in order of character sequences corresponding to largest such products, as well as a row of tag sequences and/or arbitrarily selectable information is output as the morphological analysis result.


Find Patent Forward Citations

Loading…