The patent badge is an abbreviated version of the USPTO patent document. The patent badge does contain a link to the full patent document.

The patent badge is an abbreviated version of the USPTO patent document. The patent badge covers the following: Patent number, Date patent was issued, Date patent was filed, Title of the patent, Applicant, Inventor, Assignee, Attorney firm, Primary examiner, Assistant examiner, CPCs, and Abstract. The patent badge does contain a link to the full patent document (in Adobe Acrobat format, aka pdf). To download or print any patent click here.

Patent No.:

US 7139752 B1

Date of Patent:

Nov. 21, 2006

Filed:

May. 30, 2003

System, method and computer program product for performing unstructured information management and automatic text analysis, and providing multiple document views derived from different document tokenizations

Applicants:

Andrei Z Broder, New York, NY (US);

David Carmel, Haifa, IL;

Arthur C Ciccolo, Ridgefield, CT (US);

David Ferrucci, Yorktown Heights, NY (US);

Yoelle Maarek, Haifa, IL;

Yosi Mass, Ramat Gan, IL;

Aya Soffer, Haifa, IL;

Wlodek W Zadrozny, Tarrytown, NY (US);

Inventors:

Andrei Z Broder, New York, NY (US);

David Carmel, Haifa, IL;

Arthur C Ciccolo, Ridgefield, CT (US);

David Ferrucci, Yorktown Heights, NY (US);

Yoelle Maarek, Haifa, IL;

Yosi Mass, Ramat Gan, IL;

Aya Soffer, Haifa, IL;

Wlodek W Zadrozny, Tarrytown, NY (US);

Assignee:

International Business Machines Corporation, Armonk, NY (US);

Attorney:

Harrington & Smith, LLP

Primary Examiner:

Greta Robinson

Int. Cl.

CPC ...

G06F 17/30 (2006.01);

U.S. Cl.

CPC ...

Abstract

Disclosed is a system architecture, components and a searching technique for an Unstructured Information Management System (UIMS). The UIMS may be provided as middleware for the effective management and interchange of unstructured information over a wide array of information sources. The architecture generally includes a search engine, data storage, analysis engines containing pipelined document annotators and various adapters. The searching technique makes use of a two-level searching technique. Also disclosed is system, method and computer program product to process document data. The method includes inputting a document and operating at least one text analysis engine that comprises a plurality of coupled annotators for tokenizing document data for identifying and annotating a particular type of semantic content. Operating the at least one text analysis engine generates a plurality of views of a document, where each of the plurality of views are derived from a different tokenization of the document. The method further includes storing the plurality of views in a common data structure associated with the document.

Find Patent Forward Citations