The patent badge is an abbreviated version of the USPTO patent document. The patent badge does contain a link to the full patent document.
The patent badge is an abbreviated version of the USPTO patent document. The patent badge covers the following: Patent number, Date patent was issued, Date patent was filed, Title of the patent, Applicant, Inventor, Assignee, Attorney firm, Primary examiner, Assistant examiner, CPCs, and Abstract. The patent badge does contain a link to the full patent document (in Adobe Acrobat format, aka pdf). To download or print any patent click here.
Patent No.:
Date of Patent:
Nov. 21, 2006
Filed:
May. 30, 2003
Andrei Z Broder, New York, NY (US);
David Carmel, Haifa, IL;
Arthur C Ciccolo, Ridgefield, CT (US);
David Ferrucci, Yorktown Heights, NY (US);
Yoelle Maarek, Haifa, IL;
Yosi Mass, Ramat Gan, IL;
Aya Soffer, Haifa, IL;
Wlodek W Zadrozny, Tarrytown, NY (US);
Andrei Z Broder, New York, NY (US);
David Carmel, Haifa, IL;
Arthur C Ciccolo, Ridgefield, CT (US);
David Ferrucci, Yorktown Heights, NY (US);
Yoelle Maarek, Haifa, IL;
Yosi Mass, Ramat Gan, IL;
Aya Soffer, Haifa, IL;
Wlodek W Zadrozny, Tarrytown, NY (US);
International Business Machines Corporation, Armonk, NY (US);
Abstract
Disclosed is a system architecture, components and a searching technique for an Unstructured Information Management System (UIMS). The UIMS may be provided as middleware for the effective management and interchange of unstructured information over a wide array of information sources. The architecture generally includes a search engine, data storage, analysis engines containing pipelined document annotators and various adapters. The searching technique makes use of a two-level searching technique. Also disclosed is system, method and computer program product to process document data. The method includes inputting a document and operating at least one text analysis engine that comprises a plurality of coupled annotators for tokenizing document data for identifying and annotating a particular type of semantic content. Operating the at least one text analysis engine generates a plurality of views of a document, where each of the plurality of views are derived from a different tokenization of the document. The method further includes storing the plurality of views in a common data structure associated with the document.