The patent badge is an abbreviated version of the USPTO patent document. The patent badge does contain a link to the full patent document.

The patent badge is an abbreviated version of the USPTO patent document. The patent badge covers the following: Patent number, Date patent was issued, Date patent was filed, Title of the patent, Applicant, Inventor, Assignee, Attorney firm, Primary examiner, Assistant examiner, CPCs, and Abstract. The patent badge does contain a link to the full patent document (in Adobe Acrobat format, aka pdf). To download or print any patent click here.

Date of Patent:
Dec. 07, 1999

Filed:

Nov. 14, 1997
Applicant:
Inventors:

James V Mahoney, Los Angeles, CA (US);

Jeanette L Blomberg, Portola Valley, CA (US);

Randall H Trigg, Palo Alto, CA (US);

Christian K Shin, Fairport, NY (US);

Assignee:

Xerox Corporation, Stamford, CT (US);

Attorney:
Primary Examiner:
Int. Cl.
CPC ...
G06K / ;
U.S. Cl.
CPC ...
382305 ; 382180 ;
Abstract

A document search system provides a user with a programming interface for dynamically specifying features of documents recorded in a corpus of documents. The programming interface operates at a high-level that is suitable for interactive user specification of layout components and structures of documents. In operation, a bitmap image of a document is analyzed by the document search system to identify layout objects such as text blocks or graphics. Subsequently, the document search system computes a set of attributes for each of the identified layout objects. The set of attributes which are identified are used to describe the layout structure of a page image of a document in terms of the spatial relations that layout objects have to frames of reference that are defined by other layout objects. After computing attributes for each layout object, a user can operate the programming interface to define unique document features. Each document feature is a routine defined by a sequence of selections operations which consume a first set of layout objects and produce a second set of layout objects. The second set of layout objects constitutes the feature in a page image of a document. Using the programming interface, a user flexibly defines a genre of document using the user-specified document features.


Find Patent Forward Citations

Loading…