The patent badge is an abbreviated version of the USPTO patent document. The patent badge does contain a link to the full patent document.

The patent badge is an abbreviated version of the USPTO patent document. The patent badge covers the following: Patent number, Date patent was issued, Date patent was filed, Title of the patent, Applicant, Inventor, Assignee, Attorney firm, Primary examiner, Assistant examiner, CPCs, and Abstract. The patent badge does contain a link to the full patent document (in Adobe Acrobat format, aka pdf). To download or print any patent click here.

Date of Patent:
Mar. 16, 1999

Filed:

Jun. 02, 1995
Applicant:
Inventors:

Gary E Kopec, Belmont, CA (US);

Philip A Chou, Menlo Park, CA (US);

Leslie T Niles, Palo Alto, CA (US);

Assignee:

Xerox Corporation, Stamford, CT (US);

Attorney:
Primary Examiner:
Assistant Examiner:
Int. Cl.
CPC ...
G06K / ; G06K / ;
U.S. Cl.
CPC ...
382310 ; 382155 ; 382159 ; 382161 ; 382215 ; 382226 ; 382230 ;
Abstract

A method and system for automatically modifying an original transcription produced as the output of a recognition operation produces a second, modified transcription, such as, for example, automatically correcting an errorful transcription produced by an OCR operation. The invention uses information in an input text image of character images and in an original transcription associated with the input text image to modify aspects of a formal image source model that models as a grammar the spatial image structure of a set of text images. A recognition operation is then performed on the input text image using the modified formal image source model to produce a second, modified transcription. When the original transcription is errorful, the second transcription is a corrected transcription. Several aspects of the formal image source model may be modified; in particular, character templates to be used in the recognition operation are trained in the font of the glyphs occurring in the input text image. When errors in the original transcription are caused by matching glyphs against templates that are inadequately specified for the given input text image, the subsequently performed recognition operation on the text image using the trained, font-specific character templates produces a more accurate transcription.


Find Patent Forward Citations

Loading…