The patent badge is an abbreviated version of the USPTO patent document. The patent badge does contain a link to the full patent document.

The patent badge is an abbreviated version of the USPTO patent document. The patent badge covers the following: Patent number, Date patent was issued, Date patent was filed, Title of the patent, Applicant, Inventor, Assignee, Attorney firm, Primary examiner, Assistant examiner, CPCs, and Abstract. The patent badge does contain a link to the full patent document (in Adobe Acrobat format, aka pdf). To download or print any patent click here.

Patent No.:

US 5956419 B1

Date of Patent:

Sep. 21, 1999

Filed:

Apr. 28, 1995

Unsupervised training of character templates using unsegmented samples

Applicant:

Inventors:

Gary E Kopec, Belmont, CA (US);

Philip Andrew Chou, Menlo Park, CA (US);

Assignee:

Xerox Corporation, Stamford, CT (US);

Attorney:

Primary Examiner:

Jon Chang

Assistant Examiner:

Jayanti K Patel

Int. Cl.

CPC ...

G06K / ;

U.S. Cl.

CPC ...

382159 ; 345467 ; 382112 ; 382187 ; 382229 ;

Abstract

A method for operating a machine to perform unsupervised training of a set of character templates uses as the source of training samples an image source of character images, called glyphs, that need not be manually or automatically segmented or isolated prior to training. A recognition operation performed on the image source of character images produces a labeled glyph position data structure that includes, for each glyph in the image source, a glyph image position in the image source associating an estimated image location of the glyph in the image source with a character label paired with the glyph image position that indicates the character in the character set being trained. The labeled glyph position data and the image source are then used to determine sample image regions in the image source; each sample image region is large enough to contain at least a single glyph but need not be restricted in size to only contain a single glyph. The template construction process using unsegmented samples is mathematically modeled as an optimization problem that optimizes a function that represents the set of character templates being trained as an ideal image to be reconstructed to match the input image. The method produces all of the character templates substantially contemporaneously by using a novel pixel scoring technique that implements an approximation of a maximum likelihood criterion subject to a constraint on the templates produced which holds that foreground pixels in adjacently positioned character images have substantially nonoverlapping foreground pixels. The character templates produced may be binary templates or arrays of probability values.

Find Patent Forward Citations