The patent badge is an abbreviated version of the USPTO patent document. The patent badge does contain a link to the full patent document.
The patent badge is an abbreviated version of the USPTO patent document. The patent badge covers the following: Patent number, Date patent was issued, Date patent was filed, Title of the patent, Applicant, Inventor, Assignee, Attorney firm, Primary examiner, Assistant examiner, CPCs, and Abstract. The patent badge does contain a link to the full patent document (in Adobe Acrobat format, aka pdf). To download or print any patent click here.
Patent No.:
Date of Patent:
Nov. 10, 1998
Filed:
May. 30, 1996
William James Rucklidge, Mountain View, CA (US);
Daniel P Huttenlocher, Ithaca, NY (US);
Eric W Jaquith, San Francisco, CA (US);
Xerox Corporation, Stamford, CT (US);
Abstract
A method and apparatus for comparing symbols extracted from binary images of text for classifying into equivalence classes. The present invention uses a Hausdorff-like method for comparing symbols for similarity. When a symbol contained in a bitmap A is compared to a symbol contained in a bitmap B, it is determined whether or not the symbol in bitmap B fits within a tolerance into a dilated representation of the symbol in bitmap A with no excessive density of errors and whether the symbol in bitmap A fits within a tolerance into a dilated representation of the symbol in bitmap B with no excessive density of errors. If both tests are passed, an error density check is performed to determine a match. The dilated representation of the bitmap accounts for various quantization errors that may occur along the boundaries of a symbol defined in the respective bitmaps. The dilation utilized preserves the topology of the symbol. The topology preserving dilation is one where symbols are 'thickened' yet the local topology (or connectedness) of the symbol is not changed. Such a dilation is performed by applying a set of local rules to 'off' pixels that are adjacent to 'on' pixels. Quantization effects are also accounted for through the use of a non-linear error allowance. The non-linear error allowance implements the idea that small symbols provide for little or no error, whereas large symbols provide for a proportionately larger amount of error.