The patent badge is an abbreviated version of the USPTO patent document. The patent badge does contain a link to the full patent document.
The patent badge is an abbreviated version of the USPTO patent document. The patent badge covers the following: Patent number, Date patent was issued, Date patent was filed, Title of the patent, Applicant, Inventor, Assignee, Attorney firm, Primary examiner, Assistant examiner, CPCs, and Abstract. The patent badge does contain a link to the full patent document (in Adobe Acrobat format, aka pdf). To download or print any patent click here.
Patent No.:
Date of Patent:
Aug. 14, 2018
Filed:
Jan. 27, 2017
Xerox Corporation, Norwalk, CT (US);
Sainarayanan Gopalakrishnan, Chennai, IN;
Rajasekar Kanagasabai, Chennai, IN;
Sudhagar Subbaian, Coimbatore, IN;
XEROX CORPORATION, Norwalk, CT (US);
Abstract
The present disclosure discloses methods and systems for creating a multi-layered Optical Character Recognition (OCR) document, the multi-layered OCR document facilitates selection of the desired text from the multi-layered OCR document. The method includes receiving a scanned image corresponding to a document, the document includes text information. A binary image is generated from the scanned image. Then, a morphological dilation operation is performed to create one or more text groups, using a horizontal structuring element and a vertical structuring element. Thereafter, OCR operation is applied on each text group to generate a corresponding OCR layer. The one or more OCR layers are then combined while creating a multi-layered OCR document. Finally, the combined OCR layers are superimposed as invisible text layers over the scanned image to create the multi-layered OCR document.