The patent badge is an abbreviated version of the USPTO patent document. The patent badge does contain a link to the full patent document.
The patent badge is an abbreviated version of the USPTO patent document. The patent badge covers the following: Patent number, Date patent was issued, Date patent was filed, Title of the patent, Applicant, Inventor, Assignee, Attorney firm, Primary examiner, Assistant examiner, CPCs, and Abstract. The patent badge does contain a link to the full patent document (in Adobe Acrobat format, aka pdf). To download or print any patent click here.
Patent No.:
Date of Patent:
Apr. 04, 2017
Filed:
Sep. 03, 2014
Applicant:
Xerox Corporation, Norwalk, CT (US);
Inventors:
Assignee:
Xerox Corporation, Norwalk, CT (US);
Attorney:
Primary Examiner:
Assistant Examiner:
Int. Cl.
CPC ...
G06K 9/62 (2006.01); G06K 9/00 (2006.01); G06K 9/72 (2006.01);
U.S. Cl.
CPC ...
G06K 9/00449 (2013.01); G06K 9/00442 (2013.01); G06K 9/00463 (2013.01); G06K 9/00469 (2013.01); G06K 9/72 (2013.01);
Abstract
This disclosure provides an exemplary method and system for extracting structured label and value pairwise textual data from a textual document. According to an exemplary method, initially a layout analysis is performed resulting in one or more alternatives for grouping and ordering the textual elements of interest. Next, textual elements are tagged as including a label term, a value term or a label and value term. Finally, a sequence-based method is applied to the tagged elements to generate one or more sequence listings representative of the label and value pairwise data structure(s) and label:value pairwise data is extracted.