The patent badge is an abbreviated version of the USPTO patent document. The patent badge does contain a link to the full patent document.

The patent badge is an abbreviated version of the USPTO patent document. The patent badge covers the following: Patent number, Date patent was issued, Date patent was filed, Title of the patent, Applicant, Inventor, Assignee, Attorney firm, Primary examiner, Assistant examiner, CPCs, and Abstract. The patent badge does contain a link to the full patent document (in Adobe Acrobat format, aka pdf). To download or print any patent click here.

Date of Patent:
Nov. 14, 2023

Filed:

May. 27, 2021
Applicant:

Tata Consultancy Services Limited, Mumbai, IN;

Inventors:

Mouli Rastogi, Gurgaon, IN;

Syed Afshan Ali, Gurgaon, IN;

Mrinal Rawat, Gurgaon, IN;

Lovekesh Vig, Gurgaon, IN;

Puneet Agarwal, Noida, IN;

Gautam Shroff, Gurgaon, IN;

Ashwin Srinivasan, Goa, IN;

Assignee:
Attorney:
Primary Examiner:
Assistant Examiner:
Int. Cl.
CPC ...
G06K 9/00 (2022.01); G06F 16/93 (2019.01); G06F 16/901 (2019.01); G06K 9/62 (2022.01); G06K 9/40 (2006.01); G06K 9/46 (2006.01); G06V 30/418 (2022.01); G06V 10/30 (2022.01); G06V 10/426 (2022.01); G06V 10/75 (2022.01); G06V 30/413 (2022.01); G06V 30/414 (2022.01); G06F 18/21 (2023.01); G06F 18/22 (2023.01); G06V 30/18 (2022.01);
U.S. Cl.
CPC ...
G06V 30/418 (2022.01); G06F 16/9024 (2019.01); G06F 16/93 (2019.01); G06F 18/21 (2023.01); G06F 18/22 (2023.01); G06V 10/30 (2022.01); G06V 10/426 (2022.01); G06V 10/751 (2022.01); G06V 30/18057 (2022.01); G06V 30/413 (2022.01); G06V 30/414 (2022.01);
Abstract

This disclosure relates to a method and system for extracting information from images of one or more templatized documents. A knowledge graph with a fixed schema based on background knowledge is used to capture spatial and semantic relationships of entities present in scanned document and an adaptive lattice-based approach based on formal concepts analysis (FCA) is used to determine a similarity metric that utilizes both spatial and semantic information to determine if the structure of the scanned document image adheres to any of the known document templates. If a known document template whose structure is closely matching the structure of the scanned document is detected, then an inductive rule learning based approach is used to learn symbolic rules to extract information present in scanned document image and if a new document template is detected, then future scanned document images belonging to new document template are automatically processed using the learnt rules.


Find Patent Forward Citations

Loading…