The patent badge is an abbreviated version of the USPTO patent document. The patent badge does contain a link to the full patent document.

The patent badge is an abbreviated version of the USPTO patent document. The patent badge covers the following: Patent number, Date patent was issued, Date patent was filed, Title of the patent, Applicant, Inventor, Assignee, Attorney firm, Primary examiner, Assistant examiner, CPCs, and Abstract. The patent badge does contain a link to the full patent document (in Adobe Acrobat format, aka pdf). To download or print any patent click here.

Date of Patent:
Dec. 31, 2024

Filed:

Feb. 22, 2022
Applicant:

Tao Automation Services Private Limited, Bangalore, IN;

Inventors:

Hariharamoorthy Theriappan, Bangalore, IN;

Amit Rajan, Ranchi, IN;

Nagaraju Pappu, Bangalore, IN;

Jawahar Bekay, Bengaluru, IN;

Assignee:
Attorneys:
Primary Examiner:
Int. Cl.
CPC ...
G06F 40/284 (2020.01); G06F 40/117 (2020.01); G06N 5/022 (2023.01);
U.S. Cl.
CPC ...
G06F 40/284 (2020.01); G06F 40/117 (2020.01); G06N 5/022 (2013.01);
Abstract

Embodiments of the present disclosure provide systems and methods for extracting entities from semi-structured enterprise documents. The method performed by a server system includes receiving an enterprise document in a semi-structured format. The method includes extracting document features from the enterprise document. The document features include structural, token-specific, and entity-specific features. Further, the method includes identifying candidate entities in the enterprise document based at least on a machine learning model which uses document features. The candidate entities include candidate tabular entities and candidate non-tabular entities. The method includes computing probability scores for the one or more tokens-corresponding to the candidate non-tabular entities and the candidate tabular entities, based at least on the machine learning model. The method includes extracting structured data from the enterprise document according to the candidate non-tabular and tabular entities based at least on the probability scores.


Find Patent Forward Citations

Loading…