The patent badge is an abbreviated version of the USPTO patent document. The patent badge does contain a link to the full patent document.

The patent badge is an abbreviated version of the USPTO patent document. The patent badge covers the following: Patent number, Date patent was issued, Date patent was filed, Title of the patent, Applicant, Inventor, Assignee, Attorney firm, Primary examiner, Assistant examiner, CPCs, and Abstract. The patent badge does contain a link to the full patent document (in Adobe Acrobat format, aka pdf). To download or print any patent click here.

Date of Patent:
Jun. 17, 2025

Filed:

Jul. 19, 2022
Applicant:

Onetrust Llc, Atlanta, GA (US);

Inventors:

Madan Avadhani, Palo Alto, CA (US);

Siddhartha Kille, Atlanta, GA (US);

Assignee:

OneTrust, LLC, Atlanta, GA (US);

Attorney:
Primary Examiner:
Int. Cl.
CPC ...
G06F 40/284 (2020.01); G06F 18/22 (2023.01); G06F 40/295 (2020.01); G06V 30/32 (2022.01);
U.S. Cl.
CPC ...
G06F 40/284 (2020.01); G06F 18/22 (2023.01); G06F 40/295 (2020.01); G06V 30/32 (2022.01);
Abstract

Methods, systems, and non-transitory computer readable storage media are disclosed for correcting entity detection errors with entity correction and resolution in optical character recognition for digitization of physical documents. Specifically, the disclosed system utilizes named entity recognition to extract entities from character strings (e.g., words) in a digital text document. The disclosed system also tokenizes the character strings in the digital text document based on attributes of the character strings. Furthermore, the disclosed system compares the extracted entities and tokenized character strings to determine similarity metrics between the extracted entities and tokenized character strings. The disclosed system also compares extracted entities to character strings including special/numerical characters to determine similarity metrics indicating correlation probabilities between entities and character strings. The disclosed systems generate mappings between the tokens and entities based on the similarity metrics to resolve entities to likely corresponding character strings while correcting for errors during entity extraction.


Find Patent Forward Citations

Loading…