The patent badge is an abbreviated version of the USPTO patent document. The patent badge does contain a link to the full patent document.

The patent badge is an abbreviated version of the USPTO patent document. The patent badge covers the following: Patent number, Date patent was issued, Date patent was filed, Title of the patent, Applicant, Inventor, Assignee, Attorney firm, Primary examiner, Assistant examiner, CPCs, and Abstract. The patent badge does contain a link to the full patent document (in Adobe Acrobat format, aka pdf). To download or print any patent click here.

Date of Patent:
Feb. 06, 2024

Filed:

Jul. 27, 2021
Applicant:

Emc Ip Holding Company Llc, Hopkinton, MA (US);

Inventors:

Paulo Abelha Ferreira, Rio de Janeiro, BR;

Pablo Nascimento da Silva, Niterói, BR;

Rômulo Teixeira de Abreu Pinho, Niterói, BR;

Tiago Salviano Calmon, London, GB;

Vinicius Michel Gottin, Rio de Janeiro, BR;

Assignee:

EMC IP Holding Company LLC, Hopkinton, MA (US);

Attorneys:
Primary Examiner:
Assistant Examiner:
Int. Cl.
CPC ...
G06F 30/00 (2020.01); G06V 30/412 (2022.01); G06F 16/35 (2019.01); G06V 30/413 (2022.01); G06V 30/414 (2022.01); G06F 18/214 (2023.01);
U.S. Cl.
CPC ...
G06V 30/412 (2022.01); G06F 16/35 (2019.01); G06F 18/214 (2023.01); G06V 30/413 (2022.01); G06V 30/414 (2022.01);
Abstract

Techniques described herein relate to a method for predicting field values of documents. The method may include identifying a field prediction model generation request; obtaining, training documents from a document manager; selecting a first training document; making a first determination that the first training document is a text-based document; performing text-based data extraction to identify first words and first boxes included in the first training document; identifying first keywords and first candidate words included in the first training document based on the first words and the first boxes; and generating a first annotated training document using the first keywords and the first candidate words, wherein the first annotated training document comprises color-based representation masks for the first keywords, the first candidate words, and first general words included in the first training document.


Find Patent Forward Citations

Loading…