The patent badge is an abbreviated version of the USPTO patent document. The patent badge does contain a link to the full patent document.

The patent badge is an abbreviated version of the USPTO patent document. The patent badge covers the following: Patent number, Date patent was issued, Date patent was filed, Title of the patent, Applicant, Inventor, Assignee, Attorney firm, Primary examiner, Assistant examiner, CPCs, and Abstract. The patent badge does contain a link to the full patent document (in Adobe Acrobat format, aka pdf). To download or print any patent click here.

Date of Patent:
Sep. 10, 2024

Filed:

Sep. 24, 2021
Applicant:

Salesforce, Inc., San Francisco, CA (US);

Inventors:

Mingfei Gao, Sunnyvale, CA (US);

Zeyuan Chen, Mountain View, CA (US);

Ran Xu, Mountain View, CA (US);

Assignee:

Salesforce, Inc., San Francisco, CA (US);

Attorney:
Primary Examiner:
Assistant Examiner:
Int. Cl.
CPC ...
G06N 20/20 (2019.01); G06N 3/084 (2023.01); G06N 5/01 (2023.01); G06N 5/04 (2023.01); G06V 30/412 (2022.01); G06V 30/413 (2022.01);
U.S. Cl.
CPC ...
G06N 20/20 (2019.01); G06N 3/084 (2013.01); G06N 5/01 (2023.01); G06N 5/04 (2013.01); G06V 30/412 (2022.01); G06V 30/413 (2022.01);
Abstract

A field extraction system that does not require field-level annotations for training is provided. Specifically, the training process is bootstrapped by mining pseudo-labels from unlabeled forms using simple rules. Then, a transformer-based structure is used to model interactions between text tokens in the input form and predict a field tag for each token accordingly. The pseudo-labels are used to supervise the transformer training. As the pseudo-labels are noisy, a refinement module that contains a sequence of branches is used to refine the pseudo-labels. Each of the refinement branches conducts field tagging and generates refined labels. At each stage, a branch is optimized by the labels ensembled from all previous branches to reduce label noise.


Find Patent Forward Citations

Loading…