The patent badge is an abbreviated version of the USPTO patent document. The patent badge does contain a link to the full patent document.

The patent badge is an abbreviated version of the USPTO patent document. The patent badge covers the following: Patent number, Date patent was issued, Date patent was filed, Title of the patent, Applicant, Inventor, Assignee, Attorney firm, Primary examiner, Assistant examiner, CPCs, and Abstract. The patent badge does contain a link to the full patent document (in Adobe Acrobat format, aka pdf). To download or print any patent click here.

Date of Patent:
May. 19, 2020

Filed:

Nov. 23, 2016
Applicant:

Google Inc., Mountain View, CA (US);

Inventors:

Ying Sheng, Sunnyvale, CA (US);

Yifeng Lu, Mountain View, CA (US);

Jing Xie, San Jose, CA (US);

Jie Yang, Sunnyvale, CA (US);

Luis Garcia Pueyo, Mountain View, CA (US);

Jinan Lou, Cupertino, CA (US);

James Wendt, Los Angeles, CA (US);

Assignee:

GOOGLE LLC, Mountain View, CA (US);

Attorney:
Primary Examiner:
Int. Cl.
CPC ...
G06F 16/00 (2019.01); G06F 16/28 (2019.01); G06N 20/00 (2019.01); G06F 16/93 (2019.01); G06Q 10/10 (2012.01); G06N 20/20 (2019.01); G06F 40/174 (2020.01); G06F 40/186 (2020.01);
U.S. Cl.
CPC ...
G06F 16/285 (2019.01); G06F 16/93 (2019.01); G06F 40/174 (2020.01); G06F 40/186 (2020.01); G06N 20/00 (2019.01); G06N 20/20 (2019.01); G06Q 10/10 (2013.01);
Abstract

Techniques are described herein for automatically generating data extraction templates for structured documents (e.g., B2C emails, invoices, bills, invitations, etc.), and for assigning classifications to those data extraction templates to streamline data extraction from subsequent structured documents. In various implementations, a data extraction template generated from a cluster of structured documents that share fixed content may be identified. Features of the cluster of structured documents may be applied as input to extraction machine learning model(s) trained to provide location(s) of transient field(s) in structured documents, to determine location(s) of transient field(s) in the cluster of structured documents. An association between the data extraction template and the determined transient field location(s) may be stored. Based on the association, data point(s) may be extracted from a given structured document of a user that shares fixed content with the cluster of structured documents. The extracted data point(s) may be surfaced to the user.


Find Patent Forward Citations

Loading…