The patent badge is an abbreviated version of the USPTO patent document. The patent badge does contain a link to the full patent document.
The patent badge is an abbreviated version of the USPTO patent document. The patent badge covers the following: Patent number, Date patent was issued, Date patent was filed, Title of the patent, Applicant, Inventor, Assignee, Attorney firm, Primary examiner, Assistant examiner, CPCs, and Abstract. The patent badge does contain a link to the full patent document (in Adobe Acrobat format, aka pdf). To download or print any patent click here.
Patent No.:
Date of Patent:
Oct. 10, 2017
Filed:
Oct. 16, 2014
Google Inc., Mountain View, CA (US);
Marc-Allen Cartright, Stanford, CA (US);
Luis Garcia Pueyo, San Jose, CA (US);
Vanja Josifovski, Los Gatos, CA (US);
Amitabh Saikia, Mountain View, CA (US);
Jie Yang, Santa Clara, CA (US);
Mike Bendersky, Sunnyvale, CA (US);
MyLinh Yang, Saratoga, CA (US);
GOOGLE INC., Mountain View, CA (US);
Abstract
Methods, apparatus, systems, and computer-readable media are provided for generating and applying data extraction templates. In various implementations, a corpus of plain text communications such as emails may be grouped into clusters based on one or more similarities between the plain text communications. One or more segments of communications of a particular cluster may be classified as transient based on textual pattern matching. One or more other segments of the communications of the particular cluster may be classified as transient based on various criteria. One or more transient segments may be assigned a generic and/or specific semantic data type and/or a confidentiality designation based on various signals. A data extraction template may be generated to extract, from subsequent plain text communications, content associated with transient (and in some cases, non-confidential) segments.