The patent badge is an abbreviated version of the USPTO patent document. The patent badge does contain a link to the full patent document.

The patent badge is an abbreviated version of the USPTO patent document. The patent badge covers the following: Patent number, Date patent was issued, Date patent was filed, Title of the patent, Applicant, Inventor, Assignee, Attorney firm, Primary examiner, Assistant examiner, CPCs, and Abstract. The patent badge does contain a link to the full patent document (in Adobe Acrobat format, aka pdf). To download or print any patent click here.

Date of Patent:
Nov. 26, 2019

Filed:

Apr. 14, 2016
Applicant:

Xerox Corporation, Norwalk, CT (US);

Inventors:

Ioan Calapodescu, Grenoble, FR;

Nicolas Guerin, Notre-Dame-de-Mésage, FR;

Fanchon Jacques, Meylan, FR;

Assignee:

XEROX CORPORATION, Norwalk, CT (US);

Attorney:
Primary Examiner:
Int. Cl.
CPC ...
G06F 16/00 (2019.01); G06N 20/00 (2019.01); G06F 16/35 (2019.01); G06F 16/33 (2019.01); G06F 16/30 (2019.01); G06F 16/27 (2019.01); G06F 16/93 (2019.01); G06F 16/332 (2019.01);
U.S. Cl.
CPC ...
G06F 16/353 (2019.01); G06F 16/278 (2019.01); G06F 16/30 (2019.01); G06F 16/3325 (2019.01); G06F 16/3344 (2019.01); G06F 16/35 (2019.01); G06F 16/93 (2019.01); G06N 20/00 (2019.01);
Abstract

A method for extracting entities from a text document includes, for at least a section of a text document, providing a first set of entities extracted from the at least a section, clustering at least a subset of the extracted entities in the first set into clusters, based on locations of the entities in the document. Complete ones of the clusters of entities are identified. Patterns for extracting new entities are learned based on the complete clusters. New entities are extracted from incomplete clusters based on the learned patterns.


Find Patent Forward Citations

Loading…