The patent badge is an abbreviated version of the USPTO patent document. The patent badge does contain a link to the full patent document.

The patent badge is an abbreviated version of the USPTO patent document. The patent badge covers the following: Patent number, Date patent was issued, Date patent was filed, Title of the patent, Applicant, Inventor, Assignee, Attorney firm, Primary examiner, Assistant examiner, CPCs, and Abstract. The patent badge does contain a link to the full patent document (in Adobe Acrobat format, aka pdf). To download or print any patent click here.

Date of Patent:
Jun. 06, 2017

Filed:

Sep. 30, 2014
Applicant:

Emc Corporation, Hopkinton, MA (US);

Inventors:

Raphael Cohen, Beer-Sheva, IL;

Alon Grubshtein, Lehavim, IL;

Aisling J. Crowley, Ballincollig, IE;

Peter R. Elliot, Kinsale, IE;

Assignee:

EMC IP Holding Company LLC, Hopkinton, MA (US);

Attorney:
Primary Examiner:
Int. Cl.
CPC ...
G06F 7/00 (2006.01); G06F 17/00 (2006.01); G06F 17/30 (2006.01); G06N 7/00 (2006.01);
U.S. Cl.
CPC ...
G06F 17/30713 (2013.01); G06F 17/30011 (2013.01); G06N 7/00 (2013.01);
Abstract

An apparatus comprises a processing platform configured to implement a cluster labeling system for documents comprising unstructured text data. The cluster labeling system comprises a clustering module and a visualization module. The clustering module implements a topic model generator and is configured to assign each of the documents to one or more of a plurality of clusters based at least in part on one or more topics identified from the unstructured text data using at least one topic model provided by the topic model generator. The visualization module comprises multiple view generators configured to generate respective distinct visualizations of a selected one of the clusters. The multiple view generators include at least a bigram view generator configured to provide a visualization of a plurality of term pairs from the selected cluster, and a summarization view generator configured to provide a visualization of representative term sequences from the selected cluster.


Find Patent Forward Citations

Loading…