The patent badge is an abbreviated version of the USPTO patent document. The patent badge does contain a link to the full patent document.

The patent badge is an abbreviated version of the USPTO patent document. The patent badge covers the following: Patent number, Date patent was issued, Date patent was filed, Title of the patent, Applicant, Inventor, Assignee, Attorney firm, Primary examiner, Assistant examiner, CPCs, and Abstract. The patent badge does contain a link to the full patent document (in Adobe Acrobat format, aka pdf). To download or print any patent click here.

Date of Patent:
Nov. 19, 2013

Filed:

Jan. 06, 2010
Applicants:

Jeffrey M. Achtermann, Austin, TX (US);

Indrajit Bhattacharya, New Delhi, IN;

Kevin W. English, Jr., Fairfield, CT (US);

Shantanu R. Godbole, New Delhi, IN;

Sachindra Joshi, New Delhi, IN;

Ashwin Srinivasan, New Delhi, IN;

Ashish Verma, New Delhi, IN;

Inventors:

Jeffrey M. Achtermann, Austin, TX (US);

Indrajit Bhattacharya, New Delhi, IN;

Kevin W. English, Jr., Fairfield, CT (US);

Shantanu R. Godbole, New Delhi, IN;

Sachindra Joshi, New Delhi, IN;

Ashwin Srinivasan, New Delhi, IN;

Ashish Verma, New Delhi, IN;

Attorneys:
Primary Examiner:
Int. Cl.
CPC ...
G06F 17/30 (2006.01); G06F 17/27 (2006.01);
U.S. Cl.
CPC ...
Abstract

A system and associated method for cross-guided data clustering by aligning target clusters in a target domain to source clusters in a source domain. The cross-guided clustering process takes the target domain and the source domain as inputs. A common word attribute shared by both the target domain and the source domain is a pivot vocabulary, and all other words in both domains are a non-pivot vocabulary. The non-pivot vocabulary is projected onto the pivot vocabulary to improve measurement of similarity between data items. Source centroids representing clusters in the source domain are created and projected to the pivot vocabulary. Target centroids representing clusters in the target domain are initially created by conventional clustering method and then repetitively aligned to converge with the source centroids by use of a cross-domain similarity graph that measures a respective similarity of each target centroid to each source centroid.


Find Patent Forward Citations

Loading…