The patent badge is an abbreviated version of the USPTO patent document. The patent badge does contain a link to the full patent document.

The patent badge is an abbreviated version of the USPTO patent document. The patent badge covers the following: Patent number, Date patent was issued, Date patent was filed, Title of the patent, Applicant, Inventor, Assignee, Attorney firm, Primary examiner, Assistant examiner, CPCs, and Abstract. The patent badge does contain a link to the full patent document (in Adobe Acrobat format, aka pdf). To download or print any patent click here.

Date of Patent:
Jan. 17, 2017

Filed:

Dec. 14, 2011
Applicants:

Jiewen Huang, New Haven, CT (US);

Zhimin Chen, Seattle, WA (US);

Arvind Arasu, Bothell, WA (US);

Vivek Narasayya, Redmond, WA (US);

Inventors:

Jiewen Huang, New Haven, CT (US);

Zhimin Chen, Seattle, WA (US);

Arvind Arasu, Bothell, WA (US);

Vivek Narasayya, Redmond, WA (US);

Assignee:
Attorneys:
Primary Examiner:
Int. Cl.
CPC ...
G06F 17/30 (2006.01);
U.S. Cl.
CPC ...
G06F 17/30867 (2013.01);
Abstract

A set expansion system is described herein that improves precision, recall, and performance of prior set expansion methods for large sets of data. The system maintains high precision and recall by 1) identifying the qualify of particular lists and applying that quality through a weight, 2) allowing for the specification or negative examples in a set of seeds to reduce the introduction of bad entities into the set, and 3) applying a cutoff to eliminate lists that include a low number of positive matches. The system may perform multiple passes to first generate a good candidate result set and then refine the set to find a set with highest quality. The system may also apply Map Reduce or other distributed processing techniques to allow calculation in parallel. Thus, the system efficiently expands large concept sets from a potentially small set of initial seeds from readily available web data.


Find Patent Forward Citations

Loading…