The patent badge is an abbreviated version of the USPTO patent document. The patent badge does contain a link to the full patent document.

The patent badge is an abbreviated version of the USPTO patent document. The patent badge covers the following: Patent number, Date patent was issued, Date patent was filed, Title of the patent, Applicant, Inventor, Assignee, Attorney firm, Primary examiner, Assistant examiner, CPCs, and Abstract. The patent badge does contain a link to the full patent document (in Adobe Acrobat format, aka pdf). To download or print any patent click here.

Date of Patent:
Jul. 30, 2019

Filed:

Jul. 21, 2016
Applicant:

Spotify Ab, Stockholm, SE;

Inventors:

Kurt Jacobson, Stoneham, MA (US);

Daniel E. Stowell, Cambridge, MA (US);

Brian Whitman, Brooklyn, NY (US);

Athena Y. Koumis, Someville, MA (US);

Jason H. Steinbach, Somerville, MA (US);

Assignee:

SPOTIFY AB, Stockholm, SE;

Attorney:
Primary Examiner:
Assistant Examiner:
Int. Cl.
CPC ...
G06N 5/04 (2006.01); H04L 29/06 (2006.01); G06F 16/35 (2019.01); G06F 16/40 (2019.01); G06F 16/48 (2019.01); G06F 16/28 (2019.01); G06F 16/951 (2019.01); G06F 16/958 (2019.01); G06F 17/27 (2006.01); G06N 7/00 (2006.01); H04L 29/08 (2006.01);
U.S. Cl.
CPC ...
G06N 5/04 (2013.01); G06F 16/285 (2019.01); G06F 16/35 (2019.01); G06F 16/40 (2019.01); G06F 16/48 (2019.01); G06F 16/489 (2019.01); G06F 16/951 (2019.01); G06F 16/986 (2019.01); G06F 17/272 (2013.01); G06F 17/277 (2013.01); G06F 17/278 (2013.01); G06N 7/005 (2013.01); H04L 67/02 (2013.01); H04L 67/06 (2013.01); H04L 67/42 (2013.01);
Abstract

Methods, systems and computer program products for clustering pages into headline clusters are provided by collecting web data, identifying pages from the web data, tokenizing unique words in each page, recognizing unique entities in each page, detecting media links in each page, and constructing a plurality of vector representations of each page. A first dimension of each vector representation includes the unique words tokenized in each page, a second dimension of each vector representation includes the unique entities recognized in each page, and a third dimension of each vector representation includes the media links detected in each page. The vector representations are, in turn, clustered.


Find Patent Forward Citations

Loading…