The patent badge is an abbreviated version of the USPTO patent document. The patent badge does contain a link to the full patent document.

The patent badge is an abbreviated version of the USPTO patent document. The patent badge covers the following: Patent number, Date patent was issued, Date patent was filed, Title of the patent, Applicant, Inventor, Assignee, Attorney firm, Primary examiner, Assistant examiner, CPCs, and Abstract. The patent badge does contain a link to the full patent document (in Adobe Acrobat format, aka pdf). To download or print any patent click here.

Date of Patent:
Feb. 24, 2009

Filed:

Sep. 30, 2005
Applicants:

Srinivasan Balasubramanian, San Jose, CA (US);

Michael Ching, San Jose, CA (US);

Piyoosh Jalan, San Jose, CA (US);

Satish C. Penmetsa, Fremont, CA (US);

Andrew S. Tomkins, San Jose, CA (US);

Inventors:

Srinivasan Balasubramanian, San Jose, CA (US);

Michael Ching, San Jose, CA (US);

Piyoosh Jalan, San Jose, CA (US);

Satish C. Penmetsa, Fremont, CA (US);

Andrew S. Tomkins, San Jose, CA (US);

Attorney:
Primary Examiner:
Int. Cl.
CPC ...
G06F 13/30 (2006.01);
U.S. Cl.
CPC ...
Abstract

A system and method of crawling at least one website comprising at least one URL includes maintaining a lookup structure comprising all of the URLs known to be on a website; calculating a hub score for each webpage of the website to be recrawled, wherein the hub score measures how likely the to be recrawled webpage includes links to fresh content published on the website; sorting all the to be recrawled pages by their hub scores; and crawling the to be recrawled pages in order from highest hub scores to lowest hub scores. The calculating comprises computing a first value equaling a percentage of a number of new relative URLs on the to be recrawled page; computing a second value equaling a percentage of a previous hub score of the to be recrawled page; and computing the hub score as a sum of the first and the second values.


Find Patent Forward Citations

Loading…