The patent badge is an abbreviated version of the USPTO patent document. The patent badge does contain a link to the full patent document.

The patent badge is an abbreviated version of the USPTO patent document. The patent badge covers the following: Patent number, Date patent was issued, Date patent was filed, Title of the patent, Applicant, Inventor, Assignee, Attorney firm, Primary examiner, Assistant examiner, CPCs, and Abstract. The patent badge does contain a link to the full patent document (in Adobe Acrobat format, aka pdf). To download or print any patent click here.

Date of Patent:
Jan. 11, 2022

Filed:

Aug. 07, 2019
Applicant:

International Business Machines Corporation, Armonk, NY (US);

Inventors:

Chih-Hsiung Liu, Taipei, TW;

Peter Wu, New Taipei, TW;

Tzu-Chen Chao, Taipei, TW;

I-Chien Lin, Taipei, TW;

Attorney:
Primary Examiner:
Int. Cl.
CPC ...
G06F 16/95 (2019.01); G06F 16/951 (2019.01); G06F 9/455 (2018.01); G06F 16/955 (2019.01);
U.S. Cl.
CPC ...
G06F 16/951 (2019.01); G06F 9/45558 (2013.01); G06F 16/955 (2019.01); G06F 2009/45595 (2013.01);
Abstract

Systems, methods, and computer program products for implementing a web crawler platform comprising one or more containerized web crawler programs working in tandem to synergistically index web resources and reduce redundancy experienced by multiple web crawlers independently indexing overlapping web resources. The platform provides a URL namespace, allowing crawlers to register with the platform and create URL endpoints for other crawlers to discover existing crawlers registered to the platform and identify web resources previously indexed. The platform provides crawler to crawler communication and exchanges of data and metadata obtained from web resources that have been previously indexed, allowing for crawlers to share existing data or metadata without having to directly crawl through the web resource. As web crawlers move between data centers of different geolocations, the crawler's registered URL is mapped to subsequent IP addresses, allowing for transparency and continuous identification by other crawlers registered with the platform.


Find Patent Forward Citations

Loading…