The patent badge is an abbreviated version of the USPTO patent document. The patent badge does contain a link to the full patent document.

The patent badge is an abbreviated version of the USPTO patent document. The patent badge covers the following: Patent number, Date patent was issued, Date patent was filed, Title of the patent, Applicant, Inventor, Assignee, Attorney firm, Primary examiner, Assistant examiner, CPCs, and Abstract. The patent badge does contain a link to the full patent document (in Adobe Acrobat format, aka pdf). To download or print any patent click here.

Date of Patent:
Oct. 28, 2025

Filed:

Apr. 26, 2024
Applicant:

Dell Products L.p., Round Rock, TX (US);

Inventors:

Brennan Troy Robert Seal, Austin, TX (US);

Chris Everett Peterson, Austin, TX (US);

Nicholas Anthony Esposito, Round Rock, TX (US);

Rachel Gabrielle Mazzini, Dallas, TX (US);

Sandeep Bola Ratnakar, Bangalore, IN;

Siddharth Sreekumar, Bangalore, IN;

Assignee:

Dell Products L.P., Round Rock, TX (US);

Attorney:
Primary Examiner:
Int. Cl.
CPC ...
G06F 16/951 (2019.01); G06F 16/9538 (2019.01); G06F 16/953 (2019.01); G06F 16/9532 (2019.01); G06F 40/205 (2020.01);
U.S. Cl.
CPC ...
G06F 16/951 (2019.01); G06F 16/953 (2019.01); G06F 16/9532 (2019.01); G06F 16/9538 (2019.01); G06F 40/205 (2020.01);
Abstract

Crawling electronic documents, including: for each electronic document of a plurality of electronic documents: obtaining the electronic document including obtaining an entirety of the HyperText Markup Language (HTML) of the electronic document; analyzing the electronic document to identify a plurality of elements of the electronic document, each element of the plurality of elements including HTML tags, text associated with the HTML tags, and HTML attributes; creating a plurality of clusters of texts based on a similarity of the HTML tags, the text associated with the HTML tags, and the HTML attributes of each of the plurality of elements; labeling, for each cluster of the plurality of clusters, the cluster based on the text associated with the HTML tags of one element of the cluster; and updating, for each cluster of the plurality of clusters, an electronic document crawling model with data indicating the label of the cluster.


Find Patent Forward Citations

Loading…