The patent badge is an abbreviated version of the USPTO patent document. The patent badge does contain a link to the full patent document.
The patent badge is an abbreviated version of the USPTO patent document. The patent badge covers the following: Patent number, Date patent was issued, Date patent was filed, Title of the patent, Applicant, Inventor, Assignee, Attorney firm, Primary examiner, Assistant examiner, CPCs, and Abstract. The patent badge does contain a link to the full patent document (in Adobe Acrobat format, aka pdf). To download or print any patent click here.
Patent No.:
Date of Patent:
Sep. 28, 2021
Filed:
Sep. 30, 2020
Fujitsu Limited, Kawasaki, JP;
FUJITSU LIMITED, Kawasaki, JP;
Abstract
In an embodiment, operations include crawling a set of web pages and labeling one or more items of a first web page based on user input. Each item corresponds to a node in a first tree data structure of the first web page. The operations further include generating a first extraction rule to extract a first item from the one or more first items. The first extraction rule includes a first path, in the first tree data structure, for a first node of the first item, and includes first visual information of each node in the first path. The operations further include comparing the first visual information in the first path with second visual information of each of a plurality of candidate nodes in a second tree data structure of a second web page and further refining the first extraction rule to generate a second extraction rule.