The patent badge is an abbreviated version of the USPTO patent document. The patent badge does contain a link to the full patent document.

The patent badge is an abbreviated version of the USPTO patent document. The patent badge covers the following: Patent number, Date patent was issued, Date patent was filed, Title of the patent, Applicant, Inventor, Assignee, Attorney firm, Primary examiner, Assistant examiner, CPCs, and Abstract. The patent badge does contain a link to the full patent document (in Adobe Acrobat format, aka pdf). To download or print any patent click here.

Date of Patent:
Sep. 09, 2025

Filed:

Apr. 30, 2024
Applicant:

Oracle International Corporation, Redwood Shores, CA (US);

Inventors:

Sagar Gollamudi, San Diego, CA (US);

Vishank Bhatia, Sunnyvale, CA (US);

Xu Zhong, Vermont South, AU;

Thanh Long Duong, Point Cook, AU;

Mark Johnson, Castle Cove, AU;

Srinivasa Phani Kumar Gadde, Fremont, CA (US);

Vishal Vishnoi, Redwood City, CA (US);

Assignee:

Oracle International Corporation, Redwood Shores, CA (US);

Attorney:
Primary Examiner:
Int. Cl.
CPC ...
G06F 40/103 (2020.01);
U.S. Cl.
CPC ...
G06F 40/103 (2020.01);
Abstract

A data corpus is partitioned into text strings for header classification. A group characteristic is computed for a text string, and whether the group characteristic satisfies a group characteristic criterion is determined. The text string may be disqualified from header classification if the group characteristic criterion is not satisfied, or one or more font characteristics may be determined for the text string if the group characteristic criterion is satisfied. A font characteristic that meets one or more prevalence criteria may be identified and evaluated to determine whether the font characteristic meets at least one font characteristic criterion. The text string may be disqualified from header classification if the font characteristic criterion is not satisfied, or if the font characteristic meets the font characteristic criterion, the text string is classified as a header, and tagged content is generated by applying a header tag to the text string.


Find Patent Forward Citations

Loading…