The patent badge is an abbreviated version of the USPTO patent document. The patent badge does contain a link to the full patent document.
The patent badge is an abbreviated version of the USPTO patent document. The patent badge covers the following: Patent number, Date patent was issued, Date patent was filed, Title of the patent, Applicant, Inventor, Assignee, Attorney firm, Primary examiner, Assistant examiner, CPCs, and Abstract. The patent badge does contain a link to the full patent document (in Adobe Acrobat format, aka pdf). To download or print any patent click here.
Patent No.:
Date of Patent:
Jun. 28, 2005
Filed:
Jan. 18, 2002
Michael J. Lemon, Palo Alto, CA (US);
Maria Castellanos, Sunnyvale, CA (US);
James R. Stinger, Palo Alto, CA (US);
Michael J. Lemon, Palo Alto, CA (US);
Maria Castellanos, Sunnyvale, CA (US);
James R. Stinger, Palo Alto, CA (US);
Hewlett-Packard Development Company, L.P., Houston, TX (US);
Abstract
Embodiments of the present invention are directed to a method for content mining of semi-structured documents. In one embodiment, a semi-structured document is first converted from a document-type specific format such as HTML or PDF, to a document-type independent format such as XML. The document formatting, which contains basic level information about the document's structure, is then analyzed by a series of modules to develop a higher level understanding of the document's structure. These modules append information to the document describing the features which collectively comprise the higher level document structure. The appended information facilitates finding specified information within the document when content mining is performed.