The patent badge is an abbreviated version of the USPTO patent document. The patent badge does contain a link to the full patent document.
The patent badge is an abbreviated version of the USPTO patent document. The patent badge covers the following: Patent number, Date patent was issued, Date patent was filed, Title of the patent, Applicant, Inventor, Assignee, Attorney firm, Primary examiner, Assistant examiner, CPCs, and Abstract. The patent badge does contain a link to the full patent document (in Adobe Acrobat format, aka pdf). To download or print any patent click here.
Patent No.:
Date of Patent:
Jul. 28, 2015
Filed:
Sep. 30, 2009
Zaiqing Nie, Beijing, CN;
Yong Cao, Beijing, CN;
Ji-rong Wen, Beijing, CN;
Chunyu Yang, Beijing, CN;
Zaiqing Nie, Beijing, CN;
Yong Cao, Beijing, CN;
Ji-Rong Wen, Beijing, CN;
Chunyu Yang, Beijing, CN;
Microsoft Technology Licensing, LLC, Redmond, WA (US);
Abstract
Described is a technology for understanding entities of a webpage, e.g., to label the entities on the webpage. An iterative and bidirectional framework processes a webpage, including a text understanding component (e.g., extended Semi-CRF model) that provides text segmentation features to a structure understanding component (e.g., extended HCRF model). The structure understanding component uses the text segmentation features and visual layout features of the webpage to identify a structure (e.g., labeled block). The text understanding component in turn uses the labeled block to further understand the text. The process continues iteratively until a similarity criterion is met, at which time the entities may be labeled. Also described is the use of multiple mentions of a set of text in the webpage to help in labeling an entity.