The patent badge is an abbreviated version of the USPTO patent document. The patent badge does contain a link to the full patent document.
The patent badge is an abbreviated version of the USPTO patent document. The patent badge covers the following: Patent number, Date patent was issued, Date patent was filed, Title of the patent, Applicant, Inventor, Assignee, Attorney firm, Primary examiner, Assistant examiner, CPCs, and Abstract. The patent badge does contain a link to the full patent document (in Adobe Acrobat format, aka pdf). To download or print any patent click here.
Patent No.:
Date of Patent:
May. 10, 2016
Filed:
Nov. 05, 2012
Microsoft Corporation, Redmond, WA (US);
Zhong Wu, Issaquah, WA (US);
Xian-Sheng Hua, Sammamish, WA (US);
Microsoft Technology Licensing, LLC, Redmond, WA (US);
Abstract
Architecture that includes a junk (unwanted) image detection algorithm which performs junk image detection of unwanted images before the images are actually downloaded for indexing. Features are employed related to image location information and host websites, such as image path descriptor (e.g., URL-uniform resource locator) pattern features, webpage content features, click features, and image aggregated information in a machine learning based framework to predict the probability that an image is unwanted (or wanted) before the images are downloaded. The framework is then applied to build a statistical model and predict junk scores. By removing image URLs marked as 'junk' from the work list of an automated indexer (e.g., crawler), the indexer bandwidth is significantly improved with a corresponding improvement in the publish rate.