The patent badge is an abbreviated version of the USPTO patent document. The patent badge does contain a link to the full patent document.
The patent badge is an abbreviated version of the USPTO patent document. The patent badge covers the following: Patent number, Date patent was issued, Date patent was filed, Title of the patent, Applicant, Inventor, Assignee, Attorney firm, Primary examiner, Assistant examiner, CPCs, and Abstract. The patent badge does contain a link to the full patent document (in Adobe Acrobat format, aka pdf). To download or print any patent click here.
Patent No.:
Date of Patent:
May. 03, 2016
Filed:
Feb. 13, 2014
Go Daddy Operating Company, Llc, Scottsdale, AZ (US);
Robert Brown, Phoenix, AZ (US);
Tapan Kamdar, San Jose, CA (US);
Ryan Kirkish, Gilbert, AZ (US);
Wei-Cheng Lai, Cupertino, CA (US);
Jeff McLellan, Phoenix, AZ (US);
Go Daddy Operating Company, LLC, Scottsdale, AZ (US);
Abstract
Systems and methods for the categorization of websites are presented. A website is categorized using one or a combination of its domain name and its web page content. The domain name is tokenized, and the tokens compared to categories in a category structure to determine probabilities that the token belongs to each category. Combinations of tokens are similarly compared to the categories. A category may be determined with reference to a vector space in which a training set of websites having known categories is converted according to a methodology into reference vectors containing keyword frequencies. A target website is converted to a target vector using the same methodology, and a distance score of the target vector to each reference vector is calculated. The website represented by the target vector is assigned the category of the reference vector having the lowest distance score.