The patent badge is an abbreviated version of the USPTO patent document. The patent badge does contain a link to the full patent document.
The patent badge is an abbreviated version of the USPTO patent document. The patent badge covers the following: Patent number, Date patent was issued, Date patent was filed, Title of the patent, Applicant, Inventor, Assignee, Attorney firm, Primary examiner, Assistant examiner, CPCs, and Abstract. The patent badge does contain a link to the full patent document (in Adobe Acrobat format, aka pdf). To download or print any patent click here.
Patent No.:
Date of Patent:
Jan. 29, 2013
Filed:
Dec. 04, 2008
Lei Zheng, Sunnyvale, CA (US);
Sharat Narayan, Sunnyvale, CA (US);
Mark E. Risher, San Francisco, CA (US);
Stanley KE Wei, Palo Alto, CA (US);
Vishwanath Tumkur Ramarao, Sunnyvale, CA (US);
Anirban Kundu, San Francisco, CA (US);
Lei Zheng, Sunnyvale, CA (US);
Sharat Narayan, Sunnyvale, CA (US);
Mark E. Risher, San Francisco, CA (US);
Stanley Ke Wei, Palo Alto, CA (US);
Vishwanath Tumkur Ramarao, Sunnyvale, CA (US);
Anirban Kundu, San Francisco, CA (US);
Yahoo! Inc., Sunnyvale, CA (US);
Abstract
Embodiments are directed towards classifying messages as spam using a two phased approach. The first phase employs a statistical classifier to classify messages based on message content. The second phase targets specific message types to capture dynamic characteristics of the messages and identify spam messages using a token frequency based approach. A client component receives messages and sends them to the statistical classifier, which determines a probability that a message belongs to a particular type of class. The statistical classifier further provides other information about a message, including, a token list, and token thresholds. The message class, token list, and thresholds are provided to the second phase where a number of spam tokens in a given message for a given message class are determined. Based on the threshold, the client component then determines whether the message is spam or non-spam.