The patent badge is an abbreviated version of the USPTO patent document. The patent badge does contain a link to the full patent document.
The patent badge is an abbreviated version of the USPTO patent document. The patent badge covers the following: Patent number, Date patent was issued, Date patent was filed, Title of the patent, Applicant, Inventor, Assignee, Attorney firm, Primary examiner, Assistant examiner, CPCs, and Abstract. The patent badge does contain a link to the full patent document (in Adobe Acrobat format, aka pdf). To download or print any patent click here.
Patent No.:
Date of Patent:
May. 26, 2015
Filed:
Feb. 25, 2012
Michael Hart, Mountain View, CA (US);
Kushal Tayal, Stanford, CA (US);
Phillip Dicorpo, San Francisco, CA (US);
Michael Hart, Mountain View, CA (US);
Kushal Tayal, Stanford, CA (US);
Phillip DiCorpo, San Francisco, CA (US);
Symantec Corporation, Mountain View, CA (US);
Abstract
A computer-implemented method for classifying documents for data loss prevention may include 1) identifying training documents for a machine learning classifier configured for data loss prevention, 2) performing a semantic analysis on training documents to identify topics within the set training documents, 3) applying a similarity metric to the topics to identify at least one unrelated topic with a similarity to the other topics within the plurality of topics, as determined by the similarity metric, that falls below a similarity threshold, 4) identifying, based on the semantic analysis, at least one irrelevant training document within the set of training documents in which a predominance of the unrelated topic is above a predominance threshold, and 5) excluding the irrelevant training document from the set of training documents based on the predominance of the unrelated topic within the irrelevant training document. Various other methods, systems, and computer-readable media are also disclosed.