The patent badge is an abbreviated version of the USPTO patent document. The patent badge does contain a link to the full patent document.
The patent badge is an abbreviated version of the USPTO patent document. The patent badge covers the following: Patent number, Date patent was issued, Date patent was filed, Title of the patent, Applicant, Inventor, Assignee, Attorney firm, Primary examiner, Assistant examiner, CPCs, and Abstract. The patent badge does contain a link to the full patent document (in Adobe Acrobat format, aka pdf). To download or print any patent click here.
Patent No.:
Date of Patent:
May. 14, 2019
Filed:
Jul. 10, 2017
Sichuan University, Chengdu, Sichuan, CN;
Junfeng Wang, Chengdu, CN;
Jie Liang, Chengdu, CN;
Xiaosong Zhang, Chengdu, CN;
Dong Liu, Chengdu, CN;
Yong Ma, Chengdu, CN;
SICHUAN UNIVERSITY, Chengdu, CN;
Abstract
This invention public a kind of malicious software clustering method expressed based on TLSH feature, which belongs to the analysis and test area of malicious software. Firstly, the Cuckoo Sandbox is used to analyze the malicious software to acquire three kinds of character string features, which are the static feature of the software, resource assess record during operation and API; And then the character strings are disassembled, filtered and sorted and the TLSH algorithm is used to compress them into three groups of feature values with size of 70 characters; Finally the OPTICS algorithm is utilized to realize the automatic classification on the malicious software family. This invention adopts unsupervised learning methods, which does not need the manual tab for the training in advance. The features which are extracted are compressed and expressed by using the TLSH. Under the situation that the feature is not lost, the data dimension is largely lowered and the clustering speed is improved; Through adoption of OPTICS clustering algorithm based on the density, it can not only recognize the cluster of any shape or any number but also largely reduce the influence of the input parameters on the clustering result while improving the efficiency and quality of clustering.