Feature reduction to speed up malware classification

Moonsamy, Veelasha, Tian, Ronghua and Batten, Lynn 2012, Feature reduction to speed up malware classification. In Laud, Peeter (ed), Information security technology for applications, Springer, Berlin , Germany, pp.176-188, doi: 10.1007/978-3-642-29615-4_13.

Attached Files
Name Description MIMEType Size Downloads

Title Feature reduction to speed up malware classification
Author(s) Moonsamy, Veelasha
Tian, Ronghua
Batten, LynnORCID iD for Batten, Lynn orcid.org/0000-0003-4525-2423
Title of book Information security technology for applications
Editor(s) Laud, Peeter
Publication date 2012
Series Lecture notes in computer science; v.7161
Chapter number 13
Total chapters 18
Start page 176
End page 188
Total pages 13
Publisher Springer
Place of Publication Berlin , Germany
Keyword(s) dynamic analysis
feature reduction
malware classification
Summary In statistical classification work, one method of speeding up the process is to use only a small percentage of the total parameter set available. In this paper, we apply this technique both to the classification of malware and the identification of malware from a set combined with cleanware. In order to demonstrate the usefulness of our method, we use the same sets of malware and cleanware as in an earlier paper. Using the statistical technique Information Gain (IG), we reduce the set of features used in the experiment from 7,605 to just over 1,000. The best accuracy obtained in the former paper using 7,605 features is 97.3% for malware versus cleanware detection and 97.4% for malware family classification; on the reduced feature set, we obtain a (best) accuracy of 94.6% on the malware versus cleanware test and 94.5% on the malware classification test. An interesting feature of the new tests presented here is the reduction in false negative rates by a factor of about 1/3 when compared with the results of the earlier paper. In addition, the speed with which our tests run is reduced by a factor of approximately 3/5 from the times posted for the original paper. The small loss in accuracy and improved false negative rate along with significant improvement in speed indicate that feature reduction should be further pursued as a tool to prevent algorithms from becoming intractable due to too much data.
Notes Presented at the NordSec 2011 : Information security technology for applications : Proceedings of the 16th Nordic Conference in Secure IT Systems
ISBN 3642296157
Language eng
DOI 10.1007/978-3-642-29615-4_13
Field of Research 080201 Analysis of Algorithms and Complexity
Socio Economic Objective 890301 Electronic Information Storage and Retrieval Services
HERDC Research category B1 Book chapter
Related work DU:30044841
Copyright notice ©2012, Springer-Verlag
Persistent URL http://hdl.handle.net/10536/DRO/DU:30044746

Connect to link resolver
Unless expressly stated otherwise, the copyright for items in DRO is owned by the author, with all rights reserved.

Version Filter Type
Citation counts: TR Web of Science Citation Count  Cited 7 times in TR Web of Science
Scopus Citation Count Cited 12 times in Scopus
Google Scholar Search Google Scholar
Access Statistics: 763 Abstract Views, 30 File Downloads  -  Detailed Statistics
Created: Mon, 30 Apr 2012, 14:57:56 EST by Barb Robertson

Every reasonable effort has been made to ensure that permission has been obtained for items included in DRO. If you believe that your rights have been infringed by this repository, please contact drosupport@deakin.edu.au.