Feature reduction to speed up malware classification

Moonsamy, Veelasha; Tian, Ronghua; Batten, Lynn

Feature reduction to speed up malware classification

chapter

posted on 2012-01-01, 00:00 authored by Veelasha Moonsamy, Ronghua Tian, Lynn BattenLynn Batten

In statistical classification work, one method of speeding up the process is to use only a small percentage of the total parameter set available. In this paper, we apply this technique both to the classification of malware and the identification of malware from a set combined with cleanware. In order to demonstrate the usefulness of our method, we use the same sets of malware and cleanware as in an earlier paper. Using the statistical technique Information Gain (IG), we reduce the set of features used in the experiment from 7,605 to just over 1,000. The best accuracy obtained in the former paper using 7,605 features is 97.3% for malware versus cleanware detection and 97.4% for malware family classification; on the reduced feature set, we obtain a (best) accuracy of 94.6% on the malware versus cleanware test and 94.5% on the malware classification test. An interesting feature of the new tests presented here is the reduction in false negative rates by a factor of about 1/3 when compared with the results of the earlier paper. In addition, the speed with which our tests run is reduced by a factor of approximately 3/5 from the times posted for the original paper. The small loss in accuracy and improved false negative rate along with significant improvement in speed indicate that feature reduction should be further pursued as a tool to prevent algorithms from becoming intractable due to too much data.

History

Title of book

Information security technology for applications

Series

Lecture notes in computer science; v.7161

Chapter number

13

Pagination

176 - 188

Publisher

Springer

Place of publication

Berlin , Germany

Publisher DOI

https://doi.org/10.1007/978-3-642-29615-4_13

ISBN-13

9783642296154

ISBN-10

3642296157

Language

eng

Notes

Presented at the NordSec 2011 : Information security technology for applications : Proceedings of the 16th Nordic Conference in Secure IT Systems

Publication classification

B1 Book chapter

Copyright notice

2012, Springer-Verlag

Extent

18

Editor/Contributor(s)

P Laud

Related work

DU:30044841

Usage metrics

Keywords

dynamic analysis feature reduction malware classification Science & Technology Technology Computer Science, Theory & Methods Computer Science FEATURE-SELECTION

Feature reduction to speed up malware classification

History

Title of book

Series

Chapter number

Pagination

Publisher

Place of publication

Publisher DOI

ISBN-13

ISBN-10

Language

Notes

Publication classification

Copyright notice

Extent

Editor/Contributor(s)

Related work

Usage metrics

Categories

Keywords

Licence

Exports