Openly accessible

An automated classification system based on the strings of trojan and virus families

Tian, Ronghua, Batten, Lynn, Islam, Rafiqul and Versteeg, Steve 2009, An automated classification system based on the strings of trojan and virus families, in MALWARE 2009: 4th International Conference on Malicious and Unwanted Software, IEEE, New York, N.Y., pp. 23-30.

Attached Files
Name Description MIMEType Size Downloads
tian-rh-anautomatedclassification-2009.pdf Published version application/pdf 321.35KB 381

Title An automated classification system based on the strings of trojan and virus families
Author(s) Tian, Ronghua
Batten, Lynn
Islam, Rafiqul
Versteeg, Steve
Conference name International Conference on Malicious and Unwanted Software (4th : 2009 : Montréal, Quebec)
Conference location Montréal, Quebec, Canada
Conference dates 13–14 October 2009
Title of proceedings MALWARE 2009: 4th International Conference on Malicious and Unwanted Software
Editor(s) [Unknown]
Publication date 2009
Conference series Malicious and Unwanted Software Conference
Start page 23
End page 30
Total pages 8
Publisher IEEE
Place of publication New York, N.Y.
Keyword(s) malware
classification
strings
Summary Classifying malware correctly is an important research issue for anti-malware software producers. This paper presents an effective and efficient malware classification technique based on string information using several wellknown classification algorithms. In our testing we extracted the printable strings from 1367 samples, including unpacked trojans and viruses and clean files. Information describing the printable strings contained in each sample was input to various classification algorithms, including treebased classifiers, a nearest neighbour algorithm, statistical algorithms and AdaBoost. Using k-fold cross validation on the unpacked malware and clean files, we achieved a classification accuracy of 97%. Our results reveal that strings from library code (rather than malicious code itself) can be utilised to distinguish different malware families.
Notes This material is presented to ensure timely dissemination of scholarly and technical work. Copyright and all rights therein are retained by authors or by other copyright holders. All persons copying this information are expected to adhere to the terms and constraints invoked by each author's copyright. In most cases, these works may not be reposted without the explicit permission of the copyright holder.
ISBN 9781424457878
Language eng
Field of Research 080403 Data Structures
Socio Economic Objective 890205 Information Processing Services (incl. Data Entry and Capture)
HERDC Research category E1 Full written paper - refereed
Copyright notice ©2009, IEEE
Persistent URL http://hdl.handle.net/10536/DRO/DU:30028345

Document type: Conference Paper
Collections: School of Information Technology
Open Access Collection
Connect to link resolver
 
Unless expressly stated otherwise, the copyright for items in DRO is owned by the author, with all rights reserved.

Every reasonable effort has been made to ensure that permission has been obtained for items included in DRO. If you believe that your rights have been infringed by this repository, please contact drosupport@deakin.edu.au.

Versions
Version Filter Type
Citation counts: TR Web of Science Citation Count  Cited 6 times in TR Web of Science
Scopus Citation Count Cited 13 times in Scopus
Google Scholar Search Google Scholar
Access Statistics: 409 Abstract Views, 385 File Downloads  -  Detailed Statistics
Created: Thu, 15 Apr 2010, 13:22:47 EST by Sandra Dunoon

Every reasonable effort has been made to ensure that permission has been obtained for items included in DRO. If you believe that your rights have been infringed by this repository, please contact drosupport@deakin.edu.au.