Deakin University
Browse

File(s) under permanent embargo

An automated classification system based on the strings of trojan and virus families

conference contribution
posted on 2009-01-01, 00:00 authored by Ronghua Tian, Lynn BattenLynn Batten, R Islam, S Versteeg
Classifying malware correctly is an important research issue for anti-malware software producers. This paper presents an effective and efficient malware classification technique based on string information using several wellknown classification algorithms. In our testing we extracted the printable strings from 1367 samples, including unpacked trojans and viruses and clean files. Information describing the printable strings contained in each sample was input to various classification algorithms, including treebased classifiers, a nearest neighbour algorithm, statistical algorithms and AdaBoost. Using k-fold cross validation on the unpacked malware and clean files, we achieved a classification accuracy of 97%. Our results reveal that strings from library code (rather than malicious code itself) can be utilised to distinguish different malware families.

History

Event

International Conference on Malicious and Unwanted Software (4th : 2009 : Montréal, Quebec)

Pagination

23 - 30

Publisher

IEEE

Location

Montréal, Quebec, Canada

Place of publication

New York, N.Y.

Start date

2009-10-13

End date

2009-10-14

ISBN-13

9781424457878

Language

eng

Notes

This material is presented to ensure timely dissemination of scholarly and technical work. Copyright and all rights therein are retained by authors or by other copyright holders. All persons copying this information are expected to adhere to the terms and constraints invoked by each author's copyright. In most cases, these works may not be reposted without the explicit permission of the copyright holder.

Publication classification

E1 Full written paper - refereed

Copyright notice

2009, IEEE

Title of proceedings

MALWARE 2009: 4th International Conference on Malicious and Unwanted Software

Usage metrics

    Research Publications

    Categories

    No categories selected

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC