File(s) under permanent embargo

Polymorphic malware detection using Hierarchical Hidden Markov Model

conference contribution
posted on 2011-01-01, 00:00 authored by F Muhaya, M Khan, Yang Xiang
Binary signatures have been widely used to detect malicious software on the current Internet. However, this approach is unable to achieve the accurate identification of polymorphic malware variants, which can be easily generated by the malware authors using code generation engines. Code generation engines randomly produce varying code sequences but perform the same desired malicious functions. Previous research used flow graph and signature tree to identify polymorphic malware families. The key difficulty of previous research is the generation of precisely defined state machine models from polymorphic variants. This paper proposes a novel approach, using Hierarchical Hidden Markov Model (HHMM), to provide accurate inductive inference of the malware family. This model can capture the features of self-similar and hierarchical structure of polymorphic malware family signature sequences. To demonstrate the effectiveness and efficiency of this approach, we evaluate it with real malware samples. Using more than 15,000 real malware, we find our approach can achieve high true positives, low false positives, and low computational cost.



IEEE International Conference on Dependable, Autonomic and Secure Computing (9th : 2011 : Sydney, N.S.W.)


151 - 155


IEEE Computer Society Conference Publishing Services (CPS)


Sydney, N.S.W.

Place of publication

[Piscataway, N.J.]

Start date


End date








Publication classification

E1 Full written paper - refereed

Copyright notice

2011, IEEE

Title of proceedings

DASC 2011 : Proceedings of the 2011 IEEE Ninth International Conference on Dependable, Autonomic and Secure Computing