Deakin University
Browse

File(s) under permanent embargo

Static malware clustering using enhanced deep embedding method

Version 2 2024-06-06, 03:13
Version 1 2019-04-30, 15:57
journal contribution
posted on 2024-06-06, 03:13 authored by CK Ng, Frank JiangFrank Jiang, Leo ZhangLeo Zhang, W Zhou
Malware refers to any software, programs, or files that are intentionally utilised to compromise the system and cause unexpected losses to end‐users such as economical losses or privacy breaches. The rapid growth of malware makes it impossible to keep up with its progress merely via human interventions or manual analysis. One of the challenges for the human‐oriented approaches is they will cause backlog and inability to keep up with the development traces of the malware. Hence, an efficient method is needed urgently to analyse effectively and identify accurately the malware in their domain. Malware clustering has been extensively studied in the machine learning area with regards to distance functions, grouping algorithm and cluster validation. A large number of research studies have been done via behavioral analysis for clustering to achieve high performance of malware detections. However, there is a trade‐off for better detection performance between behaviorial approaches and high computational forces. Up to date, little work focuses on the deep learning representations for malware clustering. Therefore, in this paper, we propose an enhanced deep embedded clustering method to facilitate an effective and efficient malware clustering process. The new method takes advantage of linear dimensionality reduction and a customised deep neural network to learn malware representations in an orthogonal space and performs cluster assignments. Our experimental results demonstrate that the proposed clustering model outperforms the traditional K‐means method with regards to the enhanced features using various auto‐encoder, pre‐trained weight and principle component analysis (PCA).

History

Journal

Concurrency and Computation: Practice and Experience

Volume

31

Season

Special Issue: Special Issue on Algorithmic Advances in Parallel Architectures and Energy Efficient Computing (PPAM2017) and Recent Advances in Machine Learning for Cyber‐security (MLCSec2018)

Article number

ARTN e5234

Pagination

1 - 16

Location

Chichester, Eng.

ISSN

1532-0626

eISSN

1532-0634

Language

English

Publication classification

C1 Refereed article in a scholarly journal

Copyright notice

2019, John Wiley & Sons

Issue

19

Publisher

WILEY