Predicting the impact of android malicious samples via machine learning

Qiu, Junyang, Luo, Wei, Pan, Lei, Tai, Yonghang, Zhang, Jun and Xiang, Yang 2019, Predicting the impact of android malicious samples via machine learning, IEEE Access, pp. 1-14, doi: 10.1109/access.2019.2914311.

Attached Files
Name Description MIMEType Size Downloads

Title Predicting the impact of android malicious samples via machine learning
Author(s) Qiu, JunyangORCID iD for Qiu, Junyang
Luo, WeiORCID iD for Luo, Wei
Pan, LeiORCID iD for Pan, Lei
Tai, Yonghang
Zhang, JunORCID iD for Zhang, Jun
Xiang, YangORCID iD for Xiang, Yang
Journal name IEEE Access
Start page 1
End page 14
Total pages 14
Publisher IEEE
Place of publication Piscataway, N.J.
Publication date 2019
ISSN 2169-3536
Keyword(s) Malware
deep neural network
high impact malicious samples
low impact malicious samples
static analysis
Summary Recently Android malicious samples threaten billions of the mobile end users’ security or privacy. The community researchers have designed many methods to automatically and accurately identify Android malware samples. However, the rapid increase of Android malicious samples outpowers the capabilities of traditional Android malware detectors and classifiers with respect to the cyber security risk management needs. It is important to identify the small proportion of Android malicious samples that may produce high cyber-security or privacy impact. In this paper, we propose a light-weight solution to automatically identify the Android malicious samples with high security and privacy impact. We manuallycheck a number of Android malware families and corresponding security incidents, and define two impact metrics for Android malicious samples. Our investigation results in a new Android malware dataset with impact ground truth (low impact or high impact). This new dataset is employed to empirically investigate the intrinsic characteristics of low impact as well as high impact malicious samples. To characterize and captureAndroid malicious samples’ pattern, the reverse engineering is performed to extract semantic features to represent malicious samples. The leveraged features are parsed from both the AndroidManifest.xml files aswell as the disassembled binary classes.dex codes. Then the extracted features are embedded into numerical vectors. Furthermore, we train highly accurate Support Vector Machine and Deep Neural Network classifiers to categorize the candidate Android malicious samples into low impact or high impact. The empirical results validate the effectiveness of our designed light-weight solution. This method can be further utilized foridentifying those high impact Android malicious samples in the wild.
Notes Early Access Article
Language eng
DOI 10.1109/access.2019.2914311
HERDC Research category C1 Refereed article in a scholarly journal
Copyright notice ©2019, IEEE
Persistent URL

Connect to link resolver
Unless expressly stated otherwise, the copyright for items in DRO is owned by the author, with all rights reserved.

Version Filter Type
Citation counts: TR Web of Science Citation Count  Cited 0 times in TR Web of Science
Scopus Citation Count Cited 0 times in Scopus
Google Scholar Search Google Scholar
Access Statistics: 115 Abstract Views, 0 File Downloads  -  Detailed Statistics
Created: Fri, 03 May 2019, 10:30:12 EST

Every reasonable effort has been made to ensure that permission has been obtained for items included in DRO. If you believe that your rights have been infringed by this repository, please contact