A3CM: automatic capability annotation for android malware

Qiu, Junyang; Zhang, Jun; Luo, Wei; Pan, Lei; Nepal, Surya; Wang, Yu; Xiang, Yang

A3CM: automatic capability annotation for android malware

journal contribution

posted on 2024-06-06, 00:31 authored by Junyang Qiu, Jun Zhang, Wei LuoWei Luo, Lei Pan, Surya Nepal, Yu Wang, Yang Xiang

Android malware poses serious security and privacy threats to the mobile users. Traditional malware detection and family classification technologies are becoming less effective due to the rapid evolution of the malware landscape, with the emerging of so-called zero-day-family malware families. To address this issue, our paper presents a novel research problem on automatically identifying the security/privacy-related capabilities of any detected malware, which we refer to as Malware Capability Annotation (MCA). Motivated by the observation that known and zero-day-family malware families share the security/privacy-related capabilities, MCA opens a new alternative way to effectively analyze zero-day-family malware (the malware that do not belong to any existing families) through exploring the related information and knowledge from known malware families. To address the MCA problem, we design a new MCA hunger solution, Automatic Capability Annotation for Android Malware (A3CM). A3CM works in the following four steps: 1) A3CM automatically extracts a set of semantic features such as permissions, API calls, network addresses from raw binary APKs to characterize malware samples; 2) A3CM applies a statistical embedding method to map the features into a joint feature space, so that malware samples can be represented as numerical vectors; 3) A3CM infers the malicious capabilities by using the multi-label classification model; 4) The trained multi-label model is used to annotate the malicious capabilities of the candidate malware samples. To facilitate the new research of MCA, we create a new ground truth dataset that consists of 6,899 annotated Android malware samples from 72 families. We carry out a large number of experiments based on the four representative security/privacy-related capabilities to evaluate the effectiveness of A3CM. Our results show that A3CM can achieve promising accuracy of 1.00, 0.98 and 0.63 in inferring multiple capabilities of known Android malware, small size-families’ malware and zero-day-families’ Android malware, respectively.

History

Journal

IEEE Access

Volume

7

Pagination

147156-147168

Location

Piscataway, N.J.

Open access

Yes

Link to full text

https://doi.org/10.1109/access.2019.2946392

ISSN

2169-3536

eISSN

2169-3536

Language

eng

Publication classification

C1 Refereed article in a scholarly journal

Copyright notice

2019, IEEE

Publisher

Institute of Electrical and Electronics Engineers (IEEE)

Usage metrics

Keywords

Android malware security/privacy-related capability multi-label learning malicious capability prediction zero-day-family malware 4604 Cybersecurity and privacy 4699 Other information and computing sciences Technology

A3CM: automatic capability annotation for android malware

History

Journal

Volume

Pagination

Location

Open access

Link to full text

ISSN

eISSN

Language

Publication classification

Copyright notice

Publisher

Usage metrics

Categories

Keywords

Licence

Exports