Deakin University
Browse

An Unsupervised Model for Identifying and Characterizing Dark Web Forums

Download (4.16 MB)
Version 2 2024-06-04, 04:38
Version 1 2021-09-11, 07:21
journal contribution
posted on 2024-06-04, 04:38 authored by S Nazah, Shamsul HudaShamsul Huda, Jemal AbawajyJemal Abawajy, MM Hassan
Dark Web forums are significantly exploited to trade confidential information and illicit products by criminals. This paper addresses the problem of how to identify the cluster of discussion forums and their characteristics on the Dark Web. Exiting methods are mostly dependent on the continuous labeled contents, which are expensive and not feasible due to the nature of Dark Web data. Therefore, an approach that does not need a continuous availability of labeled forum and related knowledge is required. To this end, we propose an unsupervised model to identify and characterize Dark Web forums by combining clustering algorithm and decision tree algorithm. The proposed method presents the characteristics in an explainable form that can be used by the cyber threat intelligence system and law enforcement as scientific evidence to analyze any data breach or illicit activities in the Dark Web forums. To evaluate the performance of our model comprehensive experiments were conducted using real Dark Web forum data. The proposed approach achieves 98% accuracy and F1 score of 98% validating the efficacy of our proposed model to successfully characterize Dark Web forums. The experimental results suggest that the proposed model could be useful to the cyber threat intelligence and law enforcement community for building an intelligent source of knowledge that can be used for detecting data breach and illicit activities happening in the Dark Web forums.

History

Journal

IEEE Access

Volume

9

Pagination

112871-112892

Location

Piscataway, N.J.

Open access

  • Yes

ISSN

2169-3536

eISSN

2169-3536

Language

English

Publication classification

C1 Refereed article in a scholarly journal

Publisher

IEEE