Deakin University
Browse

File(s) under permanent embargo

Ensemble of Local Decision Trees for Anomaly Detection in Mixed Data

Version 2 2024-06-05, 11:56
Version 1 2021-09-16, 00:25
conference contribution
posted on 2024-06-05, 11:56 authored by Sunil AryalSunil Aryal, Jonathan R Wells
Anomaly Detection (AD) is used in many real-world applications such as cybersecurity, banking, and national intelligence. Though many AD algorithms have been proposed in the literature, their effectiveness in practical real-world problems are rather limited. It is mainly because most of them: (i) examine anomalies globally w.r.t. the entire data, but some anomalies exhibit suspicious characteristics w.r.t. their local neighbourhood (local context) only and they appear to be normal in the global context; and (ii) assume that data features are all numeric, but real-world data have numeric/quantitative and categorical/qualitative features. In this paper, we propose a simple robust solution to address the above-mentioned issues. The main idea is to partition the data space and build local models in different regions rather than building a global model for the entire data space. To cover sufficient local context around a test data instance, multiple local models from different partitions (an ensemble of local models) are used. We used classical decision trees that can handle numeric and categorical features well as local models. Our results show that an Ensemble of Local Decision Trees (ELDT) produces better and more consistent detection accuracies compared to popular state-of-the-art AD methods, particularly in datasets with mixed types of features.

History

Volume

12975

Pagination

687-702

Location

Bilbao, Spain

Start date

2021-09-13

End date

2021-09-17

ISSN

0302-9743

eISSN

1611-3349

ISBN-13

9783030864866

Language

eng

Publication classification

E1 Full written paper - refereed

Title of proceedings

ECML PKDD 2021 : Machine Learning and Knowledge Discovery in Databases. Research Track

Event

Machine Learning and Knowledge Discovery in Databases. Conference (2021 : Bilbao, Spain)

Publisher

Springer

Place of publication

Cham, Switzerland

Series

Lecture Notes in Computer Science ; 12975

Usage metrics

    Research Publications

    Categories

    No categories selected

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC