Ensemble of Local Decision Trees for Anomaly Detection in Mixed Data

Aryal, Sunil and Wells, Jonathan 2021, Ensemble of Local Decision Trees for Anomaly Detection in Mixed Data, in ECML PKDD 2021 : Machine Learning and Knowledge Discovery in Databases. Research Track, Springer, Cham, Switzerland, pp. 687-702, doi: 10.1007/978-3-030-86486-6_42.

Attached Files
Name Description MIMEType Size Downloads

Title Ensemble of Local Decision Trees for Anomaly Detection in Mixed Data
Author(s) Aryal, Sunil
Wells, Jonathan
Conference name Machine Learning and Knowledge Discovery in Databases. Conference (2021 : Bilbao, Spain)
Conference location Bilbao, Spain
Conference dates 2021/09/13 - 2021/09/17
Title of proceedings ECML PKDD 2021 : Machine Learning and Knowledge Discovery in Databases. Research Track
Publication date 2021
Series Lecture Notes in Computer Science ; 12975
Start page 687
End page 702
Total pages 16
Publisher Springer
Place of publication Cham, Switzerland
Keyword(s) Anomaly detection
Mixed data
LOF
IForest
Ensemble anomaly detection
Decision trees
CORE2020 A
Summary Anomaly Detection (AD) is used in many real-world applications such as cybersecurity, banking, and national intelligence. Though many AD algorithms have been proposed in the literature, their effectiveness in practical real-world problems are rather limited. It is mainly because most of them: (i) examine anomalies globally w.r.t. the entire data, but some anomalies exhibit suspicious characteristics w.r.t. their local neighbourhood (local context) only and they appear to be normal in the global context; and (ii) assume that data features are all numeric, but real-world data have numeric/quantitative and categorical/qualitative features. In this paper, we propose a simple robust solution to address the above-mentioned issues. The main idea is to partition the data space and build local models in different regions rather than building a global model for the entire data space. To cover sufficient local context around a test data instance, multiple local models from different partitions (an ensemble of local models) are used. We used classical decision trees that can handle numeric and categorical features well as local models. Our results show that an Ensemble of Local Decision Trees (ELDT) produces better and more consistent detection accuracies compared to popular state-of-the-art AD methods, particularly in datasets with mixed types of features.
ISBN 9783030864866
Language eng
DOI 10.1007/978-3-030-86486-6_42
HERDC Research category E1 Full written paper - refereed
Persistent URL http://hdl.handle.net/10536/DRO/DU:30155469

Connect to link resolver
 
Unless expressly stated otherwise, the copyright for items in DRO is owned by the author, with all rights reserved.

Versions
Version Filter Type
Citation counts: TR Web of Science Citation Count  Cited 0 times in TR Web of Science
Scopus Citation Count Cited 0 times in Scopus
Google Scholar Search Google Scholar
Access Statistics: 10 Abstract Views, 1 File Downloads  -  Detailed Statistics
Created: Thu, 16 Sep 2021, 00:25:55 EST

Every reasonable effort has been made to ensure that permission has been obtained for items included in DRO. If you believe that your rights have been infringed by this repository, please contact drosupport@deakin.edu.au.