An investigation of performance analysis of anomaly detection techniques for big data in SCADA systems
journal contributionposted on 2015-01-01, 00:00 authored by Mohiuddin Ahmed, Adnan AnwarAdnan Anwar, Abdun Naser Mahmood, Zubair Shah, Michael J Maher
Anomaly detection is an important aspect of data mining, where the main objective is to identify anomalous or unusual data from a given dataset. However, there is no formal categorization of application-specific anomaly detection techniques for big data and this ignites a confusion for the data miners. In this paper, we categorise anomaly detection techniques based on nearest neighbours, clustering and statistical approaches and investigate the performance analysis of these techniques in critical infrastructure applications such as SCADA systems. Extensive experimental analysis is conducted to compare representative algorithms from each of the categories using seven benchmark datasets (both real and simulated) in SCADA systems. The effectiveness of the representative algorithms is measured through a number of metrics. We highlighted the set of algorithms that are the best performing for SCADA systems.