A new data mining scheme for analysis of big brain signal data
Version 2 2024-06-06, 12:32Version 2 2024-06-06, 12:32
Version 1 2019-05-07, 10:08Version 1 2019-05-07, 10:08
conference contribution
posted on 2024-06-06, 12:32authored byS Siuly, R Zarei, H Wang, Y Zhang
Analysis and processing of brain signal data (e.g. Electroencephalogram (EEG) data) is a significant challenge in the medical data mining community due to its massive size and dynamic nature. The most crucial part of EEG data analysis is to discover hidden knowledge from a large volume of data through pattern mining for efficient analysis. This study focuses on discovering representative patterns from each channel data to recover useful information reducing the size of data. In this paper, a novel algorithm based on principal component analysis (PCA) technique is developed to accurately and efficiently extract a pattern from the vast amount of EEG signal data. This study considers PCA to explore the sequential pattern of each EEG channel data as PCA is a dominant tool for finding patterns in it. In order to represent the distribution of the pattern, the most significant satanical features (e.g. mean, standard deviation) are computed from the extracted pattern. Then aggregating all of the features extracted from each of the patterns in a subject, a feature vector set is created that is fed into random forest (RF) and random tree (RT) classification model, individually for classifying different categories of the signals. The proposed methodology is tested on two benchmark EEG datasets of BCI Competition III: dataset V and dataset IVa. In order to further evaluate performance, the proposed scheme is compared with some recently reported algorithms, where the same datasets were used. The experimental results demonstrate that the proposed methodology with RF achieves higher performance compared to the RT and also the recently reported methods. The present study suggests the merits and feasibility of applying proposed method in the medical data mining for efficient analysis of any biomedical signal data.