File(s) under permanent embargo
New gene selection method using gene expression programing approach on microarray data sets
conference contribution
posted on 2019-01-01, 00:00 authored by R Alanni, Jingyu HouJingyu Hou, Hasseeb Dawood Injas Azzawi, Yong XiangYong XiangFeature selection in machine learning and data mining facilitates the optimization of accuracy attained from the classifier with smallest number of features. The use of feature selection in microarray data mining is quite promising. However, usually it is hard to identify and select the feature genes from microarray data sets because multi-class categories and high dimensionality features exist in microarray data with a small-sized sample. Therefore, using good selection approaches to eliminate incomprehensibility and optimize prediction accuracy is becoming necessary, because it will help obtain genes that are relevant to sample classification when investigating large number of genes. In his paper, we propose a new feature selection method for microarray data sets. The method consists of the Gain Ratio (GR) and Improved Gene Expression Programming (IGEP) algorithms which are for gene filtering and feature selection respectively. Support Vector Machine (SVM) alongside with leave-one-out cross-validation (LOOCV) method was used to evaluate the proposed method on eight microarray datasets captured in the literature. The experimental results showed the effectiveness of the proposed method in selecting small number of features while generating higher classification accuracies compared with other existing feature selection approaches.