Feature selection in machine learning and data mining facilitates the optimization of accuracy attained from the classifier with smallest number of features. The use of feature selection in microarray data mining is quite promising. However, usually it is hard to identify and select the feature genes from microarray data sets because multi-class categories and high dimensionality features exist in microarray data with a small-sized sample. Therefore, using good selection approaches to eliminate incomprehensibility and optimize prediction accuracy is becoming necessary, because it will help obtain genes that are relevant to sample classification when investigating large number of genes. In his paper, we propose a new feature selection method for microarray data sets. The method consists of the Gain Ratio (GR) and Improved Gene Expression Programming (IGEP) algorithms which are for gene filtering and feature selection respectively. Support Vector Machine (SVM) alongside with leave-one-out cross-validation (LOOCV) method was used to evaluate the proposed method on eight microarray datasets captured in the literature. The experimental results showed the effectiveness of the proposed method in selecting small number of features while generating higher classification accuracies compared with other existing feature selection approaches.
History
Volume
791
Pagination
17-31
Location
Singapore
Start date
2018-06-06
End date
2018-06-08
ISSN
1860-949X
Language
eng
Publication classification
E Conference publication, E1 Full written paper - refereed
Copyright notice
2019, Springer Nature Switzerland AG
Editor/Contributor(s)
Lee R
Title of proceedings
ICIS 2018 : Proceedings of the 17th IEEE/ACIS International Conference on Computer and Information Science
Event
International Association for Computer and Information Science. Conference (17th : 2018 : Singapore)
Publisher
Springer Nature
Place of publication
Cham, Switzerland
Series
International Association for Computer and Information Science Conference