Enhanced classification models for Iris dataset
Version 2 2024-06-13, 14:06Version 2 2024-06-13, 14:06
Version 1 2020-09-08, 15:55Version 1 2020-09-08, 15:55
conference contribution
posted on 2024-06-13, 14:06 authored by Y Wu, J He, Y Ji, G Huang, H Yao, P Zhang, W Xu, M Guo, Y Li© 2020 The Authors. Published by Elsevier B.V. Data mining and machine learning are both useful tools in the field of data analysis. Classification algorithm is one of the most important techniques in data mining, therefore, it is of great significance to select suitable classification models with high efficiency to show superiority when solving classification problems with the use of Iris data. With this goal, a decision tree induction algorithm, namely graftedTree, is proposed to build randomized decision trees. Randomization is explicitly introduced into this algorithm, such that applying the algorithm several times on the same training data results in diversified models. An ensemble classification model is constructed using multiple randomized decision trees via majority voting. In order to show the performance of different models in classification, we propose the usage of precision, recall, F-Measure, the area under the ROC curve (AUC) and Gini coefficient as evaluation indexes of the classifying performance on the Iris dataset. The experimental results show that classification with Random Forests model has generally better performance than that with the Boosting Tree model and other three popular algorithms: KNN, SMO and Simple Cart. However, the Gini coefficient of the Random Forests model shows that it gets less pure training set than other models. The new GraftedTrees model inherits the advantages of Random Forest and further employs random mixture of two interchangeable node splitting rule inductions with the aim to obtain higher computational efficiency and better performance in terms of accuracy. With its superiority, it is expected that the new GraftedTrees model can prove to be the most powerful model with better performance in classification in the near future.
History
Volume
162Pagination
946-954Location
Granada, SpainPublisher DOI
Open access
- Yes
Link to full text
Start date
2019-11-03End date
2019-11-06ISSN
1877-0509eISSN
1877-0509Language
engPublication classification
E1.1 Full written paper - refereedEditor/Contributor(s)
Herrera-Viedma E, Shi Y, Berg D, Tien J, Javier Cabrerizo F, Li JTitle of proceedings
ITQM 2019 : Proceedings of the 7th International Conference on Information Technology and Quantitative ManagementEvent
International Academy of Information Technology and Quantitative Management. Conference (7th : 2019 : Granada, Spain)Publisher
ElsevierPlace of publication
Amsterdam, The NetherlandsSeries
International Academy of Information Technology and Quantitative Management ConferenceUsage metrics
Categories
No categories selectedKeywords
Licence
Exports
RefWorksRefWorks
BibTeXBibTeX
Ref. managerRef. manager
EndnoteEndnote
DataCiteDataCite
NLMNLM
DCDC