Deakin University
Browse

Enhanced classification models for Iris dataset

Download (432.33 kB)
Version 2 2024-06-13, 14:06
Version 1 2020-09-08, 15:55
conference contribution
posted on 2024-06-13, 14:06 authored by Y Wu, J He, Y Ji, G Huang, H Yao, P Zhang, W Xu, M Guo, Y Li
© 2020 The Authors. Published by Elsevier B.V. Data mining and machine learning are both useful tools in the field of data analysis. Classification algorithm is one of the most important techniques in data mining, therefore, it is of great significance to select suitable classification models with high efficiency to show superiority when solving classification problems with the use of Iris data. With this goal, a decision tree induction algorithm, namely graftedTree, is proposed to build randomized decision trees. Randomization is explicitly introduced into this algorithm, such that applying the algorithm several times on the same training data results in diversified models. An ensemble classification model is constructed using multiple randomized decision trees via majority voting. In order to show the performance of different models in classification, we propose the usage of precision, recall, F-Measure, the area under the ROC curve (AUC) and Gini coefficient as evaluation indexes of the classifying performance on the Iris dataset. The experimental results show that classification with Random Forests model has generally better performance than that with the Boosting Tree model and other three popular algorithms: KNN, SMO and Simple Cart. However, the Gini coefficient of the Random Forests model shows that it gets less pure training set than other models. The new GraftedTrees model inherits the advantages of Random Forest and further employs random mixture of two interchangeable node splitting rule inductions with the aim to obtain higher computational efficiency and better performance in terms of accuracy. With its superiority, it is expected that the new GraftedTrees model can prove to be the most powerful model with better performance in classification in the near future.

History

Volume

162

Pagination

946-954

Location

Granada, Spain

Open access

  • Yes

Start date

2019-11-03

End date

2019-11-06

ISSN

1877-0509

eISSN

1877-0509

Language

eng

Publication classification

E1.1 Full written paper - refereed

Editor/Contributor(s)

Herrera-Viedma E, Shi Y, Berg D, Tien J, Javier Cabrerizo F, Li J

Title of proceedings

ITQM 2019 : Proceedings of the 7th International Conference on Information Technology and Quantitative Management

Event

International Academy of Information Technology and Quantitative Management. Conference (7th : 2019 : Granada, Spain)

Publisher

Elsevier

Place of publication

Amsterdam, The Netherlands

Series

International Academy of Information Technology and Quantitative Management Conference

Usage metrics

    Research Publications

    Categories

    No categories selected

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC