Deakin University
Browse

Toward reliable diabetes prediction: Innovations in data engineering and machine learning applications

Download (2.98 MB)
journal contribution
posted on 2024-09-13, 06:36 authored by Md Alamin Talukder, Md Manowarul Islam, Md Ashraf UddinMd Ashraf Uddin, Mohsin Kazi, Majdi Khalid, Arnisha Akhter, Mohammad Ali Moni
Objective Diabetes is a metabolic disorder that causes the risk of stroke, heart disease, kidney failure, and other long-term complications because diabetes generates excess sugar in the blood. Machine learning (ML) models can aid in diagnosing diabetes at the primary stage. So, we need an efficient ML model to diagnose diabetes accurately. Methods In this paper, an effective data preprocessing pipeline has been implemented to process the data and random oversampling to balance the data, handling the imbalance distributions of the observational data more sophisticatedly. We used four different diabetes datasets to conduct our experiments. Several ML algorithms were used to determine the best models to predict diabetes faultlessly. Results The performance analysis demonstrates that among all ML algorithms, random forest surpasses the current works with an accuracy rate of 86% and 98.48% for Dataset 1 and Dataset 2; extreme gradient boosting and decision tree surpass with an accuracy rate of 99.27% and 100% for Dataset 3 and Dataset 4, respectively. Our proposal can increase accuracy by 12.15% compared to the model without preprocessing. Conclusions This excellent research finding indicates that the proposed models might be employed to produce more accurate diabetes predictions to supplement current preventative interventions to reduce the incidence of diabetes and its associated costs.

History

Related Materials

Location

Thousand Oaks, CA.

Open access

  • Yes

Language

eng

Publication classification

C1 Refereed article in a scholarly journal

Journal

Digital Health

Volume

10

Article number

ARTN 20552076241271867

Pagination

1-26

ISSN

2055-2076

eISSN

2055-2076

Publisher

SAGE Publications

Usage metrics

    Research Publications

    Licence

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC