Deakin University
Browse

File(s) under embargo

ML-CKDP: Machine learning-based chronic kidney disease prediction with smart web application

journal contribution
posted on 2024-04-08, 04:39 authored by RK Halder, MN Uddin, MA Uddin, Sunil AryalSunil Aryal, S Saha, R Hossen, S Ahmed, MAT Rony, MF Akter
Chronic kidney diseases (CKDs) are a significant public health issue with potential for severe complications such as hypertension, anemia, and renal failure. Timely diagnosis is crucial for effective management. Leveraging machine learning within healthcare offers promising advancements in predictive diagnostics. In this paper, we developed a machine learning-based kidney diseases prediction (ML‐CKDP) model with dual objectives: to enhance dataset preprocessing for CKD classification and to develop a web-based application for CKD prediction. The proposed model involves a comprehensive data preprocessing protocol, converting categorical variables to numerical values, imputing missing data, and normalizing via Min-Max scaling. Feature selection is executed using a variety of techniques including Correlation, Chi-Square, Variance Threshold, Recursive Feature Elimination, Sequential Forward Selection, Lasso Regression, and Ridge Regression to refine the datasets. The model employs seven classifiers: Random Forest (RF), AdaBoost (AdaB), Gradient Boosting (GB), XgBoost (XgB), Naive Bayes (NB), Support Vector Machine (SVM), and Decision Tree (DT), to predict CKDs. The effectiveness of the models is assessed by measuring their accuracy, analyzing confusion matrix statistics, and calculating the Area Under the Curve (AUC) specifically for the classification of positive cases. Random Forest (RF) and AdaBoost (AdaB) achieve a 100% accuracy rate, evident across various validation methods including data splits of 70:30, 80:20, and K-Fold set to 10 and 15. RF and AdaB consistently reach perfect AUC scores of 100% across multiple datasets, under different splitting ratios. Moreover, Naive Bayes (NB) stands out for its efficiency, recording the lowest training and testing times across all datasets and split ratios. Additionally, we present a real-time web-based application to operationalize the model, enhancing accessibility for healthcare practitioners and stakeholders. Web app link: https://rajib-research-kedney-diseases-prediction.onrender.com/

History

Journal

Journal of Pathology Informatics

Volume

15

Article number

100371

Pagination

1-16

Location

Amsterdam, The Netherlands

ISSN

2229-5089

eISSN

2153-3539

Language

eng

Publication classification

C1 Refereed article in a scholarly journal

Publisher

Elsevier