Background
Cardiovascular disease (CVD) is the greatest health problem in Australia, which kills more people than any other disease and incurs enormous costs for the health care system. In this study, we present a benchmark comparison of various artificial intelligence (AI) architectures for predicting the mortality rate of patients with CVD using structured medical claims data. Compared with other research in the clinical literature, our models are more efficient because we use a smaller number of features, and this study could help health professionals accurately choose AI models to predict mortality among patients with CVD using only claims data before a clinic visit.
Objective
This study aims to support health clinicians in accurately predicting mortality among patients with CVD using only claims data before a clinic visit.
Methods
The data set was obtained from the Medicare Benefits Scheme and Pharmaceutical Benefits Scheme service information in the period between 2004 and 2014, released by the Department of Health Australia in 2016. It included 346,201 records, corresponding to 346,201 patients. A total of five AI algorithms, including four classical machine learning algorithms (logistic regression [LR], random forest [RF], extra trees [ET], and gradient boosting trees [GBT]) and a deep learning algorithm, which is a densely connected neural network (DNN), were developed and compared in this study. In addition, because of the minority of deceased patients in the data set, a separate experiment using the Synthetic Minority Oversampling Technique (SMOTE) was conducted to enrich the data.
Results
Regarding model performance, in terms of discrimination, GBT and RF were the models with the highest area under the receiver operating characteristic curve (97.8% and 97.7%, respectively), followed by ET (96.8%) and LR (96.4%), whereas DNN was the least discriminative (95.3%). In terms of reliability, LR predictions were the least calibrated compared with the other four algorithms. In this study, despite increasing the training time, SMOTE was proven to further improve the model performance of LR, whereas other algorithms, especially GBT and DNN, worked well with class imbalanced data.
Conclusions
Compared with other research in the clinical literature involving AI models using claims data to predict patient health outcomes, our models are more efficient because we use a smaller number of features but still achieve high performance. This study could help health professionals accurately choose AI models to predict mortality among patients with CVD using only claims data before a clinic visit.