Machine Learning Approaches for Predicting Hypertension and Its Associated Factors Using Population-Level Data From Three South Asian Countries

Islam, Shariful; Talukder, Ashis; Awal, Md Abdul; Siddiqui, Md Muhammad Umer; Ahamad, Md Martuza; Ahammed, Benojir; Rawal, Lal B; Alizadehsani, Roohallah; Abawajy, Jemal; Laranjo, Liliana; Chow, Clara K; Maddison, Ralph

islam-machinelearningapproaches-2022.pdf (1.55 MB)

Machine Learning Approaches for Predicting Hypertension and Its Associated Factors Using Population-Level Data From Three South Asian Countries

journal contribution

posted on 2022-03-31, 00:00 authored by Shariful IslamShariful Islam, Ashis Talukder, Md Abdul Awal, Md Muhammad Umer Siddiqui, Md Martuza Ahamad, Benojir Ahammed, Lal B Rawal, Roohallah AlizadehsaniRoohallah Alizadehsani, Jemal AbawajyJemal Abawajy, Liliana Laranjo, Clara K Chow, Ralph MaddisonRalph Maddison

BackgroundHypertension is the most common modifiable risk factor for cardiovascular diseases in South Asia. Machine learning (ML) models have been shown to outperform clinical risk predictions compared to statistical methods, but studies using ML to predict hypertension at the population level are lacking. This study used ML approaches in a dataset of three South Asian countries to predict hypertension and its associated factors and compared the model's performances.MethodsWe conducted a retrospective study using ML analyses to detect hypertension using population-based surveys. We created a single dataset by harmonizing individual-level data from the most recent nationally representative Demographic and Health Survey in Bangladesh, Nepal, and India. The variables included blood pressure (BP), sociodemographic and economic factors, height, weight, hemoglobin, and random blood glucose. Hypertension was defined based on JNC-7 criteria. We applied six common ML-based classifiers: decision tree (DT), random forest (RF), gradient boosting machine (GBM), extreme gradient boosting (XGBoost), logistic regression (LR), and linear discriminant analysis (LDA) to predict hypertension and its risk factors.ResultsOf the 8,18,603 participants, 82,748 (10.11%) had hypertension. ML models showed that significant factors for hypertension were age and BMI. Ever measured BP, education, taking medicine to lower BP, and doctor's perception of high BP was also significant but comparatively lower than age and BMI. XGBoost, GBM, LR, and LDA showed the highest accuracy score of 90%, RF and DT achieved 89 and 83%, respectively, to predict hypertension. DT achieved the precision value of 91%, and the rest performed with 90%. XGBoost, GBM, LR, and LDA achieved a recall value of 100%, RF scored 99%, and DT scored 90%. In F1-score, XGBoost, GBM, LR, and LDA scored 95%, while RF scored 94%, and DT scored 90%. All the algorithms performed with good and small log loss values <6%.ConclusionML models performed well to predict hypertension and its associated factors in South Asians. When employed on an open-source platform, these models are scalable to millions of people and might help individuals self-screen for hypertension at an early stage. Future studies incorporating biochemical markers are needed to improve the ML algorithms and evaluate them in real life.

History

Journal

Frontiers in Cardiovascular Medicine

Volume

9

Article number

839379

Pagination

1 - 9

Publisher

Frontiers / Frontiers Media / Frontiers Research Foundation

Location

Lausanne, Switzerland

Publisher DOI

https://doi.org/10.3389/fcvm.2022.839379

Link to full text

http://doi.org/10.3389/fcvm.2022.839379

ISSN

2297-055X

eISSN

2297-055X

Language

eng

Author URL

https://www.ncbi.nlm.nih.gov/pubmed/35433854

Publication classification

C1 Refereed article in a scholarly journal

Usage metrics

Keywords

algorithms artificial intelligence blood pressure cardiovascular diseases Demographic and Health Survey risk factors South Asia Science & Technology Life Sciences & Biomedicine Cardiac & Cardiovascular Systems Cardiovascular System & Cardiology BODY-MASS INDEX WAIST CIRCUMFERENCE CARDIOVASCULAR RISK INCOME COUNTRIES MIDDLE-INCOME

Licence

Exports

RefWorks

BibTeX

Ref. manager

Endnote

DataCite

NLM

DC

Machine Learning Approaches for Predicting Hypertension and Its Associated Factors Using Population-Level Data From Three South Asian Countries

History

Journal

Volume

Article number

Pagination

Publisher

Location

Publisher DOI

Link to full text

ISSN

eISSN

Language

Author URL

Publication classification

Usage metrics

Categories

Keywords

Licence

Exports