Deakin University
Browse

A Hybrid Feature Selection with Ensemble Classification for Imbalanced Healthcare Data: A Case Study for Brain Tumor Diagnosis

Download (5.93 MB)
Version 2 2024-06-04, 04:36
Version 1 2017-04-07, 14:21
journal contribution
posted on 2024-06-04, 04:36 authored by Shamsul HudaShamsul Huda, John YearwoodJohn Yearwood, HF Jelinek, MM Hassan, G Fortino, M Buckland
Electronic health records (EHRs) are providing increased access to healthcare data that can be made available for advanced data analysis. This can be used by the healthcare professionals to make a more informed decision providing improved quality of care. However, due to the inherent heterogeneous and imbalanced characteristics of medical data from EHRs, data analysis task faces a big challenge. In this paper, we address the challenges of imbalanced medical data about a brain tumor diagnosis problem. Morphometric analysis of histopathological images is rapidly emerging as a valuable diagnostic tool for neuropathology. Oligodendroglioma is one type of brain tumor that has a good response to treatment provided the tumor subtype is recognized accurately. The genetic variant, 1p-/19q-, has recently been found to have high chemosensitivity, and has morphological attributes that may lend it to automated image analysis and histological processing and diagnosis. This paper aims to achieve a fast, affordable, and objective diagnosis of this genetic variant of oligodendroglioma with a novel data mining approach combining a feature selection and ensemble-based classification. In this paper, 63 instances of brain tumor with oligodendroglioma are obtained due to prevalence and incidence of the tumor variant. In order to minimize the effect of an imbalanced healthcare data set, a global optimization-based hybrid wrapper-filter feature selection with ensemble classification is applied. The experiment results show that the proposed approach outperforms the standard techniques used in brain tumor classification problem to overcome the imbalanced characteristics of medical data.

History

Journal

IEEE Access

Volume

4

Season

Special section on healthcare big data

Pagination

9145-9154

Location

Piscataway, N.J.

Open access

  • Yes

ISSN

2169-3536

eISSN

2169-3536

Language

English

Publication classification

C Journal article, C1 Refereed article in a scholarly journal

Copyright notice

2017, IEEE

Publisher

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC