Openly accessible

A particle swarm based hybrid system for imbalances medical data sampling

Yang, Pengyi, Xu, Liang, Zhou, Bing B., Zhang, Zili and Zomaya, Albert Y. 2009, A particle swarm based hybrid system for imbalances medical data sampling, BMC genomics, vol. 10, Supplement 3, pp. 1-14.

Attached Files
Name Description MIMEType Size Downloads
zhang-particleswarmbased-2009.pdf Published version application/pdf 860.53KB 43

Title A particle swarm based hybrid system for imbalances medical data sampling
Author(s) Yang, Pengyi
Xu, Liang
Zhou, Bing B.
Zhang, Zili
Zomaya, Albert Y.
Journal name BMC genomics
Volume number 10
Season Supplement 3
Start page 1
End page 14
Total pages 14
Publisher BioMed Central
Place of publication London, England
Publication date 2009-12-03
ISSN 1471-2164
Summary Background
Medical and biological data are commonly with small sample size, missing values, and most importantly, imbalanced class distribution. In this study we propose a particle swarm based hybrid system for remedying the class imbalance problem in medical and biological data mining. This hybrid system combines the particle swarm optimization (PSO) algorithm with multiple classifiers and evaluation metrics for evaluation fusion. Samples from the majority class are ranked using multiple objectives according to their merit in compensating the class imbalance, and then combined with the minority class to form a balanced dataset.

Results
One important finding of this study is that different classifiers and metrics often provide different evaluation results. Nevertheless, the proposed hybrid system demonstrates consistent improvements over several alternative methods with three different metrics. The sampling results also demonstrate good generalization on different types of classification algorithms, indicating the advantage of information fusion applied in the hybrid system.

Conclusion
The experimental results demonstrate that unlike many currently available methods which often perform unevenly with different datasets the proposed hybrid system has a better generalization property which alleviates the method-data dependency problem. From the biological perspective, the system provides indication for further investigation of the highly ranked samples, which may result in the discovery of new conditions or disease subtypes.
Notes This article is part of the supplement from the Eighth International Conference on Bioinformatics (InCoB2009): Computational Biology .

This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
Language eng
Field of Research 080301 Bioinformatics Software
Socio Economic Objective 970108 Expanding Knowledge in the Information and Computing Sciences
HERDC Research category C1 Refereed article in a scholarly journal
Copyright notice ©2009, Yang et al
Persistent URL http://hdl.handle.net/10536/DRO/DU:30029415

Document type: Journal Article
Collections: School of Information Technology
Open Access Collection
Connect to link resolver
 
Unless expressly stated otherwise, the copyright for items in DRO is owned by the author, with all rights reserved.

Every reasonable effort has been made to ensure that permission has been obtained for items included in DRO. If you believe that your rights have been infringed by this repository, please contact drosupport@deakin.edu.au.

Versions
Version Filter Type
Citation counts: Scopus Citation Count Cited 14 times in Scopus
Google Scholar Search Google Scholar
Access Statistics: 311 Abstract Views, 43 File Downloads  -  Detailed Statistics
Created: Fri, 16 Jul 2010, 09:58:13 EST by Leanne Swaneveld

Every reasonable effort has been made to ensure that permission has been obtained for items included in DRO. If you believe that your rights have been infringed by this repository, please contact drosupport@deakin.edu.au.