File(s) under permanent embargo
Boosting imbalanced data learning with Wiener process oversampling
journal contribution
posted on 2017-10-01, 00:00 authored by Q Li, Gang LiGang Li, W Niu, Y Cao, L Chang, J Tan, L GuoLearning from imbalanced data is a challenging task in a wide range of applications, which attracts significant research efforts from machine learning and data mining community. As a natural approach to this issue, oversampling balances the training samples through replicating existing samples or synthesizing new samples. In general, synthesization outperforms replication by supplying additional information on the minority class. However, the additional information needs to follow the same normal distribution of the training set, which further constrains the new samples within the predefined range of training set. In this paper, we present the Wiener process oversampling (WPO) technique that brings the physics phenomena into sample synthesization. WPO constructs a robust decision region by expanding the attribute ranges in training set while keeping the same normal distribution. The satisfactory performance of WPO can be achieved with much lower computing complexity. In addition, by integrating WPO with ensemble learning, the WPOBoost algorithm outperformsmany prevalent imbalance learning solutions.
History
Journal
Frontiers of computer scienceVolume
11Issue
5Pagination
836 - 851Publisher
SpringerLocation
Berlin, GermanyPublisher DOI
ISSN
2095-2228eISSN
2095-2236Language
engPublication classification
C Journal article; C1 Refereed article in a scholarly journalCopyright notice
2016, Higher Education Press and Springer-Verlag Berlin HeidelbergUsage metrics
Categories
No categories selectedKeywords
Licence
Exports
RefWorks
BibTeX
Ref. manager
Endnote
DataCite
NLM
DC