Boosting imbalanced data learning with Wiener process oversampling

Li, Q; Li, Gang; Niu, W; Cao, Y; Chang, L; Tan, J; Guo, L

Boosting imbalanced data learning with Wiener process oversampling

journal contribution

posted on 2017-10-01, 00:00 authored by Q Li, Gang LiGang Li, W Niu, Y Cao, L Chang, J Tan, L Guo

Learning from imbalanced data is a challenging task in a wide range of applications, which attracts significant research efforts from machine learning and data mining community. As a natural approach to this issue, oversampling balances the training samples through replicating existing samples or synthesizing new samples. In general, synthesization outperforms replication by supplying additional information on the minority class. However, the additional information needs to follow the same normal distribution of the training set, which further constrains the new samples within the predefined range of training set. In this paper, we present the Wiener process oversampling (WPO) technique that brings the physics phenomena into sample synthesization. WPO constructs a robust decision region by expanding the attribute ranges in training set while keeping the same normal distribution. The satisfactory performance of WPO can be achieved with much lower computing complexity. In addition, by integrating WPO with ensemble learning, the WPOBoost algorithm outperformsmany prevalent imbalance learning solutions.

History

Journal

Frontiers of computer science

Volume

11

Pagination

836-851

Location

Berlin, Germany

Publisher DOI

https://doi.org/10.1007/s11704-016-5250-y

ISSN

2095-2228

eISSN

2095-2236

Language

eng

Publication classification

C Journal article, C1 Refereed article in a scholarly journal

Copyright notice

2016, Higher Education Press and Springer-Verlag Berlin Heidelberg

Issue

5

Publisher

Springer

Usage metrics

Keywords

970108 Expanding Knowledge in the Information and Computing Sciences School of Information Technology 4602 Artificial intelligence

Boosting imbalanced data learning with Wiener process oversampling

History

Journal

Volume

Pagination

Location

Publisher DOI

ISSN

eISSN

Language

Publication classification

Copyright notice

Issue

Publisher

Usage metrics

Categories

Keywords

Licence

Exports