Deakin University
Browse

File(s) under permanent embargo

Reducing performance bias for unbalanced text mining

conference contribution
posted on 2006-01-01, 00:00 authored by Ling Zhuang, Honghua Dai
In text categorization applications, class imbalance, which refers to an uneven data distribution where one class is represented by far more less instances than the others, is a commonly encountered problem. In such a situation, conventional classifiers tend to have a strong performance bias, which results in high accuracy rate on the majority class but very low rate on the minorities. An extreme strategy for unbalanced, learning is to discard the majority instances and apply one-class classification to the minority class. However, this could easily cause another type of bias, which increases the accuracy rate on minorities by sacrificing the majorities. This paper aims to investigate approaches that reduce these two types of performance bias and improve the reliability of discovered classification rules. Experimental results show that the inexact field learning method and parameter optimized one-class classifiers achieve more balanced performance than the standard approaches.

History

Event

Sixth IEEE International Conference on Data Mining - Workshops (ICDMW'06)

Pagination

770 - 774

Publisher

IEEE Computer Society

Location

Hong Kong, China

Place of publication

Los Alamitos, Calif.

Start date

2006-12-18

End date

2006-12-22

Language

eng

Publication classification

E1 Full written paper - refereed

Copyright notice

2006, IEEE

Editor/Contributor(s)

S Tsumoto, C Clifton, N Zhong, X Wu, J Liu, B Wah, Y Cheung

Title of proceedings

ICDM Workshops 2006 proceedings : 18 December, 2006, Hong Kong, China

Usage metrics

    Research Publications

    Categories

    No categories selected

    Keywords

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC