Improving reliability of unbalanced text mining by reducing performance bias
Zhuang, Ling, Gan, Min and Dai, Honghua 2012, Improving reliability of unbalanced text mining by reducing performance bias, in Reliable knowledge discovery, Springer, New York, N. Y., pp.259-268.
Attached Files
(Some files may be inaccessible until you login with your Deakin Research Online credentials)
Name
Description
MIMEType
Size
Downloads
Title
Improving reliability of unbalanced text mining by reducing performance bias
Class imbalance in textual data is one important factor that affects the reliability of text mining. For imbalanced textual data, conventional classifiers tend to have a strong performance bias, which results in high accuracy rate on the majority class but very low rate on the minorities. An extreme strategy for unbalanced learning is to discard the majority instances and apply one-class classification to the minority class. However, this could easily cause another type of bias, which increases the accuracy rate on minorities by sacrificing the majorities. This chapter aims to investigate approaches that reduce these two types of performance bias and improve the reliability of discovered classification rules. Experimental results show that the inexact field learning method and parameter optimized one class classifiers achieve more balanced performance than the standard approaches.