Improving reliability of unbalanced text mining by reducing performance bias

Zhuang, Ling, Gan, Min and Dai, Honghua 2012, Improving reliability of unbalanced text mining by reducing performance bias. In Dai, Honghua, Liu, James N. K. and Smirnov, Evgueni (ed), Reliable knowledge discovery, Springer, New York, N. Y., pp.259-268, doi: 10.1007/978-1-4614-1903-7_15.

Attached Files
Name Description MIMEType Size Downloads

Title Improving reliability of unbalanced text mining by reducing performance bias
Author(s) Zhuang, Ling
Gan, Min
Dai, HonghuaORCID iD for Dai, Honghua
Title of book Reliable knowledge discovery
Editor(s) Dai, HonghuaORCID iD for Dai, Honghua
Liu, James N. K.
Smirnov, Evgueni
Publication date 2012
Chapter number 15
Total chapters 17
Start page 259
End page 268
Total pages 10
Publisher Springer
Place of Publication New York, N. Y.
Keyword(s) text mining
textual data
Summary Class imbalance in textual data is one important factor that affects the reliability of text mining. For imbalanced textual data, conventional classifiers tend to have a strong performance bias, which results in high accuracy rate on the majority class but very low rate on the minorities. An extreme strategy for unbalanced learning is to discard the majority instances and apply one-class classification to the minority class. However, this could easily cause another type of bias, which increases the accuracy rate on minorities by sacrificing the majorities. This chapter aims to investigate approaches that reduce these two types of performance bias and improve the reliability of discovered classification rules. Experimental results show that the inexact field learning method and parameter optimized one class classifiers achieve more balanced performance than the standard approaches.
ISBN 9781461419020
Language eng
DOI 10.1007/978-1-4614-1903-7_15
Field of Research 080109 Pattern Recognition and Data Mining
Socio Economic Objective 890202 Application Tools and System Utilities
HERDC Research category B1 Book chapter
Copyright notice ©2012, Springer Science+Business Media, LLC
Persistent URL

Connect to link resolver
Unless expressly stated otherwise, the copyright for items in DRO is owned by the author, with all rights reserved.

Version Filter Type
Citation counts: TR Web of Science Citation Count  Cited 0 times in TR Web of Science
Scopus Citation Count Cited 0 times in Scopus
Google Scholar Search Google Scholar
Access Statistics: 485 Abstract Views, 33 File Downloads  -  Detailed Statistics
Created: Tue, 13 Mar 2012, 09:48:31 EST

Every reasonable effort has been made to ensure that permission has been obtained for items included in DRO. If you believe that your rights have been infringed by this repository, please contact