Deakin University
Browse

Improving reliability of unbalanced text mining by reducing performance bias

Version 2 2024-06-17, 07:45
Version 1 2014-10-28, 09:35
chapter
posted on 2024-06-17, 07:45 authored by L Zhuang, M Gan, H Dai
Class imbalance in textual data is one important factor that affects the reliability of text mining. For imbalanced textual data, conventional classifiers tend to have a strong performance bias, which results in high accuracy rate on the majority class but very low rate on the minorities. An extreme strategy for unbalanced learning is to discard the majority instances and apply one-class classification to the minority class. However, this could easily cause another type of bias, which increases the accuracy rate on minorities by sacrificing the majorities. This chapter aims to investigate approaches that reduce these two types of performance bias and improve the reliability of discovered classification rules. Experimental results show that the inexact field learning method and parameter optimized one class classifiers achieve more balanced performance than the standard approaches.

History

Chapter number

15

Pagination

259-268

ISBN-13

9781461419020

ISBN-10

1461419026

Language

eng

Publication classification

B1 Book chapter

Copyright notice

2012, Springer Science+Business Media, LLC

Extent

17

Editor/Contributor(s)

Dai H, Liu J, Smirnov E

Publisher

Springer

Place of publication

New York, N. Y.

Title of book

Reliable knowledge discovery

Usage metrics

    Research Publications

    Categories

    No categories selected

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC