A novel field learning algorithm for dual imbalance text classification

Zhuang, Ling, Dai, Honghua and Hang, Xiaosha 2005, A novel field learning algorithm for dual imbalance text classification, Lecture notes in computer science, vol. 3614, pp. 39-48.

Attached Files
Name Description MIMEType Size Downloads

Title A novel field learning algorithm for dual imbalance text classification
Author(s) Zhuang, Ling
Dai, Honghua
Hang, Xiaosha
Journal name Lecture notes in computer science
Volume number 3614
Start page 39
End page 48
Publisher Springer-Verlag
Place of publication Berlin , Germany
Publication date 2005
ISSN 0302-9743
Keyword(s) vertebrata
Bayes estimation
knowledge base
learning algorithm
Pisces
expert system
classification
text
information retrieval
content analysis
artificial intelligence
Summary Fish-net algorithm is a novel field learning algorithm which derives classification rules by looking at the range of values of each attribute instead of the individual point values. In this paper, we present a Feature Selection Fish-net learning algorithm to solve the Dual Imbalance problem on text classification. Dual imbalance includes the instance imbalance and feature imbalance. The instance imbalance is caused by the unevenly distributed classes and feature imbalance is due to the different document length. The proposed approach consists of two phases: (1) select a feature subset which consists of the features that are more supportive to difficult minority class; (2) construct classification rules based on the original Fish-net algorithm. Our experimental results on Reuters21578 show that the proposed approach achieves better balanced accuracy rate on both majority and minority class than Naive Bayes MultiNomial and SVM.
Language eng
Field of Research 080201 Analysis of Algorithms and Complexity
HERDC Research category C1 Refereed article in a scholarly journal
Copyright notice ©Springer-Verlag Berlin Heidelberg, 2005
Persistent URL http://hdl.handle.net/10536/DRO/DU:30003424

Document type: Journal Article
Collection: School of Information Technology
Connect to link resolver
 
Unless expressly stated otherwise, the copyright for items in DRO is owned by the author, with all rights reserved.

Versions
Version Filter Type
Access Statistics: 383 Abstract Views, 0 File Downloads  -  Detailed Statistics
Created: Mon, 07 Jul 2008, 08:52:47 EST

Every reasonable effort has been made to ensure that permission has been obtained for items included in DRO. If you believe that your rights have been infringed by this repository, please contact drosupport@deakin.edu.au.