File(s) under permanent embargo
A new supervised term ranking method for text categorization
conference contribution
posted on 2010-12-01, 00:00 authored by Musa MammadovMusa Mammadov, John YearwoodJohn Yearwood, L ZhaoIn text categorization, different supervised term weighting methods have been applied to improve classification performance by weighting terms with respect to different categories, for example, Information Gain, χ 2 statistic, and Odds Ratio. From the literature there are three term ranking methods to summarize term weights of different categories for multi-class text categorization. They are Summation, Average, and Maximum methods. In this paper we present a new term ranking method to summarize term weights, i.e. Maximum Gap. Using two different methods of information gain and χ 2 statistic, we setup controlled experiments for different term ranking methods. Reuter-21578 text corpus is used as the dataset. Two popular classification algorithms SVM and Boostexter are adopted to evaluate the performance of different term ranking methods. Experimental results show that the new term ranking method performs better. © 2010 Springer-Verlag.
History
Volume
6464 LNAIPagination
102-111Location
Adelaide, S. Aust.Publisher DOI
Start date
2010-12-07End date
2010-12-10ISSN
0302-9743eISSN
1611-3349ISBN-10
3642174310Publication classification
EN.1 Other conference paperTitle of proceedings
AI: Australasian Joint Conference on Artificial IntelligencePublisher
SpringerPlace of publication
Berlin, GermanyUsage metrics
Categories
No categories selectedKeywords
Licence
Exports
RefWorksRefWorks
BibTeXBibTeX
Ref. managerRef. manager
EndnoteEndnote
DataCiteDataCite
NLMNLM
DCDC