Deakin University
Browse

File(s) under permanent embargo

A new supervised term ranking method for text categorization

conference contribution
posted on 2010-12-01, 00:00 authored by Musa MammadovMusa Mammadov, John YearwoodJohn Yearwood, L Zhao
In text categorization, different supervised term weighting methods have been applied to improve classification performance by weighting terms with respect to different categories, for example, Information Gain, χ 2 statistic, and Odds Ratio. From the literature there are three term ranking methods to summarize term weights of different categories for multi-class text categorization. They are Summation, Average, and Maximum methods. In this paper we present a new term ranking method to summarize term weights, i.e. Maximum Gap. Using two different methods of information gain and χ 2 statistic, we setup controlled experiments for different term ranking methods. Reuter-21578 text corpus is used as the dataset. Two popular classification algorithms SVM and Boostexter are adopted to evaluate the performance of different term ranking methods. Experimental results show that the new term ranking method performs better. © 2010 Springer-Verlag.

History

Volume

6464 LNAI

Pagination

102 - 111

Publisher

Springer

Location

Adelaide, S. Aust.

Place of publication

Berlin, Germany

Start date

2010-12-07

End date

2010-12-10

ISSN

0302-9743

eISSN

1611-3349

ISBN-10

3642174310

Publication classification

EN.1 Other conference paper

Title of proceedings

AI: Australasian Joint Conference on Artificial Intelligence

Usage metrics

    Research Publications

    Categories

    No categories selected

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC