Depth first rule heneration for text categorization
An, Jiyuan and Chen, Yi-Ping Phoebe 2006, Depth first rule heneration for text categorization, Frontiers in artificial intelligence and applications: advances in intelligent IT: active media technology, vol. 138, pp. 302-306.
Attached Files
(Some files may be inaccessible until you login with your Deakin Research Online credentials)
Name
Description
MIMEType
Size
Downloads
Title
Depth first rule heneration for text categorization
Frontiers in artificial intelligence and applications: advances in intelligent IT: active media technology
Volume number
138
Start page
302
End page
306
Publisher
IOS Press
Place of publication
Amsterdam, Netherlands
Publication date
2006
ISSN
0922-6389 0927-720X
Summary
Classification methods are usually used to categorize text documents, such as, Rocchio method, Naïve bayes based method, and SVM based text classification method. These methods learn labeled text documents and then construct classifiers. The generated classifiers can predict which category is located for a new coming text document. The keywords in the document are often used to form rules to categorize text documents, for example “kw = computer” can be a rule for the IT documents category. However, the number of keywords is very large. To select keywords from the large number of keywords is a challenging work. Recently, a rule generation method based on enumeration of all possible keywords combinations has been proposed [2]. In this method, there remains a crucial problem: how to prune irrelevant combinations at the early stages of the rule generation procedure. In this paper, we propose a method than can effectively prune irrelative keywords at an early stage.