posted on 2004-01-01, 00:00authored byJiyuan An, Yi-Ping Phoebe Chen
Concept learning of text documents can be viewed as the problem of acquiring the definition of a general category of documents. To definite the category of a text document, the Conjunctive of keywords is usually be used. These keywords should be fewer and comprehensible. A naïve method is enumerating all combinations of keywords to extract suitable ones. However, because of the enormous number of keyword combinations, it is impossible to extract the most relevant keywords to describe the categories of documents by enumerating all possible combinations of keywords. Many heuristic methods are proposed, such as GA-base, immune based algorithm. In this work, we introduce pruning power technique and propose a robust enumeration-based concept learning algorithm. Experimental results show that the rules produce by our approach has more comprehensible and simplicity than by other methods.
History
Pagination
698 - 701
Location
Beijing, China
Open access
Yes
Start date
2004-09-20
End date
2004-09-24
ISBN-13
9780769521008
ISBN-10
0769521002
Language
eng
Publication classification
E1 Full written paper - refereed; E Conference publication
Copyright notice
2004 IEEE. Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution to servers or lists, or to reuse any copyrighted component of this work in other works must be obtained from the IEEE.