Existing generative classifiers (e.g., BayesNet and AnDE) make independence assumptions and estimate one-dimensional likelihood. This paper presents a new generative classifier called MassBayes that estimates multi-dimensional likelihood without making any explicit assumptions. It aggregates the multi-dimensional likelihoods estimated from random subsets of the training data using varying size random feature subsets. Our empirical evaluations show that MassBayes yields better classification accuracy than the existing generative classifiers in large data sets. As it works with fixed-size subsets of training data, it has constant training time complexity and constant space complexity, and it can easily scale up to very large data sets.
History
Volume
7818
Pagination
136-148
Location
Gold Coast, Qld.
Start date
2013-04-14
End date
2013-04-17
ISBN-13
978-3-642-37453-1
Language
eng
Publication classification
E1.1 Full written paper - refereed
Copyright notice
2013, Springer-Verlag Berlin Heidelberg
Editor/Contributor(s)
Pei J, Tseng VS, Cao L, Motoda H, Xu G
Title of proceedings
PAKDD 2013 : Proceedings of the 17th Pacific-Asia Conference on Knowledge Discovery and Data Mining 2013
Event
Knowledge Discovery and Data Mining. Conference (17th : 2013 : Gold Coast, Qld.)