Proportional k-interval discretization for naive-Bayes classifiers
Yang, Ying and Webb, Geoffrey I. 2001, Proportional k-interval discretization for naive-Bayes classifiers, in ECML 2001 : Machine Learning : 12th European Conference on Machine Learning, Springer-Verlag, Berlin, Germany, pp. 564-575.
Attached Files
(Some files may be inaccessible until you login with your Deakin Research Online credentials)
Name
Description
MIMEType
Size
Downloads
Title
Proportional k-interval discretization for naive-Bayes classifiers
ECML 2001 : Machine Learning : 12th European Conference on Machine Learning
Editor(s)
Carbonell, Jaime G. Siekmann, Jorg
Publication date
2001
Series
Lecture notes in computer science ; 2167
Start page
564
End page
575
Publisher
Springer-Verlag
Place of publication
Berlin, Germany
Summary
This paper argues that two commonly-used discretization approaches, fixed k-interval discretization and entropy-based discretization have sub-optimal characteristics for naive-Bayes classification. This analysis leads to a new discretization method, Proportional k-Interval Discretization (PKID), which adjusts the number and size of discretized intervals to the number of training instances, thus seeks an appropriate trade-off between the bias and variance of the probability estimation for naive-Bayes classifiers. We justify PKID in theory, as well as test it on a wide cross-section of datasets. Our experimental results suggest that in comparison to its alternatives, PKID provides naive-Bayes classifiers competitive classification performance for smaller datasets and better classification performance for larger datasets.
ISBN
3540425365 9783540425366
Language
eng
Field of Research
080199 Artificial Intelligence and Image Processing not elsewhere classified