Deakin University
Browse

Proportional k-interval discretization for naive-Bayes classifiers

conference contribution
posted on 2001-01-01, 00:00 authored by Ying Yang, G Webb
This paper argues that two commonly-used discretization approaches, fixed k-interval discretization and entropy-based discretization have sub-optimal characteristics for naive-Bayes classification. This analysis leads to a new discretization method, Proportional k-Interval Discretization (PKID), which adjusts the number and size of discretized intervals to the number of training instances, thus seeks an appropriate trade-off between the bias and variance of the probability estimation for naive-Bayes classifiers. We justify PKID in theory, as well as test it on a wide cross-section of datasets. Our experimental results suggest that in comparison to its alternatives, PKID provides naive-Bayes classifiers competitive classification performance for smaller datasets and better classification performance for larger datasets.

History

Title of proceedings

ECML 2001 : Machine Learning : 12th European Conference on Machine Learning

Event

European Conference on Machine Learning (12th : 2001 : Freiburg, Germany)

Series

Lecture notes in computer science ; 2167

Pagination

564 - 575

Publisher

Springer-Verlag

Location

Freiburg, Germany

Place of publication

Berlin, Germany

Start date

2001-09-03

End date

2001-09-07

ISBN-13

9783540425366

ISBN-10

3540425365

Language

eng

Publication classification

E1 Full written paper - refereed

Copyright notice

Springer-Verlag Berlin Heidelberg 2001

Editor/Contributor(s)

J Carbonell, J Siekmann

Usage metrics

    Research Publications

    Categories

    No categories selected

    Keywords

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC