Deakin University
Browse

File(s) under permanent embargo

On the size of training set and the benefit from ensemble

journal contribution
posted on 2004-01-01, 00:00 authored by Z H Zhou, D Wei, Gang LiGang Li, Honghua Dai
In this paper, the impact of the size of the training set on the benefit from ensemble, i.e. the gains obtained by employing ensemble learning paradigms, is empirically studied. Experiments on Bagged/ Boosted J4.8 decision trees with/without pruning show that enlarging the training set tends to improve the benefit from Boosting but does not significantly impact the benefit from Bagging. This phenomenon is then explained from the view of bias-variance reduction. Moreover, it is shown that even for Boosting, the benefit does not always increase consistently along with the increase of the training set size since single learners sometimes may learn relatively more from additional training data that are randomly provided than ensembles do. Furthermore, it is observed that the benefit from ensemble of unpruned decision trees is usually bigger than that from ensemble of pruned decision trees. This phenomenon is then explained from the view of error-ambiguity balance.

History

Journal

Lecture notes in computer science

Volume

3056

Pagination

298 - 307

Publisher

Springer-Verlag

Location

Heidelberg, Germany

ISSN

0302-9743

eISSN

1611-3349

Language

eng

Publication classification

C1 Refereed article in a scholarly journal

Copyright notice

2004, Springer-Verlag

Usage metrics

    Research Publications

    Categories

    No categories selected

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC