On the size of training set and the benefit from ensemble

Zhou, Zhi-Hua, Wei, Dan, Li, Gang and Dai, Honghua 2004, On the size of training set and the benefit from ensemble, Lecture notes in computer science, vol. 3056, pp. 298-307.

Attached Files
Name Description MIMEType Size Downloads

Title On the size of training set and the benefit from ensemble
Author(s) Zhou, Zhi-Hua
Wei, Dan
Li, GangORCID iD for Li, Gang orcid.org/0000-0003-1583-641X
Dai, HonghuaORCID iD for Dai, Honghua orcid.org/0000-0001-9899-7029
Journal name Lecture notes in computer science
Volume number 3056
Start page 298
End page 307
Publisher Springer-Verlag
Place of publication Heidelberg, Germany
Publication date 2004
ISSN 0302-9743
Summary In this paper, the impact of the size of the training set on the benefit from ensemble, i.e. the gains obtained by employing ensemble learning paradigms, is empirically studied. Experiments on Bagged/ Boosted J4.8 decision trees with/without pruning show that enlarging the training set tends to improve the benefit from Boosting but does not significantly impact the benefit from Bagging. This phenomenon is then explained from the view of bias-variance reduction. Moreover, it is shown that even for Boosting, the benefit does not always increase consistently along with the increase of the training set size since single learners sometimes may learn relatively more from additional training data that are randomly provided than ensembles do. Furthermore, it is observed that the benefit from ensemble of unpruned decision trees is usually bigger than that from ensemble of pruned decision trees. This phenomenon is then explained from the view of error-ambiguity balance.
Language eng
Field of Research 080299 Computation Theory and Mathematics not elsewhere classified
HERDC Research category C1 Refereed article in a scholarly journal
Copyright notice ©2004, Springer-Verlag
Persistent URL http://hdl.handle.net/10536/DRO/DU:30008668

Document type: Journal Article
Collection: School of Information Technology
Connect to link resolver
Unless expressly stated otherwise, the copyright for items in DRO is owned by the author, with all rights reserved.

Version Filter Type
Citation counts: TR Web of Science Citation Count  Cited 0 times in TR Web of Science
Scopus Citation Count Cited 0 times in Scopus
Google Scholar Search Google Scholar
Access Statistics: 590 Abstract Views, 0 File Downloads  -  Detailed Statistics
Created: Mon, 13 Oct 2008, 15:38:34 EST

Every reasonable effort has been made to ensure that permission has been obtained for items included in DRO. If you believe that your rights have been infringed by this repository, please contact drosupport@deakin.edu.au.