The need for low bias algorithms in classification learning from large data sets

Brain, Damien and Webb, Geoffrey I. 2002, The need for low bias algorithms in classification learning from large data sets, in PKDD 2002 : Principles of Data mining and Knowledge Discovery : 6th European Conference Proceedings, PKDD, [Helsinki, Finland], pp. 62-73.

Attached Files
Name Description MIMEType Size Downloads

Title The need for low bias algorithms in classification learning from large data sets
Author(s) Brain, Damien
Webb, Geoffrey I.
Conference name Principles of data mining and knowledge discovery. European Conference (6th : 2002 : Helsinki, Finland)
Conference location Helsinki, Finland
Conference dates 19-23 Aug. 2002
Title of proceedings PKDD 2002 : Principles of Data mining and Knowledge Discovery : 6th European Conference Proceedings
Editor(s) Elomaa, Tapio
Mannila, Heikki
Toivonen, Hannu
Publication date 2002
Start page 62
End page 73
Publisher PKDD
Place of publication [Helsinki, Finland]
Summary This paper reviews the appropriateness for application to large data sets of standard machine learning algorithms, which were mainly developed in the context of small data sets. Sampling and parallelisation have proved useful means for reducing computation time when learning from large data sets. However, such methods assume that algorithms that were designed for use with what are now considered small data sets are also fundamentally suitable for large data sets. It is plausible that optimal learning from large data sets requires a different type of algorithm to optimal learning from small data sets. This paper investigates one respect in which data set size may affect the requirements of a learning algorithm — the bias plus variance decomposition of classification error. Experiments show that learning from large data sets may be more effective when using an algorithm that places greater emphasis on bias management, rather than variance management.
ISBN 3540440372
Language eng
Field of Research 080110 Simulation and Modelling
Socio Economic Objective 970108 Expanding Knowledge in the Information and Computing Sciences
HERDC Research category E1 Full written paper - refereed
Copyright notice ©2002, PKDD
Persistent URL http://hdl.handle.net/10536/DRO/DU:30004684

Document type: Conference Paper
Collection: School of Information Technology
Connect to link resolver
 
Unless expressly stated otherwise, the copyright for items in DRO is owned by the author, with all rights reserved.

Versions
Version Filter Type
Access Statistics: 257 Abstract Views, 0 File Downloads  -  Detailed Statistics
Created: Mon, 07 Jul 2008, 09:40:12 EST

Every reasonable effort has been made to ensure that permission has been obtained for items included in DRO. If you believe that your rights have been infringed by this repository, please contact drosupport@deakin.edu.au.