The impact of sample size and data quality to classification reliability

Dai, Honghua 2012, The impact of sample size and data quality to classification reliability. In Dai, Honghua, Liu, James N. K. and Smirnov, Evgueni (ed), Reliable knowledge discovery, Springer, New York, N. Y., pp.219-226, doi: 10.1007/978-1-4614-1903-7_12.

Attached Files
Name Description MIMEType Size Downloads

Title The impact of sample size and data quality to classification reliability
Author(s) Dai, HonghuaORCID iD for Dai, Honghua
Title of book Reliable knowledge discovery
Editor(s) Dai, HonghuaORCID iD for Dai, Honghua
Liu, James N. K.
Smirnov, Evgueni
Publication date 2012
Chapter number 12
Total chapters 17
Start page 219
End page 226
Total pages 8
Publisher Springer
Place of Publication New York, N. Y.
Keyword(s) data oriented factors
data mining
algorithm oriented factors
Summary The reliability of an induced classifier can be affected by several factors including the data oriented factors and the algorithm oriented factors [3]. In some cases, the reliability could also be affected by knowledge oriented factors. In this chapter, we analyze three special cases to examine the reliability of the discovered knowledge. Our case study results show that (1) in the cases of mining from low quality data, rough classification approach is more reliable than exact approach which in general tolerate to low quality data; (2) Without sufficient large size of the data, the reliability of the discovered knowledge will be decreased accordingly; (3) The reliability of point learning approach could easily be misled by noisy data. It will in most cases generate an unreliable interval and thus affect the reliability of the discovered knowledge. It is also reveals that the inexact field is a good learning strategy that could model the potentials and to improve the discovery reliability.
ISBN 9781461419020
Language eng
DOI 10.1007/978-1-4614-1903-7_12
Field of Research 080109 Pattern Recognition and Data Mining
Socio Economic Objective 890202 Application Tools and System Utilities
HERDC Research category B1 Book chapter
Copyright notice ©2012, Springer Science+Business Media, LLC
Persistent URL

Document type: Book Chapter
Collection: School of Information Technology
Connect to link resolver
Unless expressly stated otherwise, the copyright for items in DRO is owned by the author, with all rights reserved.

Version Filter Type
Citation counts: TR Web of Science Citation Count  Cited 0 times in TR Web of Science
Scopus Citation Count Cited 0 times in Scopus
Google Scholar Search Google Scholar
Access Statistics: 281 Abstract Views, 13 File Downloads  -  Detailed Statistics
Created: Tue, 13 Mar 2012, 09:47:18 EST

Every reasonable effort has been made to ensure that permission has been obtained for items included in DRO. If you believe that your rights have been infringed by this repository, please contact