Missing value estimation for mixed-attribute data sets

Zhu, Xiaofeng, Zhang, Shichao, Jin, Zhi, Zhang, Zili and Xu, Zhuoming 2011, Missing value estimation for mixed-attribute data sets, IEEE transactions on knowledge and data engineering, vol. 23, no. 1, pp. 110-121, doi: 10.1109/TKDE.2010.99.

Attached Files
Name Description MIMEType Size Downloads

Title Missing value estimation for mixed-attribute data sets
Author(s) Zhu, Xiaofeng
Zhang, Shichao
Jin, Zhi
Zhang, ZiliORCID iD for Zhang, Zili orcid.org/0000-0002-8721-9333
Xu, Zhuoming
Journal name IEEE transactions on knowledge and data engineering
Volume number 23
Issue number 1
Start page 110
End page 121
Total pages 12
Publisher IEEE
Place of publication Piscataway, NJ
Publication date 2011-01
ISSN 1041-4347
Keyword(s) classification
data mining
machine learning
Summary Missing data imputation is a key issue in learning from incomplete data. Various techniques have been developed with great successes on dealing with missing values in data sets with homogeneous attributes (their independent attributes are all either continuous or discrete). This paper studies a new setting of missing data imputation, i.e., imputing missing data in data sets with heterogeneous attributes (their independent attributes are of different types), referred to as imputing mixed-attribute data sets. Although many real applications are in this setting, there is no estimator designed for imputing mixed-attribute data sets. This paper first proposes two consistent estimators for discrete and continuous missing target values, respectively. And then, a mixture-kernel-based iterative estimator is advocated to impute mixed-attribute data sets. The proposed method is evaluated with extensive experiments compared with some typical algorithms, and the result demonstrates that the proposed approach is better than these existing imputation methods in terms of classification accuracy and root mean square error (RMSE) at different missing ratios.
Language eng
DOI 10.1109/TKDE.2010.99
Field of Research 080109 Pattern Recognition and Data Mining
Socio Economic Objective 970108 Expanding Knowledge in the Information and Computing Sciences
HERDC Research category C1 Refereed article in a scholarly journal
HERDC collection year 2011
Copyright notice ©2011, IEEE
Persistent URL http://hdl.handle.net/10536/DRO/DU:30033730

Document type: Journal Article
Collection: School of Information Technology
Connect to link resolver
Unless expressly stated otherwise, the copyright for items in DRO is owned by the author, with all rights reserved.

Version Filter Type
Citation counts: TR Web of Science Citation Count  Cited 92 times in TR Web of Science
Scopus Citation Count Cited 137 times in Scopus
Google Scholar Search Google Scholar
Access Statistics: 469 Abstract Views, 5 File Downloads  -  Detailed Statistics
Created: Mon, 04 Apr 2011, 10:07:46 EST by Sandra Dunoon

Every reasonable effort has been made to ensure that permission has been obtained for items included in DRO. If you believe that your rights have been infringed by this repository, please contact drosupport@deakin.edu.au.