Deakin University
Browse

File(s) under permanent embargo

Multivariable data imputation for the analysis of incomplete credit data

journal contribution
posted on 2020-03-01, 00:00 authored by Q Lan, X Xu, H Ma, Gang LiGang Li
© 2019 Missing data significantly reduce the accuracy and usability of credit scoring models, especially in multivariate missing cases. Most credit scoring models address this problem by deleting the missing instances from the dataset or imputing missing values with the mean, mode, or regression values. However, these methods often result in a significant loss of information or a bias. We proposed a novel method called BNII to impute missing values, which can be helpful for intelligent credit scoring systems. The proposed BNII algorithm consisted of two stages: the preparatory stage and the imputation stage. In the first stage, a Bayesian network with all of the attributes in the original dataset was constructed from the complete dataset so that both the network structure that implied the dependencies between variables and the parameters at each variable's conditional distributions could be learned. In the second stage, multivariables with missing values were iteratively imputed using Bayesian network models from the first stage. The algorithm was found to be monotonically convergent. The most significant advantages of the method include, it exploits the inherent probability-dependent relationship between variables, but without a specific probability distribution hypothesis, and it is suitable for multivariate missing cases. Three datasets were used for experiments: one was the real dataset from a famous P2P financial company in China, and the other two were benchmark datasets provided by UCI. The experimental results showed that BNII performed significantly better than the other well-known imputation techniques. This suggested that the proposed method can be used to improve the performance of a credit scoring system and to be extended to other expert and intelligent systems.

History

Journal

Expert systems with applications

Volume

141

Article number

112926

Publisher

Elsevier

Location

Amsterdam, The Netherlands

ISSN

0957-4174

Language

eng

Publication classification

C1 Refereed article in a scholarly journal

Copyright notice

2019, Elsevier