Robust-diagnostic regression : a prelude for inducing reliable knowledge from regression
Nurunnabi, Abdul Awal Md. and Dai, Honghua 2012, Robust-diagnostic regression : a prelude for inducing reliable knowledge from regression, in Reliable knowledge discovery, Springer, New York, N. Y., pp.69-92.
Regression lies heart in statistics, it is the one of the most important branch of multivariate techniques available for extracting knowledge in almost every field of study and research. Nowadays, it has drawn a huge interest to perform the tasks with different fields like machine learning, pattern recognition and data mining. Investigating outlier (exceptional) is a century long problem to the data analyst and researchers. Blind application of data could have dangerous consequences and leading to discovery of meaningless patterns and carrying to the imperfect knowledge. As a result of digital revolution and the growth of the Internet and Intranet data continues to be accumulated at an exponential rate and thereby importance of detecting outliers and study their costs and benefits as a tool for reliable knowledge discovery claims perfect attention. Investigating outliers in regression has been paid great value for the last few decades within two frames of thoughts in the name of robust regression and regression diagnostics. Robust regression first wants to fit a regression to the majority of the data and then to discover outliers as those points that possess large residuals from the robust output whereas in regression diagnostics one first finds the outliers, delete/correct them and then fit the regular data by classical (usual) methods. At the beginning there seems to be much confusion but now the researchers reach to the consensus, robustness and diagnostics are two complementary approaches to the analysis of data and any one is not good enough. In this chapter, we discuss both of them under the unique spectrum of regression diagnostics. Chapter expresses the necessity and views of regression diagnostics as well as presents several contemporary methods through numerical examples in linear regression within each aforesaid category together with current challenges and possible future research directions. Our aim is to make the chapter self-explained maintaining its general accessibility.