Alternative Robust Variable Selection Procedures in Multiple Regression
Abstract
Most of the commonly used linear regression variable selection techniques are affected in the presence of outliers and high leverage points and often could produce misleading conclusions. This article proposes robust variable selection methods, where the suspected outliers and high leverage points are identified by regression diagnostics tools and then the best variables are selected after diagnostic checking. The performance of the proposed methods is compared with the classical non-robust criteria and the existing criteria via simulations. Furthermore, Hawkins-Bradu-Kass data set was analyzed for illustration.References
J. Fan, and R. Li, Variable Selection via Nonconcave Penalized Likelihood and its Oracle Properties, J.Amer. Statist. Assoc, vol. 96,no. 452, pp. 1348C1360, 2001.
M. Kazemi, D. Shahsavani, and M. Arashi, Variable Selection and structure identification for ultrahigh-dimensional partially linear additive models with application to cardiomyopathy microarray data, Statistics, Optimization & Information Computing, vol. 6, no.452, pp. 373CC382, 2018.
Z. Y. Algamal, Variable Selection in Count Data Regression Model based on Firefly Algorithm, Statistics, Optimization & Information Computing, vol. 7, pp. 520C529, 2019.
Mallows, Colin L, Some comments on C p, Taylor & Francis Group, vol. 15, no. 4, pp. 661–675, 1973.
Schwarz, Gideon, Estimating the dimension of a model, The annals of statistics, vol. 6, no. 2, pp. 461–464, 1978.
Akaike, Hirotogu, Information theory and an extension of the maximum likelihood principle, Springer, 199–213, 1998.
Rousseeuw and Peter, Multivariate estimation with high breakdown point, Reidel, 1985.
Machado, Jose AF, Robust model selection and M-estimation, Econometric Theory, vol. 9, no. 03 pp. 478–493, 1993.
Ronchetti, Elvezio and Staudte, Robert G, A robust version of Mallows’s Cp, Journal of the American Statistical Association, vol. 89, no. 426, pp. 550–559, 1994.
Rousseeuw and Zomeren, Bert C, Unmasking multivariate outliers and leverage points, Journal of the American Statistical Association, vol. 85, no. 411 pp. 633–639, 1990.
Hadi, Ali S, A new measure of overall potential influence in linear regression, Computational Statistics & Data Analysis, vol. 14,no. 1 pp. 1–27, 1992.
Rousseeuw, Peter J and Leroy, Annick M, Robust regression and outlier detection, John Wiley & Sons, vol. 589, 2005.
Huber, Peter J, Robust statistics, Springer & Data Analysis, 2011.
Ryan, Thomas P, Modern regression methods, John Wiley & Sons, vol. 655, 2008.
Belsley, David A and Kuh, Edwin and Welsch, Roy E, Regression diagnostics: Identifying influential data and sources of collinearity, John Wiley & Sons, vol. 571, 2005.
Cook, R Dennis, Detection of influential observation in linear regression, Technometrics, pp. 15–18, 1977.
Maronna, Ricardo and Martin, Douglas and Yohai, Victor, Robust statistics, John Wiley & Sons, Chichester. ISBN, 2006.
Serneels, Sven and Filzmoser, Peter and Croux, Christophe and Van Espen, Pierre J, Robust continuum regression, Chemometrics and Intelligent Laboratory Systems, vol. 76, no. 2 pp. 197–204, 2005.
Lee, Jong Soo and Cox, Dennis D, Robust smoothing: Smoothing parameter selection and applications to fluorescence spectroscopy, Computational statistics & data analysis, vol. 54, no. 12 pp. 3131-3143, 2010.
Leung, Denis Heng-Yan, Cross-validation in nonparametric regression with outliers, Annals of Statistics, pp. 2291–2310, 2005.
Morell, Oliver and Otto, Dennis and Fried, Roland, On robust cross-validation for nonparametric smoothing, Computational Statistics, vol. 28, no. 4 pp. 1617–1637, 2013.
- Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under a Creative Commons Attribution License that allows others to share the work with an acknowledgement of the work's authorship and initial publication in this journal.
- Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgement of its initial publication in this journal.
- Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (See The Effect of Open Access).