Alternative Robust Variable Selection Procedures in Multiple Regression

  • Shokrya Saleh Jazan University, Kingdom of Saudi Arabia
  • Ali Hassan Abuzaid Department of Mathematics, Al Azhar University, Gaza
Keywords: Model selection criteria, Regression diagnostics, Robust variable selection, Breakdown point

Abstract

Most of the commonly used linear regression variable selection techniques are affected in the presence of outliers and high leverage points and often could produce misleading conclusions. This article proposes robust variable selection methods, where the suspected outliers and high leverage points are identified by regression diagnostics tools and then the best variables are selected after diagnostic checking. The performance of the proposed methods is compared with the classical non-robust criteria and the existing criteria via simulations. Furthermore, Hawkins-Bradu-Kass data set was analyzed for illustration.

Author Biography

Ali Hassan Abuzaid, Department of Mathematics, Al Azhar University, Gaza
Ali H. Abuzaid is an Associate Professor in the Department of Mathematics with a concentration in statistics at Al-Azhar University- Gaza, Palestine. He holds a PhD and MSc in Statistics from University of Malaya, Malaysia. He is interested in the development of outlier detection procedures in different types of data and the application on real data. Currently, he is the dean of planning and quality assurance at Al Azhar University.

References

J. Fan, and R. Li, Variable Selection via Nonconcave Penalized Likelihood and its Oracle Properties, J.Amer. Statist. Assoc, vol. 96,no. 452, pp. 1348C1360, 2001.

M. Kazemi, D. Shahsavani, and M. Arashi, Variable Selection and structure identification for ultrahigh-dimensional partially linear additive models with application to cardiomyopathy microarray data, Statistics, Optimization & Information Computing, vol. 6, no.452, pp. 373CC382, 2018.

Z. Y. Algamal, Variable Selection in Count Data Regression Model based on Firefly Algorithm, Statistics, Optimization & Information Computing, vol. 7, pp. 520C529, 2019.

Mallows, Colin L, Some comments on C p, Taylor & Francis Group, vol. 15, no. 4, pp. 661–675, 1973.

Schwarz, Gideon, Estimating the dimension of a model, The annals of statistics, vol. 6, no. 2, pp. 461–464, 1978.

Akaike, Hirotogu, Information theory and an extension of the maximum likelihood principle, Springer, 199–213, 1998.

Rousseeuw and Peter, Multivariate estimation with high breakdown point, Reidel, 1985.

Machado, Jose AF, Robust model selection and M-estimation, Econometric Theory, vol. 9, no. 03 pp. 478–493, 1993.

Ronchetti, Elvezio and Staudte, Robert G, A robust version of Mallows’s Cp, Journal of the American Statistical Association, vol. 89, no. 426, pp. 550–559, 1994.

Rousseeuw and Zomeren, Bert C, Unmasking multivariate outliers and leverage points, Journal of the American Statistical Association, vol. 85, no. 411 pp. 633–639, 1990.

Hadi, Ali S, A new measure of overall potential influence in linear regression, Computational Statistics & Data Analysis, vol. 14,no. 1 pp. 1–27, 1992.

Rousseeuw, Peter J and Leroy, Annick M, Robust regression and outlier detection, John Wiley & Sons, vol. 589, 2005.

Huber, Peter J, Robust statistics, Springer & Data Analysis, 2011.

Ryan, Thomas P, Modern regression methods, John Wiley & Sons, vol. 655, 2008.

Belsley, David A and Kuh, Edwin and Welsch, Roy E, Regression diagnostics: Identifying influential data and sources of collinearity, John Wiley & Sons, vol. 571, 2005.

Cook, R Dennis, Detection of influential observation in linear regression, Technometrics, pp. 15–18, 1977.

Maronna, Ricardo and Martin, Douglas and Yohai, Victor, Robust statistics, John Wiley & Sons, Chichester. ISBN, 2006.

Serneels, Sven and Filzmoser, Peter and Croux, Christophe and Van Espen, Pierre J, Robust continuum regression, Chemometrics and Intelligent Laboratory Systems, vol. 76, no. 2 pp. 197–204, 2005.

Lee, Jong Soo and Cox, Dennis D, Robust smoothing: Smoothing parameter selection and applications to fluorescence spectroscopy, Computational statistics & data analysis, vol. 54, no. 12 pp. 3131-3143, 2010.

Leung, Denis Heng-Yan, Cross-validation in nonparametric regression with outliers, Annals of Statistics, pp. 2291–2310, 2005.

Morell, Oliver and Otto, Dennis and Fried, Roland, On robust cross-validation for nonparametric smoothing, Computational Statistics, vol. 28, no. 4 pp. 1617–1637, 2013.

Published
2019-12-01
How to Cite
Saleh, S., & Abuzaid, A. H. (2019). Alternative Robust Variable Selection Procedures in Multiple Regression. Statistics, Optimization & Information Computing, 7(4), 816-825. https://doi.org/10.19139/soic-2310-5070-642
Section
Research Articles