Statistical Inference for Multivariate Conditional Cumulative Distribution Function Estimation By Stochastic Approximation Method
Abstract
This paper handles non-parametric estimation of a conditional cumulative distribution function (CCDF). Using a recursive approach, we set forward a multivariate recursive estimator defifined by stochastic approximation algorithm. Our basic objective is to investigate the statistical inference of our estimator and compare it with that of non-recursive Nadaraya-Watson’s estimator. From this perspective, we fifirst derive the asymptotic properties of the proposed estimator which highly depend on the choice of two parameters, the stepsize (γn) as well as the bandwidth (hn). The second generation plug-in method, a method of bandwidth selection minimizing the Mean Weighted Integrated Squared Error (MW ISE) of the estimator in reference, entails the optimal choice of the bandwidth and therefore maintains an appropriate choice of the stepsize parameter. Basically, we demonstrate that, under some conditions, the Mean Squared Error (MSE) of the proposed estimator can be smaller than the one of Nadaraya Watson’s estimator. We corroborate our theoretical results through simulation studies and two real dataset applications, namely the Insurance Company Benchmark (COIL 2000) dataset as well as the French Hospital Data of COVID-19 epidemic.References
F. A. Al-Awadhi, Z. Kaid, A. Laksaci, I. Ouassou, and M. Rachdi, Functional data analysis: local linear estimation of the L1- conditional quantiles, Statistical Methods & Applications. Journal of the Italian Statistical Society, vol. 28, no. 2, pp. 217–240, 2019.
I. M. Almanjahie, Z. Chikr Elmezouar, A. Laksaci, and M. Rachdi, kNN local linear estimation of the conditional cumulative distribution function: Dependent functional data case, Comptes Rendus Mathématique. Académie des Sciences. Paris, vol. 356, no. 10, pp. 1036–1039, 2018.
N. Altman and C. Léger, Bandwidth selection for kernel distribution function estimation, Journal of Statistical Planning and Inference, vol. 46, no. 2, pp. 195–214, 1995.
F. Benziadi, A. Laksaci, and F. Tebboune, Recursive kernel estimate of the conditional quantile for functional ergodic data, Communications in Statistics. Theory and Methods, vol. 45, no. 11, pp. 3097–3113, 2016.
A. Berlinet, A. Gannoun, and E. Matzner-Lober, Propriétés asymptotiques d’estimateurs convergents des quantiles conditionnels, Comptes Rendus de l’Académie des Sciences. Série I. Mathématique, vol. 326, no. 5, pp. 611–614, 1998.
A. Berlinet, A. Gannoun, and E. Matzner-Lø ber, Normalité asymptotique d’estimateurs convergents du mode conditionnel, The Canadian Journal of Statistics, vol. 26, no. 2, pp. 365–380, 1998.
J. R. Blum, Multidimensional stochastic approximation methods, Annals of Mathematical Statistics, vol. 25, pp. 737–744, 1954.
O. Bouanani, S. Rahmani, and L. Ait-Hennani, Local linear conditional cumulative distribution function with mixing data, Arabian Journal of Mathematics, vol. 9, no. 2, pp. 289–307, 2020.
E. Brunel, F. Comte, and C. Lacour, Minimax estimation of the conditional cumulative distribution function, Sankhya A. The Indian Journal of Statistics, vol. 72, no. 2, pp. 293–330, 2010.
Z. Chikr-Elmezouar, I. M. Almanjahie, A. Laksaci, and M. Rachdi, FDA: strong consistency of the kNN local linear estimation of the functional conditional density and mode, Journal of Nonparametric Statistics, vol. 31, no. 1, pp. 175–195, 2019.
P. Chilinski and R. Silva, Neural likelihoods via cumulative distribution function, Proceedings of Machine Learning Research, vol. 124, pp. 420–429, 2020.
M. Chowdhury, C. Wu, and R. Modarres, Nonparametric estimation of conditional distribution functions with longitudinal data and time-varying parametric models, Metrika. International Journal for Theoretical and Applied Statistics, vol. 81, no. 1, pp. 61–83, 2018.
M. Duflo, Random iterative models, vol. 34 of Applications of Mathematics (New York). Springer-Verlag, Berlin, 1997. Translated from the 1990 French original by Stephen S. Wilson and revised by the author.
J. Fan, Q. Yao, and H. Tong, Estimation of conditional densities and sensitivity measures in nonlinear dynamical systems, Biometrika, vol. 83, no. 1, pp. 189–206, 1996.
J. Galambos and E. Seneta, Regularly varying sequences, Proceedings of the American Mathematical Society, vol. 41, pp. 110–116, 1973.
A. Gannoun, S. Girard, C. Guinot, and J. Saracco, Reference curves based on nonparametric quantile regression, Statistics in Medicine, vol. 21, no. 9, pp. 3119–3135, 2002.
P. Hall, R. C. L. Wolff, and Q. Yao, Methods for estimating a conditional distribution function, Journal of the American Statistical Association, vol. 94, no. 445, pp. 154–163, 1999.
T. Honda, Nonparametric estimation of the conditional median function for long-range dependent processes, Journal of the Japan Statistical Society (Nihon Tôkei Gakkai Kaihô), vol. 30, no. 2, pp. 129–142, 2000.
T. Honda, Nonparametric estimation of a conditional quantile for α-mixing processes, Annals of the Institute of Statistical Mathematics, vol. 52, no. 3, pp. 459–470, 2000.
S. Khardani and Y. Slaoui, Recursive kernel density estimation and optimal bandwidth selection under alpha-mixing data, Journal of Statistical Theory and Practice, vol. 13, no. 36, 2019.
J. Kiefer and J. Wolfowitz, Stochastic estimation of the maximum of a regression function, Annals of Mathematical Statistics, vol. 23, pp. 462–466, 1952.
S. Kiwitt and N. Neumeyer, Estimating the conditional error distribution in non-parametric regression, Scandinavian Journal of Statistics. Theory and Applications, vol. 39, no. 2, pp. 259–281, 2012.
A. Laksaci and F. Maref, Conditional cumulative distribution estimation and its applications, JPSS. Journal of Probability and Statistical Science, vol. 7, no. 1, pp. 57–69, 2009.
A. Laksaci and N. Hachemi, Note on the functional linear estimate of conditional cumulative distribution function, JPSS. Journal of Probability and Statistical Science, vol. 10, no. 2, pp. 153–160, 2012.
Q. Li, J. Lin, and J. S. Racine, Optimal bandwidth selection for nonparametric conditional distribution and quantile functions, Journal of Business & Economic Statistics, vol. 31, no. 1, pp. 57–65, 2013.
A. Mokkadem, M. Pelletier, and Y. Slaoui, The stochastic approximation method for the estimation of a multivariate probability density, Journal of Statistical Planning and Inference, vol. 139, no. 7, pp. 2459–2478, 2009.
A. Mokkadem, M. Pelletier, and Y. Slaoui, Revisiting Révész’s stochastic approximation method for the estimation of a regression function, ALEA. Latin American Journal of Probability and Mathematical Statistics, vol. 6, pp. 63–114, 2009.
A. Muller-Gueudin, S. Ferrigno, and M. Maumy-Bertrand, Certainty bands for the conditional cumulative distribution function and applications, 47èmes Journées de Statistique de la SFdS, hal-01154624, June, 2015.
E. A. Nadaraya, On estimating regression, Theory of Probability and Its Applications, vol. 9, pp. 141–142, 1964.
E. Parzen, On estimation of a probability density function and mode, Ann. Math. Statist., vol. 33, pp. 1065–1076, 09 1962.
S. Plancade, Adaptive estimation of the conditional cumulative distribution function from current status data, Journal of Statistical Planning and Inference, vol. 143, no. 9, pp. 1466–1485, 2013.
P. Révész, How to apply the method of stochastic approximation in the non-parametric estimation of a regression function, Mathematische Operationsforschung und Statistik Series Statistics, vol. 8, no. 1, pp. 119–126, 1977.
H. Robbins and S. Monro, A stochastic approximation method, Annals of Mathematical Statistics, vol. 22, pp. 400–407, 1951.
M. Rosenblatt, Remarks on some nonparametric estimates of a density function, Ann. Math. Statist., vol. 27, pp. 832–837, 09 1956.
D. Ruppert, Almost sure approximations to the Robbins-Monro and Kiefer-Wolfowitz processes with dependent noise, The Annals of Probability, vol. 10, no. 1, pp. 178–187, 1982.
S. Slama and Y. Slaoui, Multivariate distribution function estimation using stochastic approximation method, International Journal of Mathematics and Statistics, vol. 22, no. 2, pp. 31–59, 2021.
Y. Slaoui, Bandwidth selection for recursive kernel density estimators defined by stochastic approximation method, Journal of Probability and Statistics, pp. Art. ID 739640, 11, 2014.
Y. Slaoui, The stochastic approximation method for estimation of a distribution function, Mathematical Methods of Statistics, vol. 23, no. 4, pp. 306–325, 2014.
Y. Slaoui, Large and moderate deviation principles for averaged stochastic approximation method for the estimation of a regression function, Serdica. Mathematical Journal. Serdika. Matematichesko Spisanie, vol. 41, no. 2-3, pp. 307–328, 2015.
Y. Slaoui and S. Khardani, Adaptive recursive kernel conditional density estimators under censoring data, ALEA. Latin American Journal of Probability and Mathematical Statistics, vol. 17, no. 1, pp. 389–417, 2020.
W. Stute, Conditional empirical processes, Ann. Statist., vol. 14, pp. 638–647, 06 1986.
A. B. Tsybakov, Recurrent estimation of the mode of a multidimensional distribution, Akademiya Nauk SSSR. Institut Problem Peredachi Informatsii Akademii Nauk SSSR. Problemy Peredachi Informatsii, vol. 26, no. 1, pp. 31–37, 1990.
N. Veraverbeke, I. Gijbels, and M. Omelka, Preadjusted non-parametric estimation of a conditional distribution function, Journal of the Royal Statistical Society. Series B. Statistical Methodology, vol. 76, no. 2, pp. 399–438, 2014.
G. S. Watson, Smooth regression analysis, Sankhy¯a (Statistics). The Indian Journal of Statistics. Series A, vol. 26, pp. 359–372, 1964.
K. Yu and M. C. Jones, Local linear quantile regression, Journal of the American Statistical Association, vol. 93, no. 441, pp. 228–237, 1998.
- Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under a Creative Commons Attribution License that allows others to share the work with an acknowledgement of the work's authorship and initial publication in this journal.
- Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgement of its initial publication in this journal.
- Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (See The Effect of Open Access).