Estimation of Zero-Inflated Population Mean with Highly Skewed Nonzero Component: A Bootstrapping Approach

  • Khyam Paneru The University of Tampa, USA
  • R. Noah Padgett Baylor University, USA
  • Hanfeng Chen Bowling Green State University, USA
Keywords: two-component model, zero-inflated log-normal, maximum pseudo-likelihood, bootstrap, skewed distribution,

Abstract

This paper adopts a bootstrap procedure in the maximum pseudo-likelihood method under probability sampling designs. It estimates the mean of a population that is a mixture of excess zero and a nonzero skewed sub-population. Simulations studies show that the bootstrap confidence intervals for zero-inflated log-normal population consistently capture the true mean. The proposed method is applied to a real-life data set.

References

H. Chen, J. Chen, and S. Y. Chen, Confidence intervals for the mean of a population containing many zero values under unequalprobability sampling, Canadian Journal of Statistics, vol. 38, no. 4, pp. 582–597, 2010.

J. Chen, and R. R. Sitter, A pseudo empirical likelihood approach to the effective use of auxiliary information in complex surveys, Statistica Sinica, vol. 9, pp. 385–406, 1999.

Y. Cui, and W. Yang, Zero-inflated generalized Poisson regression mixture model for mapping quantitative trait loci underlying count trait with many zeros, Journal of Theoretical Biology, vol. 256, no. 2, pp. 276–285, 2009.

Y. S. Dewi, and L. Amaliana, Zero inflated Poisson and geographically weighted zero-inflated Poisson regression model: Application to elephantiasis (filariasis) counts data, Journal of Mathematics and Statistics, vol. 11, no. 2, pp. 52, 2015.

B. Efron, Bootstrap methods: Another look at the jackknife, The Annuals of Statistics, vol. 7, no. 1, pp. 1–26, 1979.

B. Efron, Nonparametric estimates of standard error: The jackknife, the bootstrap and other methods, Biometrika, vol. 63, no. 3, pp. 589–599, 1981.

B. Efron, The Jackknife, the Bootstrap and Other Resampling Plans, Society for Industrial and Applied Mathematics, 1982.

B. Efron, and R. J. Tibshirani, An Introduction to the Bootstrap, CRC press, 1994.

D. Fletcher, D. MacKenzie, and E. Villouta, Modeling skewed data with many zeros: A simple approach combining ordinary and logistic regression, Environmental and Ecological Statistics, vol. 12, no. 1, pp. 45–54, 2005.

A. H. Kvanli, Y. K. Shen, and L. Y. Deng, Construction of confidence intervals for the mean of a population containing many zero values, Journal of Business & Economic Statistics, vol. 16, no. 3, pp. 362–368, 1998.

D. Lambert, Zero-inflated Poisson regression, with an application to defects in manufacturing, Technometrics, vol. 34, no. 1, pp. 1–14, 1992.

L. Liu, Y. C. T. Shih, R. L. Strawderman, D. Zhang, B. A. Johnson, and H. Chai, Statistical analysis of zero-inflated nonnegative continuous data: A review, Statistical Science, vol. 34, no. 2, pp. 253–279, 2019.

T. Loeys , B. Moerkerke, O. De Smet, and A. Buysse, The analysis of zero-inflated count data: Beyond zero-inflated Poisson regression, British Journal of Mathematical and Statistical Psychology, vol. 65, no. 1, pp. 163–180, 2012.

T. Mahmood, and M. Xie, Models and monitoring of zero-inflated processes: The past and current trends, Quality and Reliability Engineering International, vol. 35, no. 8, pp. 2540–2557, 2019.

K. Paneru, Regression analysis for zero inflated population under complex sampling designs (Doctoral dissertation), Bowling Green State University, Bowling Green, OH, USA, 2013.

K. Paneru, and H. Chen, Asymptotic distribution of pseudo-likelihood ratio statistic for zero-inflated generalized linear models under complex sampling designs, Far East Journal of Theoretical Statistics, vol. 49, no. 1, pp. 41–60, 2014.

K. Paneru, and H. Chen, Regression analysis under complex probability sampling designs in presence of many zero-value responses, Advances and Applications in Statistics, vol. 40, no. 1, pp. 1–29, 2014.

K. Paneru, and H. Chen, Estimation of zero-inflated population mean: A bootstrapping approach, Journal of Modern Applied Statistical Methods, vol. 17, no. 1, pp. 1–14, 2018.

B. Pittman, E. Buta, S. Krishnan-Sarin, S. S. O’Malley, T. Liss, and R. Gueorguieva, Models for analyzing zero-inflated and overdispersed count data: an application to cigarette and marijuana use, Nicotine & Tobacco Research, vol. 22, no. 8, pp. 1390–1398, 2020.

F. Satter, and Y. Zhao, Nonparametric interval estimation for the mean of a zero-inflated population, Communications in Statistics-Simulation and Computation, vol. 49, no. 8, pp. 2059–2067, 2020.

F. Satter, and Y. Zhao, Jackknife empirical likelihood for the mean difference of two zero-inflated skewed populations, Journal of Statistical Planning and Inference, vol. 211, pp. 414–422, 2021.

D. J. Taylor, L. L. Kupper, S. M. Rappaport, and R. H. Lyles, A mixture model for occupational exposure mean testing with a limit of detection, Biometrics, vol. 57, no. 3, pp. 681–688, 2001.

L. Tian, Inferences on the mean of zero-inflated log-normal data: The generalized variable approach, Statistics in Medicine, vol. 24, no. 20, pp. 3223–3232, 2005.

A. H.Welsh, R. B. Cunningham, C. F. Donnelly, and D. B. Lindenmayer, Modeling the abundance of rare species: statistical models for counts with extra zeros, Ecological Modelling, vol. 88, no. 1-3, pp. 297–308, 1996.

Z. Xiao-Hua, andW. Tu, Comparison of esveral independent population means when their samples contain log-normal and possibly zero observations, Biometrics, vol. 55, no. 2, pp. 45–651, 1999.

Z. Xiao-Hua, and W. Tu, Confidence intervals for the mean of diagnostic test charge data containing zeros, Biometrics, vol. 56, no. 4, pp. 1118–1125, 2000.

M. Zidan, J. C. Wang, and M. Niewiadomska-Bugaj, Comparison of k independent, zero-heavy log-normal distributions, Canadian Journal of Statistics, vol. 39, no. 4, pp. 690–702, 2011.

Published
2022-06-12
How to Cite
Paneru, K., Padgett, R. N., & Chen, H. (2022). Estimation of Zero-Inflated Population Mean with Highly Skewed Nonzero Component: A Bootstrapping Approach. Statistics, Optimization & Information Computing, 10(4), 1044-1055. https://doi.org/10.19139/soic-2310-5070-1491
Section
Research Articles