Improved Mean Methods of Imputation

  • Choukri Mohamed
  • Stephen A. Sedory
  • Sarjinder Singh Texas A&M University - Kingsville
Keywords: Missing data, Imputation, Mean,

Abstract

Replacing missing values of a variable with the mean of the non-missing values is a simple and natural way to impute values fortunately in the case where data is missing completely at random. Following a short review of this method we consider thus possible improvements, are called the shrinkage method, a second called the weighted interval method, and a third called the known variance method. Estimates of the population mean obtained from each of these methods are compared to the mean method both analytically and by means of numerical examples.

Author Biography

Sarjinder Singh, Texas A&M University - Kingsville
Highly motivated, self-driven, and team-oriented individual with significant experience in research and teaching at several universities and currently working as an associate professor (tenured) position at the Texas A&M University-Kingsville.

References

Cochran, W.G. (1963). Sampling Techniques. John Wiley and Sons: New York.

Dempster, A.P., Laird, N.M., and Rubin, D.B. (1977). Maximum Likelihood from Incomplete Data via the EM Algorithm (with Discussion). Journal of the Royal Statistical Society, B, 39(1), 1-38.

Hansen, M.H. and Hurwitz, W.N. (1946). The problem of non-response in sample surveys. J . Amer. Statist. Assoc., 41,517–529.

Heitjan,D.F.andBasuS(1996).DistinguishingMissingAtRandomandMissingCompletelyAtRandom. TheAmericanStatistician,50, 207-213.

Johnson, Richard A., and Dean W. Wichern (1982). Applied Multivariate Statistical Analysis, pages 209-213: Prentice Hall Inc. Englewood Cliffs,.N.J.

Kataria, P. and Singh, S. (1989).On the estimation of mean when population variance is known. J. Indian Soc. Agri. Statist., 41(2), 173-175.

Mohamed, C. (2015). Improved Imputation Methods in Survey Sampling. Unpublished MS thesis submitted to the Department of Mathematics, Texas A&M University-Kingsville, TX.

Mohamed, C., Sedory, S.A. and Singh, S. (2016). Comparison of different imputing methods for scrambled responses. Handbook of Statistics: Data Gathering Analysis and Protection of Privacy Through Randomized Response Techniques: Qualitative and Quantitative Human Traits, 34, 471-495.

Mohamed, C., Sedory, S.A. and Singh, S. (2017). Imputation using higher order moments of an auxiliary variable.Communications in Statistics: Simulation and Computations, 46(8), 6588-6617.

Mohamed, C., Sedory, S.A. and Singh, S. (2018). A fresh imputing survey methodology using sensible constraints on study and auxiliary variables: dubious random non-response. Journal of Statistical Computations and Simulations, 88:7, 1273-1294.

Rubin, D.B. (1976). Inference and missing data. Biometrika, 63(3), 581 -592

Searls, D.T. (1964). The utilization of a known coefficient of variation in the estimation procedure. J. Amer. Statist. Assoc., 59, 1225–1226.

Searls, D.T. (1967). A note on the use of an approximately known co-efficient of variation. American Statistician, 21(2), 20–21.

Singh, S., Mangat, N.S., and Mahajan, P.K. (1995). General class of estimators. J. Indian Soc. of Agricul. Statist. 47, 129-133.

Published
2018-11-02
How to Cite
Mohamed, C., A. Sedory, S., & Singh, S. (2018). Improved Mean Methods of Imputation. Statistics, Optimization & Information Computing, 6(4), 526-535. https://doi.org/10.19139/soic.v6i4.281
Section
Research Articles