Abstract
Predictive mean matching is an imputation method that combines parametric and nonparametric techniques. It imputes missing values by means of the Nearest Neighbor Donor with distance based on the expected values of the missing variables conditional on the observed covariates, instead of computing the distance directly on the values of the covariates. In ordinary predictive mean matching the expected values are computed through a linear regression model. In this paper a generalization of the original predictive mean matching is studied. Here the expected values used for computing the distance are estimated through an approach based on Gaussian mixture models. This approach includes as a special case the original predictive mean matching but allows one to deal also with nonlinear relationships among the variables. In order to assess its performance, an empirical evaluation based on simulations is carried out.
Article PDF
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Avoid common mistakes on your manuscript.
References
Dempster, A.P., Laird, N.M., Rubin, D.B.: Maximum likelihood from incomplete data via the EM algorithm. J. R. Stat. Soc. B 39, 1–38 (1977)
Di Zio, M., Guarnera, U., Luzi, O.: Imputation through finite mixture models. Comput. Stat. Data Anal. 51, 5305–5316 (2007)
Durrant, G.B., Skinner, C.: Using missing data methods to correct for measurement error in a distribution function. Surv. Methodol. 32, 25–36 (2006)
Fraley, C., Raftery, E.: Model-based clustering, discriminant analysis, and density estimation. J. Am. Stat. Assoc. 97, 611–629 (2002)
Hunt, L., Jorgensen, M.: Mixture model clustering for mixed data with missing information. Comput. Stat. Data Anal. 41, 561–575 (2003)
Kotz, S., Balakrishnan, N., Johnson, N.L.: Continuous Multivariate Distributions, vol. 1, 2nd edn. Wiley, New York (2000)
Little, R.J.A.: Missing-data adjustments in large surveys. J. Bus. Econ. Stat. 6, 287–296 (1988)
Little, J., Rubin, D.: Statistical Analysis with Missing Data. Wiley, New York (2002)
Marron, S., Wand, M.: Exact Mean Integrated Squared Error. Ann. Stat. 20, 712–736 (1992)
McLachlan, G., Peel, D.: Finite Mixture Models. Wiley, New York (2000)
Roeder, K., Wasserman, L.: Practical density estimation using mixtures of normals. J. Am. Stat. Assoc. 92, 894–902 (1997)
Schafer, J.L.: Analysis of Incomplete Multivariate Data. Chapman & Hall, London (1997)
Schwarz, G.: Estimating the dimension of a model. Ann. Stat. 6, 461–464 (1978)
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Di Zio, M., Guarnera, U. Semiparametric predictive mean matching. AStA Adv Stat Anal 93, 175–186 (2009). https://doi.org/10.1007/s10182-008-0081-2
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10182-008-0081-2