×

Random covariances and mixed-effects models for imputing multivariate multilevel continuous data. (English) Zbl 1420.62279

Summary: Principled techniques for incomplete data problems are increasingly part of mainstream statistical practice. Among many proposed techniques so far, inference by multiple imputation (MI) has emerged as one of the most popular. While many strategies leading to inference by MI are available in cross-sectional settings, the same richness does not exist in multilevel applications. The limited methods available for multilevel applications rely on the multivariate adaptations of mixed-effects models. This approach preserves the mean structure across clusters and incorporates distinct variance components into the imputation process. In this paper, I add to these methods by considering a random covariance structure and develop computational algorithms. The attraction of this new imputation modelling strategy is to correctly reflect the mean and variance structure of the joint distribution of the data and allow the covariances differ across the clusters. Using Markov chain Monte Carlo techniques, a predictive distribution of missing data given observed data is simulated leading to creation of MIs. To circumvent the large sample size requirement to support independent covariance estimates for the level-1 error term, I consider distributional impositions mimicking random-effects distributions assigned a priori. These techniques are illustrated in an example exploring relationships between victimization and individual and contextual level factors that raise the risk of violent crime.

MSC:

62H30 Classification and discrimination; cluster analysis (statistical aspects)
62P25 Applications of statistics to social sciences
Full Text: DOI

References:

[1] Daniels M (2006) Bayesian modeling of several covariance matrices and some results on propriety of the posterior for linear regression with correlated and/or heterogeneous errors. Journal of Multivariate Analysis, 98, 568-87.
[2] Demidenko E (2004) Mixed models: theory and applications. New York: John Wiley & Sons. · Zbl 1055.62086
[3] Demirtas H Freels S and Yucel R (2008) Plausibility of multivariate normality assumption when multiply imputing non-Gaussian continuous outcomes: a simulation assessment. Journal of Statistical Computation and Simulation, 78, 69-84. · Zbl 1133.62337
[4] Demirtas H and Hedeker D (2008) An imputation strategy for incomplete longitudinal ordinal data. Statistics in Medicine, 27, 4086-93.
[5] Diggle P Liang K and Zeger S (1994) Analysis of longitudinal data. Oxford: Oxford University Press. · Zbl 1031.62002
[6] Fitzmaurice G Laird N and Ware J (2004) Applied longitudinal analysis. New York: John Wiley & Sons. · Zbl 1057.62052
[7] Gelman A Carlin JB Stern HS and Rubin DB (2004) Bayesian data analysis 2nd edition. London: Chapman & Hall Ltd. · Zbl 1039.62018
[8] Goldstein H Carpenter J Kenward M and Levin K (forthcoming) Multilevel models with multivariate mixed response types. Statistical Modelling, 9, 173-97. · Zbl 07257700
[9] Horton NJ Lipsitz SR and Parzen M (2003) A potential for bias when rounding in multiple imputation. The American Statistician, 57, 229-32. · Zbl 1182.62002
[10] Laird N and Ware J (1982) Random-effects models for longitudinal data. Biometrics, 38, 963-74. · Zbl 0512.62107
[11] Li K Meng X Raghunathan T and Rubin D (1991) Signicance levels from repeated p-values with multiply-imputed data. Statistica Sinica, 1, 65-92. · Zbl 0823.62009
[12] Little RJA and Rubin DB (2002) Statistical analysis with missing data, 2nd edition. New York: John Wiley & Sons. · Zbl 1011.62004
[13] Liu M Taylor J and Belin T (2000) Multiple imputation and posterior simulation for multivariate missing data in longitudinal studies. Biometrics, 56, 1157-63. · Zbl 1060.62570
[14] McCulloch C and Searle S (2001) Generalized, linear and mixed models. New York: John Wiley & Sons. · Zbl 0964.62061
[15] Meng XL (1994) Multiple-imputation inferences with uncongenial sources of input. Statistical Science, 10, 538-73.
[16] Miethe T and MacDowall D (1993) Contextual effects in models of criminal victimization. Social Forces, 71, 741-59.
[17] Pinheiro J and Bates D (2000) Mixed-effects models in S and S-PLUS. New York: Springer-Verlag Inc. · Zbl 0953.62065
[18] Pourahmadi M Daniels M and Park T (2007) Simultaneous modelling of the Cholesky decomposition of several covariance matrices. Journal of Multivariate Analysis, 98, 568-87. · Zbl 1107.62043
[19] Raghunathan TE Lepkowski JM and VanHoewyk J (2001) A multivariate technique for multiply imputing missing values using a sequence of regression models. Survey Methodology, 27, 1-20.
[20] Rasbash J Steel F Browne W and Prosser B (2006) MlWin user’s manual. Bristol, UK: Centre for Multilevel Modelling.
[21] R Development Core Team (2007) R: A language and environment for statistical computing. Vienna, Austria: R Foundation for Statistical Computing Available at http://www.R-project.org
[22] Rountree P Land C and Miethe T (1994) Macro-micro integration in the study of victimization: a hierarchical logistic model analysis across Seattle neighborhoods. Criminology, 32, 287-313.
[23] Rubin DB (1977) Formalizing subjective notions about the effect of non-respondents in sample surveys. Journal of the American Statistical Association, 72, 538-43. · Zbl 0369.62011
[24] Rubin DB (1987) Multiple imputation for non-response in surveys. New York: John Wiley & Sons. · Zbl 1070.62007
[25] SAS Institute (2001) SAS/Stat user’s guide, Version 8. 2. Carey, NC: SAS Publishing.
[26] Schafer J (1997a) Analysis of incomplete multivariate data. London: Chapman & Hall. · Zbl 0997.62510
[27] Schafer J and Yucel R (2002) Computational strategies for multivariate linear mixed-effects models with missing values. Journal of Computational and Graphical Statistics, 11, 421-42.
[28] Schafer JL (1997b) Analysis of incomplete multivariate data. London: Chapman & Hall. · Zbl 0997.62510
[29] Schafer JL and Graham JW (2002) Missing data: our view of the state of the art. Psychological Methods, 7, 147-77.
[30] Tierney L (1994) Markov chains for exploring posterior distributions. Annals of Statistics, 22, 1701-28. · Zbl 0829.62080
[31] Van Buuren S and Oudshoorn C (2000) Multivariate imputation by chained equations: MICE V1. 0 user’s guide. TNO Preventie en Gezondheid: Report PG/VGZ/00. 038.
[32] Verbeke G and Molenberghs G (2002) Linear mixed models for longitudinal data. New York: Springer-Verlag Inc. · Zbl 0956.62055
[33] Vonesh EF and Chinchilli VM (1997) Linear and nonlinear models for the analysis of repeated measurements. New York: Marcel Dekker Inc. · Zbl 0893.62077
[34] Yucel RM (2008) Multiple imputation inference for multivariate multilevel continuous data with ignorable non-response. Philosophical Transactions the Royal Socicty of London, Series A, 366, 2389-403.
[35] Yucel RM He Y and Zaslavsky A (2008) Using calibration to improve rounding in imputation. The American Statistician, 62, 125-29.
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.