×

Maximum likelihood methods in treating outliers and symmetrically heavy-tailed distributions for nonlinear structural equation models with missing data. (English) Zbl 1306.62462

Summary: By means of more than a dozen user friendly packages, structural equation models (SEMs) are widely used in behavioral, education, social, and psychological research. As the underlying theory and methods in these packages are vulnerable to outliers and distributions with longer-than-normal tails, a fundamental problem in the field is the development of robust methods to reduce the influence of outliers and the distributional deviation in the analysis. In this paper we develop a maximum likelihood (ML) approach that is robust to outliers and symmetrically heavy-tailed distributions for analyzing nonlinear SEMs with ignorable missing data. The analytic strategy is to incorporate a general class of distributions into the latent variables and the error measurements in the measurement and structural equations. A Monte Carlo EM (MCEM) algorithm is constructed to obtain the ML estimates, and a path sampling procedure is implemented to compute the observed-data log-likelihood and then the Bayesian information criterion for model comparison. The proposed methodologies are illustrated with simulation studies and an example.

MSC:

62P15 Applications of statistics to psychology

Software:

LISREL
Full Text: DOI

References:

[1] Bentler, P.M. (2004). EQS6: Structural equations program manual. Encino, CA: Multivariate Software.
[2] Berkane, M., & Bentler, P.M. (1988). Estimating of the contamination parameters and identification of outliers in multivariate data. Sociological Methods & Research, 17, 55–64. · doi:10.1177/0049124188017001003
[3] Bowman, K.O., & Shenton, L.R. (1988). Properties of estimators for the gamma distribution. New York: Marcel Dekker. · Zbl 0642.62013
[4] Browne, M.W. (1987). Robustness of statistical influence in factor analysis and related models. Biometrika, 74, 375–384. · Zbl 0633.62052 · doi:10.1093/biomet/74.2.375
[5] Browne, M.W., & Shapiro, A. (1988). Robustness of normal theory methods in the analysis of linear latent variable models. British Journal of Mathematical and Statistical Psychology, 41, 193–208. · Zbl 0718.62075 · doi:10.1111/j.2044-8317.1988.tb00896.x
[6] Campbell, N.A. (1982). Robust procedure in multivariate analysis I: Robust covariance estimation. Applied Statistics, 29, 231–237. · Zbl 0471.62047 · doi:10.2307/2346896
[7] Cowles, M.K. (1996). Accelerating Monte Carlo Markov Chains convergence for cumulative-link generalized linear modes. Statistics and Computing, 6, 101–111. · doi:10.1007/BF00162520
[8] Dempster, A.P., Laird, N.M., & Rubin, D.B. (1977). Maximum likelihood from incomplete data via the EM algorithm (with discussion). Journal of the Royal Statistical Society, Series B, 39, 1–38. · Zbl 0364.62022
[9] Gelman, A., & Meng, X.L. (1998). Simulating normalizing constants: From importance sampling to bridge sampling to path sampling. Statistical Science, 13, 163–185. · Zbl 0966.65004 · doi:10.1214/ss/1028905934
[10] Geman, S., & Geman, D. (1984). Stochastic relaxation, Gibbs distribution, and the Bayesian restoration of images. IEEE Transactions on Pattern Analysis and Machine Intelligence, 6, 721–741. · Zbl 0573.62030 · doi:10.1109/TPAMI.1984.4767596
[11] Hastings, W.K. (1970). Monte Carlo sampling methods using Markov chains and their applications. Biometrika, 57, 97–100. · Zbl 0219.65008 · doi:10.1093/biomet/57.1.97
[12] Jöreskog, K.G., & Sörbom, D. (1996). LISREL 8: Structural equation modelling with the SIMPLIS command language. London: Scientific Software International.
[13] Kano, Y., Berkane, M., & Bentler, P.M. (1993). Statistical inference based on pseudo-maximum likelihood estimators in elliptical populations. Journal of the American Statistical Association, 88, 135–143. · Zbl 0771.62044
[14] Kass, R.E., & Raftery, A.E. (1995). Bayes factor. Journal of the American Statistical Association, 90, 773–795. · Zbl 0846.62028 · doi:10.1080/01621459.1995.10476572
[15] Lange, K.L., Little, R.J.A., & Taylor, J. M. G. (1989). Robust statistical modelling using the t-distribution. Journal of the American Statistical Association, 84, 881–896.
[16] Lee, M., & Lomax, R.G. (2005). The effects of varying degrees of nonnormality in structural equation modeling. Structural Equation Modeling, 12, 1–27. · doi:10.1207/s15328007sem1201_1
[17] Lee, S.Y., & Song, X.Y. (2004). Maximum likelihood analysis of a general latent variable model with hierarchically mixed data. Biometrics, 60, 624–636. · Zbl 1274.62808 · doi:10.1111/j.0006-341X.2004.00211.x
[18] Lee, S.Y., & Song, X.Y. (2003). Model comparison of nonlinear structural equation models with fixed covariates. Psychometrika, 68, 27–47. · Zbl 1306.62460 · doi:10.1007/BF02296651
[19] Lee, S.Y., Song, X.Y., & Lee, J.C.K. (2003). Maximum likelihood estimation of nonlinear structure models with ignorable missing data. Journal of Educational and Behavioral Statistics, 28, 111–134. · doi:10.3102/10769986028002111
[20] Lee, S.Y., & Zhu, H.T. (2002). Maximum likelihood estimation of nonlinear structural equation models. Psychometrika, 67, 189–210. · Zbl 1297.62043 · doi:10.1007/BF02294842
[21] Little, R.J.A. (1988). Robust estimation of mean and covariance matrix from data with missing values. Applied Statistics, 37, 23–39. · Zbl 0647.62040 · doi:10.2307/2347491
[22] Little, R.J.A., & Rubin, D.B. (1987). Statistical analysis with missing data. New York: Wiley. · Zbl 0665.62004
[23] Louis, T.A. (1982). Finding the observed information matrix when using the EM algorithm. Journal of the Royal Statistical Society, Series B, 44, 226–233. · Zbl 0488.62018
[24] Mardia, V.V. (1970). Measures of multivariate skewness and kurtosis with application. Biometrika, 57, 519–530. · Zbl 0214.46302 · doi:10.1093/biomet/57.3.519
[25] Meng, X.L., & Rubin, D.B. (1993). Maximum likelihood estimation via the ECM algorithm: A general framework. Biometrika, 80, 267–278. · Zbl 0778.62022 · doi:10.1093/biomet/80.2.267
[26] Metropolis, N., Rosenbluth, A.W., Rosenbluth, M.N., Teller, A.H., & Teller, E. (1953). Equations of state calculations by fast computing machines. Journal of Chemical Physics, 21, 1087–1092. · doi:10.1063/1.1699114
[27] Ogasawara, H. (2005). Asymptotic robustness of the asymptotic bias in structural equation modeling. Computational Statistics & Data Analysis, 49, 771–783. · Zbl 1429.62222 · doi:10.1016/j.csda.2004.06.002
[28] Schwarz, G. (1978). Estimating the dimension of a model. Annals of Statistics, 6, 461–464. · Zbl 0379.62005 · doi:10.1214/aos/1176344136
[29] Song, X.Y., & Lee, S.Y. (2005). Maximum likelihood analysis of nonlinear structural equation models with dichotomons variables. Multivariate Behavioral Research, 40, 151–177. · doi:10.1207/s15327906mbr4002_1
[30] Song, X.Y., & Lee, S.Y. (2004). Bayesian analysis of two-level nonlinear structural equation models with continuous and polytomous data. British Journal of Mathematical and Statistical Psychology, 57, 29–52. · doi:10.1348/000711004849259
[31] Watanabe, M., & Yamaguchi, K. (2004). The EM algorithm and relate statistical models. New York: Marcel Dekker. · Zbl 1051.62001
[32] Wei, G.C.G., & Tanner, M.A. (1990). Monte Carlo implementation of the EM algorithm and the poor man’s data augmentation algorithms (in theory and methods). Journal of the American Statistical Association, 85, 699–704. · doi:10.1080/01621459.1990.10474930
[33] Yuan, K.H., & Bentler, P.M. (1997). Mean and covariance structure analysis: theoretical and practical improvements (in theory and methods). Journal of the American Statistical Association, 92, 767–774. · Zbl 0889.62099 · doi:10.1080/01621459.1997.10474029
[34] Yuan, K.H., & Bentler, P.M. (1998a). Robust mean and covariance structure analysis. British Journal of Mathematical and Statistical Psychology, 51, 63–88. · doi:10.1111/j.2044-8317.1998.tb00667.x
[35] Yuan, K.H., & Bentler, P.M. (1998b). Structural equation modelling with robust covariance. Sociological Methodology, 28, 363–396. · doi:10.1111/0081-1750.00052
[36] Yuan, K.H., & Bentler, P.M. (2000). Robust mean and covariance structure analysis through iteratively reweighted least squares. Psychometrika, 65, 43–58. · Zbl 1291.62250 · doi:10.1007/BF02294185
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.