×

Latent profile analysis with nonnormal mixtures: a Monte Carlo examination of model selection using fit indices. (English) Zbl 1468.62143

Summary: The performances of fit indices used for model selection in cross-sectional mixture modeling with nonnormally distributed indicators were examined in two studies using Monte Carlo methods. Simulation conditions were selected to mirror conditions found in educational and psychological research. The design factors under investigation were: indicator distribution, number of indicators, sample size, and profile prevalence. All models contained five, ten, or 15 continuous indicators with varying departures from normality. The fit indices examined were Akaike’s information criterion (AIC), corrected Akaike’s information criterion (AICc), consistent Akaike’s information criterion (CAIC), Bayesian information criterion (BIC), sample size-adjusted Bayesian information criterion (SSBIC), Draper’s information criterion (DIC), integrated classification likelihood criterion with Bayesian-type approximation (ICL), entropy, and the adjusted Lo-Mendell-Rubin likelihood ratio test (LMR). In the first study, nonnormally distributed data were used to estimate the mixture models. No fit index uniformly identified the simulated number of profiles using nonnormal indicators. The fit indices that tended to identify the simulated number of profiles more frequently than others were BIC, SSBIC, CAIC, and LMR although the condition(s) in which this was observed varied. In the second study, the raw data were transformed using van der Waerden quantile normal scores. Despite deflating the indicator variances, the use of normal scores increased the frequency with which fit indices identified the simulated number of profiles across most conditions.

MSC:

62-08 Computational methods for problems pertaining to statistics
62H30 Classification and discrimination; cluster analysis (statistical aspects)
Full Text: DOI

References:

[1] Akaike, H., On the entropy maximization principle, 27-41, (1977), North-Holland Amsterdam · Zbl 0388.62008
[2] Bacci, S.; Pandolfi, S.; Pennoni, F., A comparison of some criteria for states selection in the latent Markov model for longitudinal data, Adv. Data Anal. Classif., 8, 125-145, (2014) · Zbl 1459.62103
[3] Banfield, J. D.; Raftery, A. E., Model-based gaussian and non-Gaussian clustering, Biometrics, 49, 3, 803-821, (1993) · Zbl 0794.62034
[4] Bartolucci, F.; Farcomeni, A.; Pennoni, F., Latent Markov models for longitudinal data, (2013), Chapman and Hall/CRC Boca Raton, FL · Zbl 1341.62002
[5] Bauer, D. J.; Curran, P. J., The integration of continuous and discrete latent variable models: potential problems and promising opportunities, Psychol. Methods, 9, 1, 3, (2004)
[6] Biernacki, C.; Celeux, G.; Govaert, G., Assessing a mixture model for clustering with the integrated classification likelihood. technical report, rhone-alpes, (1998)
[7] Bollen, K. A., Structural equation with latent variables, (1989), John Wiley & Sons, Inc. New York · Zbl 0731.62159
[8] Bozdogan, H., Model selection and akaike’s information criterion (AIC): the general theory and its analytical extensions, Psychometrika, 52, 3, 345-370, (1987) · Zbl 0627.62005
[9] Celeux, G.; Soromenho, G., An entropy criterion for assessing the number of clusters in a mixture model, J. Classification, 13, 2, 195-212, (1996) · Zbl 0861.62051
[10] Clogg, C. C., Latent class models, 311-360, (1995), Plenum New York
[11] Collins, L. M.; Lanza, S. T., Latent class and latent transition analysis, (2010), John Wiley & Sons, Inc. Hoboken, NJ
[12] Dolan, C. V.; van der Maas, H. L.J., Fitting multivariate normal finite mixtures subject to structural equation modeling, Psychometrika, 63, 3, 227-253, (1998) · Zbl 1291.62201
[13] Draper, D., Assessment and propagation of model uncertainty, J. Roy. Statist. Soc. Ser. B, 45-97, (1995) · Zbl 0812.62001
[14] Everitt, B. S., A Monte Carlo investigation of the likelihood ratio test for the number of components in a mixture of normal distributions, Multivariate Behav. Res., 16, 2, 171-180, (1981)
[15] Everitt, B. S., Cluster analysis, (1993), John Wiley & Sons, Inc. New York
[16] Fleishman, A. I., A method for simulating non-normal distributions, Psychometrika, 43, 4, 521-532, (1978) · Zbl 0388.62023
[17] Flora, D. B.; Curran, P. J., An empirical evaluation of alternative methods of estimation for confirmatory factor analysis, Psychol. Methods, 9, 466-491, (2004)
[18] Fonseca, J. R.S.; Cardoso, M. G.M. S., Retails clients latent segments, 348-358, (2005), Springer-Verlag Covilhã, Portugal
[19] Forero, C. G.; Maydeu-Olivares, A.; Gallardo-Pujol, D., Factor analysis with ordinal indicators: a Monte Carlo study comparing DWLS and ULS estimation, Struct. Equ. Model., 16, 625-641, (2009)
[20] Gordon, A. D., Classification, (1981), Chapman and Hall New York · Zbl 0507.62057
[21] Hallquist, M., 2011. Mplusautomation: automating Mplus model estimation and interpretation, URL: http://CRAN.R-project.org/package=MplusAutomation. R version 0.4-2.
[22] Heinen, T., Latent class and discrete latent trait models, (1996), Sage Thousand Oaks, CA
[23] Hennig, C.; Liao, T. F., How to find an appropriate clustering for mixed-type variables with application to socio-economic stratification, J. Roy. Statist. Soc. Ser. C, 62, 3, 309-369, (2013)
[24] Henson, J. M.; Reise, S. P.; Kim, K. H., Detecting mixtures from structural model differences using latent variable mixture modeling: a comparison of relative model fit statistics, Struct. Equ. Model., 14, 2, 202-226, (2007)
[25] Hunt, L.; Jorgensen, M., Mixture model clustering for mixed data with missing information, Comput. Statist. Data Anal., 41, 3, 429-440, (2003) · Zbl 1256.62037
[26] Hurvich, C. M.; Tsai, C., Regression and time series model selection in small samples, Biometrika, 76, 2, 297-307, (1989) · Zbl 0669.62085
[27] Jeffries, N. O., Testing the number of components in a normal mixture, Biometrika, 90, 4, 991-994, (2003) · Zbl 1436.62073
[28] Koehler, A. B.; Murphree, E. S., A comparison of the Akaike and Schwarz criteria for selecting model order, Appl. Stat., 41, 187-195, (1988)
[29] Lee, S. X.; McLachlan, G. J., On mixtures of skew normal and skew \(t\)-distributions, Adv. Data Anal. Classif., 7, 3, 241-266, (2013) · Zbl 1273.62115
[30] Lo, Y.; Mendell, N. R.; Rubin, D. B., Testing the number of components in a normal mixture, Biometrika, 88, 3, 767-778, (2001) · Zbl 0985.62019
[31] Lubke, G.; Neale, M. C., Distinguishing between latent classes and continuous factors: resolution by maximum likelihood?, Multivariate Behav. Res., 41, 4, 499-532, (2006)
[32] McLachlan, G. J.; Basford, K. E., Mixture models: inference and applications to clustering, (1988), M. Dekker New York · Zbl 0697.62050
[33] McLachlan, G. J.; Ng, S. K., A comparison of some information criteria for the number of components in a mixture model. technical report, (2000), Department of Mathematics, University of Queensland Brisbane, Australia
[34] McLachlan, G. J.; Peel, D., Finite mixture models, (2000), John Wiley & Sons, Inc. New York · Zbl 0963.62061
[35] Milligan, G. W.; Cooper, M. C., A study of standardization of variables in cluster analysis, J. Classification, 5, 2, 181-204, (1988)
[36] Morgan, G. B., Mixed mode latent class analysis: an examination of fit index performance for classification, Struct. Equ. Model., 22, 76-86, (2015)
[37] Murray, P. M.; Browne, R. P.; McNicholas, P. D., Mixtures of skew-\(t\) factor analyzers, Comput. Statist. Data Anal., 77, 326-335, (2014) · Zbl 1506.62132
[38] Muthén, B.O., 2001. LCA and cluster analysis. Message posted to MPLUS discussion list, December 11 archived at: http://www.statmodel.com/discussion/messages/13/155.html?1077296160.
[39] Muthén, B. O., Mplus technical apendices, (2004), Muthén & Muthén Los Angeles, CA, version 3 edition
[40] Muthén, B. O.; Muthén, L. K., Integrating person-centered and variable-centered analyses: growth mixture modeling with latent trajectory classes, Alcohol. Clin. Exp. Res., 24, 6, 882-891, (2000)
[41] Muthén, L. K.; Muthén, B. O., Mplus: user’s guide, (2010), Muthén & Muthén Los Angeles, CA
[42] Muthén, L. K.; Muthén, B. O., Mplus: user’s guide, (2014), Muthén & Muthén Los Angeles, CA
[43] Nylund, K. L.; Asparouhov, T.; Muthén, B. O., Deciding on the number of classes in latent class analysis and growth mixture modeling: a Monte Carlo simulation study, Struct. Equ. Model., 14, 4, 535-569, (2007)
[44] Pastor, D. A.; Barron, K. E.; Miller, B. J.; Davis, S. L., A latent profile analysis of college students’ achievement goal orientation, Contemp. Educ. Psychol., 32, 1, 8-47, (2007)
[45] Ramaswamy, V.; DeSarbo, W. S.; Reibstein, D. J.; Robinson, W. T., An empirical pooling approach for estimating marketing mix elasticities with PIMS data, Mark. Sci., 12, 1, 103-124, (1993)
[46] R: A language and environment for statistical computing, (2010), R Foundation for Statistical Computing Vienna, Austria
[47] Schwarz, G., Estimating the dimension of a model, Ann. Statist., 6, 2, 461-464, (1978) · Zbl 0379.62005
[48] Sclove, S. L., Application of model-selection criteria to some problems in multivariate analysis, Psychometrika, 52, 3, 333-343, (1987)
[49] Soromenho, G., Comparing approaches for testing the number of components in a finite mixture model, Comput. Statist., 9, 4, 65-82, (1994) · Zbl 0940.62043
[50] Tofighi, D.; Enders, C. K., Identifying the correct number of classes in growth mixture models, 317-341, (2007), Information Age Publishing, Inc. Greenwich, CT
[51] Vermunt, J. K., The sage encyclopedia of social sciences research methods, 554-555, (2004), Sage Publications Thousand Oaks, CA, chapter Latent profile model
[52] Vermunt, J. K.; Magidson, J., Latent class cluster analysis, 89-106, (2002), Cambridge University Press Cambridge, MA · Zbl 1003.00021
[53] Vrbik, I.; McNicholas, P. D., Parsimonious skew mixture models for model-based clustering and classification, Comput. Statist. Data Anal., 71, 196-210, (2014) · Zbl 1471.62202
[54] Vuong, Q. H., Likelihood ratio tests for model selection and non-nested hypotheses, Econometrica, 57, 307-333, (1989) · Zbl 0701.62106
[55] Yang, C., Evaluating latent class analysis models in qualitative phenotype identification, Comput. Statist. Data Anal., 50, 4, 1090-1104, (2006) · Zbl 1431.62516
[56] Yang-Wallentin, F.; Jöreskog, K. G.; Luo, H., Confirmatory factor analysis of ordinal variables with misspecified models, Struct. Equ. Model., 17, 3, 392-423, (2010)
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.