×

A generalized expectation model selection algorithm for latent variable selection in multidimensional item response theory models. (English) Zbl 1529.62034

Summary: In this paper, we propose a generalized expectation model selection (GEMS) algorithm for latent variable selection in multidimensional item response theory models which are commonly used for identifying the relationships between the latent traits and test items. Under some mild assumptions, we prove the numerical convergence of GEMS for model selection by minimizing the generalized information criteria of observed data in the presence of missing data. For latent variable selection in the multidimensional two-parameter logistic (M2PL) models, we present an efficient implementation of GEMS to minimize the Bayesian information criterion. To ensure parameter identifiability, the variances of all latent traits are assumed to be unity and each latent trait is required to have an item exclusively associated with it. The convergence of GEMS for the M2PL models is verified. Simulation studies show that GEMS is computationally more efficient than the expectation model selection (EMS) algorithm and the expectation maximization based \(L_1\)-penalized method (EML1), and it yields better correct rate of latent variable selection and mean squared error of parameter estimates than the EMS and EML1. The GEMS algorithm is illustrated by analyzing a real dataset related to the Eysenck Personality Questionnaire.

MSC:

62-08 Computational methods for problems pertaining to statistics
62P15 Applications of statistics to psychology

Software:

glmnet; Rcpp; mirt
Full Text: DOI

References:

[1] Akaike, H., A new look at the statistical model identification, IEEE Trans. Autom. Control, 19, 6, 716-723 (1974) · Zbl 0314.62039 · doi:10.1109/TAC.1974.1100705
[2] An, H.; Gu, L., Fast stepwise procedures of selection of variables by using AIC and BIC criteria, Acta Math. Appl. Sin., 5, 1, 60-67 (1989) · Zbl 0676.62057 · doi:10.1007/BF02006187
[3] Baker, FB; Kim, SH, Item Response Theory: Parameter Estimation Techniques (2004), Boca Raton: CRC Press, Boca Raton · Zbl 1054.62141 · doi:10.1201/9781482276725
[4] Béguin, AA; Glas, CAW, MCMC estimation and some model-fit analysis of multidimensional IRT models, Psychometrika, 66, 4, 541-561 (2001) · Zbl 1293.62234 · doi:10.1007/BF02296195
[5] Bernaards, CA; Jennrich, RI, Gradient projection algorithms and software for arbitrary rotation criteria in factor analysis, Educ. Psychol. Meas., 65, 5, 676-696 (2005) · doi:10.1177/0013164404272507
[6] Bock, RD; Aitkin, M., Marginal maximum likelihood estimation of item parameters: application of an EM algorithm, Psychometrika, 46, 4, 443-459 (1981) · doi:10.1007/BF02293801
[7] Bock, RD; Gibbons, R.; Muraki, E., Full-information item factor analysis, Appl. Psychol. Meas., 12, 3, 261-280 (1988) · doi:10.1177/014662168801200305
[8] Browne, MW, An overview of analytic rotation in exploratory factor analysis, Multivar. Behav. Res., 36, 1, 111-150 (2001) · doi:10.1207/S15327906MBR3601_05
[9] Chalmers, RP, mirt: a multidimensional item response theory package for the R environment, J. Stat. Softw., 48, 6, 1-29 (2012) · doi:10.18637/jss.v048.i06
[10] Chalmers, RP; Flora, DB, Maximum-likelihood estimation of noncompensatory IRT models with the MH-RM algorithm, Appl. Psychol. Meas., 38, 5, 339-358 (2014) · doi:10.1177/0146621614520958
[11] Cho, AE; Wang, C.; Zhang, X.; Xu, G., Gaussian variational estimation for multidimensional item response theory, Br. J. Math. Stat. Psychol., 74, S1, 52-85 (2021) · doi:10.1111/bmsp.12219
[12] Cho, AE; Xiao, J.; Wang, C.; Xu, G., Regularized variational estimation for exploratory item factor analysis, Psychometrika (2022) · Zbl 1541.62333 · doi:10.1007/s11336-022-09874-6
[13] Claeskens, G.; Hjort, NL, Model Selection and Model Averaging (2008), Cambridge: Cambridge University Press, Cambridge · Zbl 1166.62001
[14] da Silva, MA; Liu, R.; Huggins-Manley, AC; Bazán, JL, Incorporating the Q-Matrix into multidimensional item response theory models, Educ. Psychol. Meas., 79, 4, 665-687 (2019) · doi:10.1177/0013164418814898
[15] Dempster, AP; Laird, NM; Rubin, DB, Maximum likelihood from incomplete data via the EM algorithm, J. R. Stat. Soc. Ser. B, 39, 1, 1-38 (1977) · Zbl 0364.62022
[16] Derksen, S.; Keselman, HJ, Backward, forward and stepwise automated subset selection algorithms: frequency of obtaining authentic and noise variables, Br. J. Math. Stat. Psychol., 45, 2, 265-282 (1992) · doi:10.1111/j.2044-8317.1992.tb00992.x
[17] Eddelbuettel, D.; Francois, R., Rcpp: seamless R and C++ integration, J. Stat. Softw., 40, 1-18 (2011) · doi:10.18637/jss.v040.i08
[18] Eysenck, S.; Barrett, P., Re-introduction to cross-cultural studies of the EPQ, Pers. Individ. Differ., 54, 4, 485-489 (2013) · doi:10.1016/j.paid.2012.09.022
[19] Friedman, J.; Hastie, T.; Tibshirani, R., Regularization paths for generalized linear models via coordinate descent, J. Stat. Softw., 33, 1, 1-22 (2010) · doi:10.18637/jss.v033.i01
[20] Ibrahim, JG; Zhu, H.; Tang, N., Model selection criteria for missing-data problems using the EM algorithm, J. Am. Stat. Assoc., 103, 484, 1648-1658 (2008) · Zbl 1286.62082 · doi:10.1198/016214508000001057
[21] Janssen, R.; De Boeck, P., Confirmatory analyses of componential test structure using multidimensional item response theory, Multivar. Behav. Res., 34, 2, 245-268 (1999) · doi:10.1207/S15327906Mb340205
[22] Jiang, J.; Nguyen, T.; Rao, JS, The E-MS algorithm: model selection with incomplete data, J. Am. Stat. Assoc., 110, 511, 1136-1147 (2015) · Zbl 1377.62078 · doi:10.1080/01621459.2014.948545
[23] Kline, P., A Handbook of Test Construction: Introduction to Psychometric Design (1986), New York: Methuen & Co, New York
[24] Lange, K., A gradient algorithm locally equivalent to the EM algorithm, J. R. Stat. Soc. Ser. B, 57, 2, 425-437 (1995) · Zbl 0813.62021
[25] Luenberger, DG; Ye, Y., Linear and Nonlinear Programming (2008), New York: Springer, New York · Zbl 1207.90003 · doi:10.1007/978-0-387-74503-9
[26] Mckinley, R., Confirmatory analysis of test structure using multidimensional item response theory, ETS Res. Rep. Ser., 1989, 2, i-40 (1989)
[27] McLachlan, GJ; Krishnan, T., The EM Algorithm and Extensions (2008), Hoboken: John Wiley & Sons, Hoboken · Zbl 1165.62019 · doi:10.1002/9780470191613
[28] Meng, X.; Xu, G.; Zhang, J.; Tao, J., Marginalized maximum a posteriori estimation for the four-parameter logistic model under a mixture modelling framework, Br. J. Math. Stat. Psychol., 73, S1, 51-82 (2020) · doi:10.1111/bmsp.12185
[29] Meng, X.L., Schilling, S.: Fitting full-information factor models and an empirical investigation of bridge sampling. J. Am. Stat. Assoc. 91(435), 1254-1267 (1996) · Zbl 0925.62220
[30] Neath, AA; Cavanaugh, JE, The Bayesian information criterion: background, derivation, and applications, Wiley Interdiscip. Rev. Comput. Stat., 4, 2, 199-203 (2012) · doi:10.1002/wics.199
[31] Parikh, N.; Boyd, S., Proximal algorithms, Found. Trends Optim., 1, 3, 127-239 (2014) · doi:10.1561/2400000003
[32] Reckase, MD, Multidimensional Item Response Theory (2009), New York: Springer, New York · Zbl 1291.62023 · doi:10.1007/978-0-387-89976-3
[33] Scharf, F., Nestler, S.: Should regularization replace simple structure rotation in exploratory factor analysis? Struct. Equ. Model. 26(4), 576-590 (2019)
[34] Schilling, S.; Bock, RD, High-dimensional maximum marginal likelihood item factor analysis by adaptive quadrature, Psychometrika, 70, 3, 533-555 (2005) · Zbl 1306.62497
[35] Schwarz, G., Estimating the dimension of a model, Ann. Stat., 6, 2, 461-464 (1978) · Zbl 0379.62005 · doi:10.1214/aos/1176344136
[36] Shang, L.; Xu, PF; Shan, N.; Tang, ML; Ho, GTS, Accelerating \(L_1\)-penalized expectation maximization algorithm for latent variable selection in multidimensional two-parameter logistic models, PLoS ONE, 18, 1, e0279918 (2023) · doi:10.1371/journal.pone.0279918
[37] Sun, J.; Chen, Y.; Liu, J.; Ying, Z.; Xin, T., Latent variable selection for multidimensional item response theory models via \(L_1\) regularization, Psychometrika, 81, 4, 921-939 (2016) · Zbl 1367.62322 · doi:10.1007/s11336-016-9529-6
[38] Tibshirani, R., Regression shrinkage and selection via the lasso, J. R. Stat. Soc. Ser. B, 58, 1, 267-288 (1996) · Zbl 0850.62538
[39] Trendafilov, NT; Adachi, K., Sparse versus simple structure loadings, Psychometrika, 80, 3, 776-790 (2015) · Zbl 1323.62124 · doi:10.1007/s11336-014-9416-y
[40] Vrieze, SI, Model selection and psychological theory: a discussion of the differences between the Akaike information criterion (AIC) and the Bayesian information criterion (BIC), Psychol. Methods, 17, 2, 228-243 (2012) · doi:10.1037/a0027127
[41] Xu, PF; Shang, L.; Zheng, QZ; Shan, N.; Tang, ML, Latent variable selection in multidimensional item response theory models using the expectation model selection algorithm, Br. J. Math. Stat. Psychol., 75, 2, 363-394 (2022) · Zbl 1534.62253 · doi:10.1111/bmsp.12261
[42] Yamashita, T.; Yamashita, K.; Kamimura, R., A stepwise AIC method for variable selection in linear regression, Commun. Stat. Theory Methods, 36, 13, 2395-2403 (2007) · Zbl 1128.62077 · doi:10.1080/03610920701215639
[43] Zhang, S.; Chen, Y., Computation for latent variable model estimation: a unified stochastic proximal framework, Psychometrika, 87, 4, 1473-1502 (2022) · Zbl 1499.62441 · doi:10.1007/s11336-022-09863-9
[44] Zhang, S.; Chen, Y.; Liu, Y., An improved stochastic EM algorithm for large-scale full-information item factor analysis, Br. J. Math. Stat. Psychol., 73, 1, 44-71 (2020) · Zbl 1440.62089 · doi:10.1111/bmsp.12153
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.