×

High-dimensional maximum marginal likelihood item factor analysis by adaptive quadrature. (English) Zbl 1306.62497

Summary: Although the Bock-Aitkin likelihood-based estimation method for factor analysis of dichotomous item response data has important advantages over classical analysis of item tetrachoric correlations, a serious limitation of the method is its reliance on fixed-point Gauss-Hermite (G-H) quadrature in the solution of the likelihood equations and likelihood-ratio tests. When the number of latent dimensions is large, computational considerations require that the number of quadrature points per dimension be few. But with large numbers of items, the dispersion of the likelihood, given the response pattern, becomes so small that the likelihood cannot be accurately evaluated with the sparse fixed points in the latent space. In this paper, we demonstrate that substantial improvement in accuracy can be obtained by adapting the quadrature points to the location and dispersion of the likelihood surfaces corresponding to each distinct pattern in the data. In particular, we show that adaptive G-H quadrature, combined with mean and covariance adjustments at each iteration of an EM algorithm, produces an accurate fast-converging solution with as few as two points per dimension. Evaluations of this method with simulated data are shown to yield accurate recovery of the generating factor loadings for models of upto eight dimensions. Unlike an earlier application of adaptive Gibbs sampling to this problem by Meng and Schilling, the simulations also confirm the validity of the present method in calculating likelihood-ratio chi-square statistics for determining the number of factors required in the model. Finally, we apply the method to a sample of real data from a test of teacher qualifications.

MSC:

62P15 Applications of statistics to psychology
62H25 Factor analysis and principal components; correspondence analysis

Software:

TESTFACT; GLLAMM; BILOG; Mplus

References:

[1] Ahrens J.H., Dieter U. (1979) Computer methods for sampling from the exponential and normal distributions. Communications of the Association for Computing Machinery 15:873–882 · Zbl 0247.65002 · doi:10.1145/355604.361593
[2] Ansari A., Jedidi K. (2000) Bayesian factor analysis for multilevel binary observations. Psychometrika 65(4):475–496 · Zbl 1291.62190 · doi:10.1007/BF02296339
[3] Bartholomew D.J., Knott M. (1999) Latent Variable Models and Factor Analysis. Oxford, New York · Zbl 1066.62528
[4] Bock R.D. (1975/1985) Multivariate Statistical Methods in Behavioral Research. McGraw-Hill, New York; 1985 reprint, Chicago: Scientific Software International · Zbl 0398.62086
[5] Bock R.D., Lieberman M. (1970) Fitting a response model for dichotomously scored items. Psychometrika 35:179–197 · doi:10.1007/BF02291262
[6] Bock R.D., Aitkin M. (1981) Marginal maximum likelihood estimation of item parameters: application of an EM algorithm. Psychometrika 46:443–459 · doi:10.1007/BF02293801
[7] Bock R.D., Gibbons R.D., Muraki E. (1987) Full information item factor analysis. Applied Psychological Measurement 12(3):261–280 · doi:10.1177/014662168801200305
[8] Bock R.D., Schilling S.G. (1997) High-dimensional full-information item factor analysis. In: Birkane M. (ed) Latent Variable Modeling and Applications to Causality. Springer, New-York, pp 163–176 · Zbl 0919.62058
[9] Bock R.D., Gibbons R.D., Muraki E., Schilling S.G., Wilson D.T., Wood R. (1999) TESTFACT 3: Test Scoring, Item Statistics, and Full-information Item Factor Analysis. Scientific Software International, Chicago
[10] Divgi D.R. (1979) Calculation of the tetrachoric correlation coefficient. Psychometrika 44:169–172 · Zbl 0422.62052 · doi:10.1007/BF02293968
[11] Ferguson G.A. (1941) The factorial interpretation of test difficulty. Psychometrika 6:323–329 · doi:10.1007/BF02288588
[12] Fox J.P., Glas C.A.W. (2001) Bayesian estimation of a multilevel IRT model using Gibbs sampling. Psychometrika 66(2):271–288 · Zbl 1293.62242 · doi:10.1007/BF02294839
[13] Guilford J.P. (1941) The difficulty of a test and its factor composition. Psychometrika 6:66–77
[14] Haberman S.J. (1977) Log-linear models and frequency tables with small expected cell counts. Annals of Statistics 5:1148–1169 · Zbl 0404.62025 · doi:10.1214/aos/1176344001
[15] Harman H.H. (1987) Modern Factor Analysis. University of Chicago Press, Chicago · Zbl 0161.39805
[16] Hedeker D., Gibbons R.D. (1994) A random-effects ordinal regression model for multilevel analysis. Biometrics 50:933–944 · Zbl 0826.62049 · doi:10.2307/2533433
[17] Hill H.C., Schilling S.G., Ball D.L. (2004) Developing measures of teachers mathematics knowledge for teaching. Elementary School Journal, in press.
[18] Householder A.S. (1964) The Theory of Matrices in Numerical Analysis. Blaisdell, New York · Zbl 0161.12101
[19] Kaiser H.F. (1958) The varimax criterion for analytic rotation in factor analysis. Psychometrika 23:187–200 · Zbl 0095.33603 · doi:10.1007/BF02289233
[20] Leonelli B.T., Chang C.H., Bock R.D., Schilling S.G. (2000) A full-information item factor analysis interpretation of the MMPI-2: Normative Sampling with Non-pathonomic Descriptors. Journal of Personality Assessment 74(3):400–422 · doi:10.1207/S15327752JPA7403_5
[21] Lesaffre E., Spiessens B. (2001) On the effect of the number of quadrature points in a logistic random-effects model: an example. Applied Statistics 50:325–335 · Zbl 1112.62307
[22] Lindstrom M.J., Bates D.M. (1990) Nonlinear mixed effects models for repeated measures data. Biometrics 46:673–687 · doi:10.2307/2532087
[23] Liu C., Rubin D.B., Wu Y.N. (1998) Parameter expansion to accelerate EM: The PX-EM algorithm. Biometrika 85(4):755–770 · Zbl 0921.62071 · doi:10.1093/biomet/85.4.755
[24] Liu Q., Pierce D.A. (1994) A note on G-H quadrature. Biometrika 81(3):624–629 · Zbl 0813.65053
[25] Meng X.L., Schilling S.G. (1996) Fitting full-information factor models and an empirical investigation of bridge sampling. Journal of the American Statistical Association 91:1254–1267 · Zbl 0925.62220 · doi:10.1080/01621459.1996.10476995
[26] Mislevy R.J. (1984) Estimating latent distributions. Psychometrika 49(3):359–381 · Zbl 0555.62093 · doi:10.1007/BF02306026
[27] Muthén B.O. (1984) A general structural equation model with dichotomous, ordered categorical and continuous latent variable indicators. Psychometrika 49:115–132 · doi:10.1007/BF02294210
[28] Muthén L. K., Muthén B.O. (1998–2001). Mplus User’s Guide (Second edition). Muthén & Muthén, Los Angeles CA
[29] Naylor J.C., Smith A.F.M. (1982) Applications of a method for the efficient computation of posterior distributions. Applied Statistics 31:214–225 · Zbl 0521.65017 · doi:10.2307/2347995
[30] Polak E. (1971) Computational Methods in Optimization. Academic Press, New York · Zbl 0257.90055
[31] Powell M.J.D. (1964) An efficient method for several variables without calculating derivatives. Computer Journal 7:155–162 · Zbl 0132.11702 · doi:10.1093/comjnl/7.2.155
[32] Rabe-Hesketh S., Pickles A., Skrondal A., (2001) GLLAMM Manual. Tech. rept. 2001/01. Department of Biostatistics and Computing, Institute of Psychiatry, King’s College, University of London. Downloadable from http://www.gllamm.org.
[33] Rabe-Hesketh S., Skrondal A., Pickles A. (2002) Reliable estimation of generalized linear mixed models using adaptive quadrature. The Stata Journal 2:1–21
[34] Rabe-Hesketh S., Skrondal A., Pickles A. (2005a) Generalized multilevel structural equation modeling. Psychometrika 69:167–190 · Zbl 1306.62484 · doi:10.1007/BF02295939
[35] Rabe-Hesketh S., Skrondal A., Pickles A. (2005b) Maximum likelihood estimation of limited and discrete dependent variable models with nested random effects. Journal of Econometrics, in press. · Zbl 1336.62079
[36] Ramsay J.O. (1998) Estimating smooth monotone functions. Journal of the Royal Statistical Society, Series B. 60:365–375 · Zbl 0909.62041 · doi:10.1111/1467-9868.00130
[37] Raudenbush S.W., Yang M., Yosef (2000) Maximum likelihood for generalized linear models with nested random effects via high-order, multivariate Laplace approximation. Journal of Computational and Graphical Statistics 9(1):141–157
[38] Raudenbush S.W., Birk A.S. (2002) Maximum likelihood for generalized linear models with nested random effects via high-order, multivariate Laplace approximation. Journal of Computational and Graphical Statistics 9(1):141–157
[39] Ripley B.D. (1987) Stochastic Simulation. Wiley, New York · Zbl 0613.65006
[40] Schilling S.G. (1993) Advances in Full Information Item Factor Analysis using the Gibbs Sampler. (Unpublished doctoral dissertation, University of Chicago)
[41] Schrage L. (1979) A more portable fortran random number generator. Association for Computing Machinery: Transactions on Mathematical Software 5:132–138 · Zbl 0403.68041 · doi:10.1145/355826.355828
[42] Thurstone L.L. (1947) Multiple Factor Analysis. The University of Chicago Press, Chicago · Zbl 0029.22203
[43] Thurstone L.L., Thurstone T.G. (1941) Factorial studies of intelligence. Psychometric Monographs No. 2. University of Chicago Press, Chicago · JFM 51.0415.04
[44] Tierney L., Kadane J.B. (1986) Accurate approximations for posterior moments and marginal densities. Journal of the American Statistical Association 81:82–86 · Zbl 0587.62067 · doi:10.1080/01621459.1986.10478240
[45] Wei G.C.G., Tanner M.A. (1990) A Monte Carlo implementation of the EM algorithm and the poor man’s data augmentation algorithm. Journal of the American Statistical Association 85:699–704 · doi:10.1080/01621459.1990.10474930
[46] Wood R., Wilson D.T., Gibbons R.D., Schilling S.G., Muraki E., Bock R.D. (2003) TESTFACF 4: Test Scoring, Item Statistics, and Full-information Item Factor Analysis. Scientific Software International, Chicago
[47] Zimowski M.F., Muraki E., Mislevy R.J., Bock R.D. (1995) BILOG-MG: multiple-group item analysis and test scoring. Scientific Software International, Chicago
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.