Abstract
A Bayesian procedure to estimate the three-parameter normal ogive model and a generalization of the procedure to a model with multidimensional ability parameters are presented. The procedure is a generalization of a procedure by Albert (1992) for estimating the two-parameter normal ogive model. The procedure supports analyzing data from multiple populations and incomplete designs. It is shown that restrictions can be imposed on the factor matrix for testing specific hypotheses about the ability structure. The technique is illustrated using simulated and real data.
Similar content being viewed by others
References
Ackerman, T.A. (1996a). Developments in multidimensional item response theory.Applied Psychological Measurement, 20, 309–310.
Ackerman, T.A. (1996b). Graphical representation of multidimensional item response theory analyses.Applied Psychological Measurement, 20, 311–329.
ACT. (1997).ACT Assessment Technical Manual. Iowa City, IA: Author.
Albert, J.H. (1992). Bayesian estimation of normal ogive item response functions using Gibbs sampling.Journal of Educational Statistics, 17, 251–269.
Andersen, E.B. (1973). A goodness of for test for the Rasch model.Psychometrika, 38, 123–140.
Baker, F.B. (1998). An investigation of item parameter recovery characteristics of a Gibbs sampling procedure.Applied Psychological Measurement, 22, 153–169.
Bock, R.D., Gibbons, R.D., & Muraki, E. (1988). Full-information factor analysis.Applied Psychological Measurement, 12, 261–280.
Bock, R.D., & Aitkin, M. (1981). Marginal maximum likelihood estimation of item parameters: An application of an EM-algorithm.Psychometrika, 46, 443–459.
Bock, R.D., & Schilling, S.G. (1997). High dimensional full-information item factor analysis. In M. Berkane (Ed.),Latent variable modeling and applications of causality (pp. 163–176). New York, NY: Springer.
Bock, R.D., & Zimowski, M.F. (1997). Multiple group IRT. In W.J. van der Linden & R.K. Hambleton (Eds.),Handbook of modern item response theory (pp. 433–448). New York, NY: Springer.
Box, G., & Tiao, G. (1973).Bayesian inference in statistical analysis. Reading, MA: Addison-Wesley.
Bradlow, E.T., Wainer, H., & Wang, X. (1999). A Bayesian random effects model for testlets.Psychometrika, 64, 153–168.
Cressie, N., & Holland, P.W. (1983). Characterizing the manifest probabilities of latent trait models.Psychometrika, 48, 129–141.
Fischer, G.H. (1995). Derivations of the Rasch model. In G.H. Fischer & I.W. Molenaar (Eds.),Rasch models: Foundations, recent developments and applications (pp. 15–38). New York, NY: Springer.
Fox, J.P., & Glas, C.A.W. (2001). Bayesian estimation of a multilevel IRT model using Gibbs sampling.Psychometrika, 66, 271–288.
Fraser, C. (1988).NOHARM: A computer program for fitting both unidimensional and multidimensional normal ogive models of latent trait theory. Armidale, Australia: University of New England.
Gelfand, A.E., & Smith, A.F.M. (1990). Sampling-based approaches to calculating marginal densities.Journal of the American Statistical Association, 85, 398–409.
Gelman, A., Carlin, J.B., Stern, H.S., & Rubin, D.B. (1995).Bayesian data analysis. London: Chapman and Hall.
Glas, C.A.W. (1988). The derivation of some tests for the Rasch model from the multinomial distribution.Psychometrika, 53, 525–546.
Glas, C.A.W. (1998). Detection of differential item functioning using Lagrange multiplier tests.Statistica Sinica, 8(1). 647–667.
Glas, C.A.W. (1999). Modification indices for the 2-pl and the nominal response model.Psychometrika, 64, 273–294.
Glas C.A.W., & Ellis, J.L. (1993).RSP, Rasch scaling program, computer program and user's manual. Groningen: ProGAMMA.
Glas, C.A.W., & Verhelst, N.D. (1989). Extensions of the partial credit model.Psychometrika, 54, 635–659.
Glas, C.A.W., & Verhelst, N.D. (1995). Tests of fit for polytomous Rasch models. In G.H. Fischer & I.W. Molenaar (Eds.),Rasch models: Foundations, recent developments and applications (pp. 325–352). New York, NY: Springer.
Glas, C.A.W., Wainer, H., & Bradlow, E.T. (2000). MML and EAP estimates for the testlet response model. In W.J. van der Linden & C.A.W. Glas (Eds.),Computer adaptive testing: Theory and practice (pp. 271–287). Boston MA: Kluwer-Nijhoff Publishing.
Hoijtink, H., & Molenaar, I.W. (1997). A multidimensional item response model: Constrained latent class analysis using the Gibbs sampler and posterior predictive checks.Psychometrika, 62, 171–189.
Holland, P.W., & Rosenbaum, P.R. (1986). Conditional association and uni-dimensionality in monotone latent variable models.Annals of Statistics, 14, 1523–1543.
Junker, B. (1991). Essential independence and likelihood-based ability estimation for polytomous items.Psychometrika, 56, 255–278.
Kelderman, H. (1984). Loglinear RM tests.Psychometrika, 49, 223–245.
Kelderman, H. (1989). Item bias detection using loglinear IRT.Psychometrika, 54, 681–697.
Lawley, D.N. (1943). On problems connected with item selection and test construction.Proceedings of the Royal Society of Edinburgh, 61, 273–287.
Lawley, D.N. (1944). The factorial analysis of multiple test items.Proceedings of the Royal Society of Edinburgh, Series A, 62, 74–82.
Lord, F.M. (1952). A theory of test scores.Psychometric Monograph No. 7.
Lord, F.M. (1953a). An application of confidence intervals and of maximum likelihood to the estimation of an examinee's ability.Psychometrika, 18, 57–75.
Lord, F.M. (1953b). The relation of test score to the trait underlying the test.Educational and Psychological Measurement, 13, 517–548.
Lord, F.M., & Novick, M.R. (1968).Statistical theories of mental test scores. Reading, MA: Addison-Wesley.
Lord, F.M., & Wingersky, M.S. (1984). Comparison of IRT true-score and equipercentile observed-score “equatings”.Applied Psychological Measurement, 8, 453–461.
Martin-Löf, P. (1973).Statistika Modeller [Statistical models] (Anteckningar från seminarier Lasåret 1969–1970, utardeltade av Rolf Sunberg. Obetydligt ändrat nytryck, oktober 1973). Stockholm: Institutet för Försäkringsmatematik och Matematisk Statistik vid Stockholms Universitet.
Martin Löf, P. (1974). The notion of redundancy and its use as a quantitative measure if the discrepancy between a statistical hypothesis and a set of observational data.Scandinavian Journal of Statistics, 1, 3–18.
McDonald, R.P. (1967). Nonlinear factor analysis.Psychometric Monograph No. 15.
McDonald, R.P. (1982). Linear versus nonlinear models in item response theory.Applied Psychological Measurement, 6, 379–396.
McDonald, R.P. (1997). Normal-ogive multidimensional model. In W.J. van der Linden, & R.K. Hambleton (Eds.),Handbook of modern item response theory (pp. 257–269). New York, NY: Springer.
Mellenbergh, G.J. (1994). Generalized linear item response theory.Psychological Bulletin, 115, 300–307.
Meng, X.L., & Schilling, S.G. (1996). Fitting full-information item factor models and an empirical investigation of bridge sampling.Journal of the American Statistical Association, 91, 1254–1267.
Mislevy, R.J. (1986). Bayes modal estimation in item response models.Psychometrika, 51, 177–195.
Mislevy, R.J., & Bock, R.D. (1990).PC-BILOG. Item analysis and test scoring with binary logistic models. Chicago, IL: Scientific Software International.
Mislevy, R.J., & Wu, P.K. (1996).Missing responses and IRT ability estimation: Omits, choice, time limits and adaptive testing (ETS Research Reports RR-96-30-ONR). Princeton, NJ: Educational Testing Service.
Molenaar, I.W. (1995). Estimation of item parameters. In G.H. Fischer & I.W. Molenaar (Eds.),Rasch models: Foundations, recent developments and applications (pp. 39–51). New York, NY: Springer.
Patz, R.J., & Junker, B.W. (1999a). A straightforward approach to Markov chain Monte Carlo methods for item response models.Journal of Educational and Behavioral Statistics, 24, 146–178.
Patz, R.J., & Junker, B.W. (1999b). Applications and extensions of MCMC in IRT: Multiple item types, missing data, and rated responses.Journal of Educational and Behavioral Statistics, 24, 342–366.
Reckase, M.D. (1985). The difficulty of test items that measure more than one ability.Applied Psychological Measurement, 9, 401–412.
Reckase, M.D. (1997). A linear logistic multidimensional model for dichotomous item response data. In W.J. van der Linden & R.K. Hambleton (Eds.),Handbook of modern item response theory (pp. 271–286). New York, NY: Springer.
Rasch, G. (1977).On specific objectivity: An attempt at formalizing the request for generality and validity of scientific statements. In M. Blegvad (Ed.),The Danish yearbook of philosophy (pp. 58–94). Copenhagen: Munksgaard.
Reiser, M. (1996). Analysis of residuals for the multinomial item response model.Psychometrika, 61, 509–528.
Rosenbaum, P.R. (1984). Testing the conditional independence and monotonicity assumptions of item response theory.Psychometrika, 49, 425–436.
Rubin, D.B. (1976). Inference and missing data.Biometrika, 63, 581–592.
Shi, J.Q., & Lee, S.Y. (1998). Bayesian sampling based approach for factor analysis models with continuous and polytomous data.British Journal of Mathematical and Statistical Psychology, 51, 233–252.
Sijtsma, K. (1998). Methodology review: Nonparametric IRT approaches to the analysis of dichotomous item scores.Applied Psychological Measurement, 22, 3–32.
Stout, W.F. (1987). A nonparametric approach for assessing latent trait dimensionality.Psychometrika, 52, 589–617.
Stout, W.F. (1990). A new item response theory modeling approach with applications to unidimensional assessment and ability estimation.Psychometrika, 55, 293–326.
Thurstone, L.L. (1947).Multiple factor analysis. Chicago, IL: University of Chicago Press.
Wainer, H., Bradlow, E.T., & Du, Z. (2000). Testlet response theory: An analog for the 3pl model useful in testlet-based adaptive testing. In W.J. van der Linden & C.A.W. Glas (Eds.),Computerized adaptive testing: Theory and practice (pp. 245–269). Boston, MA: Kluwer Academic Publishers.
Wilson, D.T., Wood, R., & Gibbons, R. (1991)TESTFACT: Test scoring, item statistics, and item factor analysis [Computer program]. Chicago, IL: Scientific Software International.
Yen, W.M. (1981). Using simultaneous results to choose a latent trait model.Applied Psychological Measurement, 5, 245–262.
Yen, W.M. (1984). Effects of local item dependence on the fit and equating performance of the three-parameter logistic model.Applied Psychological Measurement, 8, 125–145.
Zimowski, M.F., Muraki, E., Mislevy, R.J., & Bock, R.D. (1996).Bilog MG: Multiple-group IRT analysis and test maintenance for binary items. Chicago, IL: Scientific Software International.
Author information
Authors and Affiliations
Additional information
The authors would like to thank Norman Verhelst for his valuable comments and ACT, CITO group and SweSAT for the use of their data.
Rights and permissions
About this article
Cite this article
Béguin, A.A., Glas, C.A.W. MCMC estimation and some model-fit analysis of multidimensional IRT models. Psychometrika 66, 541–561 (2001). https://doi.org/10.1007/BF02296195
Received:
Revised:
Issue Date:
DOI: https://doi.org/10.1007/BF02296195