MCMC estimation and some model-fit analysis of multidimensional IRT models

A. A. Béguin¹ &
C. A. W. Glas¹

1987 Accesses
236 Citations
Explore all metrics

Abstract

A Bayesian procedure to estimate the three-parameter normal ogive model and a generalization of the procedure to a model with multidimensional ability parameters are presented. The procedure is a generalization of a procedure by Albert (1992) for estimating the two-parameter normal ogive model. The procedure supports analyzing data from multiple populations and incomplete designs. It is shown that restrictions can be imposed on the factor matrix for testing specific hypotheses about the ability structure. The technique is illustrated using simulated and real data.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Bayesian Estimation of the Three-Parameter Multi-Unidimensional Model

Non-parametric Regression Among Factor Scores: Motivation and Diagnostics for Nonlinear Structural Equation Models

Article Open access 23 April 2024

Robustness of Mixture IRT Models to Violations of Latent Normality

References

Ackerman, T.A. (1996a). Developments in multidimensional item response theory.Applied Psychological Measurement, 20, 309–310.
Google Scholar
Ackerman, T.A. (1996b). Graphical representation of multidimensional item response theory analyses.Applied Psychological Measurement, 20, 311–329.
Google Scholar
ACT. (1997).ACT Assessment Technical Manual. Iowa City, IA: Author.
Google Scholar
Albert, J.H. (1992). Bayesian estimation of normal ogive item response functions using Gibbs sampling.Journal of Educational Statistics, 17, 251–269.
Google Scholar
Andersen, E.B. (1973). A goodness of for test for the Rasch model.Psychometrika, 38, 123–140.
Article Google Scholar
Baker, F.B. (1998). An investigation of item parameter recovery characteristics of a Gibbs sampling procedure.Applied Psychological Measurement, 22, 153–169.
Google Scholar
Bock, R.D., Gibbons, R.D., & Muraki, E. (1988). Full-information factor analysis.Applied Psychological Measurement, 12, 261–280.
Google Scholar
Bock, R.D., & Aitkin, M. (1981). Marginal maximum likelihood estimation of item parameters: An application of an EM-algorithm.Psychometrika, 46, 443–459.
Article Google Scholar
Bock, R.D., & Schilling, S.G. (1997). High dimensional full-information item factor analysis. In M. Berkane (Ed.),Latent variable modeling and applications of causality (pp. 163–176). New York, NY: Springer.
Google Scholar
Bock, R.D., & Zimowski, M.F. (1997). Multiple group IRT. In W.J. van der Linden & R.K. Hambleton (Eds.),Handbook of modern item response theory (pp. 433–448). New York, NY: Springer.
Google Scholar
Box, G., & Tiao, G. (1973).Bayesian inference in statistical analysis. Reading, MA: Addison-Wesley.
Google Scholar
Bradlow, E.T., Wainer, H., & Wang, X. (1999). A Bayesian random effects model for testlets.Psychometrika, 64, 153–168.
Article Google Scholar
Cressie, N., & Holland, P.W. (1983). Characterizing the manifest probabilities of latent trait models.Psychometrika, 48, 129–141.
Google Scholar
Fischer, G.H. (1995). Derivations of the Rasch model. In G.H. Fischer & I.W. Molenaar (Eds.),Rasch models: Foundations, recent developments and applications (pp. 15–38). New York, NY: Springer.
Google Scholar
Fox, J.P., & Glas, C.A.W. (2001). Bayesian estimation of a multilevel IRT model using Gibbs sampling.Psychometrika, 66, 271–288.
Article Google Scholar
Fraser, C. (1988).NOHARM: A computer program for fitting both unidimensional and multidimensional normal ogive models of latent trait theory. Armidale, Australia: University of New England.
Google Scholar
Gelfand, A.E., & Smith, A.F.M. (1990). Sampling-based approaches to calculating marginal densities.Journal of the American Statistical Association, 85, 398–409.
Google Scholar
Gelman, A., Carlin, J.B., Stern, H.S., & Rubin, D.B. (1995).Bayesian data analysis. London: Chapman and Hall.
Google Scholar
Glas, C.A.W. (1988). The derivation of some tests for the Rasch model from the multinomial distribution.Psychometrika, 53, 525–546.
Article Google Scholar
Glas, C.A.W. (1998). Detection of differential item functioning using Lagrange multiplier tests.Statistica Sinica, 8(1). 647–667.
Google Scholar
Glas, C.A.W. (1999). Modification indices for the 2-pl and the nominal response model.Psychometrika, 64, 273–294.
Article Google Scholar
Glas C.A.W., & Ellis, J.L. (1993).RSP, Rasch scaling program, computer program and user's manual. Groningen: ProGAMMA.
Google Scholar
Glas, C.A.W., & Verhelst, N.D. (1989). Extensions of the partial credit model.Psychometrika, 54, 635–659.
Article Google Scholar
Glas, C.A.W., & Verhelst, N.D. (1995). Tests of fit for polytomous Rasch models. In G.H. Fischer & I.W. Molenaar (Eds.),Rasch models: Foundations, recent developments and applications (pp. 325–352). New York, NY: Springer.
Google Scholar
Glas, C.A.W., Wainer, H., & Bradlow, E.T. (2000). MML and EAP estimates for the testlet response model. In W.J. van der Linden & C.A.W. Glas (Eds.),Computer adaptive testing: Theory and practice (pp. 271–287). Boston MA: Kluwer-Nijhoff Publishing.
Google Scholar
Hoijtink, H., & Molenaar, I.W. (1997). A multidimensional item response model: Constrained latent class analysis using the Gibbs sampler and posterior predictive checks.Psychometrika, 62, 171–189.
Google Scholar
Holland, P.W., & Rosenbaum, P.R. (1986). Conditional association and uni-dimensionality in monotone latent variable models.Annals of Statistics, 14, 1523–1543.
Google Scholar
Junker, B. (1991). Essential independence and likelihood-based ability estimation for polytomous items.Psychometrika, 56, 255–278.
Article Google Scholar
Kelderman, H. (1984). Loglinear RM tests.Psychometrika, 49, 223–245.
Article Google Scholar
Kelderman, H. (1989). Item bias detection using loglinear IRT.Psychometrika, 54, 681–697.
Article Google Scholar
Lawley, D.N. (1943). On problems connected with item selection and test construction.Proceedings of the Royal Society of Edinburgh, 61, 273–287.
Google Scholar
Lawley, D.N. (1944). The factorial analysis of multiple test items.Proceedings of the Royal Society of Edinburgh, Series A, 62, 74–82.
Google Scholar
Lord, F.M. (1952). A theory of test scores.Psychometric Monograph No. 7.
Lord, F.M. (1953a). An application of confidence intervals and of maximum likelihood to the estimation of an examinee's ability.Psychometrika, 18, 57–75.
Article Google Scholar
Lord, F.M. (1953b). The relation of test score to the trait underlying the test.Educational and Psychological Measurement, 13, 517–548.
Google Scholar
Lord, F.M., & Novick, M.R. (1968).Statistical theories of mental test scores. Reading, MA: Addison-Wesley.
Google Scholar
Lord, F.M., & Wingersky, M.S. (1984). Comparison of IRT true-score and equipercentile observed-score “equatings”.Applied Psychological Measurement, 8, 453–461.
Google Scholar
Martin-Löf, P. (1973).Statistika Modeller [Statistical models] (Anteckningar från seminarier Lasåret 1969–1970, utardeltade av Rolf Sunberg. Obetydligt ändrat nytryck, oktober 1973). Stockholm: Institutet för Försäkringsmatematik och Matematisk Statistik vid Stockholms Universitet.
Google Scholar
Martin Löf, P. (1974). The notion of redundancy and its use as a quantitative measure if the discrepancy between a statistical hypothesis and a set of observational data.Scandinavian Journal of Statistics, 1, 3–18.
Google Scholar
McDonald, R.P. (1967). Nonlinear factor analysis.Psychometric Monograph No. 15.
McDonald, R.P. (1982). Linear versus nonlinear models in item response theory.Applied Psychological Measurement, 6, 379–396.
Google Scholar
McDonald, R.P. (1997). Normal-ogive multidimensional model. In W.J. van der Linden, & R.K. Hambleton (Eds.),Handbook of modern item response theory (pp. 257–269). New York, NY: Springer.
Google Scholar
Mellenbergh, G.J. (1994). Generalized linear item response theory.Psychological Bulletin, 115, 300–307.
Article Google Scholar
Meng, X.L., & Schilling, S.G. (1996). Fitting full-information item factor models and an empirical investigation of bridge sampling.Journal of the American Statistical Association, 91, 1254–1267.
Google Scholar
Mislevy, R.J. (1986). Bayes modal estimation in item response models.Psychometrika, 51, 177–195.
Article Google Scholar
Mislevy, R.J., & Bock, R.D. (1990).PC-BILOG. Item analysis and test scoring with binary logistic models. Chicago, IL: Scientific Software International.
Google Scholar
Mislevy, R.J., & Wu, P.K. (1996).Missing responses and IRT ability estimation: Omits, choice, time limits and adaptive testing (ETS Research Reports RR-96-30-ONR). Princeton, NJ: Educational Testing Service.
Google Scholar
Molenaar, I.W. (1995). Estimation of item parameters. In G.H. Fischer & I.W. Molenaar (Eds.),Rasch models: Foundations, recent developments and applications (pp. 39–51). New York, NY: Springer.
Google Scholar
Patz, R.J., & Junker, B.W. (1999a). A straightforward approach to Markov chain Monte Carlo methods for item response models.Journal of Educational and Behavioral Statistics, 24, 146–178.
Google Scholar
Patz, R.J., & Junker, B.W. (1999b). Applications and extensions of MCMC in IRT: Multiple item types, missing data, and rated responses.Journal of Educational and Behavioral Statistics, 24, 342–366.
Google Scholar
Reckase, M.D. (1985). The difficulty of test items that measure more than one ability.Applied Psychological Measurement, 9, 401–412.
Google Scholar
Reckase, M.D. (1997). A linear logistic multidimensional model for dichotomous item response data. In W.J. van der Linden & R.K. Hambleton (Eds.),Handbook of modern item response theory (pp. 271–286). New York, NY: Springer.
Google Scholar
Rasch, G. (1977).On specific objectivity: An attempt at formalizing the request for generality and validity of scientific statements. In M. Blegvad (Ed.),The Danish yearbook of philosophy (pp. 58–94). Copenhagen: Munksgaard.
Google Scholar
Reiser, M. (1996). Analysis of residuals for the multinomial item response model.Psychometrika, 61, 509–528.
Google Scholar
Rosenbaum, P.R. (1984). Testing the conditional independence and monotonicity assumptions of item response theory.Psychometrika, 49, 425–436.
Google Scholar
Rubin, D.B. (1976). Inference and missing data.Biometrika, 63, 581–592.
Google Scholar
Shi, J.Q., & Lee, S.Y. (1998). Bayesian sampling based approach for factor analysis models with continuous and polytomous data.British Journal of Mathematical and Statistical Psychology, 51, 233–252.
Google Scholar
Sijtsma, K. (1998). Methodology review: Nonparametric IRT approaches to the analysis of dichotomous item scores.Applied Psychological Measurement, 22, 3–32.
Google Scholar
Stout, W.F. (1987). A nonparametric approach for assessing latent trait dimensionality.Psychometrika, 52, 589–617.
Article Google Scholar
Stout, W.F. (1990). A new item response theory modeling approach with applications to unidimensional assessment and ability estimation.Psychometrika, 55, 293–326.
Google Scholar
Thurstone, L.L. (1947).Multiple factor analysis. Chicago, IL: University of Chicago Press.
Google Scholar
Wainer, H., Bradlow, E.T., & Du, Z. (2000). Testlet response theory: An analog for the 3pl model useful in testlet-based adaptive testing. In W.J. van der Linden & C.A.W. Glas (Eds.),Computerized adaptive testing: Theory and practice (pp. 245–269). Boston, MA: Kluwer Academic Publishers.
Google Scholar
Wilson, D.T., Wood, R., & Gibbons, R. (1991)TESTFACT: Test scoring, item statistics, and item factor analysis [Computer program]. Chicago, IL: Scientific Software International.
Google Scholar
Yen, W.M. (1981). Using simultaneous results to choose a latent trait model.Applied Psychological Measurement, 5, 245–262.
Google Scholar
Yen, W.M. (1984). Effects of local item dependence on the fit and equating performance of the three-parameter logistic model.Applied Psychological Measurement, 8, 125–145.
Google Scholar
Zimowski, M.F., Muraki, E., Mislevy, R.J., & Bock, R.D. (1996).Bilog MG: Multiple-group IRT analysis and test maintenance for binary items. Chicago, IL: Scientific Software International.
Google Scholar

Download references

Author information

Authors and Affiliations

University of Twente, The Netherlands
A. A. Béguin & C. A. W. Glas

Authors

A. A. Béguin
View author publications
You can also search for this author in PubMed Google Scholar
C. A. W. Glas
View author publications
You can also search for this author in PubMed Google Scholar

Additional information

The authors would like to thank Norman Verhelst for his valuable comments and ACT, CITO group and SweSAT for the use of their data.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Béguin, A.A., Glas, C.A.W. MCMC estimation and some model-fit analysis of multidimensional IRT models. Psychometrika 66, 541–561 (2001). https://doi.org/10.1007/BF02296195

Download citation

Received: 27 August 1999
Revised: 16 January 2001
Issue Date: December 2001
DOI: https://doi.org/10.1007/BF02296195

MCMC estimation and some model-fit analysis of multidimensional IRT models

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Bayesian Estimation of the Three-Parameter Multi-Unidimensional Model

Non-parametric Regression Among Factor Scores: Motivation and Diagnostics for Nonlinear Structural Equation Models

Robustness of Mixture IRT Models to Violations of Latent Normality

References

Author information

Authors and Affiliations

Additional information

Rights and permissions

About this article

Cite this article

Key words

Subscribe and save

Buy Now

Navigation

MCMC estimation and some model-fit analysis of multidimensional IRT models

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Bayesian Estimation of the Three-Parameter Multi-Unidimensional Model

Non-parametric Regression Among Factor Scores: Motivation and Diagnostics for Nonlinear Structural Equation Models

Robustness of Mixture IRT Models to Violations of Latent Normality

References

Author information

Authors and Affiliations

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Key words

Subscribe and save

Buy Now

Search

Navigation