×

Modification indices for the 2-PL and the nominal response model. (English) Zbl 1291.62207

Summary: It is shown that various violations of the 2-PL model and the nominal response model can be evaluated using the Lagrange multiplier test or the equivalent efficient score test. The tests presented here focus on violation of local stochastic independence and insufficient capture of the form of the item characteristic curves. Primarily, the tests are item-oriented diagnostic tools, but taken together, they also serve the purpose of evaluation of global model fit. A useful feature of Lagrange multiplier statistics is that they are evaluated using maximum likelihood estimates of the null-model only, that is, the parameters of alternative models need not be estimated. As numerical examples, an application to real data and some power studies are presented.

MSC:

62P15 Applications of statistics to psychology

Software:

BayesDA; BILOG; MULTILOG
Full Text: DOI

References:

[1] Agresti, A., & Yang, M. (1987). An empirical investigation of some effects of sparseness in contingency tables.Computational Statistics and Data Analysis, 5, 9–21. · Zbl 0605.62061 · doi:10.1016/0167-9473(87)90003-X
[2] Aitchison, J., & Silvey, S.D. (1958). Maximum likelihood estimation of parameters subject to restraints.Annals of Mathematical Statistics, 29, 813–828. · Zbl 0092.36704 · doi:10.1214/aoms/1177706538
[3] Albert, J.H. (1992). Bayesian estimation of normal ogive item response functions using Gibbs sampling.Journal of Educational Statistics, 17, 251–269. · doi:10.2307/1165149
[4] Andersen, E.B. (1973). A goodness of for test for the Rasch model.Psychometrika, 38, 123–140. · Zbl 0276.62048 · doi:10.1007/BF02291180
[5] Andersen, E.B. (1985). Estimating latent correlations between repeated testings.Psychometrika, 50, 3–16. · Zbl 0562.62096 · doi:10.1007/BF02294143
[6] Ando, A., & Kaufmann, O.M. (1965). Bayesian analysis of the independent normal process-neither mean nor precision known.Journal of the American Statistical Association, 60, 347–358. · Zbl 0139.37103
[7] Baker, F.B. (1998). An investigation of item parameter recovery characteristics of a Gibbs sampling procedure. Applied Psychological Measurement, 22, 153–169. · doi:10.1177/01466216980222005
[8] Birnbaum, A. (1968). Some latent trait models. In F.M. Lord & M.R. Novick (Eds.),Statistical theories of mental test scores (pp. 395–479). Reading, MA: Addison-Wesley.
[9] Bock, R.D. (1972). Estimating item parameters and latent ability when responses are scored in two or more nominal categories.Psychometrika, 37, 29–51. · Zbl 0233.62016 · doi:10.1007/BF02291411
[10] Bock, R.D., & Aitkin, M. (1981). Marginal maximum likelihood estimation of item parameters: an application of an EM-algorithm.Psychometrika, 46, 443–459. · doi:10.1007/BF02293801
[11] Breusch, T.S., & Pagan, A.R. (1980). The Lagrange multiplier test and its applications to model specification in econometrics.Review of Economic Studies, 47, 239–254. · Zbl 0465.62107 · doi:10.2307/2297111
[12] Buse, A. (1982). The likelihood ratio, Wald, and Lagrange multiplier tests: An expository note.The American Statistician, 36, 153–157.
[13] Choppin, B. (1983).A two-parameter latent trait model (CSE report No. 197). Los Angeles, CA: University of California, Center for Study of Evaluation, Graduate School of Education.
[14] de Leeuw, J., & Verhelst, N. D. (1986). Maximum likelihood estimation in generalized Rasch models.Journal of Educational Statistics, 11, 183–196.
[15] Fischer, G.H. (1974).Einführung in die Theorie Psychologischer Tests [Introduction to the theory of psychological tests]. Bern: Huber. · Zbl 0315.92016
[16] Follmann, D. (1988). Consistent estimation in the Rasch model based on nonparametric margins.Psychometrika, 53, 553–562. · Zbl 0718.62266 · doi:10.1007/BF02294407
[17] Gelman, A., Carlin, J.B., Stern, H.S., & Rubin, D.B. (1995).Bayesian data analysis. London: Chapman and Hall. · Zbl 1279.62004
[18] Glas, C.A.W. (1988). The derivation of some tests for the Rasch model from the multinomial distribution.Psychometrika, 53, 525–546. · Zbl 0718.62267 · doi:10.1007/BF02294405
[19] Glas, C.A.W. (1992). A Rasch model with a multivariate distribution of ability. In M. Wilson, (Ed.),Objective measurement: Theory into practice, Vol. 1. (pp.236–258) New Jersey: Ablex Publishing Co.
[20] Glas, C.A.W. (1998). Detection of differential item functioning using Lagrange multiplier tests.Statistica Sinica, 8, 647–667. · Zbl 0905.62114
[21] Glas, C.A.W., & Verhelst, N.D. (1989). Extensions of the partial credit model.Psychometrika, 54, 635–659. · Zbl 0732.62104 · doi:10.1007/BF02296401
[22] Glas, C.A.W., & Verhelst, N.D. (1995). Tests of fit for polytomous Rasch models. In G. H. Fischer & I. W. Molenaar (Eds.).Rasch models. Their foundation, recent developments and applications. New York: Springer. · Zbl 0825.62940
[23] Grayson, D.A. (1988). Two-group classification in item response theory: Scores with monotone likelihood ratio.Psychometrika, 53, 383–392. · Zbl 0718.62147 · doi:10.1007/BF02294219
[24] Hemker, B.T., Sijtsma, K., Molenaar, I.W. & Junker, B.W. (1996). Polytomous IRT models and monotone likelihood ratio of the total score.Psychometrika, 61, 679–693. · Zbl 0906.62120 · doi:10.1007/BF02294042
[25] Holland, P.W., & Rosenbaum, P.R. (1986). Conditional association and unidimensionality in monotone latent variable models.Annals of Statistics, 14, 1523–1543. · Zbl 0625.62102 · doi:10.1214/aos/1176350174
[26] Huynh, H. (1994). A new proof for monotone likelihood ratio for the sum of independent bernoulli random variables.Psychometrika, 59, 77–79. · Zbl 0826.62011 · doi:10.1007/BF02294266
[27] Jannarone, R.J. (1986). Conjunctive item response theory kernels.Psychometrika, 51, 357–373. · Zbl 0609.62142 · doi:10.1007/BF02294060
[28] Junker, B. (1991). Essential independence and likelihood-based ability estimation for polytomous items.Psychometrika, 56, 255–278. · Zbl 0761.62162 · doi:10.1007/BF02294462
[29] Kelderman, H. (1984). Loglinear Rasch model tests.Psychometrika, 49, 223–245. · Zbl 0573.62097 · doi:10.1007/BF02294174
[30] Kelderman, H. (1989). Item bias detection using loglinear IRT.Psychometrika, 54, 681–697. · doi:10.1007/BF02296403
[31] Koehler, K. (1986). Goodness-of-fit tests for loglinear models in sparse contingency tables.Journal of the American Statistical Association, 81, 483–493. · Zbl 0625.62033 · doi:10.1080/01621459.1986.10478294
[32] Koehler, K., & Larntz, K. (1980). An empirical investigation of goodness-of-fit statistics for sparse multinomials.Journal of the American Statistical Association, 75, 336–344. · Zbl 0442.62025 · doi:10.1080/01621459.1980.10477473
[33] Larntz, K. (1978). Small-sample comparison of exact levels for goodness-of-fit statistics.Journal of the American Statistical Association, 73, 253–263. · Zbl 0414.62022 · doi:10.1080/01621459.1978.10481567
[34] Louis, T.A. (1982). Finding the observed information matrix when using the EM algorithm.Journal of the Royal Statistical Society, Series B, 44, 226–233. · Zbl 0488.62018
[35] Lord, F.M. (1980).Applications of item response theory to practical testing problems. Hillsdale, NJ, Erlbaum.
[36] Martin-Löf, P. (1973).Statistika Modeller. Anteckningar från seminarier Lasåret 1969–1970, utardeltade av Rolf Sunberg. Obetydligt ändrat nytryck, oktober 1973. Stockholm: Institutet för Försäkringsmatematik och Matematisk Statistik vid Stockholms Universitet.
[37] Martin Löf, P. (1974). The notion of redundancy and its use as a quantitative measure if the discrepancy between a statistical hypothesis and a set of observational data.Scandinavian Journal of Statistics, 1, 3–18. · Zbl 0297.62011
[38] McDonald, R.P. (1967). Nonlinear factor analysis.Psychometric monographs, No.15.
[39] McDonald, R.P. (1997). Normal-ogive multidimensional model. In W.J. van der Linden & R.K. Hambleton (Eds.),Handbook of modern item response theory, (pp. 257–269). New York: Springer.
[40] Mislevy, R.J. (1986). Bayes modal estimation in item response models.Psychometrika, 51, 177–195. · Zbl 0596.62114 · doi:10.1007/BF02293979
[41] Mislevy, R.J., & Bock, R.D. (1990).PC-Bilog. Item analysis and test scoring with binary logistic models. Chicago: Scientific Software International.
[42] Molenaar, I.W. (1983). Some improved diagnostics for failure in the Rasch model.Psychometrika, 48, 49–72. · doi:10.1007/BF02314676
[43] Muraki, E. (1992). A generalized partial credit model: application of an EM algorithm.Applied Psychological Measurement, 16, 159–176. · doi:10.1177/014662169201600206
[44] Patz, R.J. & Junker, B.W. (1997).Applications and extensions of MCMC in IRT: Multiple item types, missing data, and rated responses (Technical Report No. 670). Pittsburgh: Carnegie Mellon University, Department of Statistics.
[45] Rao, C.R. (1947). Large sample tests of statistical hypothesis concerning several parameters with applications to problems of estimation.Proceedings of the Cambridge Philosophical Society, 44, 50–57. · Zbl 0034.07503
[46] Reckase, M.D. (1985). The difficulty of test items that measure more than one ability.Applied Psychological Measurement, 9, 401–412. · doi:10.1177/014662168500900409
[47] Reckase, M.D. (1997). A linear logistic multidimensional model for dichotomous item response data. In W.J. van der Linden & R. K. Hambleton (Eds.),Handbook of modern item response theory (pp. 271–286). New York: Springer.
[48] Reiser, M. (1996). Analysis of residuals for the multinomial item response model.Psychometrika, 61, 509–528. · Zbl 0863.62086 · doi:10.1007/BF02294552
[49] Rosenbaum, P.R. (1984). Testing the conditional independence and monotonicity assumptions of item response theory.Psychometrika, 49, 425–436. · Zbl 0569.62097 · doi:10.1007/BF02306030
[50] Rubin, D.B. (1976). Inference and missing data.Biometrika, 63, 581–592. · Zbl 0344.62034 · doi:10.1093/biomet/63.3.581
[51] Stout, W.F. (1987). A nonparametric approach for assessing latent trait dimensionality.Psychometrika, 52, 589–617. · Zbl 0718.62089 · doi:10.1007/BF02294821
[52] Stout, W.F. (1990). A new item response theory modeling approach with applications to unidimensional assessment and ability estimation.Psychometrika, 55, 293–326. · Zbl 0746.62103 · doi:10.1007/BF02295289
[53] Thissen, D. (1991).MULTILOG. Multiple, categorical item analysis and test scoring using item response theory. Chicago: Scientific Software International.
[54] Thissen, D., & Steinberg, L. (1986). A taxonomy of item response models.Psychometrika, 51, 567–577. · Zbl 0646.62098 · doi:10.1007/BF02295596
[55] Yen, W.M. (1981). Using simultaneous results to choose a latent trait model.Applied Psychological Measurement, 5, 245–262. · doi:10.1177/014662168100500212
[56] Yen, W.M. (1984). Effects of local item dependence on the fit and equating performance of the three-parameter logistic model.Applied Psychological Measurement, 8, 125–145. · doi:10.1177/014662168400800201
[57] Zimowski, M.F., Muraki, E., Mislevy, R.J., & Bock, R.D. (1996).Bilog MG: Multiple-group IRT analysis and test maintenance for binary items. Chicago: Scientific Software International.
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.