×

On the information obtainable from comparative judgments. (English) Zbl 1499.62430

Summary: Personality tests employing comparative judgments have been proposed as an alternative to Likert-type rating scales. One of the main advantages of a comparative format is that it can reduce faking of responses in high-stakes situations. However, previous research has shown that it is highly difficult to obtain trait score estimates that are both faking resistant and sufficiently accurate for individual-level diagnostic decisions. With the goal of contributing to a solution, I study the information obtainable from comparative judgments analyzed by means of Thurstonian IRT models. First, I extend the mathematical theory of ordinal comparative judgments and corresponding models. Second, I provide optimal test designs for Thurstonian IRT models that maximize the accuracy of people’s trait score estimates from both frequentist and Bayesian statistical perspectives. Third, I derive analytic upper bounds for the accuracy of these trait estimates achievable through ordinal Thurstonian IRT models. Fourth, I perform numerical experiments that complement results obtained in earlier simulation studies. The combined analytical and numerical results suggest that it is indeed possible to design personality tests using comparative judgments that yield trait scores estimates sufficiently accurate for individual-level diagnostic decisions, while reducing faking in high-stakes situations. Recommendations for the practical application of comparative judgments for the measurement of personality, specifically in high-stakes situations, are given.

MSC:

62P15 Applications of statistics to psychology
62K05 Optimal statistical designs

References:

[1] Anguiano-Carrasco, C.; MacCann, C.; Geiger, M.; Seybert, JM; Roberts, RD, Development of a forced-choice measure of typical-performance emotional intelligence, Journal of Psychoeducational Assessment, 33, 83-97 (2015) · doi:10.1177/0734282914550387
[2] Atkinson, A., Donev, A., & Tobias, R. (2007). Optimum experimental designs, with SAS (Vol. 34). Oxford University Press. · Zbl 1183.62129
[3] Baron, H., Strengths and limitations of ipsative measurement, Journal of Occupational and Organizational Psychology, 69, 49-56 (1996) · doi:10.1111/j.2044-8325.1996.tb00599.x
[4] Bartram, D., Increasing validity with forced-choice criterion measurement formats, International Journal of Selection and Assessment, 15, 3, 263-272 (2007) · doi:10.1111/j.1468-2389.2007.00386.x
[5] Berger, M. P., & Wong, W.-K. (2009). An introduction to optimal designs for social and biomedical research (Vol. 83). Wiley. · Zbl 1182.62154
[6] Brown, A., Item response models for forced-choice questionnaires: A common framework, Psychometrika, 81, 135-160 (2016) · Zbl 1342.62182 · doi:10.1007/s11336-014-9434-9
[7] Brown, A., Thurstonian scaling of compositional questionnaire data, Multivariate Behavioral Research, 51, 2-3, 345-356 (2016) · doi:10.1080/00273171.2016.1150152
[8] Brown, A., & Bartram, D. (2013). The occupational personality questionnaire revolution: Applying item response theory to questionnaire design and scoring. Retrieved from http://www.humandevelopmentsolutions.com/views/archives/pdf/White-Paper-OPQ32r.pdf
[9] Brown, A.; Maydeu-Olivares, A., Item response modeling of forced-choice questionnaires, Educational and Psychological Measurement, 71, 460-502 (2011) · doi:10.1177/0013164410375112
[10] Brown, A.; Maydeu-Olivares, A., Fitting a Thurstonian IRT model to forced-choice data using mplus, Behavior Research Methods, 44, 1135-1147 (2012) · doi:10.3758/s13428-012-0217-x
[11] Brown, A.; Maydeu-Olivares, A., How IRT can solve problems of ipsative data in forced-choice questionnaires, Psychological Methods, 18, 1, 36-52 (2013) · doi:10.1037/a0030641
[12] Brown, A.; Maydeu-Olivares, A., Ordinal factor analysis of graded-preference questionnaire data, Structural Equation Modeling, 25, 4, 516-529 (2018) · doi:10.1080/10705511.2017.1392247
[13] Bürkner, P.-C. (2019). thurstonianIRT: Thurstonian IRT models in R. R package version 0.8.
[14] Bürkner, P-C; Schulte, N.; Holling, H., On the statistical and practical limitations of Thurstonian IRT models, Educational and Psychological Measurement, 79, 827-854 (2019) · doi:10.1177/0013164419832063
[15] Bürkner, P-C; Schwabe, R.; Holling, H., Optimal designs for the generalized partial credit model, British Journal of Mathematical and Statistical Psychology, 72, 2, 271-293 (2019) · Zbl 1420.62326 · doi:10.1111/bmsp.12148
[16] Bürkner, P-C; Vuorre, M., Ordinal regression models in psychology: A tutorial, Advances in Methods and Practices in Psychological Science, 2, 1, 77-101 (2019) · doi:10.1177/2515245918823199
[17] Cao, M.; Drasgow, F., Does forcing reduce faking? A meta-analytic review of forced-choice personality measures in high-stakes situations, Journal of Applied Psychology, 104, 11, 1347-1368 (2019) · doi:10.1037/apl0000414
[18] Chaloner, K.; Verdinelli, I., Bayesian experimental design: A review, Statistical Science, 10, 273-304 (1995) · Zbl 0955.62617 · doi:10.1214/ss/1177009939
[19] Cheung, MW-L; Chan, W., Reducing uniform response bias with ipsative measurement in multiple-group confirmatory factor analysis, Structural Equation Modeling, 9, 1, 55-77 (2002) · doi:10.1207/S15328007SEM0901_4
[20] Costa, P. T., & McCrae, R. R. (1992). NEO-PI-r professional manual. Psychological Assessment Resources.
[21] Gelman, A., Carlin, J. B., Stern, H. S., Dunson, D. B., Vehtari, A., & Rubin, D. B. (2013). Bayesian Data Analysis (3rd ed.). CRC. · Zbl 1279.62004
[22] Gelman, A., & Hill, J. (2006). Data analysis using regression and multilevel/hierarchical models. Cambridge University Press.
[23] Guenole, N.; Brown, A.; Cooper, AJ, Forced-choice assessment of work-related maladaptive personality traits: Preliminary evidence from an application of thurstonian item response modeling, Assessment, 25, 513-526 (2018) · doi:10.1177/1073191116641181
[24] Hicks, LE, Some properties of ipsative, normative, and forced-choice normative measures, Psychological Bulletin, 74, 167-184 (1970) · doi:10.1037/h0029780
[25] Hontangas, PM; de la Torre, J.; Ponsoda, V.; Leenen, I.; Morillo, D.; Abad, FJ, Comparing traditional and IRT scoring of forced-choice tests, Applied Psychological Measurement, 39, 8, 598-612 (2015) · doi:10.1177/0146621615585851
[26] Hu, L.; Bentler, PM, Fit indices in covariance structure modeling: Sensitivity to underparameterized model misspecification, Psychological Methods, 3, 4, 424-453 (1998) · doi:10.1037/1082-989X.3.4.424
[27] Lee, H.; Smith, WZ, A Bayesian random block item response theory model for forced-choice formats, Educational and Psychological Measurement, 80, 3, 578-603 (2020) · doi:10.1177/0013164419871659
[28] Lee, P.; Lee, S.; Stark, S., Examining validity evidence for multidimensional forced choice measures with different scoring approaches, Personality and Individual Differences, 123, 229-235 (2018) · doi:10.1016/j.paid.2017.11.031
[29] Lehmann, E. L., & Casella, G. (2006). Theory of point estimation. Springer. · Zbl 0916.62017
[30] Ostendorf, F., & Angleitner, A. (2004). Neo-PI-R: Neo-persönlichkeitsinventar nach costa und McCrae. Hogrefe.
[31] Paulhus, D. L. (1991). Measurement and control of response bias. In J. P. Robinson, P. R. Shaver, L. S. Wrightsman, J. P. Robinson, P. R. Shaver, & L. S. Wrightsman (Eds.), Measures of personality and social psychological attitudes (pp. 17-59). Academic Press.
[32] Paulhus, D. L., & Vazire, S. (2007). The self-report method. In R. W. Robins, R. C. Fraley, & R. F. Krueger (Eds.), Handbook of research methods in personality psychology (pp. 224-239). Guilford.
[33] Samejima, F., Estimation of latent ability using a response pattern of graded scores, Psychometrika Monograph Supplement, 17, 1-100 (1969)
[34] Saville, P.; Willson, E., The reliability and validity of normative and ipsative approaches in the measurement of personality, Journal of Occupational Psychology, 64, 219-238 (1991) · doi:10.1111/j.2044-8325.1991.tb00556.x
[35] Schmidt, D.; Schwabe, R., On optimal designs for censored data, Metrika, 78, 3, 237-257 (2015) · Zbl 1314.62171 · doi:10.1007/s00184-014-0500-1
[36] Schulte, N.; Holling, H.; Bürkner, P-C, Can high-dimensional questionnaires resolve the ipsativity issue of forced-choice response formats?, Educational and Psychological Measurement, 81, 2, 262-289 (2020) · doi:10.1177/0013164420934861
[37] Schulte, N., Kaup, L., Bürkner, P.-C., & Holling, H. (2021). The fakeability of personality measurement with graded paired comparisons. PsyArXiv Preprint.
[38] Srivastava, MS, Singular Wishart and multivariate beta distributions, Annals of Statistics, 31, 5, 1537-1560 (2003) · Zbl 1042.62051 · doi:10.1214/aos/1065705118
[39] Thurstone, LL, A law of comparative judgment, Psychological Review, 79, 273-286 (1927) · doi:10.1037/h0070288
[40] van der Linden, W. J., & Hambleton, R. K. (2013). Handbook of modern item response theory. Springer. · Zbl 0872.62099
[41] Watrin, L.; Geiger, M.; Spengler, M.; Wilhelm, O., Forced-choice versus likert responses on an occupational big five questionnaire, Journal of Individual Differences, 40, 134-148 (2019) · doi:10.1027/1614-0001/a000285
[42] Wetzel, E., Böhnke, J. R., & Brown, A. (2016). Response biases. In F. T. L. Leong, D. Bartram, F. M. Cheung, K. F. Geisinger, & D. Iliescu (Eds.), The ITC international handbook of testing and assessment (pp. 349-363). Oxford University Press.
[43] Wetzel, E.; Frick, S.; Brown, A., Does multidimensional forced-choice prevent faking? Comparing the susceptibility of the multidimensional forced-choice format and the rating scale format to faking, Psychological Assessment, 33, 156 (2020) · doi:10.1037/pas0000971
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.