×

A generalization of the uniform association model for assessing rater agreement in ordinal scales. (English) Zbl 1511.62308

Summary: Recently, the data analysts pay more attention to the assessment of rater agreement, especially in areas of medical sciences. In this context, the statistical indices such as kappa and weighted kappa are the most common choices. These indices are simple to calculate and interpret, although, they fail to describe the structure of agreement, particularly when the available outcome has an ordinal nature. In the previous decades, statisticians suggested more efficient statistical tools such as diagonal parameter, linear by linear association and agreement plus linear by linear association models for describing the structure of rater agreement. In these models, the equal interval scores are the common choice for the levels of the ordinal scales. In this manuscript, we show that choosing the common equal interval scores does not necessarily lead to the best fit and propose a modification using a power transformation for the ordinal scores. We also use two different data sets (IOTN and ovarian masses data) to illustrate our suggestion more clearly. In addition, we utilize the category distinguishability concept for interpreting the model parameter estimates.

MSC:

62P10 Applications of statistics to biology and medical sciences; meta analysis
Full Text: DOI

References:

[1] Agresti, A. 1988. A model for agreement between ratings on an ordinal scale. Biometrics, 44: 539-548. · Zbl 0707.62227 · doi:10.2307/2531866
[2] Becker, M. P. and Agresti, A. 1992. Log-linear modeling of pairwise interobserver agreement on a categorical scale. Statist. Med., 11: 101-114. · doi:10.1002/sim.4780110109
[3] Calster, B. V., Timmerman, D., Bourne, T., Testa, A. C., Holsbeke, C. V., Domali, E., Jurkovic, D., Neven, P., Van Huffel, S. and Valentin, L. 2007. Discrimination between benign and malignant adnexal masses by specialist ultrasound examination versus serum CA-125. J. Natl Cancer Inst., 99(22): 1706-1714. · doi:10.1093/jnci/djm199
[4] Cohen, J. 1960. A coefficient of agreement for nominal scales. Educ. Psychol. Meas., 20: 37-46. · doi:10.1177/001316446002000104
[5] Cohen, J. 1968. Weighted kappa: nominal scale agreement with provision for scaled disagreement or partial credit. Psychol. Bull., 70: 213-220. · doi:10.1037/h0026256
[6] Darroch, J. N. and McCloud, P. I. 1986. Category distinguishability and observer agreement. Austral. J. Statist., 28: 371-388. · Zbl 0609.62140 · doi:10.1111/j.1467-842X.1986.tb00709.x
[7] Feinstein, A. R. and Cicchetti, D. V. 1990. High agreement but low kappa: I. The problem of two paradoxes. J. Clin. Epidemiol., 43: 543-549. · doi:10.1016/0895-4356(90)90158-L
[8] Goodman, L. A. 1979. Simple models for the analysis of association in cross-classifications having ordered categories. J. Amer. Statist. Assoc., 74: 537-552.
[9] Koch, G. G., Landis, J. R., Freeman, J. L., Freeman, D. H. and Lehnen, R. G. 1977. A general methodology for the analysis of experiments with repeated measurement of categorical data. Biometrics, 33: 133-158. · Zbl 0351.62037 · doi:10.2307/2529309
[10] Kraemer, H. C., Periakoil, V. S. and Noda, A. 2002. Tutorial in biostatistics, kappa coefficients in medical research. Statist. Med., 21: 2109-2129. · doi:10.1002/sim.1180
[11] May, S. M. 1994. Modeling observer agreement – an alternative to kappa. J. Clin. Epidemiol., 44: 1315-1324. · doi:10.1016/0895-4356(94)90137-6
[12] Perkins, S. M. and Becker, M. P. 2002. Assessing rater agreement using marginal association models. Statist. Med., 21: 1743-1760. · doi:10.1002/sim.1146
[13] Richmond, S., Shaw, W. C., O’Brien, K. D., Buchanan, I. B., Stephens, C. D., Andrews, M. and Roberts, C. T. 1995. The relationship between the index of orthodontic treatment need and consensus opinion of a panel of 74 dentists. Brit.Dental J., 178: 370-374. · doi:10.1038/sj.bdj.4808776
[14] Tanner, M. A. and Young, M. A. 1985. Modeling agreement among raters. J. Amer. Statist. Assoc., 80: 175-180.
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.