The kappa statistic: a second look. (English) Zbl 1234.68406
Summary: In recent years, the kappa coefficient of agreement has become the de facto standard for evaluating intercoder agreement for tagging tasks. In this squib, we highlight issues that affect \(\kappa \) and that the community has largely neglected. First, we discuss the assumptions underlying different computations of the expected agreement component of \(\kappa \). Second, we discuss how prevalence and bias affect the \(\kappa \) measure.
MSC:
68T50 | Natural language processing |
62P99 | Applications of statistics |
62H20 | Measures of association (correlation, canonical correlation, etc.) |
Keywords:
kappa coefficient of agreement; kappa statistics; expected agreement component; bias; prevalenceReferences:
[1] | DOI: 10.1097/00005053-197611000-00003 · doi:10.1097/00005053-197611000-00003 |
[2] | DOI: 10.1001/jama.268.18.2513 · doi:10.1001/jama.268.18.2513 |
[3] | DOI: 10.1016/0895-4356(93)90018-V · doi:10.1016/0895-4356(93)90018-V |
[4] | Carletta Jean, Computational Linguistics 22 (2) pp 249– (1996) |
[5] | Carletta Jean, Computational Lingustics 23 (1) pp 13– (1997) |
[6] | DOI: 10.1016/0895-4356(90)90159-M · doi:10.1016/0895-4356(90)90159-M |
[7] | DOI: 10.1177/001316446002000104 · doi:10.1177/001316446002000104 |
[8] | DOI: 10.1006/ijhc.2000.0428 · Zbl 1011.68638 · doi:10.1006/ijhc.2000.0428 |
[9] | DOI: 10.1037/h0031619 · doi:10.1037/h0031619 |
[10] | DOI: 10.1001/jama.268.18.2513 · doi:10.1001/jama.268.18.2513 |
[11] | DOI: 10.1001/archpsyc.1981.01780290042004 · doi:10.1001/archpsyc.1981.01780290042004 |
[12] | DOI: 10.1086/266577 · doi:10.1086/266577 |
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.