Computer Science > Computation and Language

arXiv:2003.04642 (cs)

[Submitted on 10 Mar 2020]

Title:A Framework for Evaluation of Machine Reading Comprehension Gold Standards

Authors:Viktor Schlegel, Marco Valentino, André Freitas, Goran Nenadic, Riza Batista-Navarro

View PDF

Abstract:Machine Reading Comprehension (MRC) is the task of answering a question over a paragraph of text. While neural MRC systems gain popularity and achieve noticeable performance, issues are being raised with the methodology used to establish their performance, particularly concerning the data design of gold standards that are used to evaluate them. There is but a limited understanding of the challenges present in this data, which makes it hard to draw comparisons and formulate reliable hypotheses. As a first step towards alleviating the problem, this paper proposes a unifying framework to systematically investigate the present linguistic features, required reasoning and background knowledge and factual correctness on one hand, and the presence of lexical cues as a lower bound for the requirement of understanding on the other hand. We propose a qualitative annotation schema for the first and a set of approximative metrics for the latter. In a first application of the framework, we analyse modern MRC gold standards and present our findings: the absence of features that contribute towards lexical ambiguity, the varying factual correctness of the expected answers and the presence of lexical cues, all of which potentially lower the reading comprehension complexity and quality of the evaluation data.

Comments:	In Proceedings of the 12th International Conference on Language Resources and Evaluation (LREC 2020)
Subjects:	Computation and Language (cs.CL)
Cite as:	arXiv:2003.04642 [cs.CL]
	(or arXiv:2003.04642v1 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2003.04642

Submission history

From: Viktor Schlegel [view email]
[v1] Tue, 10 Mar 2020 11:30:22 UTC (107 KB)

Computer Science > Computation and Language

Title:A Framework for Evaluation of Machine Reading Comprehension Gold Standards

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:A Framework for Evaluation of Machine Reading Comprehension Gold Standards

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators