ISCA Archive Interspeech 2015
ISCA Archive Interspeech 2015

Selection and aggregation techniques for crowdsourced semantic annotation task

Shammur Absar Chowdhury, Marcos Calvo, Arindam Ghosh, Evgeny A. Stepanov, Ali Orkan Bayer, Giuseppe Riccardi, Fernando García, Emilio Sanchis

Crowdsourcing is an accessible and cost-effective alternative to traditional methods of collecting and annotating data. The application of crowdsourcing to simple tasks has been well investigated. However, complex tasks like semantic annotation transfer require workers to take simultaneous decisions on chunk segmentation and labeling while acquiring on-the-go domain-specific knowledge. The increased task complexity may generate low judgment agreement and/or poor performance. The goal of this paper is to cope with these crowdsourcing requirements with semantic priming and unsupervised quality control mechanisms. We aim at an automatic quality control that takes into account different levels of workers' expertise and annotation task performance. We investigate the judgment selection and aggregation techniques on the task of cross-language semantic annotation transfer. We propose stochastic modeling techniques to estimate the task performance of a worker on a particular judgment with respect to the whole worker group. These estimates are used for the selection of the best judgments as well as weighted consensus-based annotation aggregation. We demonstrate that the technique is useful for increasing the quality of collected annotations.