[PDF][PDF] Normalization for automated metrics: English and Arabic speech translation

S Condon, G Sanders, D Parvaz…�- …�Summit XII: Papers, 2009 - aclanthology.org
S Condon, G Sanders, D Parvaz, A Rubenstein, C Doran, J Aberdeen, B Oshika
Proceedings of Machine Translation Summit XII: Papers, 2009aclanthology.org
Abstract The Defense Advanced Research Projects Agency (DARPA) Spoken Language
Communication and Translation System for Tactical Use (TRANSTAC) program has
experimented with applying automated metrics to speech translation dialogues. For
translations into English, BLEU, TER, and METEOR scores correlate well with human
judgments, but scores for translation into Arabic correlate with human judgments less
strongly. This paper provides evidence to support the hypothesis that automated measures�…
Abstract
The Defense Advanced Research Projects Agency (DARPA) Spoken Language Communication and Translation System for Tactical Use (TRANSTAC) program has experimented with applying automated metrics to speech translation dialogues. For translations into English, BLEU, TER, and METEOR scores correlate well with human judgments, but scores for translation into Arabic correlate with human judgments less strongly. This paper provides evidence to support the hypothesis that automated measures of Arabic are lower due to variation and inflection in Arabic by demonstrating that normalization operations improve correlation between BLEU scores and Likert-type judgments of semantic adequacy—as well as between BLEU scores and human judgments of the successful transfer of the meaning of individual content words from English to Arabic.
aclanthology.org
Showing the best result for this search. See all results