Google Scholar

[PDF][PDF] Normalization for automated metrics: English and Arabic speech translation

S Condon, G Sanders, D Parvaz…�- …�Summit XII: Papers, 2009 - aclanthology.org

S Condon, G Sanders, D Parvaz, A Rubenstein, C Doran, J Aberdeen, B Oshika

Proceedings of Machine Translation Summit XII: Papers, 2009•aclanthology.org

Abstract The Defense Advanced Research Projects Agency (DARPA) Spoken Language
Communication and Translation System for Tactical Use (TRANSTAC) program has
experimented with applying automated metrics to speech translation dialogues. For
translations into English, BLEU, TER, and METEOR scores correlate well with human
judgments, but scores for translation into Arabic correlate with human judgments less
strongly. This paper provides evidence to support the hypothesis that automated measures�…

Abstract

The Defense Advanced Research Projects Agency (DARPA) Spoken Language Communication and Translation System for Tactical Use (TRANSTAC) program has experimented with applying automated metrics to speech translation dialogues. For translations into English, BLEU, TER, and METEOR scores correlate well with human judgments, but scores for translation into Arabic correlate with human judgments less strongly. This paper provides evidence to support the hypothesis that automated measures of Arabic are lower due to variation and inflection in Arabic by demonstrating that normalization operations improve correlation between BLEU scores and Likert-type judgments of semantic adequacy—as well as between BLEU scores and human judgments of the successful transfer of the meaning of individual content words from English to Arabic.

aclanthology.org

Show moreShow less

Save Cite Cited by 7 Related articles All 12 versions View as HTML

Showing the best result for this search. See all results

Cite

Advanced search

Saved to My library

[PDF][PDF] Normalization for automated metrics: English and Arabic speech translation