HPSG-based preprocessing for English-to-Japanese translation

H Isozaki, K Sudoh, H Tsukada, K Duh�- ACM Transactions on Asian�…, 2012 - dl.acm.org
H Isozaki, K Sudoh, H Tsukada, K Duh
ACM Transactions on Asian Language Information Processing (TALIP), 2012dl.acm.org
Japanese sentences have completely different word orders from corresponding English
sentences. Typical phrase-based statistical machine translation (SMT) systems such as
Moses search for the best word permutation within a given distance limit (distortion limit). For
English-to-Japanese translation, we need a large distance limit to obtain acceptable
translations, and the number of translation candidates is extremely large. Therefore, SMT
systems often fail to find acceptable translations within a limited time. To solve this problem�…
Japanese sentences have completely different word orders from corresponding English sentences. Typical phrase-based statistical machine translation (SMT) systems such as Moses search for the best word permutation within a given distance limit (distortion limit). For English-to-Japanese translation, we need a large distance limit to obtain acceptable translations, and the number of translation candidates is extremely large. Therefore, SMT systems often fail to find acceptable translations within a limited time. To solve this problem, some researchers use rule-based preprocessing approaches, which reorder English words just like Japanese by using dozens of rules. Our idea is based on the following two observations: (1) Japanese is a typical head-final language, and (2) we can detect heads of English sentences by a head-driven phrase structure grammar (HPSG) parser. The main contributions of this article are twofold: First, we demonstrate how off-the-shelf, state-of-the-art HPSG parser enables us to write the reordering rules in an abstract level and can easily improve the quality of English-to-Japanese translation. Second, we also show that syntactic heads achieve better results than semantic heads. The proposed method outperforms the best system of NTCIR-7 PATMT EJ task.
ACM Digital Library
Showing the best result for this search. See all results