research-article

Syntax-Based Post-Ordering for Efficient Japanese-to-English Translation

Authors:

Katsuhito Sudoh,

Hajime Tsukada,

Masaaki NagataAuthors Info & Claims

ACM Transactions on Asian Language Information Processing (TALIP), Volume 12, Issue 3

Article No.: 12, Pages 1 - 15

https://doi.org/10.1145/2499955.2499960

Published: 01 August 2013 Publication History

Abstract

This article proposes a novel reordering method for efficient two-step Japanese-to-English statistical machine translation (SMT) that isolates reordering from SMT and solves it after lexical translation. This reordering problem, called post-ordering, is solved as an SMT problem from Head-Final English (HFE) to English. HFE is syntax-based reordered English that is very successfully used for reordering with English-to-Japanese SMT. The proposed method incorporates its advantage into the reverse direction, Japanese-to-English, and solves the post-ordering problem by accurate syntax-based SMT with target language syntax. Two-step SMT with the proposed post-ordering empirically reduces the decoding time of the accurate but slow syntax-based SMT by its good approximation using intermediate HFE. The proposed method improves the decoding speed of syntax-based SMT decoding by about six times with comparable translation accuracy in Japanese-to-English patent translation experiments.

References

[1]

Aikawa, T. and Ruopp, A. 2009. Chained system: A linear combination of different types of statistical machine translation systems. In Proceedings of the 12th Machine Translation Summit.

[2]

Bangalore, S., Haffner, P., and Kanthak, S. 2007. Statistical machine translation through global lexical selection and sentence reconstruction. In Proceedings of the 45th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, 152--159.

[3]

Bangalore, S. and Riccardi, G. 2000. Finite-state models for lexical reordering in spoken language translation. In Proceedings of the International Conference on Spoken Language Processing (ICSLP). 422--425.

[4]

Brown, P. F., Della Pietra, S. A., Della Pietra, V. J., and Mercer, R. L. 1993. The mathematics of statistical machine translation: Parameter estimation. Comput. Linguis. 19, 2, 263--311.

Digital Library

[5]

Chiang, D. 2007. Hierarchical phrase-based translation. Comput. Linguis. 33, 2, 201--228.

Digital Library

[6]

Collins, M., Koehn, P., and Kucerova, I. 2005. Clause restructuring for statistical machine translation. In Proceedings of the 43rd Annual Meeting of the Association for Computational Linguistics (ACL’05). Association for Computational Linguistics, 531--540.

Digital Library

[7]

Costa-jussà, M. R. and Fonollosa, J. A. R. 2006. Statistical machine reordering. In Proceedings of the Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, 70--76.

Digital Library

[8]

DeNero, J. and Uszkoreit, J. 2011. Inducing sentence structure from parallel corpora for reordering. In Proceedings of the Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, 193--203.

Digital Library

[9]

Dugast, L., Senellart, J., and Koehn, P. 2007. Statistical post-editing on SYSTRAN’s rule-based translation system . In Proceedings of the 2nd Workshop on Statistical Machine Translation. Association for Computational Linguistics, 220--223.

Digital Library

[10]

Ehara, T. 2007. Rule based machine translation combined with statistical post editor for japanese to english patent translation. In Proceedings of the MT Summit XI Workshop on Patent Translation.

[11]

Galley, M. and Manning, C. D. 2008. A simple and effective hierarchical phrase reordering model. In Proceedings of the Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, 848--856.

Digital Library

[12]

Galley, M., Hopkins, M., Knight, K., and Marcu, D. 2004. What’s in a translation rule? In Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (HLT-NAACL’04), D. M. Susan Dumais and S. Roukos Eds., Association for Computational Linguistics, 273--280.

[13]

Genzel, D. 2010. Automatically learning source-side reordering rules for large scale machine translation. In Proceedings of the 23rd International Conference on Computational Linguistics (COLING’10). 376--384.

Digital Library

[14]

Goto, I., Lu, B., Chow, K. P., Sumita, E., and Tsou, B. K. 2011. Overview of the patent machine translation task at the NTCIR-9 workshop. In Proceedings of the NII Test Collection for IR Systems (NTCIR-9).

[15]

Goto, I., Utiyama, M., and Sumita, E. 2012. Post-ordering by parsing for Japanese-English statistical machine translation. In Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics. (Vol. 2: Short Papers). Association for Computational Linguistics, 311--316.

Digital Library

[16]

Graehl, J. and Knight, K. 2004. Training tree transducers. In Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (HLT-NAACL’04), D. M. Susan Dumais and S. Roukos Eds., Association for Computational Linguistics, 105--112.

[17]

Hong, G., Lee, S.-W., and Rim, H.-C. 2009. Bridging morpho-syntactic gap between source and target sentences for English-Korean statistical machine translation. In Proceedings of the Annual Meeting of the Association for Computational Linguistics and the International Conference on Natural Language Processing (ACL-IJCNLP’09). Conference Short Papers. Association for Computational Linguistics, 233--236.

Digital Library

[18]

Isozaki, H., Hirao, T., Duh, K., Sudoh, K., and Tsukada, H. 2010a. Automatic evaluation of translation quality for distant language pairs. In Proceedings of the Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, 944--952.

Digital Library

[19]

Isozaki, H., Sudoh, K., Tsukada, H., and Duh, K. 2010b. Head finalization: A simple reordering rule for sov languages. In Proceedings of the Joint 5th Workshop on Statistical Machine Translation and MetricsMATR. Association for Computational Linguistics, 244--251.

Digital Library

[20]

Isozaki, H., Sudoh, K., Tsukada, H., and Duh, K. 2012. HPSG-based preprocessing for English-to-Japanese translation. ACM Trans. Asian Lang. Inf. Proces. 11, 3.

Digital Library

[21]

Katz-Brown, J. and Collins, M. 2008. Syntactic reordering in preprocessing for Japanese-English translation: MIT system description for NTCIR-7 patent translation task . In Proceedings of the NII Test Collection for IR Systems (NTCIR-7). 409--414.

[22]

Katz-Brown, J., Petrov, S., McDonald, R., Och, F., Talbot, D., Ichikawa, H., Seno, M., and Kazawa, H. 2011. Training a parser for machine translation reordering. In Proceedings of the Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, 183--192.

Digital Library

[23]

Koehn, P. 2010. Statistical Machine Translation. Cambridge University Press, Cambridge, U.K.

Digital Library

[24]

Koehn, P., Och, F. J., and Marcu, D. 2003. Statistical phrase-based translation. In Proceedings of the Human Language Technology Conference of the North American Chapter of the Association for Computational Linguistics. 263--270.

Digital Library

[25]

Kondo, S., Komachi, M., Matsumoto, Y., Sudoh, K., Duh, K., and Tsukada, H. 2011. Learning of linear ordering problems and its application to J-E patent translation in NTCIR-9 PatentMT. In Proceedings of the NII Test Collection for IR Systems (NTCIR-9).

[26]

Li, C.-H., Li, M., Zhang, D., Li, M., Zhou, M., and Guan, Y. 2007. A probabilistic approach to syntax-based reordering for statistical machine translation. In Proceedings of the 45th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, 720--727.

[27]

Matusov, E., Kanthak, S., and Ney, H. 2005. On the integration of speech recognition and statistical machine translation. In Proceedings of the Annual Conference of the International Speech Communication Association (INTERSPEECH). 3177--3180.

[28]

Miyao, Y. and Tsujii, J. 2008. Feature forest models for probabilistic hpsg parsing. Comput. Linguis. 34, 1, 35--80.

Digital Library

[29]

Nagata, M., Saito, K., Yamamoto, K., and Ohashi, K. 2006. A clustered global phrase reordering model for statistical machine translation. In Proceedings of the 21st International Conference on Computational Linguistics and 44th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, 713--720.

Digital Library

[30]

Neubig, G., Watanabe, T., and Mori, S. 2012. Inducing a discriminative parser to optimize machine translation reordering. In Proceedings of the Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning. Association for Computational Linguistics, 843--853.

Digital Library

[31]

Papineni, K., Roukos, S., Ward, T., and Zhu, W.-J. 2002. Bleu: A method for automatic evaluation of machine translation. In Proceedings of 40th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, 311--318.

Digital Library

[32]

Quirk, C., Menezes, A., and Cherry, C. 2005. Dependency treelet translation: Syntactically informed phrasal SMT. In Proceedings of the 43rd Annual Meeting of the Association for Computational Linguistics (ACL’05). Association for Computational Linguistics, 271--279.

Digital Library

[33]

Simard, M., Goutte, C., and Isabelle, P. 2007. Statistical phrase-based post-editing. In Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Association for Computational Linguistics, 508--515.

[34]

Snover, M., Dorr, B., Schwartz, R., Micciulla, L., and Makhoul, J. 2006. A study of translation edit rate with targeted human annotation. In Proceedings of the Conference of the Association for Machine Translation in the Americas (AMTA). 223--231.

[35]

Sudoh, K., Duh, K., Tsukada, H., Nagata, M., Wu, X., Matsuaki, T., and Tsujii, J. 2011a. NTT-UT statistical machine translation in NTCIR-9 PatentMT. In Proceedings of the NII Test Collection for IR Systems (NTCIR-9).

[36]

Sudoh, K., Wu, X., Duh, K., Tsukada, H., and Nagata, M. 2011b. Post-ordering in statistical machine translation. In Proceedings of the 13th Machine Translation Summit (MT Summit XIII). 316--323.

[37]

Tillmann, C. 2004. A unigram orientation model for statistical machine translation. In Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (HLT-NAACL’04), D. M. Dumais and S. Roukos Eds., Association for Computational Linguistics, 101--104.

Digital Library

[38]

Tillmann, C., Vogel, S., Ney, H., Zubiaga, A., and Sawaf, H. 1997. Accelerated DP based search for statistical translation. In Proceedings of the European Conference on Speech Communication and Technology (Eurospeech). Vol. 5. 2667--2670.

[39]

Tromble, R. and Eisner, J. 2009. Learning linear ordering problems for better translation. In Proceedings of the Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, 1007--1016.

Digital Library

[40]

Wu, D. 1997. Stochastic inversion transduction grammars and bilingual parsing of parallel corpora. Comput. Linguis. 23, 3, 377--404.

Digital Library

[41]

Wu, H. and Wang, H. 2007. Pivot language approach for phrase-based statistical machine translation. In Proceedings of the 45th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, 856--863.

[42]

Xia, F. and McCord, M. 2004. Improving a statistical MT system with automatically learned rewrite patterns. In Proceedings of the International Conference on Computational Linguistics (COLING). 508--514.

Digital Library

[43]

Xu, P., Kang, J., Ringgaard, M., and Och, F. 2009. Using a dependency parser to improve smt for subject-object-verb languages. In Proceedings of the Annual Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Association for Computational Linguistics, 245--253.

Digital Library

[44]

Yamada, K. and Knight, K. 2001. A syntax-based statistical translation model. In Proceedings of the 39th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, 523--530.

Digital Library

[45]

Zollmann, A. and Venugopal, A. 2006. Syntax augmented machine translation via chart parsing. In Proceedings on the Workshop on Statistical Machine Translation. Association for Computational Linguistics, 138--141.

Digital Library

Cited By

Anirudh CKavi N(2023)How Good are Transformers in Reordering?Multi-disciplinary Trends in Artificial Intelligence10.1007/978-3-031-36402-0_5(60-67)Online publication date: 24-Jun-2023
https://doi.org/10.1007/978-3-031-36402-0_5
Farzi SFaili HKianian S(2018)A neural reordering model based on phrasal dependency tree for statistical machine translationIntelligent Data Analysis10.3233/IDA-17358222:5(1163-1183)Online publication date: 26-Sep-2018
https://doi.org/10.3233/IDA-173582
Farzi SFaili HKianian S(2018)A preordering model based on phrasal dependency treeDigital Scholarship in the Humanities10.1093/llc/fqy00933:4(748-765)Online publication date: 18-May-2018
https://doi.org/10.1093/llc/fqy009
Show More Cited By

Index Terms

Syntax-Based Post-Ordering for Efficient Japanese-to-English Translation
1. Computing methodologies
  1. Artificial intelligence
    1. Natural language processing
      1. Machine translation

Recommendations

Post-Ordering by Parsing with ITG for Japanese-English Statistical Machine Translation

Word reordering is a difficult task for translation between languages with widely different word orders, such as Japanese and English. A previously proposed post-ordering method for Japanese-to-English translation first translates a Japanese sentence ...
HPSG-Based Preprocessing for English-to-Japanese Translation

Japanese sentences have completely different word orders from corresponding English sentences. Typical phrase-based statistical machine translation (SMT) systems such as Moses search for the best word permutation within a given distance limit (...
A Syntactic-based Word Re-ordering for English-Vietnamese Statistical Machine Translation System
PRICAI '08: Proceedings of the 10th Pacific Rim International Conference on Artificial Intelligence: Trends in Artificial Intelligence

In machine translation, the re-ordering of word from source to target language is one of the major steps that affect mainly the performance of the system. Among many approaches for this type of problem, syntactic is an effective method for handling word-...

Comments

Information & Contributors

Information

Published In

cover image ACM Transactions on Asian Language Information Processing

ACM Transactions on Asian Language Information Processing Volume 12, Issue 3

August 2013

76 pages

ISSN:1530-0226

EISSN:1558-3430

DOI:10.1145/2499955

Issue’s Table of Contents

Copyright © 2013 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 01 August 2013

Accepted: 01 December 2012

Revised: 01 November 2012

Received: 01 February 2012

Published in TALIP Volume 12, Issue 3

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article
Research
Refereed

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

6
Total Citations
View Citations
341
Total Downloads

Downloads (Last 12 months)7
Downloads (Last 6 weeks)0

Reflects downloads up to 22 Oct 2024

Other Metrics

View Author Metrics

Citations

Cited By

Anirudh CKavi N(2023)How Good are Transformers in Reordering?Multi-disciplinary Trends in Artificial Intelligence10.1007/978-3-031-36402-0_5(60-67)Online publication date: 24-Jun-2023
https://doi.org/10.1007/978-3-031-36402-0_5
Farzi SFaili HKianian S(2018)A neural reordering model based on phrasal dependency tree for statistical machine translationIntelligent Data Analysis10.3233/IDA-17358222:5(1163-1183)Online publication date: 26-Sep-2018
https://doi.org/10.3233/IDA-173582
Farzi SFaili HKianian S(2018)A preordering model based on phrasal dependency treeDigital Scholarship in the Humanities10.1093/llc/fqy00933:4(748-765)Online publication date: 18-May-2018
https://doi.org/10.1093/llc/fqy009
Bisazza AFederico M(2016)A survey of word reordering in statistical machine translationComputational Linguistics10.1162/COLI_a_0024542:2(163-205)Online publication date: 1-Jun-2016
https://dl.acm.org/doi/10.1162/COLI_a_00245
Ding CSakanushi KTouji HYamamoto M(2016)Inter-, Intra-, and Extra-Chunk Pre-Ordering for Statistical Japanese-to-English Machine TranslationACM Transactions on Asian and Low-Resource Language Information Processing10.1145/281838115:3(1-28)Online publication date: 9-Jan-2016
https://dl.acm.org/doi/10.1145/2818381
Farzi SFaili H(2015)Improving Statistical Machine Translation using Syntax-based Learning-to-Rank SystemDigital Scholarship in the Humanities10.1093/llc/fqv032(fqv032)Online publication date: 12-Aug-2015
https://doi.org/10.1093/llc/fqv032

View Options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Issue’s Table of Contents