Google Scholar

Selecting artificially-generated sentences for fine-tuning neural machine translation

A Poncelas, A Way�- arXiv preprint arXiv:1909.12016, 2019 - arxiv.org

arXiv preprint arXiv:1909.12016, 2019•arxiv.org

Neural Machine Translation (NMT) models tend to achieve best performance when larger
sets of parallel sentences are provided for training. For this reason, augmenting the training
set with artificially-generated sentence pairs can boost performance. Nonetheless, the
performance can also be improved with a small number of sentences if they are in the same
domain as the test set. Accordingly, we want to explore the use of artificially-generated
sentences along with data-selection algorithms to improve German-to-English NMT models�…

Neural Machine Translation (NMT) models tend to achieve best performance when larger sets of parallel sentences are provided for training. For this reason, augmenting the training set with artificially-generated sentence pairs can boost performance. Nonetheless, the performance can also be improved with a small number of sentences if they are in the same domain as the test set. Accordingly, we want to explore the use of artificially-generated sentences along with data-selection algorithms to improve German-to-English NMT models trained solely with authentic data. In this work, we show how artificially-generated sentences can be more beneficial than authentic pairs, and demonstrate their advantages when used in combination with data-selection algorithms.

arxiv.org

Show moreShow less

Save Cite Cited by 12 Related articles All 7 versions View as HTML

Cite

Advanced search

Saved to My library

Selecting artificially-generated sentences for fine-tuning neural machine translation