Title: Large Aligned Treebanks for Syntax-based Machine Translation
Authors: Kotzé, Gideon ×
Vandeghinste, Vincent
Martens, Scott
Tiedemann, Jörg #
Issue Date: 6-Oct-2016
Publisher: Springer
Series Title: Language Resources and Evaluation
Article number: DOI 10.1007/s10579-016-9369-0
Abstract: We present a collection of parallel treebanks that have been automatically aligned on both the terminal and the non-terminal constituent level for use in syntax-based
machine translation. We describe how they were constructed and applied to a syntax and example-based machine translation system called Parse and Corpus-Based Machine Translation (PaCo-MT). For the language pair Dutch to English, we present non-terminal alignment evaluation scores for a variety of tree alignment approaches.
Finally, based on the parallel treebanks created by these approaches, we evaluate the MT system itself and compare the scores with those of Moses, a current state-of-theart
statistical MT system, when trained on the same data.
ISSN: 1574-020X
Publication status: published
KU Leuven publication type: IT
Appears in Collections:Formal and Computational Linguistics (ComForT), Leuven
× corresponding author
# (joint) last author

Files in This Item:
File Description Status SizeFormat
LRE2016.pdffull article Published 988KbAdobe PDFView/Open Request a copy

These files are only available to some KU Leuven Association staff members


All items in Lirias are protected by copyright, with all rights reserved.