Southern African Linguistics and Applied Language Studies vol:28 issue:3 pages:283-290
We investigate the extent to which the detection of phraseological (in)consistency in the translation process can be automated. We describe the acquisition of a large corpus of Belgian legal documents consisting of French arrests translated into Dutch. We apply the sentence alignment tool GMA to the corpus, and extract phraseological unit candidates from the sentence pairs through the term candidate extraction tool TermCalc and the word alignment data produced by the GIZA++ tool. The candidates are compared to a reference set from a manual study of an MA student at Lessius/KULeuven. They appear to cover only 33% of the bilingual phraseological unit pairs and only four French units with more than one Dutch equivalent. This indicates the need for devising techniques
specifically aimed at detecting multiple equivalence, hence potential phraseological inconsistency.