Morphosyntactic target language matching in statistical machine translation

Simon Zwarts, Mark Dras

Research output: Contribution to journalConference paper

Abstract

While the intuition that morphological preprocessing of languages in various applications can be beneficial appears to be often true, especially in the case of morphologically richer languages, it is not always the case. Previous work on translation between Nordic languages, including the morphologically rich Finnish, found that morphological analysis and preprocessing actually led to a decrease in translation quality below that of the unprocessed baseline. In this paper we investigate the proposition that the effect on translation quality depends on the kind of morphological preprocessing; and in particular that a specific kind of morphological preprocessing before translation could improve translation quality, a preprocessing that first transforms the source language to look more like the target, adapted from work on preprocessing via syntactically motivated reordering. We show that this is indeed the case in translating from Finnish, and that the results hold for different target languages and different morphological analysers.
Original languageEnglish
Pages (from-to)169-177
Number of pages9
JournalProceedings of the Australasian Language Technology Workshop 2008 (ALTA 2008)
Publication statusPublished - 2008
EventAustralasian Language Technology Association Workshop - Hobart, TAS
Duration: 8 Dec 200810 Dec 2008

Cite this