Choosing the right translation: A syntactically informed classification approach

Simon Zwarts, Mark Dras

Research output: Chapter in Book/Report/Conference proceedingConference proceeding contributionResearchpeer-review

Abstract

One style of Multi-Engine Machine Translation architecture involves choosing the best of a set of outputs from different systems. Choosing the best translation from an arbitrary set, even in the presence of human references, is a difficult problem; it may prove better to look at mechanisms for making such choices in more restricted contexts. In this paper we take a classification-based approach to choosing between candidates from syntactically informed translations. The idea is that using multiple parsers as part of a classifier could help detect syntactic problems in this context that lead to bad translations; these problems could be detected on either the source side - perhaps sentences with difficult or incorrect parses could lead to bad translations - or on the target side - perhaps the output quality could be measured in a more syntactically informed way, looking for syntactic abnormalities. We show that there is no evidence that the source side information is useful. However, a target-side classifier, when used to identify particularly bad translation candidates, can lead to significant improvements in BLEU score. Improvements are even greater when combined with existing language and alignment model approaches.

LanguageEnglish
Title of host publicationColing 2008 - 22nd International Conference on Computational Linguistics, Proceedings of the Conference
Place of PublicationStroudsburg
PublisherAssociation for Computational Linguistics (ACL)
Pages1153-1160
Number of pages8
Volume1
ISBN (Print)9781905593446
Publication statusPublished - 2008
Event22nd International Conference on Computational Linguistics, Coling 2008 - Manchester, United Kingdom
Duration: 18 Aug 200822 Aug 2008

Other

Other22nd International Conference on Computational Linguistics, Coling 2008
CountryUnited Kingdom
CityManchester
Period18/08/0822/08/08

Fingerprint

Syntactics
Classifiers
Engines
candidacy
Syntax
language
evidence
Classifier

Cite this

Zwarts, S., & Dras, M. (2008). Choosing the right translation: A syntactically informed classification approach. In Coling 2008 - 22nd International Conference on Computational Linguistics, Proceedings of the Conference (Vol. 1, pp. 1153-1160). Stroudsburg: Association for Computational Linguistics (ACL).
Zwarts, Simon ; Dras, Mark. / Choosing the right translation : A syntactically informed classification approach. Coling 2008 - 22nd International Conference on Computational Linguistics, Proceedings of the Conference. Vol. 1 Stroudsburg : Association for Computational Linguistics (ACL), 2008. pp. 1153-1160
@inproceedings{f4f85acac569499aa8d6f9cfb6ace035,
title = "Choosing the right translation: A syntactically informed classification approach",
abstract = "One style of Multi-Engine Machine Translation architecture involves choosing the best of a set of outputs from different systems. Choosing the best translation from an arbitrary set, even in the presence of human references, is a difficult problem; it may prove better to look at mechanisms for making such choices in more restricted contexts. In this paper we take a classification-based approach to choosing between candidates from syntactically informed translations. The idea is that using multiple parsers as part of a classifier could help detect syntactic problems in this context that lead to bad translations; these problems could be detected on either the source side - perhaps sentences with difficult or incorrect parses could lead to bad translations - or on the target side - perhaps the output quality could be measured in a more syntactically informed way, looking for syntactic abnormalities. We show that there is no evidence that the source side information is useful. However, a target-side classifier, when used to identify particularly bad translation candidates, can lead to significant improvements in BLEU score. Improvements are even greater when combined with existing language and alignment model approaches.",
author = "Simon Zwarts and Mark Dras",
year = "2008",
language = "English",
isbn = "9781905593446",
volume = "1",
pages = "1153--1160",
booktitle = "Coling 2008 - 22nd International Conference on Computational Linguistics, Proceedings of the Conference",
publisher = "Association for Computational Linguistics (ACL)",

}

Zwarts, S & Dras, M 2008, Choosing the right translation: A syntactically informed classification approach. in Coling 2008 - 22nd International Conference on Computational Linguistics, Proceedings of the Conference. vol. 1, Association for Computational Linguistics (ACL), Stroudsburg, pp. 1153-1160, 22nd International Conference on Computational Linguistics, Coling 2008, Manchester, United Kingdom, 18/08/08.

Choosing the right translation : A syntactically informed classification approach. / Zwarts, Simon; Dras, Mark.

Coling 2008 - 22nd International Conference on Computational Linguistics, Proceedings of the Conference. Vol. 1 Stroudsburg : Association for Computational Linguistics (ACL), 2008. p. 1153-1160.

Research output: Chapter in Book/Report/Conference proceedingConference proceeding contributionResearchpeer-review

TY - GEN

T1 - Choosing the right translation

T2 - A syntactically informed classification approach

AU - Zwarts, Simon

AU - Dras, Mark

PY - 2008

Y1 - 2008

N2 - One style of Multi-Engine Machine Translation architecture involves choosing the best of a set of outputs from different systems. Choosing the best translation from an arbitrary set, even in the presence of human references, is a difficult problem; it may prove better to look at mechanisms for making such choices in more restricted contexts. In this paper we take a classification-based approach to choosing between candidates from syntactically informed translations. The idea is that using multiple parsers as part of a classifier could help detect syntactic problems in this context that lead to bad translations; these problems could be detected on either the source side - perhaps sentences with difficult or incorrect parses could lead to bad translations - or on the target side - perhaps the output quality could be measured in a more syntactically informed way, looking for syntactic abnormalities. We show that there is no evidence that the source side information is useful. However, a target-side classifier, when used to identify particularly bad translation candidates, can lead to significant improvements in BLEU score. Improvements are even greater when combined with existing language and alignment model approaches.

AB - One style of Multi-Engine Machine Translation architecture involves choosing the best of a set of outputs from different systems. Choosing the best translation from an arbitrary set, even in the presence of human references, is a difficult problem; it may prove better to look at mechanisms for making such choices in more restricted contexts. In this paper we take a classification-based approach to choosing between candidates from syntactically informed translations. The idea is that using multiple parsers as part of a classifier could help detect syntactic problems in this context that lead to bad translations; these problems could be detected on either the source side - perhaps sentences with difficult or incorrect parses could lead to bad translations - or on the target side - perhaps the output quality could be measured in a more syntactically informed way, looking for syntactic abnormalities. We show that there is no evidence that the source side information is useful. However, a target-side classifier, when used to identify particularly bad translation candidates, can lead to significant improvements in BLEU score. Improvements are even greater when combined with existing language and alignment model approaches.

UR - http://www.scopus.com/inward/record.url?scp=77956311127&partnerID=8YFLogxK

M3 - Conference proceeding contribution

SN - 9781905593446

VL - 1

SP - 1153

EP - 1160

BT - Coling 2008 - 22nd International Conference on Computational Linguistics, Proceedings of the Conference

PB - Association for Computational Linguistics (ACL)

CY - Stroudsburg

ER -

Zwarts S, Dras M. Choosing the right translation: A syntactically informed classification approach. In Coling 2008 - 22nd International Conference on Computational Linguistics, Proceedings of the Conference. Vol. 1. Stroudsburg: Association for Computational Linguistics (ACL). 2008. p. 1153-1160