United Nations General Assembly Resolutions: a six-language parallel corpus

Alexandre Rafalovitch, Robert Dale

Research output: Chapter in Book/Report/Conference proceedingConference proceeding contributionpeer-review


In this paper we describe a six-ways parallel public-domain corpus consisting of 2100 United Nations General Assembly Resolutions with translations in the six official languages of the United Nations, with an average of around 3 million tokens per language. The corpus is available in a preprocessed, formatting-normalized TMX format with paragraphs aligned across multiple languages. We describe the background to the corpus and its content, the process of its construction, and some of its interesting properties.
Original languageEnglish
Title of host publicationMT Summit XII proceedings
PublisherInternational Association of Machine Translation
Number of pages8
Publication statusPublished - 2009
EventMachine Translation Summit (12th : 2009) - Ottawa, Canada
Duration: 26 Aug 200930 Aug 2009


ConferenceMachine Translation Summit (12th : 2009)
CityOttawa, Canada


Dive into the research topics of 'United Nations General Assembly Resolutions: a six-language parallel corpus'. Together they form a unique fingerprint.

Cite this