About multilingual parallel corpus of translations

Multilingual parallel corpus of translations is based on the EU Commission data.

The multilingual corpus contains EU acts in 22 EU official languages; however, all the texts in the corpus have not been translated into all languages and therefore the number of hits varies with different languages. Most of the texts are in English, which was the source language in most cases.

The users of this corpus should be aware that only European Community legislation printed in the paper edition of the Official Journal of the European Union is deemed authentic.

The multilingual corpus is especially useful for translators.

Currently, the corpus contains about 98 million words in 22 languages (all data have not been included yet); language distribution can be seen from the statistics.


Searching the corpus

Enter the search word(s) in the input field. Select one source language and one or more target languages (several target languages can be selected by holding down the Control or Ctrl key while clicking the mouse.). The volume of output can be limited by specifying full or partial Celex number.
It is possible to select terminology or corpus output - or both. The corpus output can be either monolingual (KWIC - KeyWord In Context) or multilingual.

When making a search, the following wildcards can be used:

The corpus was last updated in 2008.

Please send any comments regarding the corpus to the Main Administrative Office of GSV.