Searching the Evrokorpus

A search in the corpus database can be made in two ways: on the left side of the main page there is a simple user interface, beneath which is a link to the advanced search.
Simple search:
To use the simple search just enter the search word(s) in the input field and select the appropriate corpus. The program uses the following search strategy:
Advanced search:
If you use advanced search, the search word(s) can be entered in English/French/German/Italian/Spanish or/and Slovene. Additionally, search results can be limited to: The default output is bilingual: the first part of each hit consists of the header data (field, revision stage and ID number), followed by aligned translation units (usually sentences) in source and target languages. A monolingual output can be selected (KWIC - KeyWord In Context - in this case, there are only up to 50 characters to the left and right of the search word), or only the number of hits can be shown.

Results are sorted on the basis of the quality of translations (translations at the highest revision stage appear on the top of the list). The ID number is shown on the right side of the header of each hit - it indicates the act from which this particular sentence was taken. If this link is clicked, the whole document will be shown.

When making a search, the following wildcards can be used: Tips on using the corpus

If you want to check how many times the search query has been translated in a particular way, you can just click the links provided in the detailed output of Evroterm (e.g., there are five possible translations of the word sustainable into Slovene; however, according to Evrokorpus results, only one of them seems to be widely used). If the search query is not found in the Evroterm database, you can switch to advanced search in Evrokorpus and then enter the search query in one language and possible translations (one by one) in another language. This will give the frequency of use of a particular translation.

If you want to see the bilingual aligned version of a particular act, enter the appropriate ID (e.g., Celex) number and put the most frequent English word (e.g. the) into the "Search query in English" field. This should result in the major part of this particular act; however, it is true that the sequence of sentences on the corpus output page is not usually the same as in the original document.

The corpus can sometimes provide an answer on punctuation. If you want to check whether there is a comma in front of the English word "unless", then the program can first count all hits (enter " unless" as a search query - without double quotes; the space in front of the word is important because it eliminates cases in which "unless" appears at the beginning of the sentence) and the program can then count hits with a comma preceding the word (enter ", unless" (without double quotes) in the entry field). The difference between these two numbers indicates the number of cases in which the search word was not preceded by a comma.
Fast search during translation

Please send any comments regarding the corpus to the Main Administrative Office of GSV.