Remember Me
Or use your Academic/Social account:


Or use your Academic/Social account:


You have just completed your registration at OpenAire.

Before you can login to the site, you will need to activate your account. An e-mail will be sent to you with the proper instructions.


Please note that this site is currently undergoing Beta testing.
Any new content you create is not guaranteed to be present to the final version of the site upon release.

Thank you for your patience,
OpenAire Dev Team.

Close This Message


Verify Password:
Verify E-mail:
*All Fields Are Required.
Please Verify You Are Human:
fbtwitterlinkedinvimeoflicker grey 14rssslideshare1
Biblio at Institute of Formal and Applied Linguistics
Institutional Repository
249 Publications
OpenAIRE 3.0 (OA, funding)
More information
Detailed data provider information (OpenDOAR)


  • Selecting Data for English-to-Czech Machine Translation

    Bojar, Ondřej; Tamchyna, Aleš; Kamran, Amir; Galuščáková, Petra; Stanojević, Miloš (2012)
    Projects: EC | EUROMATRIXPLUS (231720)
    We provide a few insights on data selection for machine translation. We evaluate the quality of the new CzEng 1.0, a parallel data source used in WMT12. We describe a simple technique for reducing out-of-vocabulary rate after phrase extraction. We discuss the benefits of tuning towards multiple reference translations for English-Czech language pair. We introduce a novel approach to data selection by full-text indexing and search: we select sentences similar to th...

    NMT at CUNI

    Bojar, Ondřej (2017)
    Projects: EC | QT21 (645452)
    A summary of the development of neural MT at Charles University.

    Representing Layered and Structured Data in the CoNLL-ST Format

    Štěpánek, Jan; Straňák, Pavel (2010)
    Projects: EC | EUROMATRIXPLUS (231720)
    In this paper, we investigate the CoNLL Shared Task format, its properties and possibility of its use for complex annotations. We argue that, perhaps despite the original intent, it is one of the most important current formats for syntactically annotated data. We show the limits of the CoNLL-ST data format in its current form and propose several simple enhancements that push those limits further and make the format more robust and future proof. We analyse several different linguistic ...

    Automatic Source Code Reduction

    Diviš, Jiří; Bojar, Ondřej (2010)
    Projects: EC | EUROMATRIXPLUS (231720)
    The aim of this paper is to introduce Reductor, a program that automatically removes unused parts of the source code of valid programs written in the Mercury language. Reductor implements two main kinds of reductions: statical reduction and dynamical reduction. In the statical reduction, Reductor exploits semantic analysis of the Melbourne Mercury Compiler to nd routines which can be removed from the program. Dynamical reduction of routines additionally uses Mercury Deep Profiler and some ...

    CUNI System for WMT17 Automatic Post-Editing Task

    Variš, Dušan; Bojar, Ondřej (2017)
    Projects: EC | QT21 (645452), EC | HimL (644402)
    Following upon the last year's CUNI system for automatic post-editing of machine translation output, we focus on exploiting the potential of sequence-to-sequence neural models for this task. In this system description paper, we compare several encoder-decoder architectures on a smaller-scale models and present the system we submitted to WMT 2017 Automatic Post-Editing shared task based on this preliminary comparison. We also show how simple inclusion of synthetic data can improve the overa...
  • No data provider research data found
  • Latest Documents Timeline

    Chart is loading... It may take a bit of time. Please be patient and don't reload the page.

    Document Types

    Chart is loading... It may take a bit of time. Please be patient and don't reload the page.

    Funders in data provider publications

    Chart is loading... It may take a bit of time. Please be patient and don't reload the page.

    Projects with most Publications

    Chart is loading... It may take a bit of time. Please be patient and don't reload the page.

Share - Bookmark