LOGIN TO YOUR ACCOUNT

Username
Password
Remember Me
Or use your Academic/Social account:

CREATE AN ACCOUNT

Or use your Academic/Social account:

Congratulations!

You have just completed your registration at OpenAire.

Before you can login to the site, you will need to activate your account. An e-mail will be sent to you with the proper instructions.

Important!

Please note that this site is currently undergoing Beta testing.
Any new content you create is not guaranteed to be present to the final version of the site upon release.

Thank you for your patience,
OpenAire Dev Team.

Close This Message

CREATE AN ACCOUNT

Name:
Username:
Password:
Verify Password:
E-mail:
Verify E-mail:
*All Fields Are Required.
Please Verify You Are Human:
fbtwitterlinkedinvimeoflicker grey 14rssslideshare1
Name
LINDAT/CLARIN repository
Type
Data Repository
Items
34 Research Data
Compatibility
OpenAIRE Data (funded, referenced datasets)
OAI-PMH
http://lindat.mff.cuni.cz/repository/oai/openaire_data
More information
Detailed data provider information (Re3data)

 

  • No data provider publications found
  • English-Hindi Parallel Corpus

    Bojar, Ondřej; Straňák, Pavel; Zeman, Daniel; Jain, Gaurav; Damani, Om Prakesh (2010)
    Publisher: Charles University in Prague, UFAL
    Projects: EC | EUROMATRIXPLUS (231720)
    Embargo end date: 2011/11/07
    English-Hindi parallel corpus collected from several sources. Tokenized and sentence-aligned. A part of the data is our patch for the Emille parallel corpus.

    WMT16 Quality Estimation Shared Task Training and Development Data

    Specia, Lucia; Logacheva, Varvara; Scarton, Carolina (2016)
    Publisher: University of Sheffield
    Projects: EC | QT21 (645452)
    Embargo end date: 2016/02/29
    Training and development data for the WMT16 QE task. Test data will be published as a separate item. This shared task will build on its previous four editions to further examine automatic methods for estimating the quality of machine translation output at run-time, without relying on reference translations. We include word-level, sentence-level and document-level estimation. The sentence and word-level tasks will explore a large dataset produced from post-editions by professional translat...

    Manually Classified Errors in Cs->Sk Translation

    Galuščáková, Petra; Bojar, Ondřej (2012)
    Publisher: Charles University in Prague, UFAL
    Projects: EC | EUROMATRIXPLUS (231720)
    Embargo end date: 2012/05/15
    Manual classification of errors of Czech-Slovak translation according to the classification introduced by Vilar et al. [1]. First 50 sentences from WMT 2010 test set were translated by 5 MT systems (Česílko, Česílko2, Google Translate and two Moses setups) and MT errors were manually marked and classified. Classification was applied in MT systems comparison [3]. Reference translation is included. References: [1] David Vilar, Jia Xu, Luis Fernando D’Haro and Hermann Ney. Error Analysi...

    Depfix: Automatic Post-editing of SMT

    Rosa, Rudolf (2015)
    Publisher: Charles University in Prague, UFAL
    Projects: EC | QTLEAP (610516)
    Embargo end date: 2015/01/29
    Depfix, a tool for Automatic Post-editing of SMT. See the project website for more information.

    Hindi Web Texts

    Bojar, Ondřej; Straňák, Pavel; Zeman, Daniel (2011)
    Publisher: Charles University in Prague, UFAL
    Projects: EC | EUROMATRIXPLUS (231720)
    Embargo end date: 2011/11/23
    A Hindi corpus of texts downloaded mostly from news sites. Contains both the original raw texts and an extensively cleaned-up and tokenized version suitable for language modeling. 18M sentences, 308M tokens
  • Latest Documents Timeline

    Chart is loading... It may take a bit of time. Please be patient and don't reload the page.

    Document Types

    Chart is loading... It may take a bit of time. Please be patient and don't reload the page.

    Projects with most Research Data

    Chart is loading... It may take a bit of time. Please be patient and don't reload the page.

Share - Bookmark