Remember Me
Or use your Academic/Social account:


Or use your Academic/Social account:


You have just completed your registration at OpenAire.

Before you can login to the site, you will need to activate your account. An e-mail will be sent to you with the proper instructions.


Please note that this site is currently undergoing Beta testing.
Any new content you create is not guaranteed to be present to the final version of the site upon release.

Thank you for your patience,
OpenAire Dev Team.

Close This Message


Verify Password:
Verify E-mail:
*All Fields Are Required.
Please Verify You Are Human:
fbtwitterlinkedinvimeoflicker grey 14rssslideshare1
Biblio at Institute of Formal and Applied Linguistics
Institutional Repository
260 Publications
OpenAIRE 3.0 (OA, funding)
More information
Detailed data provider information (OpenDOAR)


  • Towards an Indonesian-English SMT System: A Case Study of an Under-Studied and Under-Resourced Language, Indonesian

    This paper describes a work on preparing an Indonesian-English Statistical Machine Translation (SMT) System. It includes the creation of Indonesian morphological analyzer, MorphInd, and the composing of an Indonesian-English parallel corpus, IDENTIC. We build an SMT system using the state-of-the-art phrase-based SMT system, MOSES. We show several scenarios where the morphological tool is used to incorporate morphological information in the SMT system trained with the composed parallel corpus.

    Morphological Processing for English-Tamil Statistical Machine Translation

    Various experiments from literature suggest that in statistical machine translation (SMT), applying either pre-processing or post-processing to morphologically rich languages leads to better translation quality. In this work, we focus on the English-Tamil language pair. We implement suffix-separation rules for both of the languages and evaluate the impact of this preprocessing on translation quality of the phrase-based as well as hierarchical model in terms of BLEU score and a small manual...

    Bilingual Embeddings and Word Alignments for Translation Quality Estimation

    This paper describes our submission UFAL MULTIVEC to the WMT16 Quality Estimation Shared Task, for EnglishGerman sentence-level post-editing effort prediction and ranking. Our approach exploits the power of bilingual distributed representations, word alignments and also manual post-edits to boost the performance of the baseline QuEst++ set of features. Our model outperforms the baseline, as well as the winning system in WMT15, Referential Translation Machines ...

    A corpus-based finite-state morphological toolkit for contemporary Arabic

    We develop an open-source large-scale finite-state morphological processing toolkit (AraComLex) for Modern StandardArabic (MSA) distributed under the GPLv3 license (http://aracomlex.sourceforge.net). The morphological transducer is based on a lexical database specifically constructed for this purpose. In contrast to previous resources, the database is tuned to MSA, eliminating lexical entries no longer attested in contemporary use. The database is built using a corpus of 1,089,111,204 word toke...

    Approximating a Deep-Syntactic Metric for MT Evaluation and Tuning

    SemPOS is an automatic metric of machine translation quality for Czech and English focused on content words. It correlates well with human judgments but it is computationally costly and hard to adapt to other languages because it relies on a deep-syntactic analysis of the system output and the reference. To remedy this, we attempt at approximating SemPOS using only tagger output and a few heuristics. At a little expense in correlation to human judgments, we can e...
  • No data provider research data found
  • Latest Documents Timeline

    Chart is loading... It may take a bit of time. Please be patient and don't reload the page.

    Document Types

    Chart is loading... It may take a bit of time. Please be patient and don't reload the page.

    Funders in data provider publications

    Chart is loading... It may take a bit of time. Please be patient and don't reload the page.

    Projects with most Publications

    Chart is loading... It may take a bit of time. Please be patient and don't reload the page.

Share - Bookmark