Remember Me
Or use your Academic/Social account:


You have just completed your registration at OpenAire.

Before you can login to the site, you will need to activate your account. An e-mail will be sent to you with the proper instructions.


Please note that this site is currently undergoing Beta testing.
Any new content you create is not guaranteed to be present to the final version of the site upon release.

Thank you for your patience,
OpenAire Dev Team.

Close This Message


Verify Password:
Verify E-mail:
*All Fields Are Required.
Please Verify You Are Human:

OpenAIRE is about to release its new face with lots of new content and services.
During September, you may notice downtime in services, while some functionalities (e.g. user registration, login, validation, claiming) will be temporarily disabled.
We apologize for the inconvenience, please stay tuned!
For further information please contact helpdesk[at]openaire.eu

fbtwitterlinkedinvimeoflicker grey 14rssslideshare1

Future Fellowships - Grant ID: FT130101105

Future Fellowships - Grant ID: FT130101105
ARC | Future Fellowships
Contract (GA) number
Start Date
End Date
Open Access mandate
More information


  • Towards Robust and Privacy-preserving Text Representations

    Li, Yitong; Baldwin, Timothy; Cohn, Trevor (2018)
    Projects: ARC | Future Fellowships - Grant ID: FT130101105 (FT130101105)
    Written text often provides sufficient clues to identify the author, their gender, age, and other important attributes. Consequently, the authorship of training and evaluation corpora can have unforeseen impacts, including differing model performance for different user groups, as well as privacy implications. In this paper, we propose an approach to explicitly obscure important author characteristics at training time, such that representations learned are invariant to these attributes. Evalua...

    What's in a Domain? Learning Domain-Robust Text Representations using Adversarial Training

    Li, Yitong; Baldwin, Timothy; Cohn, Trevor (2018)
    Projects: ARC | Future Fellowships - Grant ID: FT130101105 (FT130101105)
    Most real world language problems require learning from heterogenous corpora, raising the problem of learning robust models which generalise well to both similar (in domain) and dissimilar (out of domain) instances to those seen in training. This requires learning an underlying task, while not learning irrelevant signals and biases specific to individual domains. We propose a novel method to optimise both in- and out-of-domain accuracy based on joint learning of a structured neural model with...

    Fast, Small and Exact: Infinite-order Language Modelling with Compressed Suffix Trees

    Shareghi, Ehsan; Petri, Matthias; Haffari, Gholamreza; Cohn, Trevor (2016)
    Projects: ARC | Future Fellowships - Grant ID: FT130101105 (FT130101105)
    Efficient methods for storing and querying are critical for scaling high-order n-gram language models to large corpora. We propose a language model based on compressed suffix trees, a representation that is highly compact and can be easily held in memory, while supporting queries needed in computing language model probabilities on-the-fly. We present several optimisations which improve query runtimes up to 2500x, despite only incurring a modest increase in construction time and memory usage. ...

    Learning when to trust distant supervision: An application to low-resource POS tagging using cross-lingual projection

    Fang, Meng; Cohn, Trevor (2016)
    Projects: ARC | Future Fellowships - Grant ID: FT130101105 (FT130101105)
    Cross lingual projection of linguistic annotation suffers from many sources of bias and noise, leading to unreliable annotations that cannot be used directly. In this paper, we introduce a novel approach to sequence tagging that learns to correct the errors from cross-lingual projection using an explicit debiasing layer. This is framed as joint learning over two corpora, one tagged with gold standard and the other with projected tags. We evaluated with only 1,000 tokens tagged with gold stand...

    Model Transfer for Tagging Low-resource Languages using a Bilingual Dictionary

    Fang, Meng; Cohn, Trevor (2017)
    Projects: ARC | Future Fellowships - Grant ID: FT130101105 (FT130101105)
    Cross-lingual model transfer is a compelling and popular method for predicting annotations in a low-resource language, whereby parallel corpora provide a bridge to a high-resource language and its associated annotated corpora. However, parallel data is not readily available for many languages, limiting the applicability of these approaches. We address these drawbacks in our framework which takes advantage of cross-lingual word embeddings trained solely on a high coverage bilingual dictionary....

    Learning Robust Representations of Text

    Li, Yitong; Cohn, Trevor; Baldwin, Timothy (2016)
    Projects: ARC | Future Fellowships - Grant ID: FT130101105 (FT130101105)
    Deep neural networks have achieved remarkable results across many language processing tasks, however these methods are highly sensitive to noise and adversarial attacks. We present a regularization based method for limiting network sensitivity to its inputs, inspired by ideas from computer vision, thus learning models that are more robust. Empirical evaluation over a range of sentiment datasets with a convolutional neural network shows that, compared to a baseline model and the dropout method...

    A Stochastic Decoder for Neural Machine Translation

    The process of translation is ambiguous, in that there are typically many valid trans- lations for a given sentence. This gives rise to significant variation in parallel cor- pora, however, most current models of machine translation do not account for this variation, instead treating the prob- lem as a deterministic process. To this end, we present a deep generative model of machine translation which incorporates a chain of latent variables, in order to ac- count for local lexical and syntact...

    Exploring Prediction Uncertainty in Machine Translation Quality Estimation

    Beck, Daniel; Specia, Lucia; Cohn, Trevor (2016)
    Projects: ARC | Future Fellowships - Grant ID: FT130101105 (FT130101105), EC | QT21 (645452)
    Machine Translation Quality Estimation is a notoriously difficult task, which lessens its usefulness in real-world translation environments. Such scenarios can be improved if quality predictions are accompanied by a measure of uncertainty. However, models in this task are traditionally evaluated only in terms of point estimate metrics, which do not take prediction uncertainty into account. We investigate probabilistic methods for Quality Estimation that can provide well-calibrated uncertainty...
  • No project research data found
  • Scientific Results

    Chart is loading... It may take a bit of time. Please be patient and don't reload the page.


    Chart is loading... It may take a bit of time. Please be patient and don't reload the page.

    Publications in Repositories

    Chart is loading... It may take a bit of time. Please be patient and don't reload the page.

Share - Bookmark

App Box

Cookies make it easier for us to provide you with our services. With the usage of our services you permit us to use cookies.
More information Ok