Understanding CERIF - a Use Case

altInterview with Nikos Houssos, Greek National Documentation Centre (EKT-NHRF)


Q: In the OpenAIREplus project EKT-NHRF coordinates the euroCRIS contribution to OpenAIREplus. Can you please tell us more about this collaboration?

Yes, in the OpenAIREplus project EKT-NHRF acts as the liaison to EuroCRIS, an organization that studies Current Research Information Systems (CRIS) and is the custodian of the EU recommended CERIF data model for research information. Key euroCRIS experts are directly supported by EKT in the context of OpenAIREplus.

Q: OpenAIREplus project establishes connections with other infrastructures and several diverse forms of research content systems – CRISs are one of them – to enable harvesting of their resources. One of the project tasks is streamlining the OpenAIREplus data model with the euroCRIS CERIF (Common European Research Information Format) standard, wherever there is conceptual overlap between the two. EKT-NHRF is working to improve the metadata used in OpenAIREplus based on CERIF. Can you tell us about some progress in this area?

OpenAIREplus establishes connections with and allows harvesting of several diverse forms of research content systems and infrastructures, including CRIS systems. CRIS systems maintain contextual metadata that provides semantic linkages between, among others, publications, data sets, projects, funding programmes, people, organisations. The standard data model for CRIS systems in Europe is CERIF, which is supported by more than about 200 institutional, disciplinary and national systems across Europe. OpenAIREplus will support import and export of data to CERIF XML so that (a) the information in CRIS systems is ingested into the OpenAIRE portal, (b) third party systems (including CRIS systems) are able to utilise OpenAIREplus data in other applications. Furthermore, the native OpenAIREplus data model design adopts CERIF so that it takes advantage of various CERIF strengths such as the flexibility in capturing semantic relationships among entities (CERIF Semantic Layer) and the representation of complex, arbitrary entity structures (e.g. for funding programmes).

Q: Can you give us a use-case scenario?

The OpenAIRE portal needs to maintain the linkages of publications and data sets with projects and the respective funding programmes, both at the EU and national level. The funding programmes structure is different for each country. However, due to the flexibility of CERIF, the OpenAIRE portal is able to represent these different structures in a single system. Furthermore, connections of publications and data sets to projects and funding programmes are already available in national CRIS systems, for example the CRIStin system in Norway. The OpenAIRE portal harvests this information for Norway from CRIStin in CERIF XML. On the other hand, institutional and CRIS systems (e.g. the national Greek CRIS system) can benefit from the information provided by the OpenAIRE API, for example to assist the self-archiving of publications and their links to EC projects and funding programmes by populating lists of projects and funding programmes in data entry forms.

Q: In the OpenAIREplus project a link to CERIF model and CRIS systems is studied in the context of enhanced publications – born-digital information objects which represent combinations of article descriptions (article metadata in OpenAIRE), research data descriptions, further articles (e.g., citations) and possibly other enhanced publications. Can you please tell us a bit more about this?

CERIF can represent enhanced publications, since it naturally captures semantic relationships among multiple objects and enables the generation from CERIF data of any other representation of enhanced publications. Specific CERIF features that make it suitable for utilisation in the enhanced publications context are the following: (a) each component of an enhanced publication represented in CERIF has a unique ID (b) CERIF can link together these components in multiple ways using the temporal and role-based linkage structure of CERIF; (c) CERIF can represent all the current known components of an enhanced publication, (d) CERIF naturally handles multilingual, multimedia information and versioning.

A comment from OpenAIRE's Technical Manager, Paolo Manghi, Istituto di Scienza e Tecnologie dell'Informazione "A. Faedo" – CNR:
"OpenAIRE proposes an enhanced publication model, which includes specific relationships between a “main publication” a set of other digital objects, aggregations of such objects, and web resources. In that it provides a very specific model that (presumably and hopefully) captures the needs of researchers in constructing such information packages for others to use. This is different from CERIF, whose aim is to provide model abstracting over all possible associations between entities involved in the research funding and production chain (publications, datasets, people, organizations, funding schemes, etc.). This is why the CERIF model is designed to be semantic-agnostic, which means semantics is injected by the users of the model to capture their specific needs. Now, since the entities adopted by the enhanced publication model overlap with those of CERIF (not sure about “aggregation objects”, but for sure we can find a mapping into dedicated CERIF objects), CERIF can be instantiated by injecting the semantics of enhanced publication model and therefore be used to represent such objects. This means that enhanced publications are CERIF compliant."

