Skip to main content
Case studies

Integration of OpenAIRE Graph on OiPub SME platform

EOSC-DIH business pilots in practice

Overview

OiPub is an academic research discovery and discussion platform built around a library of research publications.

Challenge & Scenario

Prior to the EOSC-DIH pilot, OiPub was using the research paper metadata provided by the Directory of Open Access Journals for initial system testing and development (circa 8.5M publications). This library needed to be expanded to provide users with access to all research paper metadata. OpenAire Graph offers support in massively expanding the metadata library available to users to 160M publications and growing. This massive publication dataset needed to be processed and integrated in order to make it compatible with OiPub’s topic-based broadcasting and sorting system. Publication metrics also needed to be calculated, adapted, and integrated so that the information shown to users is relevant and important for the community or search applied.

Solution & Implementation

The OpenAIRE Graph data dump was processed and integrated to make it compatible with OiPub’s design. OiPub was supported in this work by EOSC-DIH provided EGI computation resources as well as OpenAire expert support and feedback whenever questions arose. Additional OpenAIRE services such as OpenCitations were investigated for publication metrics and additional academic data.

Impact

After the successful integration of the OpenAIRE Graph data, OiPub will allow users to explore, discuss and share the wealth of information provided by it. The collaboration resulted in the publications library OiPub provides growing from the 8.5M publications from the initial DOAJ dataset, to the over 160 million publications provided by the OpenAIRE Graph. As well as 3.4 billion keyword-publication links with related scores were extracted as part of the data analytics work. This information will be used to categorise and organise all the data from the massive OpenAIRE Graph, allowing users of OiPub to find all information and discussion about one topic in one place and broadcast this information in flexible, automatically populated communities.

In depth description

Details

The OpenAIRE support, along with that of the EOSC Digital Innovation Hub (EOSC DIH) was vital in developing OiPub’s pilot release and the key data systems it is based on. The aim of the pilot was to investigate the services provided by OpenAIRE and adapt the data these services provide into a pilot version of OiPub. The data and services investigated included the OpenAIRE Graph, OpenCitations and ScholeXplorer.

Downloading the OpenAIRE Graph for initial investigation and analysis was made simple thanks to the OpenAIRE Graph data dump and metadata schema. Similar data dumps are available for OpenCitations and ScholeXplorer. All of the above datasets provide immense value for researchers and the open science movement. The initial investigations allowed the team to determine the structure and plans for the data systems for the pilot, as well as possible future iterations. The focus for the remaining work during this pilot would be on the OpenAIRE Graph.

The next part of the pilot would involves adapting the OpenAIRE Graph data to work with OiPub’s design concept. OiPub had already worked on a keyword’s dataset of 150 thousand keywords. The next step would involve Natural Language Processing (NLP) for recognition and relation scoring to link each publication in the over 160 million records in the OpenAIRE Graph with OiPub’s keywords dataset. The computational work involved here was supported by computational tools, support and systems provided by EGI with the EOSC DIH. This NLP computational analysis and tagging resulted in 3.4 billion publication-keyword links that, along with the OpenAIRE Graph publication metadata, would form the foundations of OiPub’s research categorisation and broadcasting system.

This pilot was focused on the data science and back-end foundations of OiPub. In parallel, OiPub has been developing its front-facing systems.

OiPub’s design concept revolves around automatically populated information hubs and streams of information for discovery and discussion of every research niche. This will allow researchers to find, follow and discuss research far more easily.
OiPub is releasing an open beta for users by end 2023, you can find this at https://oipub.com/

Related Services

We want to hear from you

If you find the case study useful, contact us so we can guide you through the process.