Results from three Canadian funders in one portal
Building a national portal easily
Overview
Challenge & Scenario
Solution & Implementation
Impact
Related resources
In depth description
Details
Stage 1. Identifying research results through customised text mining/natural language processing
To overcome the lack of information in Canadian funding information, OpenAIRE developed a text mining module capable of identifying highly accurate links between publications and the three main Canadian federal research funding agencies ("Tri-Agency"):
- Canadian Institutes of Health Research (CIHR)
- Natural Sciences and Engineering Research Council (NSERC)
- Social Sciences and Humanities Research Council (SSHRC)
The mining algorithm takes into account all alternative names and acronyms of these funders, both in English and French. As a next step, the mining module is currently being updated to infer links also to National Research Council Canada (NRC) funding.
Stage 2. Addressing the issue of organisation disambiguation
Organisations appear all over the Research & Innovation ecosystem in different shapes and formats: the same organisation may appear with different names (e.g., full legal name, short or alternative names, acronym) and different metadata fields in different data sources. Persistent identifiers may be of no help when different data sources identify organisations according to different PID schemas. As this ambiguity greatly affects the building of the OpenAIRE Research Graph, OpenOrgs the OpenAIRE service for bridging registries of research organisations, was used.
OpenOrgs supports the disambiguation of organisations with both automated processes and human curation: an algorithm does the first part of the task, suggesting duplicated organisations with a certain degree of similarity in their metadata. After that, the curator has to confirm or reject the suggestions. This is a key task for creating a solid information space, that enables the construction of monitoring services at the institution and - by extension – at the regional level. OpenOrgs is available to Canadian members to help them to improve the regional coverage and to increase scholarly communication efficiency.
Stage 3. Identifying the Canadian subset of the OpenAIRE Research Graph
The outputs of steps 1 and 2 are used to identify the subset of the OpenAIRE Research Graph that is relevant for Canada. In particular, the following criteria are applied for the identification of Canadian research products:
- all research products linked to a Canadian funder;
- all research products authored by a researcher affiliated to a Canadian organisation;
- all research products deposited in an institutional repository of a Canadian institution;
- all research products available via the Canadian national aggregators (Canada Research and Federated Research Data Repository (FRDR)).
Stage 4. Creating the Canadian Explore portal
To give a better overview of the collaboration and the effort of both teams, and to provide access to all users of the Canadian research results, OpenAIRE set up the customised CANADA.EXPLORE portal available at http://canada.explore.openaire.eu/. Through the portal, a Canadian subset of the OpenAIRE Research Graph is presented.
The search bar allows users to search, browse and access Canadian research products. The home page of the Canadian portal has its own identity with a dedicated logo and colours. Additionally, information about the collaboration with OpenAIRE and statistical numbers about the content are presented. Furthermore, there is a dedicated page for developers on how to use OpenAIRE API to get the Canadian results.
The pilot stage of the CARL-OpenAIRE collaboration has been completed and the next step is to encourage and support all Canadian institutional repositories to adopt the OpenAIRE metadata guidelines so, they can be harvested by the OpenAIRE infrastructure. This next phase was launched with a webinar series to introduce the OpenAIRE services to Canadian stakeholders and provide practical support for repositories to become compliant with the OpenAIRE Guidelines.