OpenOrgs is a new tool created to solve a long-standing problem: the disambiguation of organizations variously involved in the research process. In particular OpenOrgs addresses the ambiguity affecting the information aggregated by OpenAIRE from different research organization registries (e.g. ROR, EC) and populating the OpenAIRE Research Graph.
It works in two steps:
First, an algorithm automatically detects identities between organizations appearing in different data sources with different names: metadata, PIDs, and so on;
Second, a process of manual curation corroborates the automated process. Data curators can resolve the ambiguity of duplicates detected with the automated process by stating whether two or more entities correspond or not to the same organization. They can also enrich metadata and eventually suggest new duplicates, thus improving the automated process.
Essentially, simple users can resolve duplicates by approving or rejecting equivalent orgs suggested by the algorithm. While doing so, curators can also enrich metadata if they want.
In addition to simple users' actions, national admins can approve new orgs (giving them a stable OpenOrg ID) and resolve conflicts.
The testing phase has been completed and the system has just been moved to org.openaire.eu. This means that from now on the curation of OpenOrgs contributes to all the OpenAIRE services such the explore.openaire.eu and monitor.openaire.eu.
Notes can be helpful to keep track of the choices made. They are also useful if there are more than one curator for a single country, so they can leave notes about the work done. For example, choices may be available, such as which variant of the name to assign to an organization, or what relationships to establish between institutes and departments in the same organization.
National admins can see how the single organization page has been modified and by who.
To help determine the type of organization, a set of flags imported from the European Commission portals (if available) are provided.
During the OSFAIR Conference it was held a demo session where OpenOrgs was presented with the aim of highlighting the importance of disambiguating orgs not only for OpenAIRE services but for building a robust Open Science ecosystem. Furthermore, a set of basic and advanced operations were shown, i.e. selecting an organization with duplicates, resolving duplicates and metadata enrichment, manually adding duplicates, understanding conflicts between orgs, their solution, and so on.
Presentation. OpenOrgs: bridging registries of research organizations. Supporting disambiguation and improving the quality of data: https://zenodo.org/record/5101096#.YW7iJJ4zZ2R
Blogpost. OpenOrgs: the OpenAIRE service for bridging registries of research organisations: https://www.openaire.eu/openorgs-the-openaire-service-for-bridging-registries-of-research-organisationsWhen you subscribe to the blog, we will send you an e-mail when there are new updates on the site so you wouldn't miss them.
OpenAIRE has received funding from the European Union's Horizon 2020 Research and Innovation programme under Grant Agreements No. 777541 and 101017452 (see all).
Unless otherwise indicated, all materials created by OpenAIRE are licenced under CC ATTRIBUTION 4.0 INTERNATIONAL LICENSE.
Comments