OpenOrgs: the OpenAIRE service for bridging registries of research organisations
Organisations appear all over the Research & Innovation ecosystem in different shapes and formats: the same organization may appear with different names (e.g., full legal name, short or alternative names, acronym) and different metadata fields in different data sources. Persistent identifiers may be of no help when different data sources identify organizations according to different PID schemas (ROR, ISNI, EC PIC numbers, etc.). Disambiguating them when trying to build a linked open scholarly communicution system is not a trivial problem.
OpenAIRE developed OpenOrgs to address this ambiguity which greatly affects the information aggregated by OpenAIRE from different research organization registries (e.g. Research Organisation Registry - ROR, the Europen Commission and other funder databases, ) into building the OpenAIRE Research Graph. OpenOrgs was presented in the demo session at the Open Science Fair in September 2021 with two aims. First, to highlight the importance of this disambiguation task not only for OpenAIRE services but for building a robust Open Science ecosystem. And second, to practically show OpenOrgs main functionalities and how users can interact with them.
OpenOrgs mixes automated processes and human curation. An algorithm does the first part of the work, grouping organizations with a certain degree of similarity in their metadata. After that, the curator has to determine if the identity is real or not. In most cases, it is enough to look at the overall metadata collected from several sources to accept or reject the identity. In some cases - and this is why it is impossible to rely only on the authomation - a user may need additional knowledge of the organisation or country to understand whether two org entities are branches of the same organization or two independent institutions. Users can also suggest new duplicates that the algorithm has not found, curate and enrich metadata.
Our next step is to include national curators (our NOADs) to increase scholarly communication efficiency by working on organisations' ambiguity.
- OSFAIR demo session: showing a set of basic and advanced operations, i.e. selecting an organization with duplicates, resolving duplicates and metadata enrichment, manually adding duplicates, understanding of conflicts between orgs, their solution, etc.
- Beta version of the service. Please send request to access it to Gina Pavone, CNR,