Services to support FAIR data, Vienna (10th openAIRE workshop - 2019)
In the series of workshops on FAIR services for research data that OpenAIRE is jointly organizing with FAIRsFAIR, EOSC hub and RDA-Europe, the second workshop took place as a part of the larger event ‘Linking Open Science in Austria’. The series of workshops explores: how data infrastructures can work together to meet the challenges of creating, managing, opening and archiving FAIR data.
The aim of the workshops is to:
- Look at the landscape of data infrastructures looking to integrate FAIR into their services
- Set initial recommendations and to find out what the challenges and priorities are
The interactive workshop "Services to Support FAIR Data" had a similar structure as the first workshop in the series of three but was aimed at researchers and research support.
Four (4) implementation stories were presented, each of them highlighting different aspects in the wider spectrum of services needed to successfully support FAIR data.
- Zenodo, on how to make FAIR data happen in a generic repository - Lars Holm Nielsen (CERN
- FREYA, on PID infrastucture for allowing discovery of research - Simon Lambert (STFC)
- WikiData,on making WikiData FAIR - Andra Waagmeester (Micelio)
- Core Trust Seal - Natalie Harrower
The talk explored how a generic data repository can provide services for FAIR data based on a concrete use case for taxonomic treatments in biodiversity previously locked up in journal articles.
Using text and data mining, Zenodo can unlock the possibilities lying dormant in already existing articles and data. After extracting data, figures and other marital, Zenodo is able to give the data a FAIR treatment: adding persistent identifier, widely adopted metadata schema, indexing and registration in searchable resources. Their identified key challenge was the specifics of the requirements of domain-specific data versus the generic level of the platform which welcomes all disciplines. Zenodo tries to meet this challenge by providing a stable home which is key to findability and accessibility.
Key take away: Good FAIR data takes an ecosystem of independent services to help the creation of FAIR research output.
The FREYA project aim is to connect and integrate PID (persistent identifiers) systems to create relationships across a network of PIDs and serving that can be the basis for new services. Their persistent identifiers graph makes it possible to improve discovery and access for research data, develop new types of PIDs and demonstrate disciplinary PID systems. FREYA will also be integrated with EOSC by developing a PID infrastructure that will facilitate and boost this ecosystem of PIDs. At a minimum, FREYA issues PIDs and resolve them. But on top of that, it builds technical services. The combination of linked PIDs creates a PID superstructure revealing links between research objects, authors, funders and more making it possible to answer specific questions for tracking.
Key take away: FAIR aspects can be found in making object uniquely identifiable (findable), retrievable (accessible), interoperable through qualified references and re-usable by connecting entities and showing their provenance.
Wikidata is a free and open knowledge base that can be read and edited by both humans and machines. Wikidata acts as central storage for the structured data of its Wikimedia sister projects including Wikipedia, Wikivoyage, Wiktionary, Wikisource, and others. Wikidata is a stable platform integrated with the semantic web putting resources towards semantic data modelling and data FAIRification. Their FAIRification process started with community engagement and model discussion. Data is imported from many sources and ontologies, linked to many identifiers from external databases and uses platforms for (semi-) automatic data in gestation. The varied sources provide a challenge but Wikidata is trying to make data FAIR and fairly available by adding documentation, making discovery possible and sharing the data for all forms of reuse.
Key take away: A key problem was a lack of finding FAIR experts with technical and conceptual expertise who are preferably also affordable.
Core Trust Seal certification provides a collection of 16 core trustworthy data repository requirements that are mandatory, stand-alone and equally weighted. The Core Trust Seal certification is seen as an entry-level way to assess the robustness of a repository. It functions also as a way to agree on common standard criteria for repositories since it has both grass-roots support as support from bottom up. This consensus is the first step in choosing a repository which is trustworthy, safe and well-structured and fit, consequently, in the move towards good data management and FAIR data. It enables FAIRs’ needs for long term data stewardship and preservation.
Key take away: An important point, and previously identified as a challenge, is that certification would shift part of the requirements and thus the burden of making data FAIR from the researcher to the infrastructure.
Breakout groups and panel
Breakgroup questions were:
- What does it mean for researchers to make their data FAIR?
- How does/could your institution or service help researchers make their data FAIR?
- What could/should you do that you are not already doing? What is the priority?
- Where are the biggest gaps in provision?
- How does/could your institution or service reach researchers and explain the importance of FAIR?
During the discussions, a big hurdle was seen of the emphasis that was placed, often by policies, on researchers as the sole player to make data FAIR. It was seen as an important goal to make researchers aware of the importance of FAIR data and the shift the research culture to a cyclic idea of research with FAIR and open as integrated rather than on top of research activities. It was agreed however that there is predominantly a need for a framework of shared responsibilities.
To make data FAIR should be an aspiration linked to practice to rather than a hard set of criteria to meet. Where this practice show missing elements it’ necessary to point to the right services.
Furthermore, incentives were discussed. Since it takes a long time to adopt new methods, appropriate incentives should be in place as well as sufficient training of both researchers as support staff.
FAIR data also offers new opportunities not fully broached yet. In the long term, it can provide new pathways with possibilities into data curation and data management instead of creating new data which gives opportunities to develop new career paths along those lines. Needed to make that transition and also, in general, are skilled people. Basic training for researchers combined with data stewards who possess a different skill set was seen as another priority.
Takeaways from the panel included:
- Good FAIR data takes an ecosystem of independent services to help the creation of FAIR research output.
- PID services for a wide range of objects can help track and evaluate FAIRness. Emerging PID types should be monitored for maturity.
- A key problem was a lack of finding FAIR experts with technical and conceptual expertise. We recommand trainig both entry-level data stewardship programmes for researchers as well as specific training for data stewards on FAIR services.
- A challenge is the shift of part of the requirements and thus the burden of making data FAIR from the researcher to FAIR services and infrastructure. Setting-up and participating in cross-institutional, collaborative communities of practice to advance and implement FAIR services is recommanded.
This 2-day event is a joint experience and the result of the cooperation of OpenAIRE NOAD Austria, e-Infrastructures Austria Plus, RDA Europe, RDA Austria, the Austrian Federal Ministry of Science, Research and Innovation, GOFAIR International and it is organized by Vienna University Library and Archive Services.