Ilona von Stein, project leader and policy officer at DANS, writes about the importance of Data Management Training.
What would happen if your data is not trusted by your own research community?
What if your research community generates 2 petabyte of raw data per day, and and needs to share it between institutions?
What if reuse of your data takes place in computational workflows, but data archives do not provide formats and annotations to support this?
And what if you cannot find your own data later in your career?
The challenges are ambitious, and the key is to make data Open and FAIR. The FAIR data principles (Findable, Accessible, Interoperable and Reusable) act as a guideline to enhance the reusability of data. And indeed: worldwide there is an astonishing number of tools and services that help you to “do something” with your data. But which tools and services do we have in our EOSC-hub communities that really contribute to making our data open and FAIR?
A typical research data lifecycle includes steps for re-using, creating or capturing data, processing, analysing, publishing, sharing, and preserving. EOSC-hub offers services to support each step and the training resources necessary to make a Data Management Plan – or DMP, for short.
Let’s have a look at the research data life cycles phase ‘Access, deposition and sharing’. What does EOSC-hub contributes in this respect? A prominent example is our joint work on the development of an AAI infrastructure that uses existing security services offered by our EOSC-hub consortium (EGI Check-in, B2ACCESS, INDIGO-IAM). B2SHARE complements the phase as a tool to add clear usage licences to data and let anyone know in which conditions your data can be reused.
Other phases of the research data lifecycle where EOSC-hub contributes are, for example, the ‘Data Management, curation & preservation’-phase where the project is working on federated data storage (possible integrations of storage facilities such as B2SAFE and EGI DataHub) and the ‘Process & Analysis’-phase where progress will be made on for example cloud computing and scientific workflow management.
DMPs help researchers and research teams to consider What data goes into a project (reuse) and what comes out of it (potential reuse), How the team takes care of the data, and Who is allowed to do What with the data When. (And yes, funders increasingly demand DMPs.) The information in a DMP covers the research data lifecycle and is essential information for your research support office and IT department or service provider: after all, the DMP specifies the services and legal support that the project needs to make the data as FAIR and Open as possible.
The training team of EOSC-hub DANS and CCFE supports researchers and support staff in proper management of research data (as part of EOSC-hub task 11.2). In collaboration with OpenAIRE-Advance, the team provides information about the Why and How of the creation of DMPs and about the DMP review process of major research funders, such as the Horizon 2020 programme. Next to support on DMP planning, EOSC-hub also offers training on particular services.