RDM NOADs Starter kit
An extensive overview of Research Data Management resources, targeted for beginners and experts, in OpenAIRE Portal.
Overall
Basics
- Basics of Research Data Management Webinar Beginner
- Open Research Data Pilot in Horizon 2020 Guide Beginner
- Open Science e il mandato Europeo sull’Open Access relativo alle pubblicazioni Webinar in Italian Beginner
- Open Research Data in Horizon 2020 Webinar Expert
- OpenAIRE/LIBER workshop on Dealing with Data. What's the Role for the library? (3rd OpenAIRE Workshop - May 2013) Workshop Expert
- OpenAIRE Research Data Management Briefing paper Guide Expert
RDM: Train the trainer resources
- RDM: Train the trainer resources Guide Beginner
Plan
Data Management Plans
- What you need to know about Data Management Plans Webinar Beginner
- How to create a Data Management Plan Guide Beginner
- How to write a Data Management Plan Webinar Expert
- Guidelines on FAIR Data Management in Horizon 2020 Guide Expert
- Webinar on Research Data and Data Management Plans Webinar in Greek Beginner
- Che cos’è un Data Management Plan: presentazione e casi d’uso Webinar in Italian Beginner
FAIR
- FAIR DATA e Action Plan Webinar in Italian Beginner
Create
Workflows
- Life Sciences and Open Science: Workflows and tools for publishing, licensing, versioning, identifiers, archiving, software…
Webinar Beginner - Natural Sciences and Open Science: Workflows and tools for publishing, licensing, versioning, identifiers, archiving, software…
Webinar Beginner - Humanities and Open Science: Workflows and tools for publishing, licensing, versioning, identifiers, archiving, software
Webinar Beginner
Software
- Four Good Practices for Software Development in Open Science Webinar in Greek Beginner
Process
Sensitive data
- Personal data and the Open Research Data Pilot Fact sheet Beginner
- OpenAIRE - EOSC-hub webinar “Data Privacy and Sensitive Data Services” Webinar Expert
- How to deal with sensitive data Guide Expert
Amnesia
- Amnesia: Data anonymization made easy Webinar Beginner
- Amnesia - Anonymize your data before publishing Guide Beginner
GDPR
- Webinar on Open Science and Personal Data: applying the General Data Protection Regulation in today's digital science
Webinar in Greek Beginner
Services
Preserve
Zenodo
- Webinar about the Zenodo repository Webinar Beginner
- Zenodo: The free, open repository from OpenAIRE and CERN Fact sheet Beginner
- Zenodo - A universal repository for all your research outcomes Guide Expert
FAIR
- FAIR data and trusted repositories Webinar Beginner Expert
- Data formats for preservation: What you need to know when creating a DMP Guide Beginner
- Workshop Series: Services to Support FAIR Data Workshop Expert
- https://www.openaire.eu/openaire-workshop-making-services-fair-vienna-april-24th-2019 Workshop Expert
- How to make your data FAIR Guide Expert
Repository
- OpenAIRE-Connect Workshop - A user journey in OpenAIRE services through the lens of repository managers Workshop Beginner Expert
- What are repositories? Guide Beginner
- Making your OA repository or OA journal OpenAIRE compatible with OA Horizon 2020 requirements Webinar Expert
- How to find a trustworthy repository for your data Guide Expert
Services
Share / Reuse
DM and Share
- OpenAIRE-Connect Workshop - Data Management and Sharing in Neuroinformatics Workshop Beginner
- Can I reuse someone else’s research data? Learn more on how to reuse research data Guide Beginner
Shared and linked
Services
FAQs
Questions from researchers in different countries about RDM.
- 1. What is Research Data Management?
Research Data Management defines a fundamental piece of the Research Cycle, and indicates the actions taken to create, organise, store, secure and eventually share the data derived from a research project.
- 2. What is the difference between Research Data Management and a Data Management Plan?
Data will be stored, described and catalogued to ensure its findability in the short and long term, not only by its creator, but also by other scientists to allow for a better reusability and avoid duplication of work and information loss.
Research Data Management is a set of activities related to creating, organising, storing, sharing and preserving data. There is likely to be guidance, tools and support from your research organisation or external data services that can help you with this.
A Data Management Plan is a brief document that describes your approach to RDM for a given project or context - what data will you collect, how will it be documented, and which data will be shared in the long-term? Many funders ask for DMPs, including the European Commission under Horizon 2020.
- 3. What is a Data Management Plan (DMP)?
A DMP is a formal document outlining how the research data collected or generated will be handled during and after a research project. It is a brief plan to define: how the data will be created; how it will be documented; who can access it; where it will be stored; whether it will be shared and where it will be preserved.
It should describe:
- The dataset: what kind of data will the project collect or generate, and to whom might they be useful later on?
- Standards and metadata: What is the data about? Who created it and why? In what forms it is available? Can your data be combined with other data sources (interoperability)? Metadata answers such questions to enable data to be found and understood, ideally according to the particular standards of your scientific discipline. Use your disciplinary standards to enable interoperability, or if there are no standards in your discipline just describe what type of metadata will be created and how (see www.rd-alliance.org/groups/metadata-standards-directory-working-group.html as a reference). Also, document your definitions, variables, machine configurations et cetera in a way that is common in your field.
- Data sharing: Sharing data outside the project team is the default, so legitimate reasons for not sharing resulting data should be explained in the DMP.
- Archiving and preservation: The usability of data depends not only on storage and backup but perhaps also on well-preserved software and on conversion to new file formats. Where will the data, metadata, documentation and software be preserved for the long-term?
Please note that the DMP is a “living” document, that is it is not a fixed document: it evolves and gains more precision and substance during the lifespan of the project, and this is the reason why you should keep it updated! See also FAQ 23 for more information on DMPs.
- 4. What is the purpose of a Data Management Plan and which ‘data’ does this apply to?
The data management plan concerns all data you use in your research, and the documentation that is necessary for long-term reusability, both for the researchers involved and for future research.
In principle all data should be maintained and made accessible for replication studies, this is called the maintenance and archiving of a replication package. At this moment H2020 demands that all data underlying publications should be made accessible, as open as possible. Other data like raw data or versioned data and data that will be open access as well as data that will be available on request only, are just as welcome. You help other researcher to reuse your data for future research if you mention both sorts of ‘other data’ in your data management plan. See also https://researchdata.nl/en/services/data-management/selecting-research-data/.
We advise you strongly to mention and explain the selection criteria in your data management plan.
- 5. Why should I include a draft for Research Data Management in my project proposal? Should I include a first version of the Data Management Plan as well?
It is always good to take data management seriously already in the application. If you plan well from the outset you can anticipate and avoid potential problems such as duplication or data loss.
The Data Management Plan is one of the fundamental aspects of Research Data Management, describing in details the approach adopted in the research project regarding all aspects of research data cycle. Defining a draft for the Data Management Plan at the proposal stage ensures that the project has a strong strategy in the medium and long term, and a vision also after its end. Although EC funded projects that are part of the Open Data Pilot are requested to provide an initial DMP in the first 6 months of the project and DMPs provided at the proposal stage will not be considered for evaluation as stated in the Guidelines on FAIR Data Management in Horizon 2020, including a draft for a DMP in the proposal stage could help the evaluators to better understand the vision of the project in the long term. Furthermore, soft evaluation criteria such as open science are just as important for the review process.
- 6. What are the main aspects of Research Data Management that I should include in my project proposal?
At the proposal stage in the H2020 program, the applicants should show that the project has a strategy for managing the data resulting from their research. At this stage you can already describe how informed consent will be organized, how one will safely store the data, where one will archive the data, metadata, and other documentation for the long term, what conditions (if any) will hold for data reuse, et cetera. The current state of consortium agreements regarding data management should be reflected on.
Forward planning is one of the key principles of RDM and ensures no data loss or inconsistency can happen since the beginning of the project. Be consistent with referring to the exploitation and protection of results. The DMP can be considered also as a checklist for the future and as a reference for the resource and budget allocations of the project. An overall strategy approach should be given for:
- Data creation and use - what types of data will the project generate/collect?
- Data organisation, structure, and naming - what standards will be used?
- Data storage, security, access and back up - how will this data be curated and preserved?
- Data sharing (internally and externally), publication and reuse.
- If data cannot be made available, explain why.
Also consider organizing your own periodic data management plan evaluation. The DMP can be considered as a checklist for the future and as a reference for the resource and budget allocations of the project.
Once the funding is approved and has started, you must submit a first version of your DMP within the first 6 months of the project. The Commission provides a DMP template in annex, the use of which is recommended but voluntary. However, using the EC template will probably be more efficient for you to support researchers. The Data must fulfill the FAIR principles (Findable, Accessible, Interoperable, Reusable). The DMP needs to be updated whenever significant changes develop. The DMP is a living document and changes have to be reported as a minimum in time with the periodic evaluation/assessment of the project. If there are no other periodic reviews foreseen within the grant agreement, then such an update needs to be made in time for the final review at the latest.Furthermore, the project consortium can define a timetable for review in the DMP itself. For further information: http://ec.europa.eu/research/participants/docs/h2020-funding-guide/cross-cutting-issues/open-access-data-management/data-management_en.htm.
You may also want to take a look at the following resources:
- 7. Where can I find examples of Research Data Management at the proposal stage?
The following projects publicly share their DMP, which is based on the current H2020 template (without OpenAIRE endorsement of the DMPs, but we are pleased about this openness):
- 8. Should the Data Management Plan be agreed by all the Consortium Members?
Yes, consortium members should provide the plans of which data they are planning to collect. It is important to agree in the proposal stage on the standards, methods, costs and format which will be used. If there for instance are different data formats used then all the formats have to be listed in the DMP. As an example, if partners decide to produce articles with joined authorship it also has to be an agreement between partners, where to publish or in which Open Access Repository the article will be published. A joint DMP for the whole of the project consortium means that all used formats and approaches would be documented in one paper although the storage of data might be different across partners.
- 9. Can project partners submit different Data Management Plans?
It would not make much sense to hand in different Data Management Plans. If there for instance are different data formats used then all the formats have to be listed in the DMP. As an example, if partners decide to produce articles with joined authorship it also has to be an agreement between partners, where to publish or in which Open Access Repository the article will be published. A joint DMP for the whole of the project consortium means that all used formats and approaches would be documented in one paper although the storage of data might be different across partners.
For projects that have already started writing their DMP under the previous guidelines: when you feel that it would be a waste of effort to “throw together” the individual datasets, a pragmatic solution for reviewers and for those who write the next DMP version(s) could be to add a brief overall description in the beginning of the document, and then simply maintain the existing distinctions.
- 10. What is intended for FAIR Data Management?
The European Commission frames its Data Management guidelines in terms of FAIR. This means that data should be Findable, Accessible, Interoperable and Reusable. Making data FAIR ensures that data is shared in a useful way (e.g. fully documented, using standards, via trustworthy repositories). Being FAIR increases the ability of others (both humans and machines) to find, understand and use the outputs.
For more information you can also see: https://ec.europa.eu/info/sites/info/files/turning_fair_into_reality_1.pdf
- 11. How can I estimate the costs for Research Data Management?
To find out how high the costs are you have to consider costs for data collection, storage, access, security, preservation and possibly assessing the availability of resources for staff, such as a data steward position. By planning early, costs can be significantly reduced. Under Horizon 2020, research data management costs are eligible for reimbursement for the duration of the grant agreement.
Here is a guide on estimating those type of costs: www.openaire.eu/how-to-comply-to-h2020-mandates-rdm-costs, with an RDM costs estimating tool: www.openaire.eu/estimating-costs-rdm-tool?highlight=WyJjb3N0IiwiY29zdCcsIl0=
Also the University of Utrecht compiled a Data Management Cost Guide
- 12. Are RDM costs eligible on H2020 grants?
The Annotated Model Grant Agreement mentions "costs related to open access to research data and related costs, such as data maintenance or storage" to be eligible under the category D.3 (pg 87).
- 13. Which data do I have to publish as open data? All I acquire or simply those that are used for a publication?
H2020 demands that all data underlying publications should be made accessible, as open as possible. This does not mean that there is an obligation on data opening. Data need to be FAIR, then you can open it or not. You should always rely on the principle “as open as possible, as closed as necessarily”. As not all data can be open, projects can opt out at any stage (either before or after signing the grant).
You can choose to opt out and not open your data, but you need to justify your choice in the DMP, with reference to the opt out options defined by the EC in the “Guidelines to the Rules on Open Access to Scientific Publications and Open Access to Research”: https://ec.europa.eu/research/participants/data/ref/h2020/grants_manual/hi/oa_pilot/h2020-hi-oa-pilot-guide_en.pdf
If you do not go for the opt out option, you have to provide open access via a repository to those research data necessary to validate the results presented in a scientific publication. All sorts of data can be preserved: raw data or versioned data as well as data that will be available on request only. You help other researcher to reuse your data for future research if you mention all sorts of ‘other data’ in your data management plan. See also: https://researchdata.nl/en/services/data-management/selecting-research-data/
We strongly advise you to mention and explain the selection criteria in your data management plan.
- 14. What happens with the data I do not publish as open data? Do I have to archive them? Are there any regulations on them?
Data originated by an EC funded project have to be FAIR. This does not mean researchers have to share and open all the data produced, but that they should learn how to manage it in a FAIR way. The decision to open or not the data should be taken by the Project Consortium and described in the DMP. Data should be always archived in a trusted repository even if they are not open to the general public in order to fulfill FAIR principles. Decisions taken on whether to open or close a dataset should be justified in the DMP, with reference to the Opt Out options defined by the EC in the “Guidelines to the Rules on Open Access to Scientific Publications and Open Access to Research” https://ec.europa.eu/research/participants/data/ref/h2020/grants_manual/hi/oa_pilot/h2020-hi-oa-pilot-guide_en.pdf
- 15. What happens if I use data acquired during the project that I only use for a new publication months after the end of the funded project? Do I have to meet the original funder’s obligations for these later publications as well?
If the data were acquired within a specific project, they will fall under the definition of “any other data (for instance curated data not directly attributable to a publication, or raw data)” that the EC gives in the “Guidelines to the Rules on Open Access to Scientific Publications and Open Access to Research” https://ec.europa.eu/research/participants/data/ref/h2020/grants_manual/hi/oa_pilot/h2020-hi-oa-pilot-guide_en.pdf
This data will have to be managed in a FAIR way. Decision on whether to open the data and when is up to the researchers and has to be justified in the DMP.
As the publication stems from data collected within a grant, it should also follow the funder obligations on OA to scientific publications. Unfortunately for this specific case, eventual APC cannot be covered by the project grant if published after project end.
- 16. How do I deal with sensitive data? Which data do I have to anonymize and to what extent?
Generally, all data that are personal or can be related to an individual have to be anonymized before publication. OpenAIRE offers the data anonymization tool Amnesia for removing identifying information from data. It is available as an online service or local app. Find more about Amnesia here: http://catalogue.openaire.eu/service/openaire.amnesia
For more information please check OpenAire guide about sensitive data: www.openaire.eu/sensitive-data-guide
The German project DataJus at the Technische Universität Dresden analysed the legal framework for RDM and also covered the topic of personalised and personalizable data. You can find the report here: https://tu-dresden.de/gsw/jura/igetem/jfbimd13/ressourcen/dateien/publikationen/DataJus_Zusammenfassung_Gutachten_12-07-18.pdf?lang=de
- 17. How can I find a suitable data repository? What obligations are there for the choice of the repository?
The OpenAIRE website mentions the following registries of repositories:
- http://roar.eprints.org/
- http://v2.sherpa.ac.uk/opendoar/
- https://www.re3data.org/ (where you can search or browse by subject: http://www.re3data.org/browse/by-subject/).
Please also take a look at the guidelines of the Core Trust Seal: www.coretrustseal.org/why-certification/requirements/
It is not recommended to store your data in more than one repository. Make sure the repository of your choice is both certified and trustworthy, as well as harvested by OpenAIRE. More information on what it takes to be an OpenAIRE-compliant repository, can be found here.
Open platforms also offer the possibility to store your data. Compared to these platforms, actual repositories offer more services such as long-term access to your data and they support good documentation and metadata practices. Both Horizon2020 and OpenAIRE strongly recommend to deposit the data with sufficient documentation and metadata.
- 18. What about opening data from DNA analyses? What legal requirements do I have to meet?
In general, when data is not personal, it can be made freely available. If data is indeed personal, then it needs to be assessed for compliance with privacy legislation before it can be made safe for sharing. This usually gives two choices: either remove sufficient context information to ensure data becomes anonymous (at the expense of removing analytical power from the data; a free anonymization tool is amnesia), or properly determine (first of all via a written agreement but also other permission declarations like DPIA, an ethics assessment, proper consent, etc.) that the benefits of sharing the personal data outweighs any possible privacy risks to the individuals behind the data, and that Data Protection Principles have been followed. It is highly recommended to refer to §9 of the GDPR (which can be used for DNA related data).
When working with genetic/genomic data, the following steps can be considered:
- Assess if the genetic/genomic data relates to a living human being. If not, then this data can be made freely available.
If yes, then the next step is...
Assess if the genetic/genomic data that relates to a person can be made properly anonymous. Otherwise…
- Make sure that personal data can be shared in compliance with the GDPR.
Ideally, this should be done before the start of the research producing personal data and should be a part of any legal and ethics assessments, DPIAs and development of consent forms. If you are trying to determine whether personal genetic data can be shared after it has been collected, consult a privacy specialist to conduct a proper assessment. This assessment will inform whether the data can be shared with or without restrictions. Also, a good starting point is to keep an eye on the European Data Protection Supervisor (EDPS) preliminary opinion on data protection and scientific research.
If it is determined that the researcher can freely share the genetic data then the usual OA resources (for non-personal data) will suffice. If the researcher can’t freely share the genetic data itself but can share it provided proper data protection agreements are in place, then resources that can hold metadata could be useful to increase the discoverability of the data. It is assumed that researchers that became interested in the data after discovering its metadata would reach out to the original researchers to reach a data sharing agreement that complies with data protection requirements. Within the metadata it can be indicated the possibility of BYOA (Bring Your Own Analysis), when data sharing is not possible. There is a whole field of study (differential privacy) working on this topic. In short, a researcher is able to submit queries to the genetic database, and the system can in turn ensure that any response does not disclose personal information.
As for the legal requirements, GDPR compliance has to be met if the Genomic DNA relates to an individual and a DPIA is an appropriate assessment tool to determine this.
This is also a helpful resource - https://ec.europa.eu/research/participants/data/ref/h2020/grants_manual/hi/ethics/h2020_hi_ethics-data-protection_en.pdf
- Assess if the genetic/genomic data relates to a living human being. If not, then this data can be made freely available.