Data will be stored, described and catalogued to ensure its findability in the short and long term, not only by its creator, but also by other scientists to allow for a better reusability and avoid duplication of work and information loss.
Research Data Management is a set of activities related to creating, organising, storing, sharing and preserving data. There is likely to be guidance, tools and support from your research organisation or external data services that can help you with this.
A Data Management Plan is a brief document that describes your approach to RDM for a given project or context - what data will you collect, how will it be documented, and which data will be shared in the long-term? Many funders ask for DMPs, including the European Commission under Horizon 2020.
A DMP is a formal document outlining how the research data collected or generated will be handled during and after a research project. It is a brief plan to define: how the data will be created; how it will be documented; who can access it; where it will be stored; whether it will be shared and where it will be preserved.
It should describe:
Please note that the DMP is a “living” document, that is it is not a fixed document: it evolves and gains more precision and substance during the lifespan of the project, and this is the reason why you should keep it updated! See also FAQ 23 for more information on DMPs.
The data management plan concerns all data you use in your research, and the documentation that is necessary for long-term reusability, both for the researchers involved and for future research.
In principle all data should be maintained and made accessible for replication studies, this is called the maintenance and archiving of a replication package. At this moment H2020 demands that all data underlying publications should be made accessible, as open as possible. Other data like raw data or versioned data and data that will be open access as well as data that will be available on request only, are just as welcome. You help other researcher to reuse your data for future research if you mention both sorts of ‘other data’ in your data management plan. See also https://researchdata.nl/en/services/data-management/selecting-research-data/.
We advise you strongly to mention and explain the selection criteria in your data management plan.
It is always good to take data management seriously already in the application. If you plan well from the outset you can anticipate and avoid potential problems such as duplication or data loss.
The Data Management Plan is one of the fundamental aspects of Research Data Management, describing in details the approach adopted in the research project regarding all aspects of research data cycle. Defining a draft for the Data Management Plan at the proposal stage ensures that the project has a strong strategy in the medium and long term, and a vision also after its end. Although EC funded projects that are part of the Open Data Pilot are requested to provide an initial DMP in the first 6 months of the project and DMPs provided at the proposal stage will not be considered for evaluation as stated in the Guidelines on FAIR Data Management in Horizon 2020, including a draft for a DMP in the proposal stage could help the evaluators to better understand the vision of the project in the long term. Furthermore, soft evaluation criteria such as open science are just as important for the review process.
At the proposal stage in the H2020 program, the applicants should show that the project has a strategy for managing the data resulting from their research. At this stage you can already describe how informed consent will be organized, how one will safely store the data, where one will archive the data, metadata, and other documentation for the long term, what conditions (if any) will hold for data reuse, et cetera. The current state of consortium agreements regarding data management should be reflected on.
Forward planning is one of the key principles of RDM and ensures no data loss or inconsistency can happen since the beginning of the project. Be consistent with referring to the exploitation and protection of results. The DMP can be considered also as a checklist for the future and as a reference for the resource and budget allocations of the project. An overall strategy approach should be given for:
Also consider organizing your own periodic data management plan evaluation. The DMP can be considered as a checklist for the future and as a reference for the resource and budget allocations of the project.
Once the funding is approved and has started, you must submit a first version of your DMP within the first 6 months of the project. The Commission provides a DMP template in annex, the use of which is recommended but voluntary. However, using the EC template will probably be more efficient for you to support researchers. The Data must fulfill the FAIR principles (Findable, Accessible, Interoperable, Reusable). The DMP needs to be updated whenever significant changes develop. The DMP is a living document and changes have to be reported as a minimum in time with the periodic evaluation/assessment of the project. If there are no other periodic reviews foreseen within the grant agreement, then such an update needs to be made in time for the final review at the latest.Furthermore, the project consortium can define a timetable for review in the DMP itself. For further information: http://ec.europa.eu/research/participants/docs/h2020-funding-guide/cross-cutting-issues/open-access-data-management/data-management_en.htm.
You may also want to take a look at the following resources:
It would not make much sense to hand in different Data Management Plans. If there for instance are different data formats used then all the formats have to be listed in the DMP. As an example, if partners decide to produce articles with joined authorship it also has to be an agreement between partners, where to publish or in which Open Access Repository the article will be published. A joint DMP for the whole of the project consortium means that all used formats and approaches would be documented in one paper although the storage of data might be different across partners.
For projects that have already started writing their DMP under the previous guidelines: when you feel that it would be a waste of effort to “throw together” the individual datasets, a pragmatic solution for reviewers and for those who write the next DMP version(s) could be to add a brief overall description in the beginning of the document, and then simply maintain the existing distinctions.
The European Commission frames its Data Management guidelines in terms of FAIR. This means that data should be Findable, Accessible, Interoperable and Reusable. Making data FAIR ensures that data is shared in a useful way (e.g. fully documented, using standards, via trustworthy repositories). Being FAIR increases the ability of others (both humans and machines) to find, understand and use the outputs.
For more information you can also see: https://ec.europa.eu/info/sites/info/files/turning_fair_into_reality_1.pdf
To find out how high the costs are you have to consider costs for data collection, storage, access, security, preservation and possibly assessing the availability of resources for staff, such as a data steward position. By planning early, costs can be significantly reduced. Under Horizon 2020, research data management costs are eligible for reimbursement for the duration of the grant agreement.
Here is a guide on estimating those type of costs: www.openaire.eu/how-to-comply-to-h2020-mandates-rdm-costs, with an RDM costs estimating tool: www.openaire.eu/estimating-costs-rdm-tool?highlight=WyJjb3N0IiwiY29zdCcsIl0=
Also the University of Utrecht compiled a Data Management Cost Guide
H2020 demands that all data underlying publications should be made accessible, as open as possible. This does not mean that there is an obligation on data opening. Data need to be FAIR, then you can open it or not. You should always rely on the principle “as open as possible, as closed as necessarily”. As not all data can be open, projects can opt out at any stage (either before or after signing the grant).
You can choose to opt out and not open your data, but you need to justify your choice in the DMP, with reference to the opt out options defined by the EC in the “Guidelines to the Rules on Open Access to Scientific Publications and Open Access to Research”: https://ec.europa.eu/research/participants/data/ref/h2020/grants_manual/hi/oa_pilot/h2020-hi-oa-pilot-guide_en.pdf
If you do not go for the opt out option, you have to provide open access via a repository to those research data necessary to validate the results presented in a scientific publication. All sorts of data can be preserved: raw data or versioned data as well as data that will be available on request only. You help other researcher to reuse your data for future research if you mention all sorts of ‘other data’ in your data management plan. See also: https://researchdata.nl/en/services/data-management/selecting-research-data/
We strongly advise you to mention and explain the selection criteria in your data management plan.
If the data were acquired within a specific project, they will fall under the definition of “any other data (for instance curated data not directly attributable to a publication, or raw data)” that the EC gives in the “Guidelines to the Rules on Open Access to Scientific Publications and Open Access to Research” https://ec.europa.eu/research/participants/data/ref/h2020/grants_manual/hi/oa_pilot/h2020-hi-oa-pilot-guide_en.pdf
This data will have to be managed in a FAIR way. Decision on whether to open the data and when is up to the researchers and has to be justified in the DMP.
As the publication stems from data collected within a grant, it should also follow the funder obligations on OA to scientific publications. Unfortunately for this specific case, eventual APC cannot be covered by the project grant if published after project end.
Generally, all data that are personal or can be related to an individual have to be anonymized before publication. OpenAIRE offers the data anonymization tool Amnesia for removing identifying information from data. It is available as an online service or local app. Find more about Amnesia here: http://catalogue.openaire.eu/service/openaire.amnesia
For more information please check OpenAire guide about sensitive data: www.openaire.eu/sensitive-data-guide
The German project DataJus at the Technische Universität Dresden analysed the legal framework for RDM and also covered the topic of personalised and personalizable data. You can find the report here: https://tu-dresden.de/gsw/jura/igetem/jfbimd13/ressourcen/dateien/publikationen/DataJus_Zusammenfassung_Gutachten_12-07-18.pdf?lang=de
Please also take a look at the guidelines of the Core Trust Seal: www.coretrustseal.org/why-certification/requirements/
It is not recommended to store your data in more than one repository. Make sure the repository of your choice is both certified and trustworthy, as well as harvested by OpenAIRE. More information on what it takes to be an OpenAIRE-compliant repository, can be found here.
Open platforms also offer the possibility to store your data. Compared to these platforms, actual repositories offer more services such as long-term access to your data and they support good documentation and metadata practices. Both Horizon2020 and OpenAIRE strongly recommend to deposit the data with sufficient documentation and metadata.
In general, when data is not personal, it can be made freely available. If data is indeed personal, then it needs to be assessed for compliance with privacy legislation before it can be made safe for sharing. This usually gives two choices: either remove sufficient context information to ensure data becomes anonymous (at the expense of removing analytical power from the data; a free anonymization tool is amnesia), or properly determine (first of all via a written agreement but also other permission declarations like DPIA, an ethics assessment, proper consent, etc.) that the benefits of sharing the personal data outweighs any possible privacy risks to the individuals behind the data, and that Data Protection Principles have been followed. It is highly recommended to refer to §9 of the GDPR (which can be used for DNA related data).
When working with genetic/genomic data, the following steps can be considered:
If yes, then the next step is...
Assess if the genetic/genomic data that relates to a person can be made properly anonymous. Otherwise…
Ideally, this should be done before the start of the research producing personal data and should be a part of any legal and ethics assessments, DPIAs and development of consent forms. If you are trying to determine whether personal genetic data can be shared after it has been collected, consult a privacy specialist to conduct a proper assessment. This assessment will inform whether the data can be shared with or without restrictions. Also, a good starting point is to keep an eye on the European Data Protection Supervisor (EDPS) preliminary opinion on data protection and scientific research.
If it is determined that the researcher can freely share the genetic data then the usual OA resources (for non-personal data) will suffice. If the researcher can’t freely share the genetic data itself but can share it provided proper data protection agreements are in place, then resources that can hold metadata could be useful to increase the discoverability of the data. It is assumed that researchers that became interested in the data after discovering its metadata would reach out to the original researchers to reach a data sharing agreement that complies with data protection requirements. Within the metadata it can be indicated the possibility of BYOA (Bring Your Own Analysis), when data sharing is not possible. There is a whole field of study (differential privacy) working on this topic. In short, a researcher is able to submit queries to the genetic database, and the system can in turn ensure that any response does not disclose personal information.
As for the legal requirements, GDPR compliance has to be met if the Genomic DNA relates to an individual and a DPIA is an appropriate assessment tool to determine this.
This is also a helpful resource - https://ec.europa.eu/research/participants/data/ref/h2020/grants_manual/hi/ethics/h2020_hi_ethics-data-protection_en.pdf