4 minutes reading time
(713 words)
Published or Private: How to do both? via OpenAIRE-Advance & EOSC-hub
The research life-cycle made easy using OpenAIRE and EOSC-hub services: making data open yet anonymous
Becky is an early career social scientist. She is excited to have been offered a competitive post in a well-established department to work on an international EC Horizon 2020 5-year research project with many partners. The project started a year before she arrived, and her institution leads the research on the language used to describe immigration in the national press.Data Everywhere
Becky knows that a significant amount of research data has already been gathered during interviews with desk editors. This research data is safely stored in a closed cloud computing area on the secure TSD - Service for Sensitive Data service. TSD is a cloud computing platform designed to comply with the security regulations appropriate to handle sensitive data. This service is provided through the Hub.
Having the data safely and securely stored via the TSD service means that only the authorised researchers have access to it. The data has limited accessibility, is not discoverable and cannot be easily shared with other researchers. This is why the 'FAIRness' of the data is rather poor.
In her second week, Becky receives a reminder from the project’s coordinator that their H2020 Data Management Plan (DMP) is due for update. She needs to find out two things:
- How to update the data management plan - about which she has little experience and;
- How to comply with the EC’s open research data mandate.
But Is the Data Shareable?
According to the DMP, everytime a dataset is included in a publication, it needs to be publicly available to comply to the FAIR principles. Since the dataset contains sensitive information, Becky first needs to anonymise personal information from the people interviewed.
Her concern about the sensitiveness of the data is reinforced by an email from her university’s ethics department reminding her to check the status of personal data and the need to conform to GDPR principles. Realising that much of the project data is sensitive, Becky goes back to the OpenAIRE Helpdesk to guide her through the different options.
An Anonymous Solution
The helpdesk proposes AMNESIA, a tool developed by OpenAIRE to support researchers to anonymise their research data. AMNESIA is a flexible data anonymization tool that allows to remove identifying information from data. It removes direct identifiers like names, SSNs etc., but also transforms secondary identifiers like birth date and zip code so that individuals cannot be identified in the data.
Private Enough - Open to All
Now that the dataset is anonymised, Becky makes it available and discoverable, by uploading it to B2SHARE -- a data repository provided via the Hub. With B2SHARE, researchers and communities can publish datasets and get Digital Object Identifier (DOI) to use in publications. All datasets published in B2SHARE are automatically made discoverable and findable via B2FIND -- an EUDAT metadata discovery portal that allows users to find data collections within an international and inter-disciplinary scope.
After a few months, Becky discovers that the dataset has been cited on a number of occasions by other researchers and she can see via the download statistics maintained within the B2SHARE service (showing the number of downloads) that objects within the dataset has been downloaded frequently.
The Outcome: Thanks to OpenAIRE and EOSC-hub services, Becky was able to get on-the-spot support to make research open for the benefit of all, making sure that her data is well-managed with a management plan, safely stored and published, complying at the same time with GDPR requirements.
Stay Informed
When you subscribe to the blog, we will send you an e-mail when there are new updates on the site so you wouldn't miss them.