Services uptake - what’s new: DECIDO integrates Amnesia!
The collaboration between the DECIDO project and Amnesia tool leads to an integration that ensures immediate anonymization of data produced by the project.
DECIDO is a Horizon 2020 project aiming to boost the use of EOSC (European Open Science Cloud) by Public Authorities, enabling innovation in the policy making sector, removing European fragmentation, allowing cross-support, and cross-collaboration and the use of secure compute – and data – intensive services.
In this effort, data from the public sector and citizens (via outsourced actions) are collected and analyzed in an ICT infrastructure realized for the needs of the project. Data follow standard processes during their handling and processing in the 4 main pilot areas that are addressed in different countries across the EU.
Each partner responsible for the pilot stores personal data in its own storage facilities. Each personal data storage satisfies a set of requirements to secure data from being identifiable and / or leaked. Anonymization and pseudo-anonymization techniques are followed on the occasions of:
The Turin pilot about flood disaster management showcases the way that Amnesia is embedded in the data management workflows. Currently, there are two options for using Amnesia in the pilot:
A | Leveraging the Amnesia UI (with all functionalities) |
B | Use Amnesia on NAS (potentially) in two diverse modalities: |
B.1 | Leveraging the Amnesia UI (with all functionalities) |
B.2 |
Using Amnesia as black-box and SSH script to anonymise data in a “semi-automatic” way, due to hierarchy files that should be provided to the tool. Hierarchy files describe the “way” as data will be anonymized. The modality to invoke Amnesia as a script SSH is completed in three steps: Step 1. In the end, a script will be invoked to anonymize data |
Amnesia was ported into a Docker environment to fix an installation issue on Synology NAS that has a proprietary operating system. The differences that were observed between the two options are:
Option A. Use Amnesia from local PC on LAN
When Amnesia is installed as-is, the server can absorb hardware capacity, but there is higher performance. However, there are issues regarding authorization and authentication that need to be overcome. The Security layer and IDM should manage access from another device.
Option B. Use Amnesia on NAS
On the other hand, when Amnesia is used on NAS, both IDM and the security layer for authorization and authentication provide APIs and CRUD to interact with data. Though, in this case, performance is lower because in the NAS device there are two services up/running: DB (mongo) for data storage and Amnesia for data anonymization, in addition to HD specifications with hardware that are limited and not scalable.
The collaboration between the DECIDO project and Amnesia tool led to an integration that ensures immediate anonymization of data produced by the project. Currently, two integration options are utilized by the Turin pilot. The next steps concentrate on expanding dockerization processes to leverage Amnesia with and without User Interface. There still needs to be agreed how to automatically retrieve or build hierarchy files, when need be.
About Amnesia: https://amnesia.openaire.eu/about-documentation.html
Published: July 26, 2021
OpenAIRE has received funding from the European Union's Horizon 2020 Research and Innovation programme under Grant Agreements No. 777541 and 101017452 (see all).
Unless otherwise indicated, all materials created by OpenAIRE are licenced under CC ATTRIBUTION 4.0 INTERNATIONAL LICENSE.