Skip to main content

Estimating Costs RDM tool

 
DMP PHASEACTIVITYCOMMENTS AND SUGGESTIONSCOSTS
Preparing Make a Data Management Plan
  • Make a DMP before you start creating data; make decisions about managing your data. You can find the template for H2020 DMPs here.
  • Check if there is a department within your organization to support data management planning.
2 hrs to 2 days, depending on the complexity of your project
1. Data Collection
Acquiring External datasets

Do you plan to use existing data, and is the data available at a commercial partner?

  • Your library may be able to help you acquire a license to a crucial database
  • In research data repositories, data can be available at no or low costs
Example: A faculty licence on a database for macro-economic analysis: €18.000/y
1. Data Collection

Formatting and organising

  • Are your data files, spreadsheets, measurements, interview transcripts, records etc. all in a uniform format or style?
  • Are files, records and items in the collection clearly named with unique file names and well organised?
  • If planned beforehand by developing templates and data entry forms for individual data files (transcripts, spreadsheets, databases) and by constructing clear file structures – low or no additional cost
  • If needed afterwards – higher cost
Per project organize style, format, names can be done by a student assistant at level 1* salary or data manager at level 2* salary
1. Data Collection Transcription
  • Will you transcribe qualitative data (e.g. recorded interviews or focus group sessions) as part of your research; or will you need to do this specifically so data can be more easily shared and reused?
  • Is full or partial transcription needed?
  • Is translation needed?
  • Will you need to develop a standard transcription template or transcription guidelines, to ensure consistent
  • If part of research practice – very low or no additional cost·   if not planned as part of research practice – potentially high additional cost
  • Is additional hardware /software needed ?
  • Consider cost of (time needed for) developing procedures, templates and guidance for transcribers
Example: Time needed for transcription - four to eight hours per hour recording
1. Data Collection

Consent for data sharing

  • Do you need to ask participants for their consent for data to be shared?
  • Consent is essential for research in the domain of health/life sciences also for qualitative interviews
  • When consent for data sharing is considered as part of standard consent procedures early in research – very low or no additional cost
  • When participants need to be re-contacted or re-visited to obtain -active consent– could be high cost
  • Does this require extra preparation of information sheets and consent forms; extra time for consent discussions; or training of interviewers?
Student assistant at level 1* salary or data manager at level 2* salary
1. Data Collection

Data transfer

Are special measures needed to transfer data from mobile devices, from fieldwork sites or from home equipment to a central work server?

Is software or hardware needed for data transfer, for encryption of confidential data before transfer, or for synchronisation of data files across sites? Free encryption or data transfer software (i.e. SurfFileSender) is available in most cases
2. Data Documentation

Data description and Metadata

  • Are data in a spreadsheet, database or data warehouse clearly marked with variable, variable labels and value labels, code descriptions, missing value descriptions, etc.?
  • Are validated questionnaires and standard coding used?
  • Are labels consistent?
  • Are files, records and items in the collection clearly described with well-defined metadata or a metadata standard to interpret the relations between them and to quickly select and understand the content.
  • Do textual data like interview transcripts need description of context, e.g. included as a heading page?
  • If data description is carried out as part of data creation, data input or data transcription – low or no additional cos
  • If needed to be added or harmonized afterwards – higher cost
  • Codebooks for datasets can often be easily exported from software packages

Examples: 4 hrs per single experiment (120 measurements) filling in 60 required metadata fields, with assistance of a data manager at level 2* salary

Two to three weeks are costed into an average two year research grant application to prepare and collate materials for deposit

More information: http://www.data-archive.ac.uk/help/user-faq

2. Data Documentation

Documentation

Do you have documentation for the data that describes the context and methodology of how data were gathered, created, processed and quality controlled?

  • Often essential contextual and methods documentation will be written up in publications and reports
  • If all data creation steps are well documented and documentation is kept well organised during research – low or no additional cost
  • If documentation to be written or compiled specifically afterwards – higher cost
Researcher at level 2* salary.
3. Data Storage & Back-up

Data backup

  • Does the institution provide regular backup or not?
  • Consider how frequently backups should be done, how many backups should be stored.
Institutional backup – included in standard indirect cost/overheadsadditional backup needed – cost according to number of copies to be kept, frequency of backup and storage media needed Examples: University drive €0.80 per GB/y Cloud: €0.30 per GB/y2 x Harddrive: €0.14 per GB (single purchase)
3. Data Storage & Back-up

Data storage

  • How much data storage space is needed for the entire duration of the project?
  • Do you need to set up a data model and accompanying database for the data?
  • If storage is provided by the institution – cost is included in standard indirect costs or overheads
  • If additional storage needed – cost server/ disk space, as well as the cost of setting up and maintenance
  • Do you need a data warehouse or a database architect?
Example: Cloud Database as a service:€160/Month (storage 5GB transfer 30GB)Database architect at level 2* salary
4. Data Access & Security

Data Access

Do external people require access to research data?

Does remote access via VPN or secure FTP need to be arranged for external people? Mostly researchers can make use of existing, free services
4. Data Access & Security

Data security

  • Is there an institutional server available where you can store your data safely?
  • Protect data from unauthorised access or use or from disclosure
  • For confidential or privacy sensitive data, determining conditions for controlling access to shared data may require extra time and discussion
  • Can security be arranged by institutional IT services or is extra software/hardware needed?
  • Data files may need encrypting before storage or transfers

Example: TTP (trusted third party), dependent on pseudonymisation type, ca. €1.000- €30.000

Existing encryption services could be used at no costs

5. Data Preservation & Archiving

File format

Do data need to be converted to a standard or open format with long- term validity for long-term preservation?

  • Is additional software or hardware needed for conversion?
  • For audio-visual data, converting to open digital formats can be time-consuming or require special equipment and/or software
  • For databases, conversions may require checking for truncation, loss of metadata or annotation, loss of relationships, etc.
Researcher at level 2* salary
6. Data Sharing & Reuse

Anonymisation

  • Do you need to remove identifying information or conceal the identity of participants (e.g. using pseudonyms) before data can be shared?
  • Anonymisation needs to be consistent throughout a data collection.
  • If anonymisation is planned before data collection or transcription/digitisation – cost can be lowered
  • For audio-visual data – anonymising/editing voices or faces can be very costly and could reduce the usefulness of data
  • For quantitative data (e.g. survey data) – low cost if identifiers are a priori excluded from data files, are easy to remove, or identifiable variables are coded to avoid disclosure; cost may be higher if variables need recoding afterwards to avoid disclosure
  • For qualitative textual data (e.g. interview transcripts) – costs can be reduced if anonymisation is carried out during transcription (or at least highlighted/coded during transcription)
  • Cost depends on how sensitive or complex data are and how much identifying information is recorded in the data– if only removal of names is required, cost is low; pseudonymisation will require more time
  • For files received of participants, check file properties and edit to remove disclosive information such as editor/author name

Free software is available. AMNESIA is a data anonymization tool, that allows to remove identifying information from data.

Example: Transcribing / simultaneously anonymizing audio (speech): up until one hour per 5 minute fragment (depending on the preciseness level of transcribing)

Student assistant at level 1* salary

6. Data Sharing & Reuse

Copyright

  • Do other parties hold copyright in the data?
  • Do you need to seek copyright clearance before sharing data?
  • Is time required to seek copyright clearance?
  • Is legal advice required?
Juridical advice at level 3* salary
6. Data Sharing & Reuse

Data sharing

  • Will your data be deposited with a data centre or research data l repository?
  • Which requirements exist to prepare data to particular standards e.g. regarding documentation or format?
  • Do structured metadata need to be created when data are shared via a data centre or archive, e.g. completing a deposit form for the UK Data Archive?
  • What data will be retained and what not?
  • How long is the data required to be available,
  • A research data repository/ data centre/ journal can help you make your data open and provide you with the possibility to share your data for reuse. Find out what the cost are of data deposit and/or longer-term storage per year cost in time and effort needed to prepare the data for sharing and preservation
  • Data centres will have their own metadata forms. Consider using these on beforehand

Examples: Completing a data repository upload form (i.e. Zenodo a free-of-charge repository) may take 15 min to 4 hrs

Dryad €110 once (max 20 GB) DataverseNL €3.60 dper GB/yearCloud Database as a service:€160 /month (storage 5 GB, transfer 30 GB)

6. Data Sharing & Reuse

Data cleaning

  • Do quantitative data need to be cleaned, checked or verified before sharing, e.g. check validity of codes used, check for anomalous values?
  • Will data match documentation, e.g. same number of variables, cases, records, files?
  • Does textual information in data need to be spell-checked?
  • Do you need to combine your data with other datasets for your research
  • Data cleaning takes time
  • If carried out as part of data entry and preparation before data analysis – low additional cost
  • If needed afterwards – higher cost

Example: Data cleaning service: €270 to well over €1800

More information: http://datascopic.net/cost-of-data-cleansing/data-cleansing/

Researcher/data manager at level 2* salary

6. Data Sharing & Reuse

Digitisation

Do analogue or paper-based research data (maps. newspaper clippings, photographs, images, text) need to be digitised to increase their potential for sharing?

  • Is additional equipment or software needed for scanning or conversion?
  • If simply image scanning of text – relatively low cost if Optical Character Recognition required, with manual checking for accuracy (revising entire scanned text) – may be high cost
  • If manual data entry or typing needed, e.g. to digitise tabular data – may be high cost
Example: Digitisation €0.50 per page (few pages) OR €320-390 per 1000 pages (OCR included)
Overall

Roles and responsibilities

Do you need to allocate roles and responsibilities for various data management activities?

If multiple partner institutions, researchers or funders are involved in research – consider cost of data management planning meetings or discussions Travel costs, lunch, time
Overall

Operationalising data management 

What measures are needed to implement and operationalise data management throughout the research lifecycle?

  • Do you need extra time and resources to implement data management throughout your research, e.g. regular team meetings, setting up a collaborative research environment?
  • If staff training is required - higher cost
  • Do you need a dedicated data manager?
Data manager at level 2* salary

* Local salary scales differ per country. E.g.:
- Level 1 (i.e. student assistant) ~ 17 euro per hour.
- Level 2 (researcher, data manager) ~60 euro per hour
- Level 3 (external expert) ~160 euro per hour.)

This guide was based on the work of the UK Data Service and the Landelijk Coördinatiepunt Research Data Management
  • UK Data Service (2013). Data management costing tool. UK Data Archive, University of Essex.
  • Alisa Westerhof (UU), Tessa Pronk (UU),Annemiek van der Kuil(3TU & TUD), Annemie Mordant (UM)(2015). Data Management Bij wetenschappelijk onderzoek méér dan alleen storage. Landelijk Coördinatiepunt Research Data Management, The Netherlands.
This work is licensed under the Creative Commons Attribution-NonCommercial-ShareAlike 3.0 Unported Licence.