Skip to main content

Guides for Researchers

How to make your data FAIR

Basic information with links to resources

Introduction

Are you at the start of your project and planning to create research data and other digital research outputs (e.g., software, workflows, protocols)? Read on to find out how to make them Findable, Accessible, Interoperable and Reusable (FAIR) by applying the FAIR principles for Horizon Europe funds. 

Why are the FAIR principles needed? The increasing availability of digital research objects and online repositories means that data need to be created with long-term stewardship and reuse in mind. Providing other researchers with access to your data and/or to rich metadata describing restricted data facilitates discovery, strengthens transparency and reproducibility, and improves research efficiency by enabling reuse. 

In this context, the FAIR vision emerged from discussions at the Lorentz Center workshop “Jointly Designing a Data FAIRport” (January 2014), and was subsequently formalized as the first published FAIR Guiding Principles in Wilkinson et al. (Scientific Data, 2016). Importantly, FAIR puts specific emphasis on machine-actionability, helping machines (as well as people) to automatically find and use data through metadata, identifiers, standards, and licenses. 

The FAIR principles describe how (meta)data and other digital research objects should be organized and described so they can be more easily accessed, understood, exchanged, and reused by both humans and computational systems. Major funders and research infrastructures in Europe, including those aligned with the European Open Science Cloud (EOSC) vision of a “Web of FAIR Data and Services”, promote FAIR to maximize the value, integrity, and impact of publicly funded research. 

EC FAIR data

What does the EC require from project grantees on FAIR data?

The European Commission frames FAIR as high-level guiding principles (not a single technical standard) that are implemented via responsible Research Data Management (RDM) and documented in a Data Management Plan (DMP). Under Horizon Europe, responsible RDM aligned with FAIR is part of mandatory open science practice, and a DMP is treated as a living document that supports planning, implementation, and updates as the project evolves. 

Key practical point for applicants: in Horizon Europe, a full DMP is not required at proposal submission, but if you expect to generate or reuse data (or other research outputs beyond publications), you should provide a short proposal-stage summary explaining how you will manage them in line with FAIR; the full DMP follows during the project.  

What is FAIR data?

The Four Basics of FAIR:

The four basics of FAIR (in practice, FAIR applies to both data and metadata): 

'Findable' i.e. described with rich metadata, assigned a globally unique and persistent identifier (PID), and indexed/registered so that people and machines can discover and reliably reference it. 
'Accessible' i.e. retrievable by its identifier using standardized communication protocols; access can be open or restricted (authentication/authorization are allowed), but metadata should remain accessible even when data are no longer available. 
'Interoperable' i.e. described using formal, shared languages, community standards, and controlled vocabularies/ontologies, and includes qualified links to related (meta)data so that systems can integrate and exchange information across tools, institutions, and borders. 
'Reusable' i.e. richly described with accurate context, includes clear usage licenses, and records provenance, while aligning with domain-relevant community standards, enabling maximum lawful and meaningful reuse. 
 

FAIRdataprinciples foster

Image: https://book.fosteropenscience.eu/

Things to remember

  • FAIR is a set of principles; not a standard, specification, or binary label. You can improve FAIRness incrementally as your workflows and infrastructure mature. 
  • Does following the FAIR principles mean your data must be shared openly with everyone? NO.
    • Data can be FAIR but not open: FAIR explicitly supports scenarios where metadata are open and informative while access to the data is controlled for legitimate reasons (e.g., privacy, safety, IP, contractual constraints). 
    • Open data may not be FAIR: data can be publicly available yet still hard (or impossible) to reuse if it lacks persistent identifiers, sufficient metadata, standards, provenance information, or an explicit license
  • If you are in receipt of Horizon Europe fundingand your project participates in the Open Research Data Pilot, a DMP is required, with a first version within the first six months, and it should be updated when significant changes arise.  Under Horizon Europe, responsible RDM aligned with FAIR is part of mandatory open science practice; a DMP should be a living document delivered by month 6 and updated as the project evolves, and open access to research data follows the principle “as open as possible, as closed as necessary.” 

FAIR - in depth

Findable

Make your data (and metadata) findable by ensuring it:

  • Has a globally unique, persistent identifier (PID) (e.g., a DOI) 
  • Has rich, machine-readable discovery (citation/descriptive) metadata (e.g., title, creators, abstract, keywords, dates, methods, license) 
  • Is registered/indexed in a searchable resource (typically a repository/catalog) so it is discoverable by people and machines 

Persistent identifiers (PIDs) are important because they unambiguously identify your data and support reliable citation and linking across systems. A common PID for datasets is a Digital Object Identifier (DOI)Choose a repository that mints/registers PIDs and exposes the record through a stable landing page (e.g., Zenodo registers a DOI for uploads via DataCite). 

The metadata describing your data supports findability, citation, and reuse, and, critically supportsmachine-actionability (so services can index, link, and reuse your records at scale). Follow community metadata standards where available, or widely used cross-domain standards (e.g., Dublin CoreDCC/DataCite Metadata Schema), and prefer standards and vocabularies that are maintained and widely implemented by repositories. 

To identify appropriate standards and vocabularies, consult:

DCC metadata guidance (disciplinary and general metadata resources)

Accessible

Make your data accessible by ensuring it:

  • Is retrievable by its identifier using standardized, open protocols (e.g., HTTPS; APIs where relevant) 
  • Supports authentication/authorization where necessary (for sensitive, embargoed, or restricted data) 
  • Keeps metadata publicly accessible even if the data are restricted or no longer available 

Remember: not all data must be open to be FAIR. Data can be restricted and still be FAIR, as long asthe metadata are openly accessible and the access conditions are described clearly

This aligns with the European Commission’s principle: “as open as possible, as closed as necessary.” 

 
As Open as Possible, As Closed as Necessary
 

Where can I keep my data (for the long term)?

Not necessarily “open to everyone,” but safe, preservable, and persistently accessible. Look for a repository that:

  • Preserves data for the long term (including format-risk and preservation planning) 
  • Makes (meta)data findable (searchable record pages; indexing/harvesting) 
  • Supports rich, standardized, machine-readable metadata 
  • Captures access conditions and reuse terms (license) in the record metadata 

You can deposit data to a general repository (e.g., ZenodoDataverse installations) or a subject-specific repository (e.g., Dryad, domain repositories). Prefer discipline repositories when they exist, because they often enforce domain standards and community metadata. 

To identify suitable repositories for your discipline, search:

  • re3data (registry of research data repositories; searchable by discipline and features) 
  • FAIRsharing (also indexes repositories and which standards/policies they align with) 

Interoperable

Make your data interoperable by using:

  • Open, documented file formats (avoid proprietary-only formats where feasible) 
  • Community-agreed schemas/standards and machine-readable structures (so tools can parse and integrate your data) 
  • Controlled vocabularies, thesauri, and ontologies for consistent meaning (semantics) across systems 
  • Qualified links to related entities via PIDs (e.g., link dataset DOI ↔ article DOI ↔ ORCID iD; include funder/organization IDs where applicable) 

Interoperable data can be integrated with other data, applications, and workflows. Think about not creating data that can only be read in proprietary software, and when you must use proprietary tools export/share an interoperable version (e.g., CSV/TSV plus a data dictionary; open exchange formats) so others can reuse it without specialized software.

Reusable

Make your data reusable by ensuring it:

  • Is well-documented (so others—and your future self—can interpret it correctly) 
  • Has clear reuse terms via a license, recorded in the metadata 
  • Includes provenance and context (how data were generated/processed; versions; assumptions) 
  • Meets domain-relevant community standards where they exist 

Create documentation, e.g., a README (plain text or Markdown preferred for machine-use; PDF only if formatting is essential) that includes:

  • For each filename: what it contains and how it relates to figures/tables/publications 
  • For tabular data: definitions of columns/rows, codes (incl. missing values), and units 
  • Processing/cleaning steps that affect interpretation 
  • Links to related datasets stored elsewhere (with PIDs where possible) 
  • Contact point (and ideally ORCID iD) for questions 

Source (README guidance): Dryad’s “Creating a README.” 

Data should have a clear license to govern the terms of reuse. If reuse is intended, avoid “no license” (which creates legal ambiguity and often blocks reuse). Practical guidance from the DCC can help you choose an appropriate data license and understand trade-offs. 

Where possible, to maximize reuse, consider widely recognized licenses such as CC BY 4.0 (attribution) or CC0 (public domain dedication/waiver), and document any restrictions in both the metadata and DMP. 

Check out: EUDAT’s License Selector wizard to support consistent license choice, especially when balancing derivatives, share-alike, and commercial reuse questions.

Find out more on HE online manual  

How FAIR are your data?

Use this checklist to evaluate your data against the FAIR principles

Jones, S. & Grootveld, M. (2017, November). How FAIR are your data? Zenodo. http://doi.org/10.5281/zenodo.1065991

More resources

Still have questions?

Contact us via our Helpdesk.
We try to respond within 48 hours.