How do I license my research data?
Learn more about licenses for research data and how to apply it
This guide, a user FAQs for Researchers, on how to license research data, is part of the "User guide on copyright, open science and data", meant to offer a state of the art, legally advanced, but still manageable set of rules, guidelines, and resources to enable the full potential of OS in the EU research field with a view to addressing copyright and related rights issues.
It depends on what rights protect your research data, if at all. In the light of what is explained in the guide "How do I know if my research data is protected?":
Keep in mind that CC licences only deal with copyright and copyright related matter. Personal data are not included in CC and are analysed separately.
Creative Commons, a global not-for-profit organisation which provides legal tools to promote the sharing and reuse of works of authorship, has produced a number of licences some of which meet the criteria for Open Access. These offer different levels of permission.
Creative Commons offers licences readable at three different levels: legal, machine (the metadata) and human (non-legal descriptions). Creative Commons has a useful tool to help you determine the licence best for you. More restrictive CC licences are unlikely to meet Open Access requirements (e.g. because they impose restrictions on commercial use).
Licences are not automatic. The owner of rights protected data set must make it clear that a licence is applied. Repositories may help you to select the licence applied to data deposited in their repository. Applying a licence can happen by:
If a standard licence from Creative Commons, they will have tools to help attach the licence effectively. See for more info the accompanying OS repository checklist for an explanation on how to use those tools.
Attribution is a genuine concern. To help others cite your research, include a citation in your research that users can copy and paste to give you credit for your hard work. If you licence your data under a CC BY you are legally requiring attribution, but we recommend that you do this only if you are authoring a work such as a journal article or a photograph or a song. If you are producing protected databases (as explained above) probably your best choice is to use CC0. You can still ask for attribution, not as a legal requirement but as “please attribute my data” in line with scientific norms.
We recommend that you avoid using a CC BY licence for data. While attribution is a genuine, recognisable concern, not only might using a CC BY licence be legally unenforceable when no underlying copyright or SGDR protects the work, but it may also communicate the wrong message to the world, as you are requiring attribution for something that the law says there is no attribution (e.g. SGDR does not require moral rights).
A better solution is to use CC0 and simply ask for credit (rather than require attribution), and provide a citation for the dataset that others can copy and paste with ease. Such requests are consistent with scholarly norms for citing source materials.
We recommend you avoid using a non-commercial licence. For legal purposes, drawing a line between what is and is not ‘commercial’ can be tricky; it’s not as black and white as you might think. For example, if you release a dataset under a non-commercial licence, it would clearly prohibit an organisation from selling your dataset to others for a profit. However, it might also prohibit someone using the dataset in their research if they intend to eventually publish that research. This is because most academic journals are commercial businesses that charge some sort of fee for access to their content, hence, such use could qualify as ‘commercial’. Consequently, using a non-commercial licence may prevent researchers from using your data in work destined for publication. This can subsequently affect the dissemination, recognition, and impact of your dataset. And it is definitively NOT open access. (see the Berlin Declaration, Bethesda Statement on Open Access Publishing, and Budapest Open Access Initiative).
We recommend you avoid using a ‘No Derivatives’ license. Similar to how a non-commercial licence might restrict meaningful reuse of your dataset, a ND license can have the same effect: it may prevent someone from recombining and reusing your data for new research. For data to be truly Open Access, it must permit these important types of reuse. It is less clear whether ND is OA compliant or not. The best view is that it depends on what kind of modifications it prohibits, therefore, there are probably cases where ND is incompatible with OA, and thus you should not use it.
Consider redacting research data to remove personal data, confidential information or third party intellectual property.
Your CC licence applies only to your original contributions and does not supersede any rights retained by authors whose works you have cited or have permission to use.
We recommend you use the CC0 Public Domain Dedication, which is first and foremost a waiver, but can act as a licence when a waiver is not possible. By applying CC0 to your data you enable everyone to freely reuse your data as they see fit by waiving (giving up) your copyright and related rights in that data.
CC BY 4.0.
If you work for an educational institution, it is good practice to first check with your research director and library. Your institution may already have an Open Access publishing policy for you to consult, and your library will be able to help you decide how to best proceed.
You should keep in mind that there are many situations in which data is not protected as a matter of copyright and related laws. Such data can include facts, names, numbers – things that are considered ‘non-original’ and part of the public domain thus not subject to copyright protection. Similarly, your database (which is a structured collection of data) might be considered ‘non-original’ and thus ineligible for copyright, and it might additionally be excluded from other forms of protection (like the EU sui generis database right, also known as the ‘SGDR’, for non-original databases).
In these cases, using a Creative Commons licence such as a CC BY could signal to users that you claim a copyright in the non-original data despite the law, and perhaps despite your real intention. Finally, if your data is in the public domain worldwide, you might state simply and obviously on the material that no restrictions attach to the reuse of your data and apply a Public Domain Mark.
Obligations of confidentiality may be imposed by contract or implication. Most researchers are expected to abide by ethical codes of conduct.
How do I know if my research data is protected
Learn more about what is research data and their protection by intellectual property rights
How do I license my research data
Learn more about licenses for research data and how to apply it
Can I reuse someone else’s research data
Learn more on how to reuse research data