This guide, is a companion Open Science (OS) checklist for Content Providers, about how to license repositories, meant to offer a state of the art, legally advanced, but still manageable set of rules, guidelines, and resources to enable the full potential of OS in the EU research field with a view to addressing copyright and related rights issues.
1.1. One of the best licenses you can use for your repository is a CC BY 4.0 license, specifying that “unless otherwise noted, this repository is under a CC BY 4.0 license”.
We recommend using a CC BY 4.0 license as a repository license for the following reasons:
The following declarations and statements provide definitions of Open Access:
Creative Commons provide guidance on how to select the most appropriate license for your work (depending on your sharing preferences), as well as how to mark your work once your appropriate license has been identified:
1.1.1. If you follow point 1.1 it means that the license applies to all “works” or other subject matter in the repository.
This includes:
For University repositories, it is likely that several of these elements co-exist, but it could also well be that the repository is not a protected database. In either case CC licenses are a good choice because (avoiding technicalities) they only regulate the use if that use requires a permission. Under this point of view it could be said that CC licenses are self-contained to when permission is necessary.
See the following source for further details on the EU legislation regulating the legal protection of databases:
For confirmation of same see:
1.1.2. However, this could become problematic, when, as in the case of University repositories, the owner of the repository (the University) and the owner of the journal article (the author unless they transferred the copyright) are different people.
Therefore, by using the recommended “unless otherwise noted” wording, you clarify that the elements that belong to third parties (e.g. journal articles) are distributed under their own license terms (which as you will see later, ideally is a CC0 or a CC BY).
It is important to license the repository as a database under an open access compliant license. This is because when a user uses aggregated data (such as in data analytics, text and data mining, etc.) in order to crawl, scrape or analyse the database, authorisation (e.g. a license or an exception if it exists) is often necessary. But if you have applied a CC BY to your repository this is already taken care of!
In order to meet the definition of open access as provided in the Budapest Declaration, users must be able to crawl the database:
OpenAIRE provides a tool which tests online repository compliance with Open Science guidelines: Validator service.
1.2. The CC BY 4.0 license should be incorporated into the terms of service of the repository.
Terms of service are general rules about how a service, such as a website, can be used. These may include a multitude of conditions, such as privacy policies, limitations of liability, and codes of conduct. All users of the service have to agree to the terms of service.
Creative Commons provide guidance on how to integrate their licenses within your terms of service:
1.2.1. The CC BY 4.0 license exists as a separate legal document from the terms of service.
As such, it must be incorporated by reference into the contractual, and broader terms of service which govern all uses of the repository. Creative Commons provide guidance on how to incorporate the CC BY 4.0 license into the repository terms of use.
Creative Commons provide guidance on how to integrate their licenses within your terms of service:
2.1. You should provide metadata in order to enhance discoverability of your resources.
See the following sources for a discussion on the merits of using metadata in your repository:
Creative Commons provide technical guidance on how to implement metadata via HTML, as well as providing a generation tool for embedding metadata within files:
Application of a CC0 license for metadata is increasingly recognised as a community standard in the following institutions:
2.1.1. Providing machine-readable bibliographic metadata is a requirement for projects which are funded under Horizon 2020.
See H2020 Framework Programme Regulation (EU) No 1291/2013 of the European Parliament and of the Council of 11 December 2013 establishing Horizon 2020 - The Framework Programme for Research and Innovation (2014 - 2020) (OJ 347, 20.12.2013, p. 104) for legal basis, and more specifically article 29.2 of the H2020 Programme: AGA - Annotated Model Grant Agreement (2018). (last accessed 1 October 2018)
The requirement for machine-readable bibliographic metadata is detailed by the European Commission in the following sources:
2.2. Metadata often are not protected as such because they are factual information, thus not original or not substantial.
However, in certain cases, complex and elaborate metadata could perhaps be protected. For the avoidance of doubt, apply a CC0 to your metadata. In this way, in those cases when a right exists, you are waiving it and allowing other people to reuse the metadata information.
CC BY should be avoided unless you know exactly what it entails.
Applying a CC BY license may result in “copyfraud” in countries where metadata is not eligible for copyright protection (as in this case the application of a CC BY license imposes more restrictive conditions than what the metadata is actually entitled to). Confirmed in:
Currently, 61% of the open access academic repositories listed on OpenDOAR have no clear metadata policy, as detailed in:
2.2.1. In those few cases when metadata can be considered original works, thus protected by copyright, they will enjoy both economic and moral rights.
Moral rights are recognised in most countries (but with important exceptions, such as the US), and may be unwaivable.
This should not represent an issue in the case of CC0, as the waiver clarifies that it only waives the rights as long as this is permitted under applicable law. So if you enjoy unwaivable moral rights, CC0 will not affect your moral rights.
For an example of a jurisdiction with unwaivable moral rights, see Kreutzer’s discussion of the German position, and why the application of a CC0 license in this situation is still valid:
2.3. For metadata to be used meaningfully, it must be standardised to optimise machine-reading (a requirement of H2020 projects). A commonly used format in libraries and cultural heritage institutions is Dublin Core.
The following source details issues which arise from using inconsistent metadata practices:
RIOXX provide a tool which tests repositories metadata compliance with open access standards:
For details on the formatting and implementation of Dublin Core bibliographic metadata, see:
3.1. In point 1 you have applied a license to your repository, and to its content “unless otherwise noted”. Now let’s take a look at the “unless otherwise noted” part.
As a repository manager, you (or the University) usually don’t own the copyright in the articles uploaded (unless you have written them).
Therefore, the repository has to implement a license selection procedure that allows the uploader (author or rightsholder) to choose the proper license. As further detailed in point 6, this process should offer a number of choices to the author, but it should be the author who ultimately decides what license to use.
Nevertheless, in order to help researchers to make the right choice, you can offer and implement some guidance that will help researchers to make the right choice and to adhere to Open Science principles.
In points 4 and 5 you will see what choices are recommended for a) data and databases; and b) articles.
See guidelines on open access as provided by the European Commission for details on self-archiving and open access publishing:
3.2. We recommend a CC BY 4.0 license in respect of the content of the repository. This is detailed further in Point 5. This may not be appropriate for data or datasets as detailed in Point 4.
For discussions on the merits of CC BY see the following resources and details in point 5:
Amiel, T. and Soares, T.C. (2016) Identifying Tensions in the Use of Open Licenses in OER Repositories. International Review of Research in Open and Distributed Learning, 17(3) (last accessed: 5 July 2018), see “Licensing”
Creative Commons UK (2017) Frequently Asked Questions on Creative Commons & Open Access. Zenodo. (last accessed: 3 July 2018), see “How should I license my work for the purposes of Open Access?”
Mewhort, K. (2012) Creative Commons Licenses: Options for Canadian Open Data Providers. (last accessed: 5 July 2018)
Some funders and institutions may require that any outputs are made available under a CC BY license, some of which are detailed in the following source:
3.3. Creative Commons licenses are not appropriate for software.
Instead, we would recommend that a GNU GPL v3.0, BSD/Apache style license is applied. These are some of the most well-established public licenses for free software, and are based on the “copyleft” concept for open source software. They are also highly interoperable with other licenses.
Creative Commons confirm that their licenses should not be applied to software:
The following open access software licenses are appropriate substitutes for a CC BY or CC0 license:
3.4. Whilst Creative Commons licenses may apply to both digital and non-digital content, this guide currently provides advice only in respect of fully digital repositories.
Libraries and cultural heritage institutions may need to audit their non-digitised resources, and check for complex or multilayered content (e.g. multiple authors, orphan works etc.)
For guidance on digitising works for the purpose of creating a digital repository, see the following sources:
This may be a particularly relevant consideration for any projects funded by the European Research Council, which require that all project materials be machine-readable. In this case, scans are not acceptable, which may impact non-digital repositories:
3.4.1. If you need to digitise materials for use on an online repository, note that there may be special considerations for traditional knowledge works.
Both the Alaska Native Knowledge Network and Charles Darwin University provide examples of special considerations and guidance when digitising or collecting traditional knowledge works:
4.1. Data and datasets and databases should be offered without restrictions on use, meaning under a CC0.
Remember that a database can be protected by copyright (database structure) and/or SGDR (substantial investment in obtaining verifying and presenting data), without prejudice to any copyright or other rights in the underlying material.
Applying a CC0 to a database means that if any rights exist they are waived, if they don’t exist CC0 does not create any obligation. If waiver is not possible then CC0 operates as a waiver or as a license to the same effect within the limits of applicable law.
4.1.1. The advantages of making data available without restrictions include:
4.2. Where the uploader is concerned with regards attribution they can ‘kindly request’ to be attributed, rather than using the more legally restrictive CC-BY license.
This is not legally binding although follows standard scholarly practice in crediting researchers for their work.
4.2.1. When ‘kindly requesting’ attribution of a work, the uploader should be advised to offer a citation which can be easily copy and pasted by subsequent users.
4.2.2. Tools such as the Creative Commons ‘Open Attribute’ tool are available to assist with ensuring adequate attribution.
4.3. Where there are concerns with regards privacy issues or data protection, these should be dealt with under the relevant legislation or ethics policies.
4.3.1. Both the BMC consultation and the PLOS data policy address these issues.
Suggest which license should be chosen to meet OS requirements, but let the uploader choose.
5.1. Give the uploader the possibility to choose the license.
You can indicate which licenses are better for OA/OS, but don’t choose for them.
To avoid ambiguity, uploaders should be expected to apply a license at the point of upload. Failure to apply a license at upload results in ‘All Rights Reserved’, which generally means people are unable to use, re-use, modify or data-mine the unlicensed content, without authorisation.
5.2. Repositories may play an important role in educating uploaders with regards open licensing.
5.2.1. The importance of making their work open access, should be explained to uploaders prior to upload.
The benefits of open access work include: Researchers and their institutions benefit from having a wider audience Open access allows use of text and data mining tools, without legal barriers. Funders receive a greater return on their investment when results of research can be utilised by more people and at an earlier date.
5.2.2. The Sparc Europe resource offers a useful summary of the benefits of making work open access.
6.1. Uploaders should be offered all possible guidance and explanation with regards the various licences open to them, and the degree to which these are compatible with open access principles.
6.1.1. This can be done by incorporating some form of ‘Licence Selector’ tool into the upload process. The tools featured here offer examples of how this can be achieved.
6.2. CC-BY 4.0 may be considered as a default standard licence, except in the case of data and datasets.
However any default licence provided should always be accompanied by a selection of alternative licences and comprehensive explanations about the function of each.
The CC-BY 4.0 licence is often considered the ‘gold standard’ open access licence, since it is the least restrictive and allows people to use the licensed content as they choose, provided attribution is provided, and is fully OA compliant. As a note of caution, however, it should always be the uploader who makes the final licence selection.
6.3. Where uploaders select a licence which is less compatible with open access/science requirements, this should be made clear to them.
This is particularly relevant where uploaders choose Creative Commons licences with NC (non-commercial) or ND (no derivatives) conditions. These licences have been described by Creative Commons as failing to promote ‘free culture’.
6.4. In the case of software, application of a GNU GPL or BSD/ Apache style licence is recommended.
These licences are:
6.5. In the case of public sector information, application of an Open Government Licence is mandated by the UK Government Licensing Framework (UKGLF) for all public sector information.
6.6. Ultimately, however, the final decision with regards which licence is applied should rest with the uploader.
6.7. The resources featured here offer comprehensive discussion regarding the benefits of open access principles, and provide an example of how these might be expressed to uploaders.
6.8. Account must be taken of any external limitations on the uploader’s choice of licence.
This may be as a result of funding body stipulations or publishers’ requirements.
6.8.1. The resources featured here, including the European Commission H2020 guidance, provide an example of possible funding body stipulations, with regards making work open access and how this should be done.
OpenAIRE has received funding from the European Union's Horizon 2020 Research and Innovation programme under Grant Agreements No. 777541 and 101017452 (see all).
Unless otherwise indicated, all materials created by OpenAIRE are licenced under CC ATTRIBUTION 4.0 INTERNATIONAL LICENSE.