Skip to main content

Guides for Content Providers

Making your repository Open

An Open Science checklist on how to license repositories

This guide, is a companion Open Science (OS) checklist for Content Providers, about how to license repositories, meant to offer a state of the art, legally advanced, but still manageable set of rules, guidelines, and resources to enable the full potential of OS in the EU research field with a view to addressing copyright and related rights issues.

  • 1. Apply the right licence to your repository

    1.1. One of the best licenses you can use for your repository is a CC BY 4.0 license, specifying that “unless otherwise noted, this repository is under a CC BY 4.0 license”.

    We recommend using a CC BY 4.0 license as a repository license for the following reasons:

    • Creative Commons licenses are internationally recognised, well-established, and both human-readable and machine-readable;
    • CC BY 4.0 licenses meet the definition of “open access” as defined in the Budapest, Bethesda, and Berlin declarations on open access;
    • CC BY 4.0 is one of the most compatible licenses for interoperability purposes.
    Legal Sources:

    The following declarations and statements provide definitions of Open Access:

    The following sources from Creative Commons and OpenMinTeD provide details on compatibility between licenses (both generally and for the purposes of text and data mining): See the following source by Creative Commons UK for details on both how and why you should be interested in making your work open access:

    Creative Commons provide guidance on how to select the most appropriate license for your work (depending on your sharing preferences), as well as how to mark your work once your appropriate license has been identified:

    1.1.1. If you follow point 1.1 it means that the license applies to all “works” or other subject matter in the repository.

    This includes:

    • The repository as a copyright protected database (in case it qualifies);
    • The repository as a suis generis database right protected database (in case it qualifies);
    • The elements composing the database which can be:
      • Not protected, such as a database of temperature measurements. In this case, as these data are not protected in themselves you don’t need a license. CC licenses are written in a way that you only have to accept them if you need permission to use something.
      • Protected (e.g. a database of journal articles)

    For University repositories, it is likely that several of these elements co-exist, but it could also well be that the repository is not a protected database. In either case CC licenses are a good choice because (avoiding technicalities) they only regulate the use if that use requires a permission. Under this point of view it could be said that CC licenses are self-contained to when permission is necessary.

    Legal Sources:

    See the following source for further details on the EU legislation regulating the legal protection of databases:

    For confirmation of same see:

    1.1.2. However, this could become problematic, when, as in the case of University repositories, the owner of the repository (the University) and the owner of the journal article (the author unless they transferred the copyright) are different people.

    Therefore, by using the recommended “unless otherwise noted” wording, you clarify that the elements that belong to third parties (e.g. journal articles) are distributed under their own license terms (which as you will see later, ideally is a CC0 or a CC BY).

    It is important to license the repository as a database under an open access compliant license. This is because when a user uses aggregated data (such as in data analytics, text and data mining, etc.) in order to crawl, scrape or analyse the database, authorisation (e.g. a license or an exception if it exists) is often necessary. But if you have applied a CC BY to your repository this is already taken care of!

    Legal Sources:

    In order to meet the definition of open access as provided in the Budapest Declaration, users must be able to crawl the database:

    See also recommendations made to data and e-infrastructure providers in the source below which confirm CC BY 4.0 is the most appropriate license for data access:

    OpenAIRE provides a tool which tests online repository compliance with Open Science guidelines: Validator service.

    1.2. The CC BY 4.0 license should be incorporated into the terms of service of the repository.

    Legal source:

    Terms of service are general rules about how a service, such as a website, can be used. These may include a multitude of conditions, such as privacy policies, limitations of liability, and codes of conduct. All users of the service have to agree to the terms of service.


    Creative Commons provide guidance on how to integrate their licenses within your terms of service:

    1.2.1. The CC BY 4.0 license exists as a separate legal document from the terms of service.

    As such, it must be incorporated by reference into the contractual, and broader terms of service which govern all uses of the repository. Creative Commons provide guidance on how to incorporate the CC BY 4.0 license into the repository terms of use.


    Creative Commons provide guidance on how to integrate their licenses within your terms of service:

  • 2. Don’t forget the metadata

    2.1. You should provide metadata in order to enhance discoverability of your resources.


    See the following sources for a discussion on the merits of using metadata in your repository:

    Creative Commons provide technical guidance on how to implement metadata via HTML, as well as providing a generation tool for embedding metadata within files:

    Application of a CC0 license for metadata is increasingly recognised as a community standard in the following institutions:

    2.1.1. Providing machine-readable bibliographic metadata is a requirement for projects which are funded under Horizon 2020.

    Legal source:

    See H2020 Framework Programme Regulation (EU) No 1291/2013 of the European Parliament and of the Council of 11 December 2013 establishing Horizon 2020 - The Framework Programme for Research and Innovation (2014 - 2020) (OJ 347, 20.12.2013, p. 104) for legal basis, and more specifically article 29.2 of the H2020 Programme: AGA - Annotated Model Grant Agreement (2018). (last accessed 1 October 2018)


    The requirement for machine-readable bibliographic metadata is detailed by the European Commission in the following sources:

    • European Commission (date unknown) “Open Access”. (last accessed 13 July 2018), see “Step 2 - Providing open access to publications”
    Any projects funded by Horizon 2020 must have machine-readable bibliographic metadata, as detailed in:

    2.2. Metadata often are not protected as such because they are factual information, thus not original or not substantial.

    However, in certain cases, complex and elaborate metadata could perhaps be protected. For the avoidance of doubt, apply a CC0 to your metadata. In this way, in those cases when a right exists, you are waiving it and allowing other people to reuse the metadata information.

    CC BY should be avoided unless you know exactly what it entails.

    Legal sources:

    Applying a CC BY license may result in “copyfraud” in countries where metadata is not eligible for copyright protection (as in this case the application of a CC BY license imposes more restrictive conditions than what the metadata is actually entitled to). Confirmed in:


    Currently, 61% of the open access academic repositories listed on OpenDOAR have no clear metadata policy, as detailed in:

    2.2.1. In those few cases when metadata can be considered original works, thus protected by copyright, they will enjoy both economic and moral rights.

    Moral rights are recognised in most countries (but with important exceptions, such as the US), and may be unwaivable.

    This should not represent an issue in the case of CC0, as the waiver clarifies that it only waives the rights as long as this is permitted under applicable law. So if you enjoy unwaivable moral rights, CC0 will not affect your moral rights.

    Legal sources:

    For an example of a jurisdiction with unwaivable moral rights, see Kreutzer’s discussion of the German position, and why the application of a CC0 license in this situation is still valid:

    2.3. For metadata to be used meaningfully, it must be standardised to optimise machine-reading (a requirement of H2020 projects). A commonly used format in libraries and cultural heritage institutions is Dublin Core.

    Legal Source:

    The following source details issues which arise from using inconsistent metadata practices:

    • Knoth, Petr (2013). From open access metadata to open access content: two principles for increased visibility of open access content. Open Repositories 2013, 8-12 Jul 2013, Charlottetown, Prince Edward Island, Canada. (online resource available here:, last accessed: 4 July 2018)

    RIOXX provide a tool which tests repositories metadata compliance with open access standards:

    • RIOXX (date unknown) The RIOXX Metadata Profile and Guidelines (last accessed: 4 July 2018)

    For details on the formatting and implementation of Dublin Core bibliographic metadata, see:

    • Dublin Core (date unknown) Dublin Core Metadata Initiative (last accessed: 3 September 2018)
  • 3. Content should also be licensed

    3.1. In point 1 you have applied a license to your repository, and to its content “unless otherwise noted”. Now let’s take a look at the “unless otherwise noted” part.

    As a repository manager, you (or the University) usually don’t own the copyright in the articles uploaded (unless you have written them).

    Therefore, the repository has to implement a license selection procedure that allows the uploader (author or rightsholder) to choose the proper license. As further detailed in point 6, this process should offer a number of choices to the author, but it should be the author who ultimately decides what license to use.

    Nevertheless, in order to help researchers to make the right choice, you can offer and implement some guidance that will help researchers to make the right choice and to adhere to Open Science principles.

    In points 4 and 5 you will see what choices are recommended for a) data and databases; and b) articles.

    Legal source:

    See guidelines on open access as provided by the European Commission for details on self-archiving and open access publishing:

    • European Commission (date unknown) “Open Access”. (last accessed 13 July 2018), see “Step 2 - Providing open access to publications”

    3.2. We recommend a CC BY 4.0 license in respect of the content of the repository. This is detailed further in Point 5. This may not be appropriate for data or datasets as detailed in Point 4.

    Legal Source:

    For discussions on the merits of CC BY see the following resources and details in point 5:

    Amiel, T. and Soares, T.C. (2016) Identifying Tensions in the Use of Open Licenses in OER Repositories. International Review of Research in Open and Distributed Learning, 17(3) (last accessed: 5 July 2018), see “Licensing”

    Creative Commons UK (2017) Frequently Asked Questions on Creative Commons & Open Access. Zenodo. (last accessed: 3 July 2018), see “How should I license my work for the purposes of Open Access?”

    Mewhort, K. (2012) Creative Commons Licenses: Options for Canadian Open Data Providers. (last accessed: 5 July 2018)


    Some funders and institutions may require that any outputs are made available under a CC BY license, some of which are detailed in the following source:

    3.3. Creative Commons licenses are not appropriate for software.

    Instead, we would recommend that a GNU GPL v3.0, BSD/Apache style license is applied. These are some of the most well-established public licenses for free software, and are based on the “copyleft” concept for open source software. They are also highly interoperable with other licenses.

    Legal sources:

    Creative Commons confirm that their licenses should not be applied to software:

    • Creative Commons (2018) Frequently Asked Questions. (last accessed: 4 July 2018), see “Can I apply a Creative Commons license to software?”

    The following open access software licenses are appropriate substitutes for a CC BY or CC0 license:

    3.4. Whilst Creative Commons licenses may apply to both digital and non-digital content, this guide currently provides advice only in respect of fully digital repositories.

    Libraries and cultural heritage institutions may need to audit their non-digitised resources, and check for complex or multilayered content (e.g. multiple authors, orphan works etc.)

    Legal source:

    For guidance on digitising works for the purpose of creating a digital repository, see the following sources:

    • Hamilton, G. and Saunderson, F. (2017) Open Licensing for Cultural Heritage. London, Facet Publishing, chapter 12 (p167-p193)
    • Jordan, M. (2006) Putting Content Online: A Practical Guide for Libraries. Oxford, Chandos Publishing.

    This may be a particularly relevant consideration for any projects funded by the European Research Council, which require that all project materials be machine-readable. In this case, scans are not acceptable, which may impact non-digital repositories:

    3.4.1. If you need to digitise materials for use on an online repository, note that there may be special considerations for traditional knowledge works.


    Both the Alaska Native Knowledge Network and Charles Darwin University provide examples of special considerations and guidance when digitising or collecting traditional knowledge works:

  • 4. Data and datasets and databases should be under CC0

    4.1. Data and datasets and databases should be offered without restrictions on use, meaning under a CC0.

    • Data: as such not protected by copyright.
    • Dataset: not defined by law, can include database (as defined by law) and other structured and unstructured data.
    • Database: defined by law as “a collection of independent works, *data* or other materials arranged in a systematic or methodical way and individually accessible by electronic or other means.”

    Remember that a database can be protected by copyright (database structure) and/or SGDR (substantial investment in obtaining verifying and presenting data), without prejudice to any copyright or other rights in the underlying material.

    Applying a CC0 to a database means that if any rights exist they are waived, if they don’t exist CC0 does not create any obligation. If waiver is not possible then CC0 operates as a waiver or as a license to the same effect within the limits of applicable law.

    Legal sources
    • Creative Commons (2018) Open Data Guide. (last accessed: 3 July 2018).
    • Creative Commons (2018) Open Science. (last accessed: 3 July 2018).

    4.1.1. The advantages of making data available without restrictions include:

    • Greater availability and accessibility of publicly funded scientific research outputs;
    • Possibility for rigorous peer-review processes;
    • Greater reproducibility and transparency of scientific works;
    • Greater impact of scientific research.
    Legal source
    • United Nations Educational, Scientific and Cultural Organisation (2018) Global Open Access Portal. (last accessed: 20 August 2018).

    4.2. Where the uploader is concerned with regards attribution they can ‘kindly request’ to be attributed, rather than using the more legally restrictive CC-BY license.

    This is not legally binding although follows standard scholarly practice in crediting researchers for their work.

    Legal source

    4.2.1. When ‘kindly requesting’ attribution of a work, the uploader should be advised to offer a citation which can be easily copy and pasted by subsequent users.

    Legal source

    4.2.2. Tools such as the Creative Commons ‘Open Attribute’ tool are available to assist with ensuring adequate attribution.

    Legal source

    4.3. Where there are concerns with regards privacy issues or data protection, these should be dealt with under the relevant legislation or ethics policies.

    Legal source

    4.3.1. Both the BMC consultation and the PLOS data policy address these issues.

    Legal sources
  • 5. Other works of authorship (articles, images etc) should be licensed under a CC-BY 4.0 licence

    Suggest which license should be chosen to meet OS requirements, but let the uploader choose.

    5.1. Give the uploader the possibility to choose the license.

    You can indicate which licenses are better for OA/OS, but don’t choose for them.

    To avoid ambiguity, uploaders should be expected to apply a license at the point of upload. Failure to apply a license at upload results in ‘All Rights Reserved’, which generally means people are unable to use, re-use, modify or data-mine the unlicensed content, without authorisation.

    Legal sources:

    5.2. Repositories may play an important role in educating uploaders with regards open licensing.

    Legal source:

    5.2.1. The importance of making their work open access, should be explained to uploaders prior to upload.

    The benefits of open access work include: Researchers and their institutions benefit from having a wider audience Open access allows use of text and data mining tools, without legal barriers. Funders receive a greater return on their investment when results of research can be utilised by more people and at an earlier date.

    Legal sources:

    5.2.2. The Sparc Europe resource offers a useful summary of the benefits of making work open access.

    Legal source:
  • 6. Repositories should recommend the best OS licences but it should be the uploader who chooses which one

    6.1. Uploaders should be offered all possible guidance and explanation with regards the various licences open to them, and the degree to which these are compatible with open access principles.

    Legal source:

    6.1.1. This can be done by incorporating some form of ‘Licence Selector’ tool into the upload process. The tools featured here offer examples of how this can be achieved.

    Legal sources:

    6.2. CC-BY 4.0 may be considered as a default standard licence, except in the case of data and datasets.

    However any default licence provided should always be accompanied by a selection of alternative licences and comprehensive explanations about the function of each.

    The CC-BY 4.0 licence is often considered the ‘gold standard’ open access licence, since it is the least restrictive and allows people to use the licensed content as they choose, provided attribution is provided, and is fully OA compliant. As a note of caution, however, it should always be the uploader who makes the final licence selection.

    Legal source:

    6.3. Where uploaders select a licence which is less compatible with open access/science requirements, this should be made clear to them.

    This is particularly relevant where uploaders choose Creative Commons licences with NC (non-commercial) or ND (no derivatives) conditions. These licences have been described by Creative Commons as failing to promote ‘free culture’.

    Legal source:

    6.4. In the case of software, application of a GNU GPL or BSD/ Apache style licence is recommended.

    These licences are:

    • The most well-established public licences for free software, and
    • The most interoperable licence both in terms of general use and for TDM purposes.
    Legal sources:

    6.5. In the case of public sector information, application of an Open Government Licence is mandated by the UK Government Licensing Framework (UKGLF) for all public sector information.

    Legal source:

    6.6. Ultimately, however, the final decision with regards which licence is applied should rest with the uploader.

    Legal sources:

    6.7. The resources featured here offer comprehensive discussion regarding the benefits of open access principles, and provide an example of how these might be expressed to uploaders.

    Legal sources:

    6.8. Account must be taken of any external limitations on the uploader’s choice of licence.

    This may be as a result of funding body stipulations or publishers’ requirements.

    Legal source:

    6.8.1. The resources featured here, including the European Commission H2020 guidance, provide an example of possible funding body stipulations, with regards making work open access and how this should be done.

    Legal sources:

Guides for Content Providers

Content Providers Community Calls

Are you a content provider? Got questions? Come and Learn more! OpenAIRE is running a series of regular online community calls targeting all OpenAIRE content providers managers.

Still have questions?

Contact us via our Helpdesk.
We try to respond within 48 hours.