Open Innovation Call - Challenge #1

Vision and Description text

The vision underlying the work of Next Generation Repositories is, “to position repositories as the foundation for a distributed, globally networked infrastructure for scholarly communication, on top of which layers of value added services will be deployed, thereby transforming the system, making it more research-oriented, open to and supportive of innovation , while also collectively managed by the scholarly community.”

An important trait of this vision is that repositories will provide access to a wide variety of research outputs, generating the conditions where a greater diversity of contributions to the scholarly record will be accessible, and formally recognized in research assessment processes.

In April 2016, the Confederation of Open Access Repositories (COAR) launched the Next Generation Repository Working Group to identify new functionalities and technologies for repositories. A full report with recommendations for the adoption of new technologies, standards and protocols that will help repositories become more integrated into the web environment and enable them to play a larger role in the scholarly communication ecosystem is here:

https://www.coar-repositories.org/files/NGR-Final-Formatted-Report-cc.pdf

Problem
In general, repositories form a distributed global knowledge network and have the potential to promote the transformation of the scholarly communication ecosystem. One of the obstacles so far, has been the lack of web-based integration with other innovative scholarly services. There is a clear need for repository platforms to adopt modern web-technologies and protocols that will allow them to better interact with more innovative and sophisticated scholarly networked tools and services.
Challenge

Building on the outcomes of the COAR Next Generation Repositories Working Group, OpenAIRE-Advance calls  interested SMEs (developers, data analysts, etc.) and Young Innovators to develop functionalities and demonstrate use cases that  support the implementation of next generation networked services (repositories) within OpenAIRE.

This call will specifically address the following needs/functionalities related to next generation repositories:

  1. Plug ins and/or add ons for the implementation of Signposting in DSpace 5 and DSpace 6
  2. Implementation of ResourceSync for eprints
  3. Innovative techniques/methods for visualizing usage data from and for individual repositories and aggregator services
What we are looking for in this challenge

We are looking for software services (new or already working) that will promote OpenAIRE services towards next generation repositories and will be successfully integrated into OpenAIRE infrastructure. You can select at least one potential category of action. Proposed services that combine more than one category will gain an advantage (see criteria of selection below).

Definitions

The next generation repository

  • manages and provides access to a wide diversity of resources, including published articles, pre-prints, datasets, working papers, images, software, and so on
  • is resource-centric/ oriented, making resources the focus of its services and infrastructure
  • is a networked repository. Cross-repository connections are established by introducing bi-directional links as a result of an interaction between resources in different repositories, or by overlay services that consume activity metadata exposed by repositories
  • is machine-friendly, enabling the development of a wider range of global repository services, with less development effort
  • is active and supports versioning, commenting, updating and linking across resources
Characteristics

Next generation repositories have three main characteristics:

  • Resource centric/ oriented, as the focus is on resources and not metadata; affecting services and infrastructures that use them
  • Networked; they do not operate in isolation but are cross-connected with other repositories
  • Machine friendly; support machine access to resources, using batch, navigation and notification access mechanisms
  • Active; support versioning, commenting, updating and linking across resources, so that content that changes over time and related systems will be notified
Important – Categories of actions to work on

Potential technologies, standards, protocols and behaviors to work on and align with the next generation repositories need to support the following actions:

Exposing Identifiers
Signposting is an approach to inform machine agents about the nature of the resources that are linked to the resource they currently interact with. Info: http://signposting.org/
Declaring Licenses at a Resource Level
Discovery through Navigation, improve the discoverability of resources through navigation in repositories
Signposting (see above)
Interacting with Resources (Annotation, Commentary and Review)
  • Activity Streams 2.0 is an approach to describe interactions with resources, including commenting, liking, sharing, etc. Interactions are expressed as JSON-LD and use the Activity Streams 2.0 vocabulary. While this core vocabulary is targeted at general social web activities, extensions can be created to supported scholarly use cases. Info: https://www.w3.org/TR/activitystreams-core/; https://www.w3.org/TR/activitystreams-vocabulary/
  • Web Annotation Model and Web Annotation Protocol specify an approach to express annotations (including commentary, review, etc.) and an associated protocol to create and manage them. Annotations are expressed using an RDF-based vocabulary and can be rendered as JSON-LD. The protocol is based on HTTP and adheres to REST design principles. https://www.w3.org/TR/annotation-model/ ; https://www.w3.org/TR/annotation-protocol/
  • International Image Interoperability Framework (IIIF) is a family of APIs that enable easy reuse, share and interaction with images for annotation, transcription, composing, authenticated access, etc. Although it is a technology although it is specific kind of content in the repository, we firmly believe that it is a good example of technology to highlight to emphasize the distributed nature of the Next Generation Repositories. Info: http://iiif.io/
Resource Transfer
  • IPFS is a promising emerging peer-to-peer hypermedia protocol designed to make the web faster, safer, and more open. IPFS should be considered as a possible approach for cases where large data collections need to be shared among a number of parties, each of which actively operates an IPFS node. Tip: https://ipfs.io/
  • ResourceSync is a specification based on Sitemaps that can be used by repository managers to provide information that allows third-party systems to remain in sync with the resources in their repository as they evolve, ResourceSync can be used for discovery and synchronization of both content and metadata and uses the Sitemaps XML format.
  • SWORD (Simple Web-service Offering Repository Deposit) is a lightweight protocol for depositing content from one location to another. It stands for Simple Web-service Offering Repository Deposit and is a profile of the Atom Publishing Protocol. Info: http://swordapp.org/about/
Batch Discovery
  • ResourceSync (see above)
  • Signposting (see above)
  • Sitemaps are widely used by webmasters to inform search engines about pages on their sites that are available for crawling. Repository managers can use Sitemaps as a straightforward way to expose a repository inventory to search engines. https://www.sitemaps.org/
Collecting and Exposing Activities
  • Activity Streams 2.0(see above)
  • Linked Data Notifications is a notification protocol whereby any resource can advertise an inbox to which notifications pertaining to that resource can be posted. For example, an annotation, commenting, or reviewing application can post a notification to a resource’s inbox to inform that resource that an interaction occurred with it, what the nature of the interaction was, who the actor involved in the interaction was, etc. The payload of a notification is expressed as JSON-LD and uses the Activity Streams 2.0 vocabulary. A repository could support an inbox per resource, or an inbox for the entire repository. The repository could surface interactions that took place with its resources in the user interface, could further post them to the inbox of an aggregating application, or could expose them in the aggregate for further machine consumption using WebSub. Info: https://www.w3.org/TR/ldn/
  • ResourceSync Change Notifications is a publish/subscribe protocol based on WebSub and focused/ that focuses on sending notifications about changes (create/update/delete) to resources in a repository to subscribers. It can be used for discovery and synchronization of both content and metadata and use the Sitemaps XML format. Info: http://www.openarchives.org/rs/notification
  • Signposting (see above)
  • Webmention is a point-to-point, trackback/pingback approach aimed at informing a resource that it was linked from another resource. It allows, for example, the establishment of bidirectional links. Info: https://www.w3.org/TR/webmention/
  • WebSub is a publish/subscribe protocol, whereby a publisher posts resource updates to a channel on a hub and the hub subsequently relays those updates to channel subscribers. A repository could publish interactions that took place with its resources on a single channel, or on multiple channels, for example, one per type of activity (e.g. citation, review, annotating). Info: https://www.w3.org/TR/websub/
  • Other messaging protocols (e.g. AMQP, Kafka) provide a common mechanism for communication between publishers of any kind of Web content and their subscribers.
Identification of Users
  • ORCID is an HTTP(S) URI in the orcid.org domain aimed at unambiguously identifying a scholarly contributor. ORCIDs are increasingly used in a variety of scholarly workflows. A profile is associated with a contributor’s ORCID, which has both a human and machine-readable representation. The machine-readable profile is RDF-based and uses the FOAF vocabulary. The ORCID organization also provides authentication services that can be used in distributed settings. Info: https://orcid.org/
  • Social Network Identities are provided by several social network platforms. In many cases, these platforms also provide facilities for distributed authentication based on the social network identities they provide
  • WebID is an HTTP(S) URI which refers to an agent (person, organization, group, etc.) and that is minted in a domain that is typically owned by the agent. The WebID leads to a machine-readable profile that describes the agent. The RDF-based profile is fully under the agent’s control and uses the FOAF vocabulary. A WebID is commonly used in conjunction with the WebID/TLS authentication approach and the Web Access Control Lists authorization approach. Info: https://www.w3.org/2005/Incubator/webid/spec/identity/
Authentication of Users
  • HTTP Signatures provide an authentication approach that is conceptually similar to WebID/TLS. But the approach is more generic in that it is not solely tied to the WebID concept. Also, in addition to authentication, it allows verification that the communication between client and server was not tampered with. The approach is currently being standardized at the IETF. https://datatracker.ietf.org/doc/draft-cavage-http-signatures/
  • OpenID Connect 1.0 is a simple identity layer on top of the OAuth 2.0 protocol, which is used for distributed authentication against compliant identity providers. OpenID Connect allows client applications - such as repositories and browsers - to verify a user’s claimed identity by authenticating the user against her identity provider. As a result of a successful authentication, basic profile information about the user can be passed along to the client application. The specification is extensible, allowing participants to use optional features such as encryption of identity data, discovery of OpenID Providers, and session management. Info: http://openid.net/connect/
  • WebID/TLS is a protocol that enables secure user authentication on the basis of the Transport Security Layer protocol (TSL), X.509 Certificates, and a WebID with associated profile. It enables a user to authenticate by simply choosing an appropriate certificate from the ones proposed by the browser. The certificate is used to sign a server’s challenge with the user’s private key but also to convey the user’s WebID. The WebID leads the server to the user’s profile, which contains her private key, allowing the server to verify that the challenge was met correctly. While this authentication approach is both elegant, efficient, and fully distributed, its adoption has thus far been hindered - among others - due to issues with generating certificates and user interface challenges. Info: https://www.w3.org/2005/Incubator/webid/spec/tls/
Exposing Standardized Usage Metrics
  • Read also Resource Transfer (below)
  • COUNTER provides the standard that enables the knowledge community to count the use of electronic resources. Known as the Code of Practice, the standard ensures vendors and publishers can provide their library customers with consistent, credible and comparable usage data. Info: https://www.projectcounter.org/
  • SUSHI (Standardized Usage Statistics Harvesting Initiative) is an ANSI/NISO Standard that defines automated request and response model for harvesting e-resource usage data. It is designed to work with COUNTER, the most frequently retrieved usage reports.
  • ETag or entity tag is one of several mechanisms that HTTP provides for web cache validation, which allows a client to make conditional requests. This allows caches to be more efficient, and saves bandwidth, as a web server does not need to send a full response if the content has not changed. ETags can also be used for optimistic concurrency control, as a way to help prevent simultaneous updates of a resource from overwriting each other. This is relevant to support central systems from fetching only new data about metrics. Info: https://en.wikipedia.org/wiki/HTTP_ETag
  • Usage metrics service provider for repositories (IRUS-UK http://irus.mimas.ac.uk/ ; OpenAIRE using Matomo (https://zenodo.org/record/1034164#.XKMvxJgzZnI), https://matomo.org/ ; RAMP - Repository Analytics and Metrics Portal http://ramp.montana.edu/
Preserving Resources
The technologies to support this action are already described in category Resource Transfer.
  • Last updated on .
OpenAIRE
flag black white lowOpenAIRE-Advance receives funding from the European Union's Horizon 2020 Research and Innovation programme under Grant Agreement No. 777541.

Subscribe

  Unless otherwise indicated, all materials created by OpenAIRE are licenced under CC ATTRIBUTION 4.0 INTERNATIONAL LICENSE.
OpenAIRE uses cookies in order to function properly. By using the OpenAIRE portal you accept our use of cookies.
More information Ok