OpenAIRE Guide for Usage Statistics
Aggregation requires standardization on the recording of user events, exclusion of robot accesses and data exchange mechanisms. The OpenAIRE Guidelines for Usage Statistics (v1.0), based on OAI-PMH, specify a common description of usage events for a straightforward adoption by data providers.
In alignment with the European Act of personal data protection, the IP address, session-id and in some cases also the C-class Subnet must be pseudonymised before transferring the usage data to the aggregator service.
Repository log file conversion and transfer tools
Below is list of tools that can help in conversion and transfer of collected usage data:
- SURFshare-sure (Statistics on the Usage of Repositories) provides a software for the conversion of Apache2 log files of repositories into OpenURL Context Object Files and for the OAI-PMH transfer to a log Aggregator, like OpenAIRE's Usage Data Aggregator Service.
- OA-Statistik Data Providers plugins for DSpace, WebDoc and OPUS repositories that convert repository usage data formats into the OpenURL Context format and expose them with OAI-PMH.
Check out regularly for an update of the list of tools.
- Removal of all malformed usage events .
- Unique publication identifier generation (currently by cross referencing DRIVER's identifiers).
- Filtering out of all robot initiated requests(COUNTER and custom black robot identification lists).
- Multiple simultaneous streams consolidation.
- Type of request (download full text or view metadata) deduction for uknown types of events.
- Double click filtering according to the COUNTER rules.
- Unique requester detection, which uses multiple input event fields to enhance precision.
- Grouping of multiple events of each requester into sessions (inactivity rule of 30 mins).
Individual publication results are shown in the OpenAIRE and DRIVER portals:
Aggregated results, e.g., usage data of publications of a project/scientific area/... are generated:
Currently we have collaborated and retrieved usage data from ~10 repositories in collaboration with PIRUS2, SURF SURE and OA-Statistik. In order for the initiative to take off and be able to provide overall statistics we need to hear from you.