Research funders across Europe are increasingly mandating Open Science practices for funded research outputs to support open and free access to valuable elements of the scholarly communication life-cycle. From OA to publications and the recent PlanS developments, to the promotion and uptake of coordinated RDM practices, to the more advanced research assessment exercises to understand innovation and societal impact, there is a need for monitoring of research output.
National and EU e-Infrastructures respond to these needs by embedding and developing monitoring tools to provide evidence-based data on policy uptake, costs, and research impact, while at the same time promoting interoperability of information outputs, shareable across networks.
This two-day workshop, co-organised by OpenAIRE and Data4Impact, with support of Science Europe, explored mechanisms for research policy monitoring and indicators, and how to link these to infrastructure and services. The first day was focused on open science indicators as these emerge from national and EU initiatives, while the second day explored more advanced aspects of indicators for innovation and societal impact.
Transparency, talking to one another and quality. These were just some of the themes concluded at the OpenAIRE / Data4Impact workshop, attended by over 80 people in Gent. OpenAIRE has been working in the area of monitoring and tracking research output, especially that of funders for nearly ten years now.
The workshop explored different ways of monitoring in this new landscape of open science.
Quality must come first. The forward-looking keynote speech, given by Marc Vanholsbeeck, explored the meaning of impact. What do we actually refer to by using the term? There are, in fact, so many different kinds of impact in research, over 3000+ pathways according to the available research. Therefore, it is hard to pin down what we want to measure. The overarching theme was: quality comes first, impact comes next. It goes without saying that policies, such as those coming from the EC or national funders can determine impact, but how do we measure it?
What is certainly clear is that OS can both communicate and popularize research to the benefit of society – as well as drive forward new ways of measuring impact. The remaining question is who are the players and providers who can do this – as an offer to society?
We need to monitor repositories. And get trained up. A presentation of the OS monitor by David Osimo from Lisbon Council furthered the discussion – research by the initiative has looked into incentives for researchers to share. Very few researchers are willing to share their research data beyond their research groups. One interesting point made was the need to standardize usage data coming from repositories. A clear way forward, certainly in the European landscape, is to train data stewards, the number of whom is still very few.
The RDA-FAIR Data Maturity Model WG - outlined in a presentation by Brecht Wyns and Christophe Bahim - has set out a rich plan looking at how we standardize FAIR - at present there are no benchmarks. This WG will go far to set these criteria and interpret the implementation of FAIR. The landscape study will also be crucial. We look forward to that FAIR checklist.
CoalitionS, and what monitoring is needed, was then presented by Stephan Kuster. Clearly the Coalition S members have had their hands busy taking on board all the many responses. There will be some changes, altering the existing key principles. This includes: no pay-walls, more flexibility on the CC-BY licensing and a bit of leeway on allowing hybrid, so long as the journal can demonstrate it is moving towards a full open access model. And…the green road is also a very important route to open access reflecting that CoalitionS is part of a global movement. Good to see! There are few sanctions, the ultimate goal, here, is to change the publishing system, not to punish the researchers.
Indicators should be transparent. Dietmar Lampert’s presentation stressed that - above all - indicators should be transparent. The results presented from his study (ZSI Research Policy and Development) gave some interesting insights: what indicators should be developed such as:
Researchers are all too aware of standards within research. It is therefore a natural progression that we can build one for OS. Other factors in the research workflow can give us monitoring insights, and we need to explore them. Research evaluation should also harness the the unseized opportunities of the open science era and potentially new technological and methodological approaches.
The potential of the semantic web and using linked data to measure OS was the key message from Diego-Valerio Chialva (European Research Council). Diego set out an interesting use case for how to monitor outputs from the repositories, and these outputs need to be also collected but often are not, leading to ineffective evaluation. However semantic data can assist here and adds huge advantages to identifying resources. He also made the important point that OS needs to be monitored in and of itself.
Based on open source sofware for transparency the OPERA project (Karen Hytteballe Ibanez, Technical University of Denmark): Exploring Open Research Analytics using VIVO used Vivo to describe open science metrics along with collaborations international and national, funding national and international, importantly using VOSviewer, and NEO4J for visualisations.
It’s all about good data. Paolo Manghi reiterated the need for reliable data to contribute to the OpenAIRE graph. Much of this graph looks beyond the article itself such as: funder data – projects – research data – software which leads to a huge potential of added value services. And it needs to be kept curated and clean. Ultimately, we are building this graph together as decentralised open science community from repositories, aggregators, individuals, institutions, service providers. Ultimately all can contribute, and this would be for the common good. However, the data needs to be reliable.
An inspiring tale from Portugal by Vasco Vaz, Foundation for Science and Technology. RCAPP started in 2008 providing access to all HEI material nation-wide. Involving pretty much everyone from the scholarly communication chain, it serves up a number of services. Using OpenAIRE’s services, project information is automatically added to output. It works as a successful case study whereby deposit is a natural part of the life-cycle within the institutional setting and since standards are implemented early on, the researcher only has to deposit once and reap the benefits.
Panelists included Bregt Saenen (EUA), Helena Cousijn (FREYA), Paolo Manghi (OpenAIRE), James Hardcastle (Clarivate Analytics), Diego- Valerio Chialva (ERC).
Some panel members argued for a consortia approach. The single institution cannot pay alone.
Against the backdrop of the audience mentimeter, it was clear that ‘openness has to somehow be monitored. However, it was clear that lack of trusted sources is the main barriers.
The second workshop day focused on the use of big data technologies for advanced research assessment. The workshop was led by Data4Impact, a Horizon 2020 project funded by the European Commmission. Data4Impact pioneers big data techniques and develop pilot approaches which track elgacy and impact of research activities after the end of public funding. In this workshop, the project consortium presented a series of indicators developed on the performance and societal impact of 40+ research programmes in the health domain for the first time.
Data4Impact coordinator Vilius Stanciauskas (PPMI) introduced the project. The key message was that building on data harvested from PubMed, OpenAIRE, Lens.org, PATSTAT, clinical guidelines repositories, company websites, social media and media platforms, EC monitoring data and other databases, Data4Impact enables policymakers, funders, experts, researchers and the public in general to ‘Ask less & know more’ in the context of advanced research assessment.
Following that, Alexander Feidenheimer (Fraunhofer ISI) provided an overview of the Data4Impact Analytical Model Of Societal Impact Assessment (AMOSIA) which has been developed over the course of the project. The analytical model is structured around four distinct phases of the research lifecycle, including input, throughput, output, and impact. Relying on novel big data techniques such as web scraping, crawling and mining as well as text analysis methods such as Natural Language Processing and deep learning, Data4Impact has gathered data for each analytical phase.
The primary purpose of the following presentations by Data4Impact consortium was to present the indicators which have been developed over the course of the project for each of these phases. Ioanna Grypari (ATHENA RC) discussed the approach used for tracking of research outputs.
Then, the consortium including Ioanna Grypari (ATHENA RC), Iason Demiros (Qualia), Vilius Stanciauskas (PPMI) and Gustaf Nelhans (University of Borås) have put emphasis on three categories of impact, including academic, economic and societal impact. .
Please refer to the Data4Impact booklet which demonstrates linkages between data collected across the three dimensions of impact
The presentations of the second workshop day may be accessed here
Data4Impact invited the participants to two parallel sessions, focusing on the Data4Impact methodology and indicators in the areas of:
|Level||Indicator||Description||Relevance & Credibility||Comments|
|1. Input level indicators||1.1. Funding volume||Monetary expenditure on R&I activities||N/A||N/A|
|2. Throughput and output level indicators||2.1. Publications||Number of publications produced by programme/funder cited in patents||N/A||N/A|
|Number of highly cited publications in patents||N/A||N/A|
|2.2. Patents||Number of patents produced||N/A||N/A|
|2.3. Innovation outputs produced by EU FPs projects||Number of health-specific innovation outputs produced in projects funded by the EU Framework Programmes||N/A||N/A|
|2.4. New companies||Number of new companies/start-ups created in the EU Framework Programmes||N/A||N/A|
|2.5. Innovation outputs produced by companies participating in R&I activities||Number of product innovations; or process, service or other innovations announced by companies||N/A||N/A|
|2.6. Innovation activities carried out by companies participating in R&I activities||Number of licensing agreements; or cases of acquisitions; or cases of private funding attracted; or cases of public funding attracted by companies; or cases of newly CE-marked medical devices or technologies||N/A||N/A|
|3. Academic impact||3.1. Funding priorities||Topic size in PubMed (absolute and normalised)||High||N/A|
|Distribution of topics per funder (normalised)||High||N/A|
|Distribution of funders per topic (absolute)||High||N/A|
|3.2. Timeliness of research performed||Rate of topic growth between 2012-2018 compared to 2005-2011||Low||The length of the time intervals used to estimate trends, growth and other factors would be derived by some well-established and/or relevant criterion|
|Share of funding allocated to top-10% fastest growing topics||High||N/A|
|3.3. Funding exclusivity||Share of funding allocated to top-10% smallest research topics by size (i.e. investment in small-niche topics)||Medium relevance but high credibility||N/A|
|Number of funders per topic whose output share exceeds 3% globally||Medium||To be clarified|
|Share of funding allocated to research topics with less than 5 funders whose share exceeds the 3% mark (i.e. investment in topics where few other funders invest)||Medium||To be clarified|
|3.4. Technological value/significance of patents||Analysis of the extent to which commonly patent forward citations (i.e. citations a patent receives from subsequent patent filings) are used||High||N/A|
|4. Economic impact||4.1. Economic and innovation performance of companies||Estimated share of enterprises with evidence of innovation activities||High||Might fit better if presented as input level indicator|
|Estimated share of highly innovative enterprises||High|
|Estimated share of enterprises with evidence of licensing activities (incl. patent/ trademark license agreements)||High|
|Estimated share of enterprises involved in activities related to acquisitions||High|
|Estimated share of enterprises with evidence of private investment/capital attracted||High|
|4.2. Continuity of innovation activities||Estimated overlap between project activities in FP7 & identified company innovations||Medium||To be clarified|
|Number of newly CE-marked devices and medical technologies that could be directly linked to R&I activities in the EU Framework Programmes||Medium|
|5. Societal/ health impact||5.1. Impact on public health||Citations of publications in clinical guidelines||Medium||Might be relevant to consider policy impact (e.g. consider health technology assessment)|
|5.2. Societal awareness/relevance of research||Rank of research topic based on number of news articles, blogs, posts, tweets, etc. discussing a given topic||Low||N/A|
|5.3. Congruence of research funding with societal priorities||Rank similarity of most discussed research topics versus actual spending in the topics||Low||N/A|
|5.4. Newly launched medicines and medicinal products||Number of human medicinal products or orphan medicines that could be directly linked to R&I activities in the EU Framework Programmes||High||Most positively assessed indicator|
|Strength of link based on the number of mentions of product names and their active substances in EC monitoring data||High||N/A|
Following the discussions on the indicators developed by Data4Impact, the discussion leads from the consortium invited the discussion groups to consider whether any other indicators could be relevant in the context of research assessment. The Academic Impact & Societal Relevance of Research discussion group suggested to explore the possibility to compare the topic of the call with the topics of the publications which came out of the project. This could help determine whether the research published matches the research promised in terms of topics. It was also suggested to consider an indicator which would normalise the numbers produced to enable cross-disciplinary comparisons.
If you are interested in learning more about Data4Impact methodology and results, we would like to invite you to attend our upcoming Workshop on Data4Impact Methodology and Indicators which will take place on 24 June 2019 at the premises of Research Executive Agency (Brussels, Belgium). In this hands-on, interactive workshop we aim to gather feedback on the chosen methodology, coverage and latency/timeliness of the developed indicators, to maximise their relevance for all the stakeholders involved. You may find the programme and registration page for this workshop in our website. If you have any questions about the event, please contact Sonata Brokeviciute at .