News
Beyond Simple Metrics: How AI and Open Data Reveal the True Complexity of Research Impact
Understanding how publicly funded research translates into real-world benefit is increasingly important as global challenges grow more complex. Policymakers need timely, evidence-based insights to guide investment, particularly in fields like rare diseases where impact unfolds over long, non-linear pathways.
Rare diseases affect fewer than 1 in 2,000 people but collectively impact 36 million Europeans across thousands of conditions. Between 2014 and 2020, the EU funding programmes FP7 and Horizon 2020 collectively invested over €2.9 billion in more than 600 rare disease projects, a significant commitment that calls for effective, reliable impact assessment.
Where traditional approaches fall short
Traditional evaluation methods primarily rely on structured data and statistical indicators. While useful, they only provide a partial view of research outcomes and struggle with persistent challenges, including:
- Long timelines: outcomes may emerge years or decades after initial funding.
- Fragmented data: they operate in data silos, making it difficult to link funding decisions to real-world applications across different sectors.
- Invisible pathways: they miss the intricate knowledge flows through which research actually drives societal change, including indirect contributions from seemingly "unsuccessful" projects that nonetheless provide valuable insights for future breakthroughs.
The misalignment between research impact timelines and policy evaluation cycles creates a fundamental challenge: policymakers need evidence-based insights on shorter cycles than research impact typically unfolds, yet traditional metrics cannot capture the subtle, long-term pathways through which knowledge creates value.
A new AI-enhanced methodology
To address these limitations, this study presents a framework combining artificial intelligence capabilities with human domain expertise. The methodology is guided by four key principles that enable more effective research impact assessment.
- 360-degree view of data integrating diverse sources including publications, patents, clinical trials, company websites, and policy documents. This holistic perspective captures multiple stages of the research lifecycle, from initial funding through commercialisation and clinical application.
- A modular, end-to-end workflow using machine learning and natural language processing (NLP) to extract and categorise relevant entities while applying semantic similarity analysis to identify connections between different research outputs. These techniques structure information into knowledge graphs that link research topics to stakeholders, funding programs, and translational applications.
- Expert-in-the-loop paradigm recognising that AI-generated outputs require human review and domain contextualisation to ensure validity and policy relevance. This approach balances scalable automation with interpretive accuracy.
- Openness and transparency, building on Open Science infrastructure like the OpenAIRE Graph to ensure reproducibility and enable others to adapt the methodology for different research domains.
Case Study: EU-funded rare disease projects
To demonstrate this methodology's capabilities, a comprehensive analysis of EU rare disease investments using multiple analytical approaches that reveal different dimensions of research impact was conducted.
A three-tier project identification process, combining NLP, filtering, and expert review, produced a curated portfolio of 400 projects. Using this dataset, impact was explored through multiple lenses:
- Funding priorities and thematic evolution: Topic modelling showed shifts aligned with emerging global health concerns; for example, increased focus on arbovirus outbreaks under Horizon 2020, coinciding with heightened concerns over Zika and dengue viruses
- Collaboration patterns: Network analysis revealed changing partnerships, from strengthened Sub-Saharan collaborations under FP7 to rising Latin American engagement in Horizon 2020.
- Research uptake in clinical practice: Over 1,800 clinical trials citing related publications were traced, including 843 involving original project participants, as well as 100 clinical guidelines referencing outputs from the portfolio.
- Industry continuity: A new R&D Uptake Score highlighted how companies continue or pivot their research focus relative to their past EU-funded work.
These findings demonstrate how a systems-based approach can surface impact that remains invisible to traditional evaluation techniques.

Indirect pathways to impact: Citation chain reconstruction showed how “unsuccessful” “research contributes to later advances; including a 2015 Ebola trial on Favipiravir that contributed to subsequent ARDS treatment developments.
Why open infrastructure matters
The rare disease analysis showcases several key advantages of this new approach over traditional evaluation methods that have direct implications for policy and funding decisions.
- It captures indirect and long-term value, revealing contributions traditional metrics overlook.
- It provides real-time intelligence on trends, collaborations, and emerging health challenges.
- It shows concrete pathways from funding to clinical practice.
- It balances scale and accuracy, combining automated analysis with domain expertise.
- It embraces the complexity of scientific ecosystems rather than imposing linear models.
This work was possible thanks to the OpenAIRE Graph, an open scholarly knowledge graph which provides clean, interlinked metadata connecting publications, projects, datasets, software, and more. Such infrastructure enables:
- tracing knowledge flows across domains,
- linking funding to real-world uptake, and
- ensuring analyses are reproducible and auditable.
Open, high-quality metadata is essential for modern research evaluation, not only for access, but for understanding how knowledge moves through society.
The path forward
This work demonstrates that research impact evaluation can move beyond counting outputs to understanding systems. Because the methodology is domain-agnostic, it can be adapted to areas such as energy, climate, and digital technologies.
To keep advancing, the research community needs continued investment in open infrastructure, shared methodological standards, and evaluation approaches that reflect the real dynamics of knowledge creation. By combining advanced analytics with human expertise, we can give policymakers the insights they need to steer research towards maximum societal benefit.
Read the full study
Grypari, I., Di Virgilio, S., Papageorgiou, H., Fergadis, A. and Pappas, D. (2025). Advancing Research Impact Evaluation in the Digital Era: Insights from EU-Funded Rare Disease Projects. fteval Journal for Research and Technology Policy Evaluation, (57), e5. https://doi.org/10.22163/fteval.2025.697