Skip to main content

News 

Published
Apr 21, 2026
Author
# Topics
Share post on
Hits: 814

SciLake’s Legacy: from fragmented research to Scientific Knowledge Graphs

Apr 21, 2026

Scientific knowledge is growing at an unprecedented pace, but the knowledge we need to advance research is often hard to access in practice. Publications, datasets, software, protocols, workflows, and contextual information are distributed across systems and formats, so even when outputs are openly available, they can remain functionally fragmented: difficult to connect, query, reuse, and translate into evidence for new research or policy decisions.

After three years of collaboration between knowledge‑management experts and research communities, SciLake presented its results at the final dissemination event on 10 March 2026. The event demonstrated how SciLake’s tools and methods can improve discovery and reuse for researchers, strengthen interoperability and service delivery for research infrastructures, and support evidence‑based analysis for science policy.

Building Scientific Knowledge Graphs in practice

SciLake’s central contribution is a practical way to build domain-specific Scientific Knowledge Graphs (SKGs). In simple terms, an SKG is a connected map of research information that links publications, datasets, software, projects, organisations, and other research outputs, so they can be explored and analysed together, rather than in isolation.

A key part of this approach is the use of the OpenAIRE Graph as a shared “backbone” of open research information, which provides a broad, cross‑disciplinary foundation that can then be refined and enriched for specific research communities, focusing on the concepts that matter in each specific field.

In practice, SciLake demonstrated how to:

  • collect and clean research outputs from different sources (for example repositories, publishers, and catalogues),
  • turn this information into a graph, i.e. a network where items (like papers or datasets) are linked to each other and to the people, places, methods, and topics they mention,
  • use shared standards (in particular the RDA SKG Interoperability Framework, SKG‑IF) so that graphs built by different communities can be reused, linked to other resources, and maintained over time,
  • make the graph available to others through standard interfaces (APIs) and graph technologies, so it can be reused outside a single platform,
  • and support value-added services built on top of the graph, for example improved discovery, monitoring, or reproducibility-focused tools.

SciLake’s results in real settings

Across five pilots, SciLake showed how domain-specific SKGs can be tailored to concrete community needs. Key pilot results include:

  • Energy planning: extracting geographic “objects of study” from publications, linking them to real-world places (via OpenStreetMap), and enabling spatial exploration (e.g., finding evidence by location).
  • Cancer: building a cancer-focused knowledge graph that integrates relevant ontologies and gene/drug resources, supporting pathway exploration, interaction networks, and ranking of literature for specific questions.
  • Maritime transport: supporting stakeholder-driven monitoring and discovery with a strong open-science focus, including openly released ranking signals (e.g., influence/PageRank and recency-biased popularity) to help prioritise what to read.
  • CCAM (automated mobility): creating targeted subgraphs for CCAM concepts, reducing noise compared with generic search, and using human-in-the-loop validation to improve entity extraction and linking.
  • Neuroscience: enriching the graph with neuroscience entities and impact indicators, enabling query-based meta-analysis and exploring how dataset citation relates to citation impact.

Full pilot materials and demonstrations are available here: https://scilake.eu/demonstrations-of-scilake-tools-through-pilot-case-studies

Building trust into knowledge graphs in the age of LLMs

An important takeaway from the panel discussion during the final event wa is that scaling knowledge graphs is not only a technical challenge. It also depends on trust.

When links between research outputs are created or enriched automatically, some errors are inevitable. Trust therefore relies on three practical elements: being able to trace claims back to evidence (provenance), enabling communities to report and correct mistakes, and making verification part of everyday use.

This focus on trust becomes even more important as AI tools become part of everyday discovery and analysis. Large language models make it easier to extract information and summarise complex material, but they can also produce plausible answers that are difficult to verify. SKGs help by linking research information and allowing us to trace the evidence back to the original source.

From project results to community uptake

SciLake leaves behind more than a set of tools. It provides a practical framework for turning fragmented research outputs into connected, reusable knowledge, building on open infrastructures such as the OpenAIRE Graph.

For research communities, this means better ways to explore and combine information across sources. For infrastructures, it offers a pathway to more interoperable and sustainable services. And for policy and research assessment, it enables more transparent and evidence-based analysis.

As Scientific Knowledge Graphs continue to evolve, SciLake’s approach offers a foundation that others can build on, adapt, and extend across disciplines.

Explore SciLake resources