Skip to main content
4 minutes reading time (722 words)

Year in Review: SciLake


The Project: SciLake is an EU-funded research project that provides innovative and customizable services for the European Open Science Cloud (EOSC). Its main objective is to deliver technologies that facilitate the integration of domain knowledge and open Scientific Knowledge Graphs (SKGs) to simplify the development of valuable services tailored to specific domains. The project involves 13 partners from 9 countries, including experts in scholarly communication and domain experts from four scientific fields: Neuroscience, Cancer, Transport, and Energy. SciLake aims to benefit the entire research community by providing tools for extracting knowledge from unstructured information, promoting interoperability of SKGs, and simplifying knowledge discovery based on the impact and reproducibility of research products.

2023 Developments: Throughout the year, SciLake has made significant progress in coordinating the technical efforts of its members and their interaction with the pilot scientific communities. The project has focused on ensuring the interoperability of domain-specific SKGs with universal ones, specifically with the OpenAIRE Graph. Additionally, the project has advanced in the development of the SciLake services by identifying the specific needs of the pilots. Roadmaps have been created for each service component, and some preliminary demos have been organized with the pilots to showcase the current state of the service and discuss improvements. Overall, the project has transitioned to a more advanced state, with concrete developments and progress in creating and/or enriching domain-specific SKGs.

SciLake concept at a glance

2023 Milestones: In September 2023, SciLake reached its first Milestone: "Initial plan and setup," successfully completing the following key tasks:

  • Gathering initial requirements for all SciLake services from project pilot participants,
  • Preparing initial plans for the use cases to be demonstrated by the SciLake pilots,
  • Setting up the necessary infrastructure to host the services and demos,
  • Developing an initial version of the SciLake architecture.

During the same period, the SciLake partners also delivered a report of the activities related to gathering and analysing the initial requirements for the SciLake services from the pilot communities. The report describes the methodology used, including online questionnaires and interviews, and summarizes the feedback received. Additionally, it presents a brief analysis of the insights gained and provides an initial use case scenario for each pilot case.

In the first year of the project, we focused on identifying the key sources of domain knowledge for the pilot domains and collecting the special requirements of the respective research communities regarding knowledge discovery and reproducibility. Now we are working hard on building domain-specific knowledge graphs and adapting the SciLake services based on the feedback we received. By summer, we will be able to demonstrate the effectiveness of our services in chosen real world scenarios.

Thanasis Vergoulis, SciLake Coordinator

2024 Objectives: The main goal for the upcoming year is to have the alpha version of the integrated system ready and deployed by June 2024. Moreover, partners will focus on the components of SciLake services and customize them based on the specific needs of the pilots. Roadmaps are being finalized for each component, outlining the steps towards the alpha release. These roadmaps ensure that all functionalities are well-documented and explained. Delivery demos will be planned for each service, taking into account the integration of different components within the pilot demos. Additionally, efforts will be dedicated to promote these developments and increase the engagement with the research community through collaborations with other projects, especially in the EOSC association.

Long-term Vision: SciLake's ambition is to deliver a Scientific Data Lake that enables democratic access to heterogeneous scholarly content. The project outcomes will enhance and encourage high-quality scientific work by providing smart services that support reproducibility and impact-based knowledge discovery. SciLake has set the following objectives to accomplish by the end of 2026:

  • Develop a scalable platform that extracts information from unstructured scholarly content and transforms it into SKGs.
  • Establish standards, guidelines, and tools to assist developers in creating domain-specific SKGs that can be interoperable with the OpenAIRE Graph.
  • Provide a unified API that enables easy access to the contents of the Scientific Data Lake and facilitates the creation of value-added services.
  • Offer customizable added-value services to enhance domain-specific knowledge discovery and reproducibility.

Stay tuned to follow SciLake's exciting progress as it continues to develop.

For more information, check out and follow @SciLake_project on X.
Stay Informed

When you subscribe to the blog, we will send you an e-mail when there are new updates on the site so you wouldn't miss them.

Related Posts