We are looking for 2 developers with the profile of a Data Engineer to join our team. You will use various methods to transform raw data into useful data systems. You should have strong analytical skills and the ability to combine data from different sources. If you are detail-oriented, with excellent organizational skills and experience in this field, we’d like to hear from you.
About the job
The OpenAIRE infrastructure rests on a big data cluster where more than 500Mi records and close to 15Mi full-texts are collected, processed, and made accessible relying on cutting-edge data wrangling technologies and methods. Challenges are to be tackled at all levels of the data flow:
- Collecting, validating, and storing metadata records and related files of different formats from 25,000 data sources;
- Deduplicating and interlinking metadata record to provide a stable OpenAIRE Research Graph;
- Applying full-text mining and deep learning techniques over the data to enrich or adjust information;
- Analyzing the Graph content to identify indicators of quality, FAIRness, openness, and impact;
- Ensuring content from the Graph is redistributed and made accessible by third-party services via a plethora of standard protocols and APIs.
- You will become a member of the OpenAIRE technical team and will lead/implement activities which will r
- Analyze and organize raw data
- Build data systems and pipelines
- Evaluate business needs and objectives
- Prepare data for prescriptive and predictive modeling
- Combine raw information from different sources
- Explore ways to enhance data quality and reliability
- Identify opportunities for data acquisition
- Develop analytical tools and programs
- Work closely with the CTO and other team members on several projects
- Potentially represent OpenAIRE in European project technical meetings.
Qualifications, SKILLS and experience
- Experience in the implementation of algorithms on top of the Apache Spark framework - required
- Experience in modeling and use of databases on NoSQL systems (e.g. Mongodb, HBase), and on systems for information retrieval (Solr, ElasticSearch) - required
- Experience in using and applying
- Spring & SpringBoot Framework
- IoC Paradigm (e.g. Spring)
- MVC paradigm and framework to support web programming (e.g. Angular.js, Bootstrap, or similar)
- XML database (e.g. existDB)
- Previous experience and consolidated skills in
- Java and at least one other language (e.g. C / C ++ / C #, Perl, Python, Go)
- Web services (SOAP, REST), web applications, and application containers (e.g. Tomcat, Jetty, Docker)
- The architecture/commands of Unix-like systems
- SQL language to create and query databases
- Management of software projects with Maven and Version Control System (e.g., Git)
- Fluent written and spoken English language
- Excellent team-work skills
Terms of employment
- The position is offered for a period of one-year as an associate (in consultancy terms). It is renewable upon satisfactory performance.
- This may be a full-time or part-time position.
- Even though OpenAIRE offices are located in Athens, selected candidates will join forces with the OpenAIRE team located in Pisa, at Consiglio Nazionale delle Ricerche, but can carry out their activities mainly from remote.
- Candidates should be available for travelling to meet with the technical team in Europe, up to 6 times per year, on request of the technical coordination (paid travel).
- Depending on experience and country of residence, gross salary for a full-time position will be in the range of € 4,000 to € 6,500 per month.
An equal opportunity employer
Here at OpenAIRE we believe in equality & diversity. OpenAIRE ensures equal opportunities, treatment and access to all candidates regardless of their sex, race, colour, ethnic or social origin, genetic features, language, religion or belief, political or any other opinion, membership of a national minority, property, birth, disability, age or sexual orientation.
How to apply
Dates and deadlines
- Closing date for applications: 30 April 2021
- Online interviews: 4 - 10 May 2021
- Expected starting date: asap no later than 15 June 2021