You are an engineer at heart with a pragmatic mentality and the responsibility of someone maintaining production systems. You easily switch between scripting and structured programming in typed languages. You understand failures modes in distributed systems. You are passionate about provisioning and automation. And understand that a robust system takes true skill and care.

As a Data Engineer, you are responsible for setup, deployment and productionising of data-intensive systems. You are experienced in engineering systems from the ground up: OS-level, distributed databases, big data clusters and distributed indexes are familiar to you. Added to that, you understand how to develop data pipelines including transformation and pre-processing.

What does it mean to be a Data Engineer and how does it differ from being a software engineer or being a Data Scientist? Data Engineering colleagues describe their daily life:

Fokko - Data Engineer

T√ľnde - Data Engineer

Ron - Data Engineer

DRIVEN is a series of video portraits of data scientists and data engineers, who talk openly about their work and personal life and finding a proper balance between the two. The series consists of interviews with one guest per episode both inside and outside a studio setting.

Being a real techie, who tries to push technical boundaries, Niels invited us to his automated house. Here, he talked about his passion for technology, from self-driving donkey cars to RGB lighting. But did Niels take his love for home automation one step too far?

DRIVEN - Niels Zeilemaker

The Data Engineer role is a senior postion with a central role in our clients' teams, and therefore we require at least 2 years of relevant professional experience. Having said that, we like to be amazed: so if you have done something outstanding during your studies, like contributing to open source projects or starting your own company, we encourage you to apply no matter what the level of your experience is.

Finally, since most of our customers operate in the Netherlands, a working knowledge of Dutch is a requirement.

Preferred general knowledge and experience includes:

  • Hands-on experience managing distributed systems and clusters
  • Programming in scripting languages, e.g. Python, Groovy, Ruby
  • Programming in a statically typed language, e.g. Java, Scala, C++
  • Deployment and provisioning automation tools
  • Linux systems administration
  • Security, authentication and authorisation (LDAP / Kerberos / PAM)
  • Data management
  • Complex Extract Transform Load (ETL) pipelines
  • Cloud platforms (AWS, Azure, Google Cloud, etc)

Preferred skills / tool experience includes:

  • Hadoop ecosystem
  • Elasticsearch
  • Ansible / Terraform
  • Docker
  • Java / Scala
  • Python
  • Shell scripting

Renald Buter
Chief Operations, GoDataDriven