Data Engineering and Data Science Training

Develop your data engineering and data science skills and become truly independent from consultancy companies.

Data Engineering and Data Science Training

Learn to develop robust and scalable Data Science solutions that can be taken into production easily. Our skilled trainers have world-class knowledge and years of experience in the field of Big Data and Data Science.

Organizations need to adapt to change more effectively. For this, they require Data Scientists that can not only develop statistical models that predict future events, but can take these models into production as well.

GoDataDriven’s training curriculum enables professionals to escalate their Data Science skills to the next level. Whether you are new to the Data Science scene and want to learn how to develop models with R, Python, or Spark, or you are an experienced Data Scientist interested in broadening and deepening your skill set; GoDataDriven offers a complete package of high-quality training.

What are the training sessions like?

A part of our curriculum is open for everyone to download and play with it. You can find the material here. By opening the slides you can get a feeling of the approach used in the classroom.

Even though only samples of some trainings are present, you can expect all the trainings to have a similar structure and depth.


Brochure Data Engineering and Data Science Training
data engineering and data science

The trainer was very knowledgeable. I specifically liked the way of presenting and the good structure of the training.

Joris van Quaethoven

Complete Curriculum of Data Engineering and Data Science Training

Applied Machine Learning

In this training, we will show you the basics of machine learning and how to apply machine learning techniques to real-world use cases. Working knowledge of Python, as well as basic statistical knowledge, are required.

This training is available in-company only. Contact us for more information

AI For Business Growth

This one-day training is the ideal way for managers and product owners to learn about the impact and potential of artificial intelligence for their own organization. You will learn how successful organizations leverage data to intelligently adapt to real-time behavior. You will be provided with a framework to point out and validate use cases. You will be able to design the organization needed to become a data-driven organization and you will understand what competencies are required to successfully design, develop, and productionize artificial intelligence solutions.

In 2018, this training is scheduled for:

  • 14 May
  • 28 November

One day, € 695.

Apache Airflow: a tool to orchestrate big data

You will develop the skills to use and manage Apache Airflow in this two-day training. pache Airflow allows you to maximize the value of any type of big data in a company. As a powerful big data orchestration tool, Apache Airflow refines data from an unrefined state into a reliable, more useful state.

In 2018, this training is scheduled for:

  • 10 - 11 December

Two days, € 1195. Find out more about this training, including full programme, dates and registration.

Data Science Accelerator Program

The Data Science Accelerator Program prepares data scientists to transition organizations into data-driven enterprises. Developed by GoDataDriven to create a benchmark in the field, the program combines in-depth lectures with hands-on hackathons and propels data scientist to a higher standard of excellence. This program is suited to experienced data scientists who want to escalate their skills to the next level.

Starting in 2019, we currently have the in-compnay only version available. You can choose several topics or join the 24-day Data Science Accelerator Program that is available in-company only and covers all essential topics as well as specializations. Training days for the in-company Data Science Accelerator Program can have two-week intervals, this means that you will have one training day every two weeks.

Data Science with Python

This training will teach you about the relevant packages and tools to do data science in Python. We will cover the tools and packages that help you solve your day-to-day data science needs. Specifically, you will learn how to work with numpy, pandas, matplotlib, and scikit-learn; we will teach how to use the Jupyter environment and how to leverage its notebooks. Added to that, we will show you how the command line to speed up some everyday tasks relevant to data science.

In 2018, this training is scheduled for:

  • 19 - 21 December

In 2019, this training is scheduled for:

  • 28 - 30 January

Three days, € 1795. Find out more about this training, including full programme, dates and registration.

Data Science with R

This training empowers you as a data scientist to use R in Rstudio to do data science, analytics and machine learning. You will learn how to apply the tidyverse stack from Rstudio, how to create interactive ML dashboards with Shiny as well as how to integrate R with databases, Spark and the h2o machine learning tooling.

In 2018, this training is scheduled for:

  • 24 - 25 September

Two days, € 1195. Find out more about this training, including full programme, dates and registration.

Data Science with Spark

This training empowers you as a data practitioner to use Spark for data manipulation, machine learning and streaming algorithms for big data. You will learn how to use optimize your Spark queries, fully leverage the machine learning API, and replace your batch pipeline by data streams.

In 2018, this training is scheduled for:

  • 6 - 8 March
  • 13 - 15 June
  • 26 - 28 September
  • 11 - 13 December

In 2019, this training is scheduled for:

  • 23 - 25 January

Three days, € 1795. Find out more about this training, including full programme, dates and registration.

Deep Learning

Explore the essentials of deep learning through real-world applications. The deep learning (DL) approach to artificial intelligence (AI) has already revolutionized many industries. Learn how these autonomous, self-teaching systems, like Google’s voice and image recognition algorithms, are developed and applied in this dynamic, three-day training. Through hands-on experience, you’ll learn how to leverage cloud GPU resources and build deep-learning models for images, text and time series.

In 2018, this training is scheduled for:

  • 23 - 25 May
  • 21 - 23 November

Three days, € 1995. Find out more about this training, including full programme, dates and registration.

Neo4j Masterclass

Despite the large number of players in the NoSQL space, graph technology has still been far and away the fastest growing category of database over the last three years according to industry monitor DB-Engines. So, while Neo4J is currently the indisputable leader in the graph technology space it is now your time to get your hands dirty with this great database. During this two-day interactive training, we will take you on a tour through Neo4j to make you ready to use Neo4J in your projects.

In 2018, this training is scheduled for:

  • 9 - 10 April
  • 19 - 20 September

Two days, € 1195. Find out more about this training, including full programme, dates and registration.

Signal Processing

In the rise of Big Data, not only the amount of data but also its diversity is ever increasing. Beyond 'traditional' data consisting of samples of a fixed number of interpretable variables, there is data such as free text, time series (financial transactions, power usage), audio (speech), images and video. These so-called signals typically need to be processed such that meaningful variables can be extracted and structured prior to further usage in data analyses and machine learning applications. This training is focused on making powerful data representations from signals for machine learning applications. In two consecutive parts, we will focus on feature engineering and feature learning, both for which Python code is provided.

In 2018, this training is scheduled for:

  • 13 April
  • 12 October

One day, € 695. Find out more about this training, including full programme, dates and registration.

Spark Programming

This three-day training is for data engineers, analysts, architects; software engineers; IT operations; and technical managers interested in a thorough, hands-on overview of the Apache Spark platform. Each topic includes slide and lecture content along with the hands-on use of Spark through the elegant Databricks web-based notebook environment. Inspired by tools like IPython/Jupyter and Matlab, Databricks notebooks allow attendees to code jobs, data analysis queries and generate visualizations using their own Spark cluster, accessed through a web browser.

In 2018, this training is scheduled for:

  • 23 - 25 April
  • 12 - 14 November

Three days, € 2100. Find out more about this training, including full programme, dates and registration.

Reasons to Select GoDataDriven as your Preferred Training Partner

Much efforts have been put into creating content that is based on real-life experience with taking data-driven applications into production. You can be certain that the content is fit to be used in any organization. We adapt the pace of the program to every participant and allow as much time for questions and interaction as required.

All training is including materials, unless specified differently. Lunch is provided and refreshments and snacks are made available throughout the day. Prices are all excluding VAT and subject to change.

  • In 2013, GoDataDriven was named Cloudera EMEA training partner of the year, in 2016 named Trainer of the Year APAC;
  • Proficient and highly-experienced trainers, who make sure that the contents are explained vividly and can be applied to your business directly;
  • GoDataDriven has been actively deploying Hadoop since 2009, and has implemented the foundation for data-driven transitions of the largest e-commerce and travel businesses in the Netherlands, including KLM, Schiphol, eBay, Wehkamp &;
  • The state-of-art curriculum based on practical and local experience, no hypothetical examples or experiences from far away;
  • GoDataDriven has been successfully facilitating training throughout Europe and the Middle East and has delivered custom data science solutions to dozens of large enterprises;
  • We are strong believers in open source technology. Because we actively contribute to many open source projects, you can rely on us to know what we are talking about.
  • Our trainers are regular speakers at international conferences like the Spark Summit, Strata Hadoop World, Berlin Buzzwords and PyData events;
  • Leading technology providers, including Cloudera, Datastax, Confluent, and Databricks, have selected GoDataDriven as their exclusive training provider.

Public or In-Company?

All our training is available both as public training as well as in-company.

Reasons to organize an in-company training: - Large groups: When you need to train 6 or more participants from your organization at the same time, an in-company training can be the most economical option. - Custom training requirements: When the standard topics do not entirely meet your requirements, with in-company it is possible to make custom arrangements with the trainer directly. - Custom date & location: With an in-company training, you determine the dates and location of the training.

Training Location

GoDataDriven provides training at its own training facilities in Amsterdam. The training facilities are located centrally in Amsterdam and can be reached easily by public transportation as well as by car.

GoDataDriven Wibautstraat 200-202 1091 GS Amsterdam

Travel Directions

Training Facilities GoDataDriven

Very interesting, a good overview and useful hands-on, practical labs. Will definitely recommend to others!

Federico Calore


GoDataDriven is the exclusive training partner of Cloudera (Hadoop), Datastax (Cassandra), Confluent (Kafka) and Databricks (Spark). Besides, GoDataDriven is also a partner of Revolution Analytics (R) and NeoTechnology (Neo4J).