Five days of data technology

GoDataFest 2019 - Presentations

The Week In Review

Recorded talks and slides

GoDataFest 2019 took place from October 28 until November 1. Every day highlighted a specific technology; Amazon Web Services, Microsoft Azure, Databricks, Google Cloud, Dataiku and open-source. With over 45 speakers, more than 50 hours of presentations, workshops and tutorials, and 400+ unique participants; This was one full week filled with data learning at the GoDataDriven office.

Don't miss the latest insights into data & AI

Subscribe to the monthly GoDataDriven newsletter and receive the latests insights into data & AI straight in your inbox.

    Monday, October 28, 2019

    Amazon Web Services

    The first day of GoDataFest was all about the smart applications for Formula 1, training your own DeepRacer and racing on the track, going in-depth with Elastic Kubernetes Services, learning to set-up data platforms and industrializing AI apps on AWS.

    Artificial intelligence in action: delivering a new experience to Formula 1 fans (Guy Kfir – Amazon Web Services) at GoDataFest 2019

    The success story of data science at Exact. Empowering the accountant of tomorrow.

    The new data-driven products of Exact could have never been successfully implemented without the effective use of big-data technologies, Machine learning, and micro-services. At Exact, the data science department has joined forces with product development to realize products that can digest a large amount of data, extract features from structured and unstructured data, and leverage data to develop machine learning models. Serving the model in real-time, comply with all legal regulations and continuous integration as well as the development has taught us many lessons that we would like to share with you. We will introduce the project of mapping bank transactions to different bookkeeping categories by using various AWS services such as EMR, Lambda, Sagemaker, Step Functions, and CloudWatch. We will also explain how our end-users benefit from the developed tools.

    Get the Latest Insights into Big Data & AI

    In the 2019/2020 Big Data & AI survey you find the latest insights on how companies use data & AI, what the top service providers are and how you can attract and retain talent.

    Download your copy of the survey
    Tuesday, October 29, 2019


    The Microsoft Azure day focused on the advent of the cloud and smart technologies is revealing new scenarios that were simply not possible until now. Smart sensors and connected Internet of Things (IoT) devices now allow us to capture new data from industrial equipment: from factories to farms, from smart cities to homes. And whether it’s a car or even a refrigerator, new devices are increasingly cloud connected by default.

    DevOps for Data Science and AI

    At GoDataFest 2019, CTO’s Marcel de Vries and Niels Zeilemaker talked about DevOps for AI, specifically for Microsoft Azure.

    In this talk, they share how to begin projects with the end in mind, how to bridge the gap between a successful experiment and using it in your business, a cost effective set-up of your environment on Azure for Data Science,  and how to by Secure and Compliant by default.

    The world runs on AI – Tony Krijnen (Microsoft) at GoDataFest 2019

    Smart application on Azure at Vattenfall – Rens Weijers & Peter van ‘t Hof

    During GoDataFest 2019, Rens Weijers, manager data & strategy and Peter van ‘ t Hof, data engineer, share the story of how Vattenfall develops smart applications on Azure.

    What Is the Role of an Analytics Translator?

    Learn how an Analytics Translator helps organizations overcome the most common difficulties when building AI solutions.

    Discover the Analytics Translator role
    Wednesday, October 30, 2019


    Databricks’ founders started the Spark research project at UC Berkeley, which later became Apache Spark™. Databricks’ mission is to accelerate innovation for its customers by unifying Data Science, Engineering and Business. Databricks provides a Unified Analytics Platform powered by Apache Spark for data science teams to collaborate with data engineering and lines of business to build data products. Users achieve faster time-to-value with Databricks by creating analytic workflows that go from ETL and interactive exploration to production.

    Solving the world’s toughest problems – Bilal Aslam (Databricks)

    Team Databricks was present in full force at GoDataFest 2019, a five-day meetup hosted by GoDataDriven, a Dutch BigData and ML consulting company (and recipient of our NEMEA Partner of the Year Award for 2019!). I presented a talk titled “Solving the world’s toughest problems with data” to a crowded room.

    Wehkamp: Applied Machine Learning for Ranking Products in an Ecommerce Setting

    At GoDataFest 2019, data scientists Jerry Vos & Arnoud de Munnik share how Wehkamp ranks product by applying ML.

    As a leading e-commerce company in fashion in the Netherlands, Wehkamp dedicates itself to provide a better shopping experience for the customers. Using Spark, the data science team is able to develop various machine-learning projects for this purpose based on the large-scale data of products and customers.  In this talk, we are going to demonstrate how we use Spark to build up the whole pipeline of ranking products and the challenges we faced along the way

    Quby: Making Homes Efficient and Comfortable using AI, IoT data and the full Databricks stack

    At GoDataFest 2019, Erni Durdevic from Quby shares how they make homes more efficient and comfortable using AI, IoT data and the full Databricks stack.

    Quby is a leading company offering data driven home services technology across European markets, known for creating the in-home display and smart thermostat Toon. In this talk Erni will take you on a tour of how Quby leverages the full Databricks stack to quickly prototype, validate, scale and launch data science products. We will explore the technical workflow of a Data Science project from end to end. Starting from developing a notebook prototype and tracking the Machine Learning Model performance with ML Flow, we move towards production-grade Databricks jobs with a CI/CD pipeline, debugging production code with Databricks Connect, and finally setting up a monitoring system for the jobs.

    During his talk, Erni Durdevic also demoed a spreadsheet to calculate the pressure of a project. Erni was kind enough to share the sheet with us: Demo project pressure

    Koalas – Tim Hunter (Databricks) at Data Council Amsterdam Meetup

    In this talk, Tim will present Koalas, a new open source project that was announced at the Spark + AI Summit in April. Koalas is a Python package that implements the pandas API on top of Apache Spark, to make the pandas API scalable to big data. Using Koalas, data scientists can make the transition from a single machine to a distributed environment without needing to learn a new framework.

    Tim will demonstrate Koalas’ new functionalities since its initial release, discuss its roadmaps, and how he envisions Koalas could become the standard API for large scale data science.

    CI/CD with Azure DevOps, Pre-Commit, and Azure Databricks – Niels Zeilemaker

    During the Data Council meetup at GoDataFest 2019, Niels Zeilemaker talks about developing a Python project and building pipelines with Azure DevOps and Databricks. Talk recorded at the Data Council meetup Amsterdam.

    Thursday, October 31, 2019

    Google Cloud Platform

    During the Google Cloud day, speakers introduced topics like Cloud DataFusion, Big QueryML and several GCP customer shared their insights and experiences from their cloud journey.

    From cloudy to the cloud: Mollie’s data transformation

    The goal at Mollie is to create a level playing field by taking the complexity out of payments. We do this by offering companies of all sizes convenient, secure, and reliable payments, allowing them to focus on growing their businesses.

    Democratizing AI/ML with GCP – Abishay Rao (Google) at GoDataFest 2019