Streaming Architecture at Scale

Two-Days Training

Streaming Architecture at Scale

While some apps enjoy the luxury of processing in a batch oriented fashion, others, as in the IoT ecosystem, expect events to be ingested and processed as they occur. This training focuses on two key players on the streaming-side of data processing: Apache Kafka and Apache Spark!

Download Training brochure

Download the GoDataDriven brochure for a complete overview of available training sessions and data engineering, data science, and analytics translator learning journeys.

Download Brochure

This online course is perfect for

IT engineers/architects, who deal with data stream processing architectures. Basic experience with Python and Apache Spark is required. If you’re not quite there yet, we recommend the Python for Data Engineers and Data Processing at Scale courses respectively as preparation for this training.

What will you learn during Streaming Architecture at Scale?

After this training, you will have understanding on how queue messaging systems work, how to route real-time incoming events with Apache Kafka and finally how to process them in real-time with Apache Spark.

The Program

The program consists of both theory and hands-on exercises for Kafka and Spark.


  • How streaming topics work
  • The basics of messaging systems
  • Watermarks
  • The concept of topics
  • Design considerations for a messaging system
  • Run a Kafka cluster as docker-compose


  • Set Spark as a consumer for Kafka
  • Process incoming events real-time
  • Fundamentals of queue messaging systems
  • Fundamentals of the Kafka architecture
  • Fundamentals of Spark Streaming, with concept as checkpointing, watermarking, streaming windows and more
  • How to consume and process events from Kafka with Spark

Training Formats

This training is available in the following formats:

In-Company Classroom

In-Company training is perfect for groups of 6 or more. The training takes place online, at your office, or at one of our modern training facilities.

Online Virtual Classroom

Virtual Classrooms provide you with an interactive environment to effectively develop your skills, right from the comfort of your own home or office.

Data Science Engineering Journey

This data engineering learning journey is available for any data experts. Our extensive training programs are designed to develop your skills from junior to senior.

How do you become a data engineering expert? Start here! We’ve put together a carefully crafted learning journey for data engineers. Knowing engineers love to figure things out on their own, we packed the program with opportunities to learn, hands-on, by solving real-life situations. Plus, there’s plenty of practical philosophy, too.

We’ll teach you how to leverage Docker to ease your deployments and navigate code written by data scientists ( Advanced Python and Data Science in Production). You will learn to use Apache Airflow, Apache Spark, and Kafka like a forklift to move data around.

Click here for more information about the Learning Journey for Data Engineers

GoDataDriven - Data Engineer Learning Journey

Data and AI Training Insights

See all
More information

Any questions? Please get in touch!

Contact Gert-Jan Steltenpool, our Sales Director, if you want to know more. He’ll be happy to help you!