Data Engineering Learning Journey

Streaming Architecture at Scale Training

English |
Book now Download brochure
Xebia Academy

Streaming Architecture at Scale

While some apps enjoy the luxury of processing in a batch oriented fashion, others, as in the IoT ecosystem, expect events to be ingested and processed as they occur. This training focuses on two key players on the streaming-side of data processing: Apache Kafka and Apache Spark!


What you'll learn

  • Fundamentals of queue messaging systems
  • Fundamentals of the Kafka architecture
  • Fundamentals of Spark Streaming, with concept as checkpointing, watermarking, streaming windows and more
  • How to consume and process events from Kafka with Spark

The Program

The program consists of both theory and hands-on exercises for Kafka and Spark.


  • How streaming topics work
  • The basics of messaging systems
  • Watermarks
  • The concept of topics
  • Design considerations for a messaging system
  • Run a Kafka cluster as docker-compose


  • Set Spark as a consumer for Kafka
  • Process incoming events real-time

Climbing a steep Python and Machine Learning curve in three days. This would have taken me months on my own.

FD Mediagroep Data Scientist

This online course is perfect for

IT engineers/architects, who deal with data stream processing architectures. Basic experience with Python and Apache Spark is required. If you’re not quite there yet, we recommend the Python for Data Engineers and Data Processing at Scale courses respectively as preparation for this training.

What will you learn during Streaming Architecture at Scale?

After this training, you will have understanding on how queue messaging systems work, how to route real-time incoming events with Apache Kafka and finally how to process them in real-time with Apache Spark.

Data Engineering

The Learning Journey for Data Engineers

Learn how to take data and AI concepts from concept to prototype and to production-ready application. Acquire the skills to develop and run Data and AI solutions at an enterprise-scale with ease! Take part in a specific training or advance through the entire journey. Learn how to build secure data platforms and reliable AI applications that are engineered for scale.

The Right Format For Your Preferred Learning Style

At GoDataDriven we offer four distinct training modalities:

  • In-Classroom & In-Company Training
  • Online, Instructor-Led Training
  • Hybrid and Blended Learning
  • Self-Paced Training

Learn more about our training modalities

Clients we've helped

  • ING Bank
  • Ahold Delhaize
  • Quby