Apache Airflow

Apache Airflow is a powerful, open-source, big data orchestration tool

What is Apache Airflow

Apache Airflow is a tool to orchestrate big data which allows you to maximize the value of any type of big data in a company. From ETL to training of models, or any other arbitrary tasks. Unlike other orchestrators, everything is written in Python, which makes it easy to use for both engineers and scientists. Having everything in code means that it is easy to version and maintain.

As a powerful big data orchestration tool, Apache Airflow refines data from an unrefined state into a reliable, more useful state. What’s more, Airflow spreads the capacity to do this across your organization, thus creating a movement towards data excellence.

GoDataDriven and Apache Airflow

At GoDataDriven we are open source enthousiasts. We dedicated a lot of time and effort to commiting to open source projects, like Apache Airflow. Did you know that our very own Fokko Driesprong is one of the committers of this project, while many more colleagues regularly improve the source code?

We regularly organize Apache Airflow code breakfasts and meetups. You can join the Apache Airflow Meetup group here

Recent Airflow Blogposts/News

Here's an overview of the recent articles we published about Apache Airflow.

Apache Airflow graduation as Apache Top-Level

Highlights from the New Apache Airflow 1.10.2 Release

Open Sourcing Airflow Local Development

Testing and Debugging Apache Airflow

The Zen of Python and Apache Airflow

GoDataDriven Open Source Contribution for January 2019, the Apache Edition

Airflow Training

Are you a data scientist or a data engineer that wants to bring data products to production? This training is for you. In this two-day training, you will learn a general introduction into Airflow, and the situations in which it will benefit you. You will go hands-on exercise with Airflow; writing a DAG that interacts with components on Google Cloud, for instance BigQuery, DataFlow and Cloud Storage. Futhermore, you will learn how to write customized operators, hooks and sensors using Airflow internals.

Airflow Training Topics

Our two-day Apache Airflow training will provide you with hands-on experience in:

  • The Airflow ecosystem

  • Analyzing data types that are suitable for using Airflow

  • And building and automating data pipelines

You will develop the skills to:

  • Use and manage Apache Airflow

  • Build and assess data pipelines

  • And integrate Airflow seamlessly into your data landscape using custom operators

More information and registration for Apache Airflow Training

Questions/Contact Us

Do you have any questions about Apache Airflow, upcoming events, or training sessions? Fill out the form below, and we'll be in touch!

We believe that getting data into motion by building pipelines and refineries, running it through machine-learning algorithms, and processing it to drive dashboards is becoming the lifeblood of leading organizations. Most organizations struggle to run their data products in a uniform manner, which causes a mismatch between their talents and their requirements. In automating your data pipelines, Apache Airflow allows your organization to turn its dormant assets into dynamic resources that provide a competitive edge.

Giovanni Lanzani
Director of Learning and Development