Godatadriven blogs

Data Engineering

Data Science and AI (76)Events (75)Data Engineering (66)Data Platforms (55)Open Source Software (50)News (38)Technology (37)Data Democratization (34)Python (33)Data and AI Strategy (28)Analytics Translation (24)Analytics Engineering (23)dbt (21)Apache Airflow (18)Apache Spark (14)Data Governance (10)MLops (8)Financial Services (7)Azure (7)Keras (7)Google Cloud Platform (6)Hadoop (6)eCommerce (5)Energy (5)Travel (5)Retail (5)
Azure Data Engineering Python
adfPy: an intuitive way to build data pipelines with Azure Data Factory
Daniel van der Ende on 25 July 2022
Apache Spark Data Engineering Data Science and AI Python
Devil’s in the details: Data Leakage
Erdem Başeğmez on 12 July 2022
Apache Spark Data Engineering dbt
DBT’s missing software engineering piece: unit tests
Cor Zuurmond on 27 May 2022
Azure Data Engineering Python
Deploying a Python Azure function as .zip
Jelle Jan Bankert on 11 May 2022
Apache Spark Data Engineering
Real distributed image processing with Apache Spark
Kris Geusebroek on 25 April 2022
Data Engineering MLops
How to deploy your python project on Databricks
Rogier van der Geer on 20 April 2022
Azure Data Engineering Python
Deploying an Azure Function with Terraform
Niels Zeilemaker on 11 March 2022
Analytics Engineering Data Engineering Open Source Software
Airbyte, the open-source data ingester
lassebenninga@godatadriven.com on 09 March 2022
Analytics Engineering Data Engineering Data Governance Data Platforms dbt Python
dbt + SODA: how to manage your data at scale
Guillermo Sánchez Dionis on 08 March 2022
Azure Data Engineering Data Platforms Python
Putting the Factory in Azure Data Factory: Dynamically generated Pipelines
Daniel van der Ende on 21 December 2021
Data Engineering Data Platforms
Data Mesh – a review
Niels Zeilemaker on 20 December 2021
Data Engineering Data Science and AI Python
Python 3.10 Introduces better error messaging
Herbert van Leeuwen on 09 September 2021
Data Engineering Data Science and AI Python
Python 3.10 introduces Pattern Matching
Giovanni Lanzani on 10 August 2021
Data Engineering
An Agile Approach to Building Data Pipelines
Steven Nooijen on 24 June 2021
Analytics Engineering Data Engineering Data Platforms dbt General
Build data pipelines using dbt on Databricks
Data Engineering
Using Draw.io diagrams as Grafana Dashboard
godatadriven on 19 February 2021
Apache Spark Dask Data Engineering
Why Dask if I may ask?
Roel Bertens on 18 February 2021
Apache Spark Data Engineering Data Platforms Open Source Software
Making joins faster in DataFusion based on table statistics
Daniël Heres on 22 December 2020
Data Engineering Data Science and AI Financial Services Trading
BaaS: Backtest, optimize and discover
Diederik Greveling on 06 October 2020
Apache Spark Data Engineering Data Platforms Open Source Software
Spark on Kubernetes with Argo and Helm
godatadriven on 02 August 2020
Apache Airflow Data Engineering Open Source Software Technology
Highlights of the Apache Airflow 1.10.10 release
godatadriven on 12 April 2020
Data Engineering Data Science and AI Financial Services
To the moon with BaaS
Diederik Greveling on 10 April 2020
AWS Data Engineering
Distributed training a DIY AWS SageMaker model
godatadriven on 28 March 2020
Apache Spark Data Engineering Open Source Software
B.EFFICIENT – Large scale Spark optimisation
godatadriven on 06 March 2020
Page 1 of 2