Godatadriven blogs


Events (74)Train (65)Build (64)Data Science (58)Data Engineering (50)Organize (50)Open source (48)Modern Data Stack (39)Data democratization (32)Learning Journey (32)Data Platforms (32)Python (31)News (30)Whitepaper (23)Technology (23)Analytics Translator (22)Analytics Engineering (20)dbt (20)Tools & Tech (19)Airflow (17)Data Science Learning Journey (16)AI Maturity (16)Strategy (15)Spark (14)Data Engineering Learning Journey (13)Data Governance (11)
Data Engineering Data Science Python Spark
Devil’s in the details: Data Leakage
Erdem Başeğmez on 12 July 2022
Data Engineering dbt Spark
DBT’s missing software engineering piece: unit tests
Cor Zuurmond on 27 May 2022
Blog Data Engineering Spark
Real distributed image processing with Apache Spark
Kris Geusebroek on 25 April 2022
Dask Data Engineering Spark
Why Dask if I may ask?
Roel Bertens on 18 February 2021
Data Engineering Data Platforms Open source Spark
Making joins faster in DataFusion based on table statistics
Daniël Heres on 22 December 2020
Data Engineering Data Platforms Open source Spark
Spark on Kubernetes with Argo and Helm
godatadriven on 02 August 2020
Data Engineering Open source Senior Data Engineer Spark
B.EFFICIENT – Large scale Spark optimisation
godatadriven on 06 March 2020
Medior Data Engineer Medior Data Scientist Spark Train
Spark surprises for the uninitiated
Giovanni Lanzani on 28 January 2019
General Spark
How to Write Code Using The Spark Dataframe API: A Focus on Composability And Testing
Giovanni Lanzani on 27 January 2017
Page 1 of 1