Details
- Wednesday, March 9th
- 8:30 – 10:30 CET (doors open, breakfast is served, from 8 AM onwards)
- GoDataDriven office, Wibautstraat 200, Amsterdam
What to Expect
In this Code Breakfast, Kris Geusebroek and Niels Zeilemaker will introduce you to the details of the Delta Sharing protocol and perform a live demonstration of data requests via the protocol using Apache Spark and Python. After this introduction, we will continue to explore the protocol and its use with different datasets in the hands-on part of the code breakfast. Since the protocol and DataMesh concept is cloud-agnostic we will try to combine datasets from different cloud platforms.
Register for the Code Breakfast:
Delta Lake and the Data Mesh
Creating a decentralized data platform instead of a single central one has gained a lot of traction. The paradigm shift towards a data mesh was triggered by the observation that while domain-driven design heavily influenced the way we design operational systems, central data platforms kept being developed as centralized monoliths.
In May of 2021, Databricks released Delta Sharing, an open protocol for the secure exchange of massive datasets. Through the generation of pre-signed, short-lived URLs pointing directly at requested data, and their distribution to authenticated data consumers, Delta Sharing facilitates real-time, platform-agnostic access to an organization’s information. This makes it well-suited as a means to inter-domain dataset distribution in the Data Mesh, an architectural paradigm that erupted into the public consciousness at around the same time.
Speakers
Kris Geusebroek
Kris is a seasoned and communicative developer with a passion for combining technologies to create new possibilities for the people around him. He started developing with Java and gained vast experience with the development of Geographical Information Systems. Over time, Kris gradually developed a passion for open source solutions.
Over the past years, Kris has been working with distributed systems and graph databases, like Hadoop and Neo4J, for large enterprises.
Clients include: Rabobank, Wehkamp, Dutch National Police, ING, KNAB, Schiphol, ABN AMRO, and Technische Unie
Niels Zeilemaker
Niels is Chief of Technology at GoDataDriven and works for a wide range of companies where he engineers features and builds models.
Niels finished his PhD thesis at the Technical University of Delft where he researched P2P systems, primarily focusing on privacy and cooperation, including applying encryption and anonymization techniques in the P2P domain.
Niels is experienced in various programming languages such as Python, Java, C#, R.
Related Content about Data Engineering
- Data Engineering
- Hadoop
- Apache Airflow
- Data Engineering
- Data Engineering
- Apache Spark
- Data Engineering
- Data Science and AI
- Apache Airflow
- Data Engineering
- Technology
- Data Engineering