Albert Heijn, the largest supermarket chain in the Benelux region, streamlines operations with an updated demand forecasting system based on machine learning.
Replenish stores more accurately and efficiently
Make more accurate predictions
Create a new demand forecasting system based on machine learning
About Albert Heijn
For more than 130 years, Albert Heijn (AH), the largest supermarket chain in the Netherlands, has provided Dutch consumers with fresh groceries and products. In 2011, it expanded to Belgium and became the largest supermarket in the Benelux region.
To make its logistics and operations more sustainable and efficient, the grocery retailer needed a better demand forecasting system that would lead to more on-shelf availability, reduced loss, and more efficient transportation.
Demand Forecasting for a Better Future
Replenishing stores accurately while also reducing waste was proving to be one of Albert Hein’s biggest challenges. Although the supermarket used demand forecasting, the decades-old system lacked important features. It didn’t account for (extreme) weather, product discounts, or events near some stores and often required labor-intensive, manual interventions (corrections) that took up valuable time. Because its forecasts were not always as accurate as needed, the system did not support short- and long-term decision-making in stock distribution, transport availability, and staffing. In other words, it wasn’t improving the company’s practices.
From Ice Cream to Winter Sausage: Making More Accurate Predictions
To ensure that more than 27,000 grocery products are available in more than 1,100 physical stores and for multiple online channels, Albert Hein needed a new demand forecasting system based on machine learning — one with more functionality and features that would provide more accurate predictions.
Its large assortment of diverse products was particularly challenging. Some products show strong seasonality, like rookworst (a Dutch sausage usually consumed in winter) and ice cream. Other products respond to changes in the market, such as the growing demand for vegan products, which requires models to adapt well to change.
The company knew that a more accurate demand forecast would allow time to correct truly unexpected and unforeseeable situations — such as the Covid-related food shortages — and would result in more on-shelf availability, reduced loss, and more efficient transportation.
Albert Heijn needed a reliable partner experienced in machine learning, data engineering, and best practices to help design and build the solution transferring skills and knowledge to internal data scientists and engineers along the way. They approached GoDataDriven for help, for a successful collaboration right from the start.
A Custom Solution Deployed Without a Hiccup
The project had many challenges, including prediction quality, timeliness, system stability, and assortment evolution. There were also critical requirements, such as tracking any irregularities — predictions that lie outside the bandwidth of historical demand must be flagged and inspected. So, accurate monitoring was also paramount to the project’s success. Reliability was also extremely important. As the replenishment of stores depends on the system’s predictions, any hiccup could disrupt critical processes. And finally, gaining the trust of stakeholders within AH and proving the project could reliably deliver quality predictions was also an ongoing effort and process.
But the biggest challenge, by far, was the massive scale of the project. The number of predictions — one for each item in the assortment, for every store, up to 50 days into the future — amounted to over a billion forecasts every day. With only a few hours available to process that data and create forecasts, the enormous demand would put a strain on even the most modern cloud services.
Together with the supermarket’s in-house Machine Learning engineers and Data Scientists, GoDataDriven deployed a custom solution on Albert Heijn’s Azure cloud infrastructure and orchestrated it using Airflow. The set-up consisted of three phases: data loading and feature generation, model training, and generating predictions. The data loading and feature generation were done using dbt (data build tool) with a Spark engine hosted in a Databricks environment.
The Overall Result
Every night, production models are provided with the most recent data to create predictions for the coming days — up to 50 days into the future. The machine learning models are a huge improvement over the existing forecast models in two important respects: first, the number of manual adjustments needed has been greatly reduced; second, the new forecasts are more accurate than before, by a percentage greater than 5%. The effects on on-shelf availability and loss are still being collected, but overall, it’s the right step in the right direction for the pioneering supermarket chain — one that improves its overall operations with better forecasting and contributes to a better future for the world.
Project typePredictive Machine Learning
Technologies usedAzure, Airflow, dbt, Spark, Databricks
“The team at GoDataDriven has shown a ‘hands-on, can-do’ mentality from the very beginning. They brought their experience in machine learning, data engineering, and best practices with them, and transferred their knowledge to the rest of the team from the get-go.”