Personal recommendations based on viewed content
Custom Predictive Modelling // Divolte
Custom Predictive Modelling // Divolte
Background info about NPO
For over 60 years, the Nederlandse Publieke Omroep (NPO), Public Broadcasting Organization, has been producing and broadcasting radio and television. Every week, NPO reaches 85% of the Dutch population, with a larger presence online. To provide relevant content for online viewers, NPO recently implemented several smart data applications.
Because of changes in viewing behavior and an increased number of channels, it has become increasingly difficult for media companies to retain viewer attention. To remain relevant in the coming decade as a public broadcaster and to make sure that viewers don’t get lost in the overwhelming offer of video content, NPO set an objective to become a personal broadcasting organization.
NPO began pointing out potential use cases, like a/b tests, automatically generated playlists and personal newsletters to offer consumers personal viewing experiences. To successfully develop these use cases, it pointed out three main themes; dashboards, recommenders and the introduction of NPO ID.
The right process for innovation
Product innovation required implementing an agile methodology and working in multidisciplinary teams.
“For NPO, agile working in multidisciplinary teams was not common practice, to put it mildly. We found out that it was not only necessary to adapt our way of working, but that the whole office needed a makeover. We have made necessary adjustments to establish a modern and flexible workspace through the full line of business,” said Marcel Collette, then manager of information systems at NPO.
Because NPO did not employ data engineers and data scientists at the time, the broadcaster decided to in-source these skills. The consultants from GoDataDriven helped kickstart the project and also actively shared knowledge with all team members.
“A strong startup vibe could be felt within the new data team, especially in the early days. Speed was everything,” according to Erik van Heeswijk, then ad interim project manager responsible for big data strategy development at NPO. "In a later phase, the focus shifted more to internal support and process management within the organization.”
Setting up the data platform and developing use cases
Software is often developed for commercial purposes. For NPO, features don’t always fit with their public character and guide role. Webshops put a lot of effort into designing a tight sales funnel or focus on conversion optimization. For NPO, these topics are not relevant. What has always been very relevant to NPO is transparency. It is important that users know what personal details are stored and what they are used for, which means that NPO only uses data from individual uses when permission has been granted to provide relevant recommendations.
NPO has built their infrastructure using proven, open-source technology like Python, Java, Divolte, Hadoop, and Spark. “By building the platform and data-driven applications from the ground up, we have been able to experience the entire learning curve and learn to understand how everything works,” explained Collette.
Much effort has been put in the realization of a central data platform based on HDFS and Spark that combined data from the various brands within the NPO organization. The implementation of the platform started with hosting. To process streaming internet data in an optimal and scalable way, NPO chose to implement Divolte as clickstream collector and to host the platform in the cloud.
“We had very little experience with cloud hosting. It was important that our ICT department built up this expertise, so they were closely involved directly from the start. When the cloud environment proved to be solid and secure, internal support instantly increased,” said Erik van Heeswijk. “The visualization of data in dashboards and the demos after every sprint also contributed to the internal support.“
Understanding behavior through dashboards
For the initial dashboard development and data collection understanding, NPO initiated two pilot projects. One with a daily platform (NOS) and one with a general platform (NPO.nl). Editors were actively involved and determined how to collect data and what metrics and visualizations should have focus. Their specific domain knowledge provided context to what works and what doesn’t.
Setting the right KPI's was essential. If content is valued based on the number of people that have viewed the whole program, then NPO should cap video duration to five minutes and offer content of cats and Katja Schuurman. The following graph shows the ratio of video length to percentage of viewers that viewed the whole program:
Based on the collected data, various visualizations were developed, including the most popular articles on the NOS website, the life expectancy of an article (how long an article will remain relevant), and the current number of viewers compared to the expected number.
NPO uses open-source solution Divolte to collect all website interactions in real-time. The collected data can be used to create dashboards as well as recommendations based on viewed content.
Recommendations based on the video content on the site allows NPO to offer visitors a more relevant program. Views on long tail content have increased, leading to content from the entire video catalog of NPO being served. “Not only do we see that content is watched more often and longer, but the value of older content has also increased due to the recommenders,” explains Erik van Heeswijk.
Correlations between viewing behavior and programs from different NPO brands are analyzed. Every color is a different brand; every circle is a program. The bigger the circle, the longer the playtime.
Transparent use of information
Transparency is very valuable to NPO. There is a clear separation between anonymous browsing and a personal environment available behind a log-in.
“As a public broadcasting company, we have to take privacy very seriously. We carefully protect our user data. An organization like Channel 4 acts as a reference for us. Anonymous website visitors receive recommendations based on general trends and editor picks. But logged-in users will receive more personal recommendations soon,” according to Marcel Collette. “Visitors that create an NPO ID will receive better service and content they will appreciate. Building up a relationship between the viewer and NPO is crucial for this relationship and we can only achieve it with a clear proposition. Members can always access their personal data and have it deleted from our databases if they choose.”
By understanding the viewing behavior on the NPO websites better, the public broadcasting organization has been able to develop relevant recommendations that will lead to even more engaging content in the future.
Technology we used
By building the platform and data-driven applications from the ground up, we have been able to experience the entire learning curve and learn to understand how everything works