Designing Your Data & AI Team Structure

Some years ago, that definitely was not the most urgent question for companies. But over time, some of the early big data and data science initiatives have evolved towards machine learning models in production and customer-facing AI solutions. Large scale adoption of data and AI solutions is still lacking, e.g. only 20% of respondents of the Data & AI Survey 2020 report actually realizing business value from data and AI, for many reasons.

From our experience at many clients, we have learned that one underlying reason for lack of effectiveness of AI initiatives is the challenge on how to organize (teams of) data experts. This challenge gets even more relevant if one actually succeeds in getting AI models into production. At the same time the number of data and AI experts in companies grows. So today, as a result, the following question has become quite urgent for many companies: How should we structure our AI organization?

In this blog I will try to answer that question. I will start by summarizing the organizational design principles from the book Team Topologies, which are focused on a somehow similar but different domain: software development. The next step is to assess how these principles can be applied to the Data & AI domain and come to a conclusion on which principles drive the design of Data & AI organizations.

Business or IT?

Initiatives involving data and AI often originate either from data heavy business domains, like marketing and sales, or from the place where most of the data lives, the IT department. The first with a focus on potential business value, the latter arises from the opportunities new technology and the availability of data promise to offer.

In the process of digitization of business processes and customer services and products, business and IT departments have gained much experience in working together. The more mature companies have transformed their IT development and operations organizational structure into cross-functional product teams with a long term mission a.k.a. squads, e.g according to the Spotify model.

But is data and AI like business or IT? Does having data experts following the same transformation into product teams work out as we would like? Often this does not, because of the simple fact that a company does not have enough budget and/or data scientists to have a data scientist in every product team requiring one. But even in case you would have many data scientists, not every product team requires an all year round, full-time data scientist dedicated for the team’s lifespan.

Some companies choose to have one or more teams with only data experts, and then have it operate as a product team. In practice the topology of a product team for such a team feels odd most of the time. Product teams work well as long as they have a single shared end-to-end mission, e.g. a customer journey for one specific set of online services. Data and AI product teams tend to work for many different business domains, on many different use cases. This results in many context switches and priority conflicts of which the latter cannot be expected to be solved by the PO of this team. Sure the PO can and should facilitate the process, but decisions on priorities should be taken on a more strategic level.

Existing theory on organizational structure

There are many books, blogs and other publications on the topic of designing organizational structures for software development. Whether that theory directly applies to the data and AI function of your company is questionable, as stated above; data and AI is not quite the same as software development. But what if we can generalize the theory a bit and try to apply it to our main question?

A book on this topic I really appreciate is "Team Topologies" by Matthew Skelton and Manuel Pais (link). One piece of advice I particularly like is that in designing organizational structures one should focus on determining the design principles, not on the organizational design itself.

Team Topologies, a summary

First of all, the book starts with disqualifying org charts and matrices by mentioning that organizations that rely too heavily on them often fail to create the necessary conditions to embrace innovation while still delivering at a fast pace. The org charts and matrices just do not represent how work gets done and teams interact with each other. In practice, the informal lines of communication enable value creation, not the theoretical and formal lines of command.

Already back in 1968 Mel Conway stated that organizations which design systems, are constrained to produce designs, which are copies of the communication structures of these organizations. A result of this law is that if the desired theoretical system architecture does not fit the organizational model, then one of the two will need to change. We can use this to our strategic advance; we can reshape the organization (i.e. the communication structure) to reflect the favored system design.

For teams to be effective we should take care of the team’s cognitive load and motivation. According to Dan Pink the three elements of intrinsic motivation are Autonomy, Mastery and Purpose. Overloading a team with many requests from multiple business domains, responsibilities for many components of software systems, requiring them to be a jack of all trades (however, a master of none) negatively affect all elements of intrinsic motivation. For teams to be autonomous, the need to communicate with other teams should be limited.
Typically, it takes 3 months for a team to become a cohesive unit. Frequent changes in team composition delay this process. Rewarding teams instead of individuals adds to the team spirit.

Behavioral studies suggest that humans work best with others when we can predict their behavior. We can build trust by providing consistent experiences for others in the organization. Clear roles and responsibilities help in defining expected behavior and thus in making interactions between teams more effective.

When used with care, there are only four team topologies needed to build and run modern software systems. These four types act as a powerful template for organization design. All teams should move toward one of these “magnetic poles”.

Team Topologies
Stream-aligned Team		Is a team aligned to a single, valuable stream of work. This might be a single product or service, a set of features, a user journey, a user persona, etc. No hand-offs to other teams
Complicated-Subsystem Team		Responsible for building and maintaining a part of the system that depends heavily on specialist knowledge Goal is to reduce the cognitive load of a single stream-aligned team
Enabling Team		Composed of specialists in a given technical domain, with no direct product development responsibilities Responsible for the time consuming practice of acquiring state of the art knowledge and facilitating teams in applying it Help multiple stream-aligned teams in acquiring missing capabilities, researching, reading about, learning and practicing new advanced skills
Platform Team		Provide internal services to enable stream-aligned teams to deliver work with substantial autonomy A platform is a foundation of self-service APIs, tools, services, knowledge and support which are arranged as an internal product that the stream-aligned teams can easily consume

A key part of the team topologies approach is to carefully design the team boundaries, keeping in mind that the need for all teams to communicate with all other teams in order to achieve their ends should be avoided. Wherever teams need to communicate an interaction mode can be defined. There are three essential interaction modes.

Interaction modes
Collaboration		Working closely together with another team Conway’s law tells us that with more collaboration, the responsibilities and architecture of the solution will get more blended Advantages: rapid innovation and discovery, fewer hand-offs Disadvantages: wide, shared responsibilities for each team, more detail/context needed between teams, leading to higher cognitive load. A team should not use collaboration with more than one team at the same time
X-as-a-service		Consuming or providing something with minimal collaboration Suitable for teams needing a code library, component, API, platform that “just works” without much effort Advantages: clarity of ownership, with clear responsibility boundaries Disadvantages: danger of reduced flow if the boundary is not effective A team should expect to use this interaction with many other teams simultaneously, whether consuming or providing a service
Facilitating		Helping (or being helped by) another team to clear impediments Advantages: unblocking of stream-aligned teams to increase flow Disadvantages: requires experienced staff to not work on “building” or “running” things A team should expect to use this interaction with a small number of teams simultaneously, whether consuming or providing the facilitation

Organizations should be willing to adapt on a frequent basis by organizational sensing: identifying communication lines, required for value creation, outside the current organizational design is considered to be a signal that the current design is not optimal anymore.

More on this book can be found on its website: Team Topologies website

Team Topologies for Data & AI teams

The theory from Team Topologies is focused on the development of software systems. Although there certainly are similarities in software development and Data and AI product development, the latter is much less predictable due to the explorative and research nature of these projects. These different dynamics and the limited availability of data experts require different team setups. Let’s discuss some aspects of the Data and AI organizational structure using the principles and theory as learned from Team Topologies.

The AI solution life cycle

From ideation to production an AI solution passes different phases. Especially AI use cases based on data sources never used before will take considerable data exploration, interpretation and preparation efforts before experiments can be done searching for predictive power in the data. At any moment in this R&D process the total accumulated knowledge may lead to a conclusion that it just won’t work or that we should adapt the objectives of the project to more feasible ones.

The AI use case may range from e.g. a small single feature within a larger software project used by millions of customers, to a one-off decision support insight based on many data sources, for a small management team.

These aspects (i.e. life-cycle phases and use case types) greatly affect the type and amount of communication required throughout the AI solution life cycle. And because it affects communication, it will determine the optimal organizational structure, according to Team Topologies’ Theory.

Research phase		A lot of communication will be needed with business domain experts during the initial research phase of the project, so to prevent the team from having to maintain the collaboration interaction mode (possibly with many teams), it makes sense to form a cross-functional (AI & business) stream-aligned team.
Development phase		As feasible ways towards the objectives start appearing, communication needs with business domain experts decrease and the AI part of the team could evolve into a complicated subsystem team.
Production phase		Even further down the life cycle, when the AI solution is in production, the product may be handed over to and managed by a platform team using the as-a-service interaction mode to stakeholders.

The best fitting topology will depend on the AI use case type. If less exploratory work is required, because well-known data sets are being used and the objectives are fixed and clear, the AI experts may start as a complicated subsystem team right from the start, using the collaboration interaction mode with the business users.

Data-as-a-service

Before data can be used in BI or AI applications, most data need to be cleaned, transformed, enriched, documented, quality-checked etc. This is a very time-consuming process which should only be done when at least one actual application requires the data source.

A newly prepared data source may be used in a single application, but preferably this data set serves many applications later on. This requires a well-designed data architecture, in other words, a proper design of the data system, which should resemble the organizational structure, according to Conway’s Law.

New data	Existing data

When new data is used, the life cycle of this data source is quite similar to the AI solution life cycle, except that planning and outcome will be a bit more predictable. For the data engineers it makes sense to join a stream-aligned team for the period of time necessary to support the analysts and scientists in exploring and preparing the data, which requires a lot of communication.

After the preparation phase a platform team providing data-as-a-service to multiple other teams may better fit the purpose.

Data & AI Maturity

An aspect having major impact on the amount of communication required is the maturity of anyone in the organization when it comes to Data & AI topics. With limited data and AI literacy in business domains, data experts will spend considerable amounts of time explaining basic concepts to business people. Actually, the same holds the other way round, data experts having limited knowledge on the business domain require a thorough introduction to be able to make sense of the data at hand.

Furthermore, when seniority is limited in the group of data scientists or data engineers a lot of time will be spend on pioneering, re-inventing the wheel, trying ways to get things done, promoting effective ways of working to documented best practices etc. all of which is accelerated by spending time together communicating and interacting on the job. Even if you are able to hire for seniority, the rest of the team still needs to be trained, which again will require time and communication. So, maturity very much affects the amount of communication needed, which in turn affects the organizational design, as is stated in Team Topologies.

A design principle aims for autonomy of teams, or in other words limited communication between teams. This promotes cross-functional teams; data/AI and business experts joining forces, educating each other on their domains. At the same time this principle promotes enabling teams of (to-be) data experts since development and sharing of knowledge and skills also requires lots of communication and time together, especially if the data junior’s autonomy is still too limited to operate on their own in a stream-aligned team.

The autonomy principle leads to contradicting arguments on the optimal topology. For individual data experts time should be spent in both the cross-functional team and the enabling team since both offer learning opportunities that are essential for their career development. The best choice in team design in practice will depend on the specific maturity levels and size of the company.

Limited number of Data Experts

As mentioned before, the assumption is that every company will hit budget limits when it comes to providing full time data engineering and data science capacity to every product team and business domain in need for it.
At the same time, other product teams and business domains might just need the full-time capacity for a couple of months, preventing them from hiring data experts themselves. So a static organizational structure will not meet the needs of all teams and domains in the company.

Conclusion

What I believe to be the huge benefit of using the theory from Team Topologies for the organizational design of Data and AI teams, is that it gives us the (both verbal and visual) language and the definitions to discuss it more effectively and specifically. In the process of designing the organizational structure, it facilitates agreeing on a set of starting points and design principles first, before getting to solutions.

Taking into account all above considerations, not one single fixed organizational design will come out. Probably, a more dynamic hybrid design is preferable, with dynamic meaning that the data experts will move between teams (topologies) every now and then. What the specific organizational design will be depends, amongst other aspects, on the maturity and size of the organization, number of data experts and complexity of use cases.

The resulting design does not need to be fully reflected in an org chart; to facilitate the enabling team topology for data scientists, as is required for further development of expert knowledge and skills, a Community of Practice instead of a “physical team” might work very well for one organization. For the less mature organization, with respect to data and AI, a home base physical team of data experts might be the better option, with a part-time allocation to stream-aligned product teams for data and AI product development during the early phases of the AI life cycle.

In a follow-up blog I will discuss three cases of companies, different in size and maturity, and use the Team Topologies’ design principles to explore possible Data & AI organizational designs.

Would you like to discuss the organizational design for data and AI teams specifically for your organization? Please contact us!
Our AI maturity scan can be used to assess specific aspects of your company. You will get recommendations to get your company to the next level of AI maturity and can be used as input for your company’s AI Strategy.

Data & AI team structure: How to design your Data & AI organization?