黑料门 has been implementing data mesh since . Since then, we鈥檝e implemented data mesh with clients globally.
听
The following is a set of 10 recommendations that are based on insights we have gained from our experiences. For each, we remark upon observed anti-patterns and the approach that we recommend instead and why. These recommendations are listed top-to-bottom by levels in an organization.
听
Recommendation #1: Bottom-up-only approaches don鈥檛 work 鈥 get top-down buy-in early
听
Management and C-level buy-in for data mesh can be challenging. This often leads听data mesh evangelists to attempt to implement听data mesh bottom-up by building data products within a听data mesh domain that specifically wants Data Mesh.
听
We have, however, seen that these domains and data products encounter an impassable barrier when attempting to expand听data mesh to other departments, who are skeptical about the entire approach. Crossing departmental boundaries and implementing shifts which change priorities, funding, roles, and responsibilities is very difficult to do in a bottom-up manner. It can sometimes elicit slightly awkward moments when someone asks 鈥渨ho are you, and why are you telling me what to do?鈥澨
听
Platform teams sometimes face the same issue. They should be the enablers and coaches on best practice: without top-down support, they cannot change data product teams鈥 ways of working, team setups or even implement a unified set of data governance policies. Data product teams encounter a similar barrier when they attempt to persuade a non-data mesh team to give them access to data or assistance with interpreting data. Scaling听data mesh requires a top-down mandate in order to create agreement and alignment among parties with different agendas and interests.
听
Both top-down and bottom-up buy-in is required: excited teams that are willing to change their way of working from the bottom-up and leadership at the top that supports that change.
听
Recommendation #2: Start with the operating model
听
Despite听data mesh requiring changes in both operating model and technology, the operating model is often sidelined because it鈥檚 too difficult to change. As a result, organizations often attempt a technology-first approach to data mesh. This technology-first approach, while improving technological practices, often results in failure within the first year. This occurs because the structures required to support the scaling of a听data mesh haven鈥檛 been adequately changed to accommodate new ways of working.听
听
While it鈥檚 true that changing the operating model can be difficult, it鈥檚 also critical. It should be done on day one of a听data mesh initiative.
听
Data mesh isn鈥檛 a project; it鈥檚 an enterprise program. It requires support from others in the organization because it affects how teams collaborate with other teams. This means high-level sponsorship and buy-in from the top is needed to ensure organization-wide alignment. It also requires a certain level of change management, such as creating various governance bodies, redefining roles and responsibilities and upskilling the organization. A transformation office can help here; by bringing people with knowledge and experience with the operating model and the technology together it can ensure organizational alignment.
听
In summary, tackling an operating model change can be daunting but it is an essential aspect of successfully implementing data mesh within an organization. Do it early.
听
Recommendation #3: Define domains in a way that represents organizational domain objects and optimizes for efficiency and communication听
听
Domain ownership is a key principle in data mesh. It鈥檚 important because it ensures that each data product in the听data mesh is owned by someone who has expertise in that specific area of the organization. The benefit is that it makes data products more useful to those who might want to use it 鈥 it removes potential confusion about what certain terms mean in given data fields and helps mitigate inconsistencies. In other words, if something doesn鈥檛 make sense, the domain owner is best placed to amend or provide the necessary context.听
听
Of course, this isn鈥檛 without challenges; defining the boundaries of organizational domains听 鈥 and who owns what 鈥 is difficult. This is often caused by predefined budget and reporting lines or possibly underlying political undercurrents. Because of this, at the beginning, it might be easiest to define domains along the boundaries of existing functions. You can then further split domain boundaries that become too complex or reassign domain boundaries as the project progresses and new information is surfaced. Better yet, start with one domain and then explore outwards as time goes on as an iterative process. Organizations already working with a domain-driven design in their operating model have an easier time making this operational shift.听
听
There are, in the end, multiple ways to define a domain. What鈥檚 important is that these domains make sense in the context of the organization, documented, and it makes communication and processes faster and more efficient. Some examples could be:
听
Along existing business units or functions
Along business outcomes (goals such as 鈥渋ncrease profits鈥, 鈥渋ncrease customer satisfaction鈥)
Along value streams (initiatives which deliver value or outcomes to customers)
Use the Inverse Conway Maneuver (this is where teams are structured according to the desired architecture rather than letting existing communication paths and structures shape it)听
Use domain-driven design
听
In summary, domain definitions allow organizations to identify owners and experts of data and to optimize for efficient lines of communication and collaboration. It鈥檚 unique to every organization and while the first domain definition can be difficult, start with a method that would be easiest to adopt in your organization and then evolve it when familiarity of the ways of working and responsibilities are achieved.
听
Recommendation #4: Develop the operating model, data governance and platform together
听
The platform brings the roles, responsibilities and ways of working defined by the operating model 鈥 and the policies defined by data governance 鈥 to life.
听
One anti-pattern that we often see is that the operating model and data governance are separate, isolated projects, each of which define a collection of documentation which is thrown over the wall to an IT department to implement. Here it may be helpful to think of 鈥渄ata mesh鈥 as a product, where the operating model, data governance and platform have the same underlying goals, hypotheses, initiatives and backlog. With this unified approach, the implementation of the operating model and data governance concepts are tested and improved iteratively through data products, which are enabled by the platform.听
听
We recommend that the first chosen use cases should define operating model and data governance policies as part of the requirements. An example requirement:
听
听鈥淲hen a Data Product team creates a Data Product, it is automatically registered in the data catalog along with a description of their input ports, output ports, schemas, SLOs, domain, and domain owners in order to increase transparency of the Data Product within the organization.鈥
听
The platform can then implement such a requirement after which the relevant operating model and data governance offices can monitor the impact and performance of such a policy in the听data mesh ecosystem (and whether its performance lives up to its defined measures of success).
听
Recommendation #5: Establish a good platform early
听
The notion of a is one of the four principles of听data mesh and it is the technical implementation of the decisions made about baseline data product capabilities, domain definitions, and governance policies. The Platform offers capabilities to data product teams in a self-service way so they don鈥檛 need to wait for the platform team to create resources and integrations on an ad-hoc basis.
听
With the听data mesh approach, the platform team is no longer responsible for maintaining and transforming data for analysts to consume, but instead responsible for shaping and building data product offerings (self-contained deployable packages) that data product teams can request and use to maintain their data.听听
听
These data product offerings are what is called an 鈥渁rchitectural quantum鈥 (as defined in Evolutionary Architectures). They contain everything a data product team needs to build their data product, such as base capabilities around data ingestion, storage, distributed compute for data transformation, integration points with data catalog tool, monitors in the data quality tool and governance policies. Sometimes there are multiple offerings for different types of data products. These guardrails and templates make it as easy as possible for teams to deliver data products.
听
As self-serve data platforms are the main enabler of data mesh, they need to be established early. A common mistake is for organizations to start data product teams without a basic platform. These initial data product teams are stuck for months at a time, waiting for the platform to develop baseline capabilities, which puts a strain on ongoing initiatives. At the other end of the spectrum, some organizations spend years attempting to build the perfect platform which takes too long to deliver value and is too difficult to maintain and operate.
听
The right answer is somewhere in the middle: a 鈥済ood platform鈥 is one built upon requirements that were determined by researching what data product teams actually need. A well-researched self-service platform plays a significant role in reducing the friction that can occur when creating data products.听
听
To avoid spending too much time building the perfect platform, we recommend starting with defining a set of core MVP capabilities to be 鈥渏ust good enough鈥 to get data product teams moving quickly to deliver value to the organization and continue to iteratively build and scale with learnings from new data product teams and domains.听
听
Recommendation #6: Reuse your existing technologies in the new听data mesh platform
听
Many clients at the early stage of听data mesh are demoralized by the sheer number of technologies and tools required to power it. The truth is that while there are some tools that might fit the requirements of听data mesh better than others, it鈥檚 best to leverage your existing stack, licenses and expertise where it makes sense. You can then add custom layers to improve developer experience, and use new tools to fill any remaining capability gaps.
听
Note that 鈥渞eusing existing technologies鈥 does not mean 鈥渞euse an existing or older platform鈥. Existing or older platforms may not be compatible with the听data mesh approach because听data mesh requires a critical paradigm shift in the way that the platform is built. Additionally, reusing older platforms that don鈥檛 conform to the听data mesh approach can lead to additional complexity, increasing costs and slowing you down.
听
We suggest taking an inventory of existing technologies and comparing them to the required capabilities of a听data mesh self-serve data platform and your data product requirements. Reuse the tools that fit those requirements and are mature to be used via self-service (for example, APIs are available or declarative definition of resources is possible for the purposes of Infrastructure as Code).
听
Recommendation #7: Move slow and start small with Use Cases听
听
The number one reason why听data mesh projects fail is because they鈥檙e trying to scale too fast. Instead, give your organization the time to learn and adjust to the change. This initial patience will really pay off in the long run.听
听
We recommend starting with a use case that has two to three data products that spans across three dimensions 鈥 operating model, product and tech 鈥 over the course of six months.
听
Understandably, some organizations believe this approach to be 鈥渢oo conservative鈥. They may want to adopt a more aggressive approach to onboard several use cases at once at the beginning. We have found that it typically takes six months to bootstrap the first data products under the above three dimensions. Once that period is complete, data product owners need to be trained and a platform team needs to be assembled to begin to build out new capabilities for the platform based on learnings. New templates and ways of working are also needed to shift the organization from a centralized to decentralized model. A听data mesh governance board will also need to be established with a roadmap to onboard additional domains.
听
Simply put: Don鈥檛 run before you can walk. There will be plenty of learnings to take away on the journey. Don鈥檛 miss them.
听
Recommendation #8: Be deliberate about your first use cases听
听
Choosing your first set of use cases can be daunting, but it鈥檚 important to remember there is no single right answer. Every organization is different. The decisions you make will depend on what you want to optimize for and your organization鈥檚 risk appetite.
听
For example, some clients choose highly urgent, complex use cases. They鈥檙e often eager to address inefficiency and internal politics that are causing pain in the organization. Others have gone the more conservative route by picking a simple, isolated use case to test the organizational appetite for data mesh.
听
Other clients, meanwhile, choose to optimize the build-out of a diverse set of platform capabilities. A mature听data mesh data platform provides capabilities for batch data processing, [near] real-time data processing, analytics, AI/machine learning (ML) and data governance. Choosing use cases that address each of these capabilities sets the precedent (or priority) to build the groundwork for all platform capabilities in parallel.听
听
Another benefit to this approach is that use cases that require more than one capability (such as ML and batch capabilities, a common dependency) become candidates for the first use cases. This approach, however, requires a large and strong platform team that can handle product thinking and the complexity of integrating capabilities in a compatible and interoperable way.听
听
While there are many approaches and ways to optimize, our recommendation is that the first chosen use case should be:
听
Manageable given the organization鈥檚 existing capabilities
Tied to business goals with clear metrics of success
Attainable
听
It鈥檚 easy to become consumed by the choice of use cases due to internal politics and over-optimization, but there is often not one perfect use case. A common pitfall to avoid is paralysis by analysis: timebox the exercise and start with 鈥済ood enough鈥.
听
Recommendation #9: Onboard data products onto the platform AND operating model鈥檚 governance structures
听
An anti-pattern that we see in IT-driven organizations is that the onboarding of data product teams stops at onboarding the team on a platform. In reality, it is also critical to onboard them to the organizational processes set up by the operating model.听
听
Proper onboarding to the operating model allows data product teams to be represented in various forums to influence important activities, such as feature prioritization. Onboarding also involves adding them to the right communication channels so that they don鈥檛 miss out on important information about new features, releases, and learning opportunities.听
听
In the long run, teams that are not onboarded to the operating model might not be aligned with the principles of the wider听data mesh ecosystem. This could lead to discontent and frustration from all sides.
听
Recommendation #10: Be committed
听
Data mesh implementations within an organization sometimes require large changes over time that affect many people, existing departments and decision-making processes. This can prove difficult when some parts of an organization are resistant to change.听
听
It might be argued that the problems associated with organizations that change quickly 鈥 like scale-ups, where experiment is valued and there is a more relaxed attitude to data access policies 鈥 and more established and mature enterprise organizations听 鈥 with greater centralization and more stubborn legacy silos 鈥 are different and so should be treated differently. While there is an element of truth to this, the reality is that either way, if the organization wants to be successful, it needs to be willing to commit to the change in terms of implementation and resources.
听
Our most successful听data mesh adopters have done the following:
听
Obtained dedicated top-level sponsorship early-on
Adopted听data mesh is part of the organization鈥檚 identity, where everyone was willing to participate in adopting the practices and mindset of听data mesh via a company-wide initiative that was data-first听
People were willing to dedicate the time to learning the new approach and change their processes
The right people were quickly moved to the right places and investments were quickly made where there were gaps
The organization was willing to accept and implement a decentralized model
The organization was ready to be aligned with domains (if they weren鈥檛 already)
听
Organizations who have made up their mind to go all-in, achieve value with听data mesh faster (and cheaper). That鈥檚 why a full commitment to听data mesh is required for efficiency and success in the shift.
听
Summary
听
A data mesh initiative could bring innovation and positive impact to your organization but it requires dedication and commitment to implementing it properly. Change can be challenging but with the right approach and the right process it鈥檚 possible to overcome growing pains to make a success of data mesh.
听
Do you want to learn more about how to bring听data mesh to your organization or evaluate whether you are ready? Get in touch with us.
听