Listen on these platforms
Brief summary
Data contracts are a bit like APIs for data — they make it possible to interface with data in a way that ensures the transfer of data from one place to another is stable and reliable. This is particularly important for building more reliable data-driven applications.
Ìý
To discuss data contracts, host Lilly Ryan is joined on the Technology Podcast by Andrew Jones, the creator of the data contract concept (in 2021) and author of Driving Data Quality with Data Contracts (2023), and Thoughtworker Ryan Collingwood who is currently writing their own book on data contracts due to be published in 2025. Andrew and Ryan offer their perspectives on the topic, explaining the origins and motivation for the idea and outlining how they can be used in practice.Ìý
Ìý
You can find Andrew’s book .
Episode transcript
Ìý
Lilly Ryan: Hello, and welcome to the ºÚÁÏÃÅ Technology Podcast. My name is Lilly Ryan. I'm one of your regular hosts and I am speaking to you from Melbourne, Australia. In today's conversation, I'm joined by Ryan Collingwood and Andrew Jones. Ryan is a data strategist at ThoughtWorks and is currently writing a book on data contracts that will be out in, what, 2025, Ryan?
Ìý
Ryan Collingwood: Yes, Q1 2025. Fingers crossed.
Ìý
Lilly: Wonderful. Andrew is the author of Driving Data Quality with Data Contracts, which was published in 2023. Andrew, would you mind introducing yourself?
Ìý
Andrew Jones: Yes, sure. Hi, everyone. I'm Andrew. I'm an independent data consultant, and I help organizations build data platforms that reduce risk and drive revenue. I've been doing that for a little while now, which is why I ended up coming up with data contract a few years ago and writing that book that you mentioned.
Ìý
Lilly: That's wonderful. As you may have guessed, we are here today to talk about data contracts. This is a topic that has been kicking around for quite some time and has recently, I think, come through quite strongly in the way that we're all focusing on data in a lot of our practices. To talk about it today and to get to terms with what data contracts are, where they fit into the software delivery lifecycle and all of that stuff, Ryan and Andrew are going to be our expert guides. To kick us off, what is a data contract?
Ìý
Andrew: Yes, sure. I think the easiest way to think about it, particularly if you're not familiar with data engineering and the data landscape, the easiest way to think about it is as an API for data. The reason why I came up with the idea originally was we were getting data directly from our databases, chuck them into a data warehouse, and we're trying to use that to drive data applications. As you know, if you know about API, you definitely don't go direct to the database because that database keeps changing, the schema keep changing, the database is evolving all the time, and you don't want that to affect your data applications.
Ìý
That's the situation we were in. We wanted to use our data for more important things, use it for more revenue-generating applications. Use it for more AIML-based applications. We couldn't do that because that data was not reliable enough. I started thinking about why that was the case. Eventually, I realized that the problem was the lack of interfaces. The fact it was building on top of the database meant that we couldn't have reliable data stream from the upstream applications. Because I got a software engineering background, I started thinking why that was the case, and eventually, I came to the idea that we need some API here, we need some interface here.
Ìý
That's what was missing. That's what was preventing us from building reliable applications on top of our data. Yes, I like to think of it as an API of data, and the keyword there is interface. It's an interface of data that provides abstraction, provides a place for us to ensure that the transfer of data between one place and the other is reliable, more reliable than it was before.
Ìý
Lilly: How does this differ from straight-up API documentation?
Ìý
Andrew: It's similar in a way. The difference really is the interface that you provide data through. In API, generally, you're making a callback and forth, you're getting a small amount of data moving around, or you're making a call to start an action. With data, you generally need to have an entire data set. For example, if you're on a trained data science model, so that might be in a table in a data warehouse somewhere, rather than a HTTP API. Or it might be in a streaming application like Kafka or similar. You could use that contract there.
Ìý
Generally, you move around greater volumes of data and the interaction between the consumer and provider of that interface is a bit different than it is compared to a standard HTTP or REST-based API.
Ìý
Ryan: Coming back to this topic of interfaces, it's interesting, Andrew, you were speaking about one of the fundamental differences with data compared to, say, API specifications that we tend to think of them for web services, is that the interface, you mentioned a couple of it, like Kafka or even your flavor of SQL, that can change. That's an implementation detail that is fundamental, and as much as people might want to have an omni connector for every single type of database interface in the world, there are some things that take some of that pain away. There's still going to be some difference here and there.
Ìý
Coming back to the value, one of the values that I see around the data contract, again, that word interface, is that in my mind, what I really saw in data contracts was this interface. It is a plane of glass where both systems and people can look at the data, understand what the data is. When I say data, it's data plus context, so information about the data. It's having that standardized and not having to think about how we're going to describe the interface because we figured that out ahead of time, and then we apply that pattern.
Ìý
As time goes by, I don't know whether it's because I'm getting lazier or getting older, but I see real value in not having to think about how I'm going to do the thing while doing the thing if I can just do the thing. [chuckles] That makes me happy. That's a great call out there around how, yes, we can think of it as an API for data, but fundamentally, the nature of the problem does mean that there's going to be a little bit of differences here and there. Having something that's consistent that we can hold on to and also gives us guidance around the things that we should be describing as part of that information about our data is only going to be a good thing.
Ìý
Andrew: Yes, exactly that way. When we talk about interfaces, we started off talking about the technical side, so it could be a Kafka, it could be a database, but when you start describing the interface by, say, you don't want to describe exactly how it's done, you want to describe what you want to achieve from that. You start describing I want to make data available to my consumers through, say, a streaming application. Then you also want to create this data. There's more things you need to describe with data. Maybe you want to describe how timely it is. Is it going to be hourly? Is it going to be daily? Is it going to be real-time?
Ìý
You might also want to describe what kind of data is in there. Is it personal data? Is it not personal data? How can this data be used? What can it be joined with? What does it actually mean in business context? This kind of journey we went on. We started thinking of we need an interface. Once you start describing data in enough detail to create this interface, you can start putting all sorts of things about your contract to help describe data even more.
Ìý
Like I said, turning that data plus the metadata plus context into information that can be consumed much more quickly and much more easily by other their consumers so they can build those applications much more quickly, as well as being much more reliably, and deliver greater value to the organization. I think it's particularly important as we're trying to do more with data, particularly with things like ML and AI these days, but whatever it is you're trying to do with data, being able to do it quicker and cheaper is a great thing for an organization.
Ìý
Lilly: To give our listeners a sense of what you're talking about when it comes to the way that these things are defined, we have talked about the schemas that you might want. What does it actually look like in practice for someone to define? There is, for example the open data contract standard, which is a standard, but we know that there could be other standards and organizations may want to define their own depending on their use cases. How do you actually, from a nuts and bolts perspective, put a data contract together? What does it look like?
Ìý
Andrew: As a standard for the open data contract standard, which I'm a little bit involved in as part of the student committee there, and that's quite good because it's kind of just a general standard that contains everything you could think about putting in a data contract. I wouldn't necessarily say that people should use that just off the bat necessarily, because when you think about how to define data contracts in your organization, every organization is a bit different, but more importantly, people who's going to be completing that data contract are using that data contract. The implementation of data contracts needs to work for them.
Ìý
For example, where I worked previously, we defined data contracts in this modern language that many people might not have heard of called Jsonnet and it's like JSON with extensions. It's like a regression language that came out of Google. It's not because it's the best way to implement data contracts. It's because our engineering teams, who we wanted to create these data contracts and manage these data contracts and own these data contracts, these engineering teams, they use Jsonnet to define the APIs. We just want to find the infrastructure's code.
Ìý
It made sense for us to define data contracts in the same way because you wanted it to be clear that data contracts were at this level very similar to API, similar to your infrastructure's code, but as important as these things. To reduce the friction for adoption, we want it to be as easy as possible for us software engineers to use. That's why we use Jsonnet. I wouldn't recommend anyone use Jsonnet unless they're already using it. The key point there it is, when you think about implementing data contracts in your organization, think about, first of all, what you want to put in it, which might be quite similar compared to other organizations, so you can use the standard for inspiration there.
Ìý
Things like ownership, things like SLOs, things like schemas. They're going to be quite standard across organizations, but when it comes to implementing it in your organization, think about how you want the data contract owners to complete to own their data contracts. How they're going to be using it and start from there rather than making them use, say, Jsonnet or Avro or Protobuf or whatever it is you might want to use. Think about the user first.
Ìý
Ryan: Yes, I definitely resonate with that. My contacts when I first encountered data contracts, the number of people that had serious technical chops, were definitely in the minority. I'll just say it, my first attempt at a data contract was essentially an Excel spreadsheet because that is what the people that I needed to get knowledge out of, that's what they understood. That's what they were comfortable with. Yes, it's not necessarily my favorite way to structure data but it is a start.
Ìý
It was good enough for me to get going, and then gradually over time, I managed to find something that worked a little bit better for me, but was still within their tolerance that they could work with that we could then collaborate on and work together. Yes, I definitely agree. Identify your audience. Perhaps there's also maybe a moment to talk about the general parties involved, the data producers, and the data consumers. You may find yourself in the situation where I was initially where I was both the consumer and effectively the producer working in a centralized data team, which is perhaps a topic we can briefly chat about in a second.
Ìý
Recognize who your parties are, what their level of comfort is. Then I think it's great in your situation, Andrew, where you had an existing set of tooling that worked for people. They were comfortable with it because when we're changing the way people work, I do believe that people have a tolerance for the amount of stuff that you can throw them at once before they start getting uncomfortable, and it varies by people. If by reusing things that people are already comfortable with that gives you a little bit more wiggle room to make people uncomfortable, then that's a smart move.
Ìý
Lilly: You are talking a lot about making sure that you meet people where they're at within the organization and ensuring that whatever your data contracts are are things that people can actively use and understand and be part of creating. What kind of maturity does an organization need to have as a prerequisite for getting value out of data contracts or making them work in the first place? In one case, to me, maybe it would imply the existence of databases, but we also know that many organizations have to start with spreadsheets and really, it makes a lot of sense.
Ìý
The world runs on Excel in a lot of cases and that's something that we have to work with. What kind of situation does an organization need to get benefit from a data contract and make it work for them?
Ìý
Ryan: It's not a one-size-fits-all all, but certainly, the problem that led me to encountering data contracts was I was at a point in an organization where after much inter-inspection of the data platforms and data systems which I had inherited, and there was still this perception of there being a data problem. Yes, there was some things that needed some uplift, but really, what I came down to in the end was as career-limiting as it may sound, often data is a side effect of things happening.
Ìý
If those things that are happening are not happening in a consistent manner, if there's three different ways to process a refund, that means that there could be at least three different expressions of the data associated to that event. To process a refund, you're going to have data quality challenges. We got to the stage at this organization where it's like, "All right, let's have a conversation about what our processes are. Let's standardize some things." There have been rapid growth, people had to adapt an event as they went, and sometimes that's necessary.
Ìý
Sometimes you just got to make it up, you got to figure it out, you got to get things done, but then certainly, I think technologists are familiar with the idea of technical debt. At some point, you have to pay down the debt that you've accumulated. Just like technical debt, there is process debt. That's what initially led me to this thing of, all right, well, we are having this conversation about what our process is, here is an opportunity to codify all this great information and this agreement that we're getting and into some sort of thing that can be reused. As a recovering business analyst in my past lives, I had exposure to things like data dictionaries and what have you.
Ìý
I think also having some software development exposure, I thought, well, if we do this right, if we do this in a way that is structured and plausible, we can use this. This can be a living document. This doesn't have to be a document that's correct at the launch of this thing, and then immediately becomes out of date once we hit our first SevOne, and then we do an in-place fix, and then the documentation and the implementation just diverge. If the documentation can be part of the integration, can be part of the solution that as it evolves and as it lives, wow, that's a document that's worth something.
Ìý
To come back to the question of, where should you be as an organization? I think that if you are at a point of having data quality challenges and/or, and I think if you are considering data quality challenges, it's also worth having a conversation around, are our processes actually what we need them to be? Because you can tackle the data quality on its own, but I feel that having that conversation around, is the activity that generates this data what we expect it to be? Are we agreed? [laughs] You're just kicking the ball down the road. That will be the big call out there, is that if you are serious about this, I would say that, are you willing to have a conversation about what it is that we do that generates this data?
Ìý
Are we agreed? Are we aligned on what those processes are? Because we've briefly touched on it, but within the parlance data contracts, and I really love the way that, Andrew, you've framed this in your book, the conversation around that there are data producers and data consumers. Data producers are people that are producing data. It says it there in the name and consumers, obviously, want to consume it. The real big got you is often data producers don't know or don't recognize, or if you want to be nihilistic, don't care, but I'd say that's the minority of cases.
Ìý
I think it's more they don't know, they don't recognize that they're producing data because if you're dealing with a department of your company that's in procurement and you ask them, "What's your role? What's your function?" they're not going to say it's to enter purchase orders. [laughs] They're going to say something like, "It's the source materials at a great price so that we can have a better margin." It might be something like that, but they're not going to say that it's something related to data entry. Capturing that data, things like capturing purchase orders is vitally important for the rest of the business because it's all these series of interconnected value chains.
Ìý
Because that information about the purchase orders makes doing things like reconciling the finances possible. If it's not done or it's poorly done, it makes this other really an important task really difficult. That's why I say, where does an organization need to be? I think it needs to be at that point of a reckoning around, yes, maybe it's born out of a concern around data quality, but I think to really make a difference, there must be an openness to have a conversation about what is our process. I think underlying it, this is where it touches into the social aspect of it, is being prepared to develop empathy for describing that scenario of data producer way upstream and data consumer way downstream.
Ìý
I do believe that if we help those people who are upstream have empathy for the people who are downstream from them to go like, "This information that is perhaps a chore for you to fill in and capture and do has a real material difference to these people downstream."
Ìý
Lilly: I want to come back to that issue around empathy in a minute, Ryan, because I think it's something that's really worth digging into. Andrew, from your point of view and in your experience, what has it been like working with a variety of different contexts to make data contracts work? What have you learned about it over the time where you've been maturing these ideas for yourself?
Ìý
Andrew: Yes, that's a good question. Like Ryan said, many organizations struggle with data quality, so that's where they come from. The ones that are looking most of the data contracts, to try and help them change the organization are ones where they are now using data for something more important, even more important than it was before. It's some part of their strategic goal, where that's AIML-based, where that's just using data to create products, drive revenue, or differentiate themselves in the market.
Ìý
It's that realization that data is key to a business that gives those organizations the, I don't want to say the freedom, but the ability, the chance to look again at how they're doing data and really do a root cause analysis on why can we not achieve this now? What is the problem we're trying to solve here to achieve our goal? Then you go back to the quality issue, you go back to how you're sourcing the data, and where there's any interface around that, and eventually, you come back to data contracts. There's organizations that have that, that have this strategic goal which requires better data, and they go from there back up the stream to why my data isn't what it needs to be.
Ìý
Lilly: How do you get to that point in a conversation where you help people to the realization that it is a data quality issue? Because as far as I can see and in a lot of my experience, it has certainly been the case that data quality issues are the root of many different types of business problems, but because they are so deeply interconnected, as Ryan said, these are byproducts of other actions that create the data. Because data's so deeply interconnected with everything a business does, it can be really hard to identify that as a particular issue.
Ìý
When you do, it's also difficult to get buy-in from people who are not data engineers or data scientists to participate in a conversation like this and see it as something that is worth investing their time in and worth investing their time in evolving and maintaining over a longer period. In your experience, Andrew, how do you get to a point with a business like that where you really need to come to that realization collectively in order for a data contract to work?
Ìý
Andrew: It's really all about communication. The problem data teams often have is they are, but in a part of business, maybe under IT, or even under finance, but not generally part of. They generally come hidden in business and they have been almost suffering in silence with this. They are expected to get data from a variety of sources, a variety of quality levels and turn that into porting. That's okay for results weren't that important. They're kind of important, but port reporting is important, particularly if they started doing revenue reporting and things like that. They're important to business, probably not critical to the business. If it fails, the business carries on running.
Ìý
Now they're starting doing more important things, they need to or they have the ability to start having these conversations with folks producing data and really highlight issues of having like, "Why am I not able to turn this data around quickly? Why do their pipelines keep failing? If they keep failing, then how can we build this whatever strategic goal they want to build on the back data?" When people ask question like that, they need to have the ability or they need to feel confident enough to raise issues and say, "This is the reason. It's because something is changing or the data entry is not correct," or whatever it might be.
Ìý
They need to have these conversations. That's what really we spent a lot of time doing. I started off talking about data contracts in technical sense, but really, in my experience, a lot of it is it's about communication. When I started doing data contracts, I had a great manager I worked with, and between us, we spent so much time talking to all parts of our business. We have our software engineers who are about to be CTO. We've asked people in between. It doesn't really matter. It does matter who, but always talking, always communicating with the challenges we're having.
Ìý
Really repeating that message in different ways, different audiences, explaining why we had to change things in one part of business to achieve things in our part of business, in the data part of business. Really, it's all about communication.
Ìý
Lilly: Ryan, I've seen you in the past when we've discussed this topic say that data contracts are a document for both humans and machines. That is something that I would love to hear a bit more about from you, particularly as it relates to the way that we can evolve data contracts now that they are becoming a bit more of a mainstream topic that people really want to engage with.
Ìý
Ryan: As I mentioned previously, a data contract is bundling. Yes, this is about data, but this is really information about data. It's a word that has been used so often that it feels like it's losing its meaning, but really, this is what I see a lot of the value about the data contract is this is a document that describes what the data is. What I mean by that is the meaning of the data. One of my bugbears that I've had around the discourse, particularly amongst data engineering folks, and God bless them, I love them, they're my tribe, but we tend to get really fixated on schema and that's often where the conversation ends.
Ìý
Anyone else outside of the data engineering world does not particularly care about schema, they don't understand schema. They'd be far more interested in the semantics. What does this data mean? How can I use it? How am I allowed to use it? What were the conditions under which this information was captured? Because if you think of, say, a table of email addresses, let's just say, and you have no other context about why this table of email addresses exists — for one thing, PII spidey senses tingling — but let's overlook that for a second.
Ìý
If you just know that it is a table that has string or varchar-type data that looks like email addresses, that doesn't really tell you a whole lot, but if you then learn, "Oh, hold on, these email addresses represent people that have told us, 'Please stop emailing me,'" that's a very different conversation. Because if you didn't know that and perhaps you were misbehaving, you might take that list of email addresses and go like, "Oh, fantastic. These must belong to customers. They want to hear from us. Let me send them some good news about the fantastic things we're doing," but this data set represents people that do not want to hear from you.
Ìý
This is the kind of information that is often, as I say, it's not bundled with the data. This is another important element about data contracts is that data contracts allow us to speak for the data in ways that data can't necessarily speak for itself. That's a really important aspect about how data contracts are both for the benefit of machines and also the benefit of humans because where benefit for machines comes along, is like, yes, we can put in these things that we know and love like schema. We can also put in things like range checks and pattern matches that we expect to exist. Things that we can test and validate and build into our pipelines.
Ìý
Certainly, if you are transmitting data between different systems and different programming languages, as this afternoon, I went off on a bit of a tangent on how, by virtue of being descended from JavaScript, that JSON schema, if you don't take the time and care to be really specific with your numbers, you can have some really less than optimal experiences because of JavaScript's, let's say, relaxed approach to data types compared to some other programming languages. Let's not even get started on equality. There's additional information that we can put into the data contract that gives us some safeguards and some security around systems and interoperability about systems.
Ìý
Also, importantly, gives us safety and I guess a degree of comfort around interoperability for people, as in, how can I use this? Can I use this? Where can I use this? Where were the conditions under which this was captured? When I think of it as being for both humans and machines, it's, again, that interface. As in, this tells you where you can access the data. This tells you why you can access the data and other really important things like that.
Ìý
Andrew: Yes, I agree. It should be human readable and machine-readable, and being machine readable allows you to do the best things that you spoke about, Ryan. We can go even further than that. We realized this quite early on that once you have it in machine readable format as well, you can do all sorts of things and really build a whole data platform around it. We can do things like change management around the schema. Is it breaking change or is it not? Have a CI check and prevent breaking changes from making rates of production in first place. Really moving a whole class of instance we had.
Ìý
We can categorize data, whether it's past data or not, and quite easily build tooling that anonymizes data or that data retention when we should no longer have that data. Really since the last five or six years since we using data contracts, we haven't found anything we couldn't express in a data contract that we could then implement at platform level. Even simple things like doing backups on data and how long the backup should be kept for, make sure that's done. All people have to do with data contract is say, "I won't backup for 30 days and the tool just takes care of it," but they don't need to know exactly how it's been backed up, where it's been backed up to. It just happens.
Ìý
A lot of these ideas come from platform engineering, which some of our listeners may be familiar with, but really, it's very powerful. Once you have this machine readable and human readable contract, there's no limit to what you can automate through that, which really helps with data governance as well.
Ìý
Ryan: Coming back to that thing around interfaces, what you described there, to me, it's being able to be decorative in your intent rather than having to be explicit. When you think of the things that have endured, let's just take our favorite SQL, it is you declare your intent. You don't tell it exactly how it should go and manipulate the bits and bites and the hard drive. You say, "Go get me this thing from this thing and do this aggregation." In my mind, those are the interfaces that endure.
Ìý
If you have something you can evolve that is very much around aligning people to say, "This is my intent and this is the things that I care about," and then abstract the implementation details, or even better yet, cater for a variety of implementation details. That's the other thing that I think is important not to overlook is that you can have from this one document, if you need to, cater for a variety of implementations. Perhaps you have a very federated data capability around your organization. Perhaps there's a very compelling reason for team A to implement a functionality that team B has in a very specific way. That's okay as long as they can agree on the interface.
Ìý
Lilly: What is the future or the desired end state for data contracts in general now that we're seeing a lot of attention on them and more adoption? Andrew, I think you are probably the person who has put more thought into this than probably anyone else in the world. I'd like to know where your thoughts are trending when it comes to what happens next in the current environment.
Ìý
Andrew: That's a good question. I think data contract solution will continue because I think it's a very simple idea like apply interfaces data, describe data, and then use that description to power a platform. Very simple idea, but very powerful and I think we'll continue to do that. I like to think the standard will continue to take off and evolve as well. We have the data contract standard we spoke about earlier because what I find quite often is we have this data contract in one format, continues and we're constantly changing it to a different format to integrate with a different tool.
Ìý
We convert it to protopath to configure Kafka or whatever it might be or convert it to some sort of JSON document to integrate by data catalog. What would be great if we could just convert it to the standard format, and then you get a data catalog. You get a tooling that helps you with authentication and privacy and governance and all those kind of things. You just plug it into your data platform. That'd be really cool if we can achieve that. That's what we're working towards with the data contract standard. See, I think data contracts keep growing as idea and I think it would be driven by the idea that, what Ryan was saying, a lot of companies struggle with data quality.
Ìý
That's got a great cost in the organization in terms of how much data engineering time is spent there, how many incident you have. While at the same time people want to do more of our data and achieve more of our data. I think while those two things remain true, we're going to start applying more discipline to our data and data contracts have a supply of that discipline.
Ìý
Lilly: Both of you have had a lot of thought put into this recently with a variety of things that you've been writing. We mentioned at the top of this. Ryan, you're working on a book right now about data contracts. Do you want to talk a little bit about that, what you're hoping will come from that and where people can find it when it's ready?
Ìý
Ryan: [laughs]
Ìý
Lilly: No pressure.
Ìý
Ryan: No pressure. I'm writing a book. I may yet live to regret that decision. It's certainly been a growth opportunity and I will be certainly happy when it is done, but it's been a great opportunity to grow and learn. What am I trying to do with this book? What I'm looking to do is build upon the work that Andrew and others have done. I alluded to it earlier, I feel that the conversation around data contracts started within data engineering and software development circles. That's where it needed to start and I think that's a valid place for it, but where I am looking to do is broaden the conversation.
Ìý
As I mentioned, I am a recovering business analyst and when I initially approached data contracts, I saw it was an interface. It was declarative and it reminded me of a number of things that I'd gotten a lot of value out of my career as a business analyst. Things like data dictionaries, things like cucumber behavior-driven development, cucumber style type requirement specifications. I thought, "Yes, this is fantastic" because of all the things that we spoke about earlier and the idea that it's an interface for people and machines. It is part of a solution rather than separate from the solution.
Ìý
Also, I see it as a way to bring in perhaps some disciplines that I feel maybe have been, I don't want to say missing, but it certainly feels like we had an understanding a couple of years ago around how to build a three-tier application across disciplines. We knew it from a database perspective, we knew it from a software engineering perspective, we knew it from a requirements management perspective. Everyone had the benefit of seeing the pattern a good few times, it's like, "Yes, we know what we're doing" and the world changed rapidly. We had microservices, we had this idea of big data, which has now just become, well, is it big enough to fit on a MacBook? Yes, okay, it's data.
Ìý
There's still been some shifts around, again, that conversation about what does the data mean? What do we desire? What do we want from this data? What I'm hoping to do with this book is, yes, talk to practitioners, data engineers, software engineers, but also reach out to people from that requirements management space, the business analysts, and the data analysts, and then also to leadership to say that, again, you may come to this seeking answers around data quality, that data, it's a side effect of things happening. Have we had a conversation about how those things happen in our organization? Do we agree?
Ìý
Because if we can't agree on that, then we're going to struggle to agree on just about everything else that flows from this. That's really where I'm looking to engage with my book is to make the circle bigger. I think it's time. The data engineers and the software engineers, we've been talking about it for a while and I think we need to stretch out and bring some more people in.
Ìý
Lilly: For something that encompasses an entire organization's processes, yes, we've got to involve the entire organization. I'm really glad to see a lot more conversation going on about how we can involve different people from different disciplines into that space. Andrew, you also recently wrote a white paper about data contracts as well. Could you talk a bit about that?
Ìý
Andrew: Yes, that's right. I wrote a relatively short white paper. Again, really just to give an instruction to data contracts for people who haven't heard of it before, and really touch on become power of it and why it's important, and what problems it can solve in your organization. I published that recently on my website. You can get to it at dc101.io. DC for data contracts, the numbers 101.io. Yes, if you are interested in data contracts based on this conversation from what we've been talking about today, then that could be a good next step to go a little bit deeper and really understand the problems it can solve in your organization.
Ìý
Lilly: Hopefully, throughout this conversation, we've piqued your interest in the entire topic. I want to thank you both so much, Ryan Collingwood and Andrew Jones for joining me here today on the ThoughtWorks Technology Podcast. We will see you next time.