Hello and welcome to the latest issue of Data Patterns. If this is your first one, you’ll find all previous issues in the archive.
In the last few issues we’ve started talking about making sense of the data landscape using Wardley Maps with a quick introduction to the tool. My plan in introducing this tool has been twofold from the very beginning.
First to help you understand what the data landscape looks like so you get a sense for it, second and most importantly, to introduce you to a few strategy frameworks that I find to be the most useful out there.
The most important criterion for me is that the framework was developed and used by a practitioner. They used it successfully multiple rand Wardley Maps fits perfectly here as Simon Wardley developed it for himself and used it successfully multiple times.
I tested the idea of analysts learning strategy frameworks and it was surprisingly popular. Both Twitter and LinkedIn posts performed really well which makes me think you will enjoy this type of content.
So in this issue I want to introduce you to another framework (more of a set of tools) for analyzing problems to find the root cause. They come from the work of Dr Eliyahu Goldratt on the Theory of Constraints (TOC) which has been used successfully since the 1980’s. You may have heard about him from the popular business novel “The Goal.”
Since you’re interested in data and analytics, I’ll explain them through examples from analytics. TOC calls it the “Thinking Process” but I found the book Logical Thinking Processes by H. William Dettmer to be an easier introduction.
This is going to be a hefty one so stick around because I promise it will be worth it in the end.
Before we dive into the details it’s important to understand some of the key principles of TOC:
If a system functions as a chain of interdependent components you can find and strengthen the weakest link. (aka the constraint)
You cannot improve such a system by improving all the components individually (aka. local optima) because you’re still limited by the weakest link. If a chain has a weakest link, it doesn’t matter how much you strengthen the other links.
All systems operate in an environment of cause and effect. In complex systems this is nearly impossible to discern due to feedback loops and actions of independent agents but there’s plenty of systems where you can.
Almost all the problems you encounter in daily interactions with most systems are not actually problems but rather symptoms of very few root causes. (often just a single one) In TOC nomenclature they’re called undesirable effects (UDE’s) therefore most systems are inherently simple.
Both “cause and effect” thinking and the “inherent simplicity” assumption are fundamental beliefs of Dr Goldratt’s philosophy given his background in physics.
Now let’s take a look at the first thinking tool called the Goal Tree.
The Goal Tree
In his books and lectures Goldratt often begins an analysis of a situation by creating logical trees. The idea is to link together all the UDEs and find a single root cause. This is called a Current Reality Tree or CRT.
However in Dettmer’s book, he suggests it’s a lot easier to start with what he calls a Goal Tree (IO Map in the book) which maps out the system’s goal, the critical success factors and the necessary conditions. That makes much it easier to build the CRT and find the root causes.
I decided to learn both of these tools by trying to figure out the critical root causes of all the problems I’ve encountered in the data field, so I asked my audience to give me some examples. You can find that list on Twitter and LinkedIn.
I then used this list to come up with the Goal Tree which you can see below.
The tree is made up of 3 tiers:
The overall System Goal is at the top
The Critical Success Factors (CSFs) that support the goal sit right underneath it
The Necessary Conditions (NCs) that support the CSFs form a hierarchy of dependencies underneath them
Let’s go over the specifics of this tree.
Goal: Analytics is used effectively by everyone in the organization.
The Goal is stated as “Analytics is used effectively by everyone in the organization.” Obviously this is what any data driven business wants. They want to use analytics effectively to make decisions on how to improve the operations of the business.
I specifically chose this as a goal vs say “Organization is data driven” because it makes the CSFs a lot easier to see. If you want to go somewhere specific, it makes sense to start with the goal in mind and Dettmer suggests this diagram is key to doing that.
The Goal Tree is fractal. It can apply at the highest level (org wide for example) or at the individual department level. Dettmer is quite adamant about choosing a level which you have influence over so that you can actually make changes.
The CSFs support the goal in a very specific way. First, you’re not supposed to have more than a handful of CSFs (Dettmer suggests 5 max). Second, each CSF must be necessary in order to achieve the goal. If any of them isn’t present, the goal should not be achievable. This is very important to narrow down your focus.
I’ve listed 5 CSFs on my tree:
The value of analytics is clear and understood by everyone
Everyone knows how to use analytics to improve the organization
The key metrics are consistent everywhere and trusted
Everyone knows the key metrics and can access them easily
The organization is strongly motivated to become data driven
Let’s go through these one by one to see why they are indeed critical to success:
CSF 1: The value of analytics is clear and understood by everyone
The first one is about the value of analytics or the “why.” When you understand the value analytics brings to the organization very clearly, you’re well on your way to using it effectively.
If you don’t, any efforts to improve data quality, build reports and dashboards, design and deploy ML models go to waste and your goal of using analytics effectively falls short. That makes it a critical success factor.
Even in today’s world where every company says they’re data driven, if you ask any stakeholder why they think analytics is important, you get vague answers like “it provides actionable insights.” This indicates the value of analytics is still not as clear as say the value of marketing or sales.
The necessary condition to support this factor is that analytics is considered important by everyone in the organization. Obviously it’s necessary for everyone to consider analytics important in order for it to be valued across the organization. If it’s not, then its value remains opaque and the organization falls short of the goal.
CSF 2: Everyone knows how to use analytics to improve the organization
The second one is about the mechanism of analytics or the “how.” When you understand the value of analytics and how it is actually used to improve the performance of the organization, you’re much more likely to be effective in its use.
If you understand the value of analytics very clearly but you don’t know how to use it to improve the organization, you’ll fall short of the goal. That makes it a critical success factor.
There are two necessary conditions that support this success factor:
Everyone understands the definitions of input and output metrics
Everyone understands how to use input and output metrics
The concepts of input and output metrics come from Amazon as described in the book Working Backwards. Output Metrics track the performance of key business processes while Input Metrics are the knobs and levers that allow you to manipulate the Output Metrics.
CSF 3: The key metrics are consistent everywhere and trusted
When you have a set of metrics that are consistent and trusted by everyone, and you understand the importance of analytics and you know how to use analytics, you’re much more likely to use analytics effectively. If your numbers are inconsistent, trust evaporates and all efforts to become data driven fall short. That makes it a critical success factor.
This CSF has the most necessary conditions:
Metric definitions are agreed upon by everyone in the org.
Enough key metrics are defined and understood
Both input and output metrics are defined
Data quality remains consistent and high
Data quality is closely monitored
Data quality issues are communicated and resolved quickly
There are policies and processes in place to ensure data quality
Core business processes are well instrumented
Since metrics are at the heart of how analytics creates value, it’s important that their definitions are agreed upon by everyone and consistent across the organization. Each organization needs enough key metrics to be well defined and understood and in order to ensure consistency, the underlying data needs to be very high quality.
To ensure data quality, you need to have policies in place to monitor it and processes that deal with communication of issues when they arise. Finally, key business processes need to be well instrumented so that data can be trusted.
CSF 4: Everyone knows the key metrics and can access them easily
The fourth one is also about metrics. If you understand the value of analytics, know how analytics works, have consistent metrics and these metrics are widely known, documented and used by everyone, you’re well on your way to effective use of analytics. If people don’t know what they are, you’re going to fall short of your goal. That makes it a critical success factor.
There are two necessary conditions that support this success factor:
The most important dashboards are easily accessible
Key metrics are well documented and easy to find
In order to ensure that everyone knows what the key metrics are and can access them these metrics need to be well documented and easy to find and the key dashboards / reports that show these metrics need to be easily accessible.
CSF 5: The organization is strongly motivated to become data driven
The fifth one is about the people and the organization. When the organization is strongly motivated to become data driven and all the above are true, you will very likely achieve your goal.
However if all four of the above are true but the organization isn’t motivated to become data driven, you will most likely fall short of the goal. Decisions will be made by gut feel and later supported with data. That makes it a critical success factor.
The necessary condition to support this factor is that the organization conducts regular business reviews. This is another idea that’s discussed quite heavily in Working Backwards, in fact the weekly business review (WBR) is how Amazon actually puts the metrics to use to improve operations.
The Current Reality Tree (CRT)
Dettmer suggests that building the Goal Tree first is necessary to make building the CRT easier and faster. The CSFs and NCs become UDEs which are then linked with each other through cause and effect arrows to help you find the root cause of the problem. The CRT is read from the bottom using “if x then y” statements.
Here’s how to read the tree.
IF the value of analytics is unclear THEN analytics is not considered as important.
IF analytics is not considered as important THEN data quality isn’t prioritized.
IF analytics is not considered as important AND not everyone knows how to use analytics to improve the org THEN metrics definitions are vague and inconsistent.
IF not everyone knows how to use analytics to improve the org THEN key metrics are unknown and undefined.
You can continue like this, following the chains of cause and effect to eventually reach the top where the organization loses interest in becoming data driven.
The root causes are at the bottom:
The value of analytics is opaque/unclear
Not everyone knows how to use analytics to improve the org
That’s why I believe that the fundamental problem in the data field today is market education. Many data vendors take this for granted and try to solve problems upstream which makes their solutions ineffective.
The important thing to keep in mind is that if you try and fix the symptoms (which appear to be problems) you will most likely not succeed with the overall mission of becoming data driven.
I know this is a lot to digest so please feel free to reply or comment with feedback and questions on these trees.
That’s it for now, I’ll write more about this topic because it’s of very high interest to me and I hope to you as well.
Until next time.
Great article. May be one of the problems is that people think they - should become data driven.
They are, but they are not aware how much they are driven by data in all parts of their organisation (that is unless they work alone, but than they are not an organisation).