Debugging Your Business with Data
How to diagnose the root cause of a 90% drop in pipeline in 5 minutes. An interview with Abhi Sivasailam and Ankur Chawla of Levers Labs
I have a treat for you this week! This is one of my longest and mnost in depth pieces but it still barely scratches the surface. So strap in, grab a cup of your favorite hot beverage and enjoy.
When I was a data analyst there was one task I absolutely dreaded having to do: “Hey Ergest, sales are down 5% what’s going on?”
Every time this happened I had to dig through the sales funnel from paid ads to website conversion, to lead qualification, to sales conversion, etc. Every time there would be more questions or something I had missed.
Sure I would save all the queries for next time but it was a chore. It would take a couple of days and often I didn’t have a definitive answer. If I missed something I would look incompetent.
As regular readers of this newsletter know, I’ve become increasingly passionate about metric trees since being introduced to the concept last year. Now that I’ve learned about them, I’ve been getting more interested in data-driven root-cause analysis.
What I’m drawn to in particular is how the metric tree concept is really a way of expressing a causal model of how a business works. As my friend Cedric Chin says, “The purpose of becoming data driven is to build a causal model of the business in your head. The purpose of doing all this work is that you want to understand how your business actually works and grows, not rely on superstitious beliefs about how your business works and grows.”
So the whole purpose of mechanisms like Weekly Business Reviews (WBR), XmR charts used in statistical process control (SPC) is to help bootstrap human intuition into building a causal mental model of the business. That’s why, as Cedric explains, the WBR is done deliberately in the direction of causes → effects (input metrics → output metrics).
This is supposed to build in your mind a causal diagram that expresses the business model in the shape of a Directed Acyclical Graph (DAG). Yes I know there are a lot of feedback loops in business processes but we’re aiming for simplicity here.
While the WBR makes the causal DAG implicit, the metric tree makes it explicit and interactive. In my mind, metric trees are an excellent tool both for helping build the necessary mental model of the business and sharpening human intuition.
So over the next few editions of this newsletter, we’re going to talk about how to put causal models to use, by tackling the dreaded Root-Cause Analysis.
To kick off, I invited my friends Abhi Sivasailam and Ankur Chawla to join me for an interview this week, which we’ve transcribed below. Abhi and Ankur are respectively the CEO and Head of Professional Services for Levers Labs, where they build products and provide services to help companies define and operationalize their causal models/metric trees.
I’m extremely excited about what they’re doing at Levers (far and away the leaders in this stuff), and was especially excited when they told me about a recent project around Root Cause Analysis.
The Interview
Ergest: Hey guys, thanks for joining. Let’s get right to it. I love the story around the most recent root-cause analysis you performed for a client: analyzing a 90% drop in sales pipeline in 5 minutes. Tell us about what you walked into here.
Ankur: We’re working with a fast-growing B2B SaaS company and earlier this month, they saw this massive and sudden drop in pipeline. And the company basically reacted the way companies always react in these situations. Sales blames marketing and says “See, we told you all that Marketing leads were getting worse”.
Marketing blames lazy SDRs and entitled AEs. Amateur and professional Analysts alike descend on dozens and dozens of dashboards and reports trying to piece together conflicting narratives. Lots of finger-pointing, thrash, and confusion and days of cumulative human-hours spent getting to a resolution executives are satisfied with.
Ergest: So what did you guys do?
Ankur: So the first thing we did was try to contextualize it – how much of an anomaly is this metric? The most important question in root cause analysis is actually just: “does this even matter?” So often, numbers go up and numbers go down, but these movements are just a part of normal variation. Cedric [Chin] talks about this a lot and he champions the use of XmR charts and we think this is pretty important.
So that's where we started – by just looking at pipeline generation through the lens of SPC and what we saw was that though the drop in pipeline looked precipitous at first blush, it was actually just at the edge of normal variation. This shouldn't have been a surprise – there had been a few other weeks like this in the past few quarters.
From there, we just applied our standard framework for looking at root cause, which is to consider the changes in a metric as resulting from 5 possible causes:
Component Drift, which is when the algebraic components of a metric have changed. For example, we know that Pipeline = Avg Pipeline Amount * Sales Qualified Opportunities. So if Pipeline is down, Avg Pipeline Amount and/or Sales Qualified Opportunities is down. Similarly we know that Sales Qualified Opportunities is the product of two other metrics: “Meetings Held” * “SQO Rate”. And so on and so forth. By expanding out this formula, we can see how big of an impact each upstream metric has on the output.
Seasonality, which is when a metric is moving in a predictable way owing to its own cyclical nature over time. For example, Inbound Leads are always lower in December.
Segment Drift, which is when a metric is moving because of dimensional “mix shift”. For example, if Avg Pipeline Amount is down, perhaps it’s down just because we have more Mid-Market leads that have lower Pipeline potential.
Influence Drift, which is when a metric is moving because some probabilistic driver upstream has changed. Abhi’s favorite example is that speed-to-lead influences conversion rates and changes in speed-to-lead can therefore result in changes to conversation rates.
Event Shocks, which is when some kind of event has taken place within the company or outside the company that has disrupted key processes. For example, a new product launch or pricing change has changed the level of interest the market has in our products.
When we do RCAs at Levers, we run through each of these factors. So in the case of this most recent RCA, we first looked at the Component Drift and were able to say things like: “the 20% drop in Avg Pipeline Amount contributed to 10% of the decline in Pipeline”.
Using the results from our Component Drift analysis, we were able to pinpoint in on one key process step within the sales funnel that had a disproportionate impact on Pipeline: the Booked Meeting flow. Account Executives were simply having fewer meetings than we’d expect, all else being equal.
This drop couldn’t be explained by seasonality, or upstream influences, or by any exogenous “events” we could identify, but we did find one significant change in our “mix shift” analysis: recently, when SDRs booked meetings, those meetings were moving farther into the future. And it was these “bumped” meetings that were temporarily dragging down Pipeline.
We were able to run through all this analysis and get to the core data story in minutes using the products we’ve built at Levers, but anyone can roll-their-own tools using the same kind of structured approach.
Ergest: Ok great, something’s wrong with the booked meeting flow – but…what was it?
Ankur: *laughs* Spring break! That was Easter Monday and spring break week for many American families. SDRs, AEs, and customers being on vacation resulted in lots of would-be meetings being pushed out by a week or two. It was really that simple.
Abhi: It's worth noting that on some level, the actual answer here really is so trivial it’s silly. Like, no one checked the calendar? But this is often how it goes. Taking a step back, the real answer to a root cause analysis isn't a number – it isn't “data”.
The real answer is qualitative – it's some change in how real people performed or participated in some process out there in the world – not in a spreadsheet or a data warehouse table. But the problem is that there are lots of processes that could be implicated and lots of ways those processes could have changed. It's a large search space.
The point of RCA is to bound the search space – it's to use quantitative methods to “shine the flashlight” on where to deploy the last-mile qualitative investigation. In this case, maybe the qualitative investigation was simple – and as simple as “talk to some SDRs” or even “look at the calendar”.
But folks are so awash in their biases and presuppositions – and even in data and dashboards – that they can sometimes miss the forest for the trees. Having a robust framework for RCA helps to insure against that happening.
Ankur: Yeah, I want to underscore the “robust framework” piece here. When I’ve done RCA in the past, it has always been hypothesis-driven – start with a bunch of hypotheses based on things we’ve seen before – seasonality, cohort “bake time”, etc., and see whether those seem to align with the reality we’re looking at.
Approaching RCA that way is a process of elimination game, can take a long time, and is often inconclusive. Having a framework like the one above where we turn RCA into a systematic science has been really revelatory for me.
Ergest: What needs to be in place for this “robust framework” to be possible?
Abhi: Basically, an expressive growth model or causal model that represents the mechanics of how the business and its key processes work. Each of those sources of variation that Ankur just talked about are properties of a well-specified growth model. Models define components and influences; those components and influences have dimensions/slices; etc.
Note that the building blocks of these models are just metrics; the most important requisite here is having enough of the right metrics. At this client, easy RCA was only possible because we had first taken the time to build the right metrics foundation.
Ergest: Makes sense. How do you turn these metrics into a “model”?
Abhi: I think you've talked before in your newsletter about how to think about these models and their outputs and types of inputs. That post of yours was specifically about expressing these models as trees, but the model can be a spreadsheet, it could be a dashboard, it could be on a post-it – whatever. Personally, we prefer to express that model as a metric “tree” (really they're metric “graphs”) because we think they’re a really useful abstraction to help folks reason about, communicate, understand, and collaborate on these growth models.
Ergest: Fascinating stuff. Where can people learn more about these concepts?
Abhi: You know we've really been on the lookout for great content that we can point people to for Root Cause Analysis and we actually haven't found much that we think really does justice to such an important topic. So, we're working on some content right now of our own. It's such a big topic, that we're going to have to break it up into a few pieces, but we should be launching the first piece later this week. We're excited for you to read it and to extend it with your own thoughts soon!
Ergest: Looking forward to it. Any closing thoughts?
Abhi: Only that I'm glad you're spending time on RCA. I'm convinced it's actually the most important analysis that data people support. I talk a lot about the four fundamental questions: What happened, why did it happen, what's going to happen next, and what should we do next? I used to think that the most valuable question for data people to enable was the one closest to action: what should we do next? I now think that's wrong.
Actions emerge from decisions and decisions emerge from…mental models. Telling someone what to do next doesn't change their mental model. Telling them why something happened that they can't explain does. That's the real power of RCA. And why it matters to get it right, and to build the foundation to be able to get it right.
Fin.
Hope you enjoyed this issue. If you did, please like the post and send it to your friends and colleagues.
See you in the next one.
Love this!
Especially tying RCA and metric trees to developing the organizations mental model. It's accepting data isn't as effective at directly influencing decision / action as it is in refining this model. That's the real "root cause" of better decisions ;)
Very much looking forward to more of this type of content from both Abhi and you!