Reducing Data Questions Deluge

How properly done self-service analytics can reduce requests on data teams

Jun 15, 2024

Before I get into this week’s essay a quick update on the book update project.

Thank you to those who helped me set up a working DuckDB and dbt repo. I’ve finished updating all the chapters and writing new ones so all that’s left now is a few rounds of editing and it should be ready.

Now back to this week’s post.

Data teams are overburdened with questions and requests. Questions about business performance, metrics performance, root causes of issues, requests for data to validate some hypothesis, requests for more dashboards, etc.

Self service analytics has been touted as the best way to solve this problem. You build structured tables then give data users a visualization tool to let them build their own reports and analyses. More recently “text-SQL” tools have been all the rage here.

This is unfortunately misguided.

You simply cannot expect busy executives to fiddle with BI tools or perform their own analysis. Firstly because proper analysis is very time consuming, but more importantly improper analysis answers the wrong questions and leads to the wrong decisions.

In this post we’ll discuss a framework for reducing question deluge and show a self service analytics tool that actually works.

Reducing question deluge

The key to reducing question deluge is to realize that most questions can be reduced to the following “meta questions” which cover 80% or more of business questions.

What’s happening with X? Are things working well?

If yes, how can I improve it? What are the drivers?
If no, what happened? Why did it happen? How can we fix it?

The beauty of these questions is their fractal nature. They apply the same way on multiple levels of abstraction in the business. Whether you’re talking about a single metric, a business process, a business function (e.g. sales) or the entire business, you can keep asking the same exact questions.

But these questions by themselves don’t reduce the deluge.

You need a standard way of measuring performance across the business because there are many different ways to assess whether a metric is doing well or not and many different metrics to measure the same thing.

Stop me if you’ve heard this before: “Sales are up week over week, flat month over month but we’re down year over year.” How is anyone supposed to make sense of this statement? Each of these measurements indicates a different action.

So what do you do?

First you need to standardize metrics across the business. This way when someone asks “What’s happening with sales?” the answer is clear or can easily be clarified by asking which metric are they referring to.

Next you need to standardize ways of assessing metric performance. Are you comparing against a target? Are you comparing to a previous time period? Are you looking at trends over time? Are you looking at changes in variability?

Amazon for example uses trailing 6 week and trailing 12 month trends for each of their output metrics.

Self-serve analytics that works

With that out of the way let’s talk about self-service analytics that works. The ideal tool accomplishes very specific goals:

It shows which key metrics are doing well vs which are not
It shows what the key drivers of the metrics are helping you debug the business
It focuses your attention on what matters instead of scattering it

Imagine the following scenario:

New MRR is down for the third week in a row. Marketing, Finance and Sales all scramble to find answers. Everyone is looking for the right dashboard and requesting new ones. Because you need answers yesterday, someone will eventually take the blame. Decisions will be made and actions will be taken that simply have no impact.

Meanwhile data analysts are working overtime to uncover the root cause. Three weeks later they come up with the answer: new leads from the trade show were down 25% from what was expected. By then it’s too late. Everyone has moved on

What would happen if you had the proper self service analytics tool?

Take a look at this chart. It’s tracking New MRR which is trending down. (red line) Below it indicates the drivers: Closed Won % and Average Sale Price (ASP). We see that Closed Won is red, that’s where we need to focus to diagnose the problem.

Now we “drill into the reds” to uncover that trade show leads were down. CFO/CRO fires off an email to marketing team to understand why. It turns out inclement weather meant lots of no shows.

Problem solved and nobody had to lose any sleep.

Of course this means you must have a metric tree already set up with all the drivers to all the key metrics already identified. This kind of work can be done upfront. All you need afterwards is to maintain accuracy and find new metrics/drivers which is essentially research work.

Where to begin

If you want a quick heuristic about where to begin, start with sales. Take the company’s P&L statement and turn it into a metrics tree. Enrich with any other metrics you have available and start looking for causal relationships and input metrics.

Here’s a sample tree to get you started

That’s it for this week. Until next time.

Data Patterns

Discussion about this post