How to Tell When You Have Enough Data to Act

6 Minute Read

We rely on statistics to help us make many decisions.

However, the timing of our decisions can matter almost as much as which decision we make. This article goes over a few concepts to help us figure out when we have enough information to take a chance.

In this article, I’ll focus on a conceptual understanding of how statistics reveals the world and what that means for the timing of our actions. For the moment, let’s hold off on the mathematical details.

Bit by Bit

Before we can work out when we have enough data to act, it is helpful to understand how statistics tends to reveal information.

Statistics can gradually reveal the world around us kind of like a picture coming into focus. At first, we know very little. Maybe we have some information about broad areas of light and dark. However, as we receive more information, the picture starts to clear up a little. Perhaps we can now see that we are looking at a forest and there may be something looking back at us. As we get more information, we can see there is an animal with large antlers looking at us.

With more data, the picture becomes more precise. However, even with very little data we have some information about what is in front of us.

With more data, the picture becomes more precise. However, even with very little data we have some information about what is in front of us.

Based on a creative commons image from flicker user, their site

As we incorporate more and more data, we are able to get a more and more precise picture. With statistics, this could be a literal picture (perhaps with astronomical data) or a forecast of the impact of certain decisions.

Interestingly, the first few bits of information actually tell us the most about the situation. Those initial data points narrow the realm of possibilities hugely. With our picture analogy, we are able to go from “it could be anything, including a city, a forest, or an underwater scene” to “it is probably a forest or the edge of the forest.” Later information can still be important, but doesn’t narrow the possibilities quite as much. Perhaps our estimates go from:

  • It is an elk with antlers extending four to nine feet into the air at the edge of the forest.

to:

  • It is an elk with antlers extending six to seven feet into the air at the edge of the forest.

What About Statistical Significance?

This “gradual focusing” analogy may at first seem a bit at odds with the concept of statistical significance. But “significance” is really just a tool for a different situation.

Let’s say we are testing a new variety of wheat. If someone tells us that this strain of wheat is “significantly better” (statistically), that is a shorthand way of saying:

If we assume this strain is the same as our baseline, it is very difficult to explain its higher performance with random variation alone.

Importantly, this doesn’t mean the difference matters. If we have enough data, a difference of 0.0001% can be significant, but that doesn’t mean it is worth switching to this new strain of wheat.

In situations where the impact is small, significance just tells us how much it doesn’t matter. In many situations, the probability that the change is large enough to be relevant to the decision is much more important than whether it is significant. Happily, we can often work out that probability with much less data.

When to Act

With knowledge that gets progressively more precise, deciding when to act is about balancing the value of improved knowledge with the value of a quicker decision. Another way of thinking of this trade-off is between:

  1. Inaccuracy: the cost of making the wrong decision
  2. Inaction: the opportunity cost of not making a decision yet

When it is appropriate to act depends a lot on the specific decision. For some decisions, inaction will be more costly than inaccuracy, for others it can be the reverse.

Expectation

Let’s work through an example together.

Imagine you have a retirement investment that is reliably earning $2.50 a month. You are considering switching to another investment, but you don’t yet know how much it earns. With the data they presently have, your advisors say there is a 60% chance it will earn $10 a month and a 40% chance it will lose $10 a month.

With this information, you work out the expected monthly earnings for this new investment. You do this by multiplying $10 with the probability it will happen (60%) and -$10 by the probability it will happen (40%). When we add these together we get:

$$\begin{align*} & \text{Expected Monthly Earnings } \\
& 10.00 * 0.6 - 10.00 * 0.4 \\
& 2.00 \\
\end{align*}$$

So, each month this new investment is expected to earn $2.00 and your present investment is earning $2.50. In this case, inaccuracy is more costly than inaction. This suggests that you would be better off sticking with your present investment until there is stronger evidence that this new investment will make $10 a month rather than lose $10.

But, what if your situation was slightly different? What if inaction were a little more costly? Let’s say your knowledge of the new investment is the same, but your present investment is only earning $1 a month. In this case, the $2.00 expected monthly earnings from the new investment would already be more than you are making. Now inaction is more costly than inaccuracy. Assuming the additional risk of the new investment wasn’t a concern, these same probabilities suggest it is wise to switch investments now rather than waiting for stronger evidence.

In reality, the math is usually more complicated but the same principles apply1. We want to choose the path with the best expected outcome. In order to identify this option, it is helpful to consider both the value and likelihood of possible outcomes.

Summary

Incorporating more data often leads to an increasingly precise estimate, similar to a picture coming into focus over time.

When we make decisions in the context of these improving estimates, we are balancing the costs of inaccuracy and inaction. We can improve accuracy by waiting for more data, but that comes at the cost of inaction.2 We can act quickly, but that would be with less accuracy than we might have later. How we balance these depends on both the probability and value of the different possible outcomes.

By clarifying what we know, statistics can help inform our decisions. It can also help us discover when it is the right time to take a chance and when it is better to wait.


  1. The math is often complicated by the costs and benefits themselves being only partially known. There are some great mathematical techniques for working with this and balancing inaction and inaccuracy, but I’ll save the details of those for another day. [return]
  2. This cost also usually goes up; it takes more and more data to improve accuracy by the same amount. [return]

If you like my work, consider connecting to me on LinkedIn.