4 minute read

Average has always been considered as the best statistical estimator of the underlying normal distribution. We can think of an average as a central point of gravity for all values around it. I’ve gathered here The 5 Laws of Averages that should be considered always when making a practical decision.

1. Averages don’t apply to individuals! by Jim C. Otar

This is an important thing to remember. As an individual, in order to learn from history and to assert any kind of informed estimate, your best choice is undoubtedly taking the average of historical data. This does not mean the estimate is useful in all situations.

A doctors perspective

Imagine being a doctor, and you have statistical knowledge that approximately on average 10% of patients have allergic reaction to penicillin. In decision making, per individual patient, this knowledge is not useful, even though you know that 90% of the time the patient is not going to be allergic, you still need to perform additional checks every time to be sure.

A gamblers perspective

Now imagine being a gambler and having a knowledge on which side roulette wheel is going to stop for an average 90% of the time. This knowledge would make you very rich in a very short period of time.

2. Expect the Worst and then Some

When building a bridge you should strive to build a firm structure that can withhold some of the most adverse effects of nature. This means that in order to build a resilient bridge structure you should not consider the value of average wind speeds, but the worst possible and then some. Averages can still help to test design resilience. Finding out what is the average wind speed together with it’s standard deviation can still give you some insight and help you test your design.

You can then estimate the worst outcome from available data using average and standard deviation and then ensuring that you’ve covered the cases that go well beyond that point.

3. Averages Give Some Guarantees

Even though Averages cannot be used in individual cases or when designing resilient systems, they still provide some guarantees. In particular, Bienaymé–Chebyshev inequality guarantees that, for a wide class of probability distributions, a minimum of 75% of values are only 2 standard deviations from the mean, conversely 89% of the values are within 3 standard deviations away from the mean.

4. Trust The Global Average More

In 1955, Professor Charles Stein of Stanford University introduced a novel revolutionary concept called “shrinkage”. The concept is unintuitive because it’s stating that somehow measuring the weight of candies, and number of World Cup attendees can improve estimates for basketball players scoring ability - if you use them together.

How is that possible? Obviously, those data points are completely unrelated, still if used together they produce better estimates with less risk than estimated individually.

Let’s use this concept with a basketball players example.

Each basketball player has it’s own average points scored metric. Given that’s the case, we would assume, and rightfully so, that the best estimate for that basketball player is his own average. James-Stein estimator proves that we can get better prediction of individual basketball players by taking a global average of all players in the world and then using a simple formula to calculate individual estimate per player:

z=yˉ+c(yyˉ).z = {\bar y + c{(y-\bar y)}}.

Using this simple formula where zz is the new estimate for each player, yˉ\bar y is the grand average of all players, cc is the “shrinkage” factor and yy is the players average.

5. Expect Change

There are only a handful phenomena in this universe that remain constant for a longer period of time. Most of the things we see around us change, some need more time, some change a lot faster. We should expect things to change, therefore we should not be surprised when that happens. Averages change. Fortunately, there are some easy and quick test to compare old conclusions with new data and to confirm what statisticians call goodnes of fit, which means that new data still matches the old one. The test is called Kolmogorov–Smirnov test and it’s named after Andrey Kolmogorov and Nikolai Smirnov. Using this test we can answer the following question: “How likely is it that we would see two sets of samples like this if they were drawn from the same (but unknown) probability distribution?”, which would help us to determine if the data has changed over time and if it’s time to move on from old conclusions.


Averages have been a friend of ours for a very long time. It made our lives easier, our calculations more accurate, and gave us a tool to reduce huge amount of complexity to a single number. In order to use it wisely, we’ve discussed some of the interesting aspects of taking an average from an array of data in this post. This is a start of the Average series of blog posts where we will dive deeper into the fascinating world of measurement, and therefore future predictions.