The other day I was browsing through the list of world’s most popular goals on 43 Things, when I came across something one might call a “true lie”.
Apparently, 25271 people want to “Fall in love” and on average it takes them 9 years to complete this goal. Not too far down the list, 19421 people want to “get married” and on average it takes them 8 years to complete this goal. And there is the fundamental flaw with the typical perception of statistical averaging.
The dictionary defines an average as:
central tendency around the middle of a scale of evaluation.
Averaging has long been an important methodological “assumption” in data-driven understanding. A lot of analysis and decision-making around the world is based on taking the averages of various quantitative measurements. But, is it really a reliable way of result representation?
An average is a single value that is meant to typify a list of values. This can be misleading if misused. I think the problem with averaging is largely related to transparent distribution. For example, on average every person has one testicle and one breast. Misleading, but true. Without valid segmentation, an average doesn’t accurately classify the data set and the inference becomes biased.
In most cases, an average ends up sounding like a generalized fact, mainly to justify a marketing strategy. If a company promotes a product by stating that its been proven to be effective for “75% of people on average”, it leaves a lot unsaid. What age groups, gender, income levels etc. were these people segmented in while deriving an average? The chances of this product being effective, and the chances of anyone buying the product, should marginally diminish with an increasing lack of segmentation made available. However, marketeers know well that it’s this very lack of segmentation that impairs the judgement of people. We buy what others buy, as Game theory comes into action.
On the other side, many market researchers overuse averaging and reach invalid conclusions. Organizations misapply these conclusions, all to their own demise. Most organizations produce stuff that may never be widely adopted, but since their market research was based on generic averaging to start with, they think otherwise.
Averaging of data without clear information around the segmentation of that data is a vague and pointless exercise, which can have grave consequences through cognitive bias. Unfortunately, it’s also the most popular way of drawing abstract conclusions.