Nowadays, researchers can access a wealth of software packages that can readily analyse data and output the results of complex statistical tests. While these are powerful resources, they also open the door to people without a full statistical understanding to misunderstand some of the subtleties within a dataset and to draw wildly incorrect conclusions.
I love this post so much—it pokes at a topic that I think is incredibly important in all of our professional (and personal!) lives today: the inability of most people to reason statistically.
As a data scientist, it isn’t your job to find the right answer: it is your job to convince other people of the right answer. Knowing that something is true is completely without value if that knowledge doesn’t affect change in the world, and that almost always requires consensus.
I think of these paradoxes—Simpson’s paradox, Berkson’s paradox, Will Rogers paradox—like fables: they’re short anecdotes that teach statistical reasoning. Like most fables, repetition is the key. Know these by heart and reference them when explaining why a particular line of reasoning is faulty.