We have previously discussed interesting phenomenon in analytics which, if not carefully considered or understood, can lead to unexpected or incorrect conclusions. See, for example, Simpson’s Paradox and Goodhart’s law. And then there is “data-dredging” of which so-called P-hacking is a type.
The Texas sharpshooter fallacy is another pitfall to avoid. It is not so much a mathematical phenomenon, like Goodhart and Simpson, but rather, a manifestation of human bias. It happens when similarities in data are emphasized and differences ignored. The metaphor refers to a Texan (no offense intended – we have a few on the First Analytics team) who fires rifle shots into the wall of a barn, and then paints the target around the largest, most dense cluster of hits, and proclaims himself to be a sharpshooter.
In analytics we see this often in cluster analysis, which is used for segmentation. Although groupings are suggested by math, decisions about how the clusters are formed and evaluated are nearly always based on subjective criteria. This opens the door for cognitive biases (in their many varieties) which produce groupings that are reflective more of what the human eye wants to see versus what the data suggest.
So many applications involve classification and grouping of entities, especially in marketing and customer analytics. To mitigate the risk of a biased segmentation scheme, good data scientists will come up with objective measurements of groupings and where the tight clusters are. It’s not perfect, and subjectivity remains, but in anything data related, the data should be allowed to speak for itself before being wrangled by a human.