In 1835, mortality statistics identified "student" as the most dangerous profession with an average age of death of just over 20 years old. During the Second World War, analysis of damage to returning warplanes almost led the Air Ministry to add armour to areas of the plane that did not need extra protection. And in a recent report on More or Less, the OU co-producedseries looking at the numbers in the news, in the episode first broadcast Friday 06 September 2013, we learned that "left-handers die on average several years earlier than right-handers".
Or not, as the case may be...
Imagine an exaggerated scenario where there were no left handers at all born before 1973 - 40 years ago.
If we now look at death records for 2013 and ask who, among the dead, was left-handed we would see that all of them died at or before the age of 40. That would be much younger than the average of age at death of right-handers.
Nothing like this exaggerated scenario ever occurred in reality - but the number of people identifying as left-handed did grow dramatically during the 20th Century.
So the idea that left-handers die nine years earlier than right-handers is a myth.
- BBC News Magazine, Do left-handed people really die young?
What all these stories have in common is an error in the way we're tempted to think about the numbers we have collected, rather than the numbers we have not. In the case of the left handers, left-handedness was often under-reported - and under-acknowledged - until the mid-20th Century.
As Chris McManus, Professor of Psychology and Medical Education at University College London suggests, researchers on the left-handed mortality study succumbed to a very subtle error by only considering the dead.
When identifying students as the most dangerous profession based on an analysis of the number of deaths and "average longevity" (that is, average age at death) of members of each of those professions, "students" were identified as having the shortest average lifespan (Howard Wainer, The Most Dangerous Profession: A Note on Nonsampling Error, Psychological Methods, 1999 4(3):250-256).
Students? Was the student lifestyle then so extreme that few students ever made it to graduation?
The mistake lays, of course, in viewing the "student" profession in the same way as that of a lawyer or architect.
To become a lawyer, or architect, it is likely that such a professional's career included an earlier step as a student.
Once a student graduates and enters a full profession, they are no longer in a position to be classed as a student when they die: they will be recorded as a lawyer, or architect. The range of ages of all possible students may have covered ages 15 to 25, for example, whereas the ages of doctors or professors is more likely to have ranged from 25 or 30 through to 70 or 80, or maybe even older if they are not recorded as "retired" on their death certificate. And so on, for each of the professions.
If we were to look at the distribution of the ages of all members of a profession at a particular time (not just the dead ones!), we would see each profession would span a range of ages, but not necessarily the same range of ages.
And mortality patterns within those ranges may differ (for example, we may expect to see students to die at a constant rate across the range of student ages, but professors to typically die towards the upper range of ages across that profession).
What we need to be mindful, of then, is not just how we analyse the data we have collected, but also take into account the data we have not collected.
A study by Abraham Wald (A Reprint of a Method of Estimating Plane Vulnerability Based on Damage of Survivors, also described in The hole story: What you don't see will kill you) makes this clear.
Analysing bullet holes on returning aircraft produced an averaged distribution of damage that was not uniform across a plane - more holes appeared on the wings and certain parts of the fuselage rather than other key areas. Where then to put the armour? At first glance, you might assume you need to protect the areas where there appear to be a higher density bullet holes.
But Wald assumed that if the distribution of bullet holes across all planes that were hit - not just the planes that returned - was uniform, the areas that needed protecting more were the areas that had the least bullet holes.
Flipping this round, if we assume that the planes that did not return took damage to the areas of the plane under-represented in the averaged damage across the planes that did return, hits to those areas are more likely to result in a plane not making it home.
So the moral of the story? Always bear in the mind the data you didn't collect, as well as the data you did!