3.4 Misleading correlation
It is important to use data sense when interpreting relationships based on correlation.
Correlation, not causation
Sometimes correlation exists between two variables but this does not automatically mean there is causation.
An example of this is that the sale of ice-creams is highest when the number of accidents in swimming pools is highest. To claim that ice- cream sales cause accidents in pools, or vice versa, would be false. In reality, both factors may have been caused by a heat wave. The heat wave is a hidden or an unseen variable, also known as a confounding variable.
Spurious correlation
Sometimes a correlation can be found between two variables but there is no relationship or connection between them at all. This is known as spurious correlation or, in other words ‘false correlation’.
For example, the number of astronauts dying in spacecraft is directly correlated to seatbelt use in cars. At first glance there appears to be a link between the two variables but there is no relationship, so the correlation is spurious.
Having discussions with your learners about when correlation does and does not imply a relationship is a useful way to support the development of their statistical thinking.