My OpenLearn Profile

Personalise your OpenLearn profile, save your favourite content and get recognition for your learning

Create account / Sign in

Course content Course content

Basic science: understanding numbers

Start this free course now. Just create an account and sign in. Enrol and complete the course for a free statement of participation or digital badge if available.

More free courses

4.4 Correlation, causation and coincidence

Graphs are a great tool for presenting complicated results: they can help communicate the relationship between two or more variables.

Correlation

A striking example of this is the recognition of a correlation between smokers and lung cancer patients. Lung cancer used to be a rare disease, with only 1% of autopsies performed by the Institute of Pathology of the University of Dresden in 1878 showing malignant lung tumours. Unfortunately, lung cancer did not remain rare and, over the following 50 years, this figure rose to more than 14%.

Causation

A particularly observant scientist, Franz Müller from Cologne Hospital, published a study in 1939, identifying the correlation between tobacco smoke and lung cancer. The study compared 86 lung cancer cases and a similar number of cancer-free controls, showing that the people who smoked were far more likely to suffer lung cancer.

Figure 11 A graph displaying per capita cigarette consumption and lung cancer deaths per 100,000 between 1900 and 2000

Show description|Hide description

This is an illustration of a graph.

Figure 11 A graph displaying per capita cigarette consumption and lung cancer deaths per 100,000 between 1900 and 2000

However, while a correlation between two sets of observations or measurements can point to a causal relationship between them, correlation does not always imply causation. This was part of the basis for the long running debate about smoking and lung cancer, but it was some decades later before this link was accepted following the emergence of scientific evidence for the cause.

Coincidence

Consider this odd correlation between worldwide launches of non-commercial space missions and the number of sociology doctorates awarded in the USA. The graph shows how both of these variables rise and fall together, as if connected in some way. You might wonder if the sociology graduates work on the space missions? However, this is not the case, with sociology doctorates usually working in the field that they are actually trained in and not turning their hands to physics or space engineering. This correlation, as real as it is, is a coincidence. It is also not a freak occurrence; a quick internet search of ‘spurious correlations’ will bring up a whole host of correlations that are purely coincidental.

Figure 12 A graph showing the number of sociology doctorates awarded each year in the US compared to worldwide non-commercial shuttle launches by year

Show description|Hide description

This is an illustration of a graph.

Figure 12 A graph showing the number of sociology doctorates awarded each year in the US compared to worldwide non-commercial shuttle ...

The role of a scientist is to critically assess correlations encountered in their results. There are several criteria scientists use to test the validity of correlations and their significance. The detail is beyond the scope of this course, but they are a crucial science skill. Ways of testing correlations include the goodness of fit, in other words, working out how good the correlation is, and whether it can be reproduced and examined further.

What about the apparent correlation in the graph of space missions and sociology graduates? A scientist might ask why the graph is only plotted between the years 1997 and 2009, since both space missions and sociology graduates were around before and after those dates – did the person plotting the data avoid those earlier and later dates because the correlation breaks down?

Despite the precaution such as goodness of fit and reproducibility, sometimes scientists get it wrong and over interpret a correlation or apply causal mechanisms to coincidences. Can you think of any examples where this has been the case?

Previous 4.3.1 Interpreting graphs

Next 4.5 Create and share your own graph

Take your learning further

Making the decision to study can be a big step, which is why you’ll want a trusted University. We’ve pioneered distance learning for over 50 years, bringing university to you wherever you are so you can fit study around your life. Take a look at all Open University courses.

If you’re new to university-level study, read our guide on Where to take your learning next, or find out more about the types of qualifications we offer including entry level Access modules, Certificates, and Short Courses.

Want to achieve your ambition? Study with us and you’ll be joining over 2 million students who’ve achieved their career and personal goals with The Open University.

Browse all Open University courses

My OpenLearn Profile

About this free course

Become an OU student

Download this course

Share this free course