Skip to content
Science, Maths & Technology

Diary of a data sleuth: Getting to grips with the census data

Updated Friday 14th December 2012

With the recent release of key statistics around the UK's 2011 census by the Office for National Statistics, our resident data sleuth set out to see what he could find.

A few days ago, a set of key statistics around the UK's 2011 census was released by the Office for National Statistics (ONS). For amateur statisticians, releases such as this provide a wealth of data that can be explored. For the data sleuth, official data releases supposedly makes discovering the data sets relatively straightforward, although evidence recently presented to the House of Commons Public Administration Committee (PASC) has been highly critical of the Office for National Statistics website in general (BBC: ONS website is improving, Andrew Dilnot tells critical MPs; watch Andrew Dilnot's evidence to the PASC, December 2012). If we start at the beginning, though, with a general web search engine, and the task of tracking down census data, how well do we get on? A quick search for uk census 2011 data turns up the Census 2011 area of the ONS website.

The census website provides three ways in to exploring more about the census data:

Screen shots from the Office for National Statistics website Copyrighted image Icon Copyright: ONS The census website

The census analysis ("Stories") section takes us to a series of articles that help us get some sort of understanding about the make up of the population of England and Wales (on census day, at least!). Each article addresses a particular topic, starting with religious affiliations, or ethnicity and national identity.  As Professor Kevin McConway describes in his compelling inaugural lecture, statisticians love to compare things, and in the case of the census, one obvious comparison to make is between regional or local distributions, comparisons which are often illustrated with maps.

The articles also hint at how we can start to pull stories out of data. It is quite possible that section headings along the lines of "Religious affiliation across the English regions and Wales" arose from a statistician asking him- or herself the question: "Do religious affiliations differ across regions, or within local areas?"; looking at the data through this lens;  refining this question as "How do religious affiliations differ across regions, or within local areas?"; and then reporting back on what they found. In many cases, one or more stories may suggest themselves as to why the population is distributed in this way, which can lead to a further round of analysis or the start of a journalistic or academic research project.

As well as illustrated text articles, some of the topics are also summarised by means of narrated videos, which are collated on the ONS Stats YouTube channel.

Religion in England and Wales

If you want to dig into the numbers behind the charts, links are provided to files containing the data in the Microsoft Office Excel/.xls spreadsheet format. This is slightly irritating, because it presupposes that we have a spreadsheet application to hand to view the data or access to an online spreadsheet application such as Google Sheets. While spreadsheets provide one of the most widely used tools for working with data, alternative approaches to working with data and creating data visualisations are increasingly to be found. One of my favourites is DataWrapper, a package built originally to support the work of data journalists, which lets you visualise simple data sets with a range of attractive charts that can be embedded in your own web pages. There are also power tools available for working with data in its rawest form, such as "data mechanics" tools like OpenRefine (which amongst other things will open data files, let you preview them, and then convert the data to other file formats), and statistical programming tools such as R/RStudio.

Unfortunately, the map-based figures don't give you direct access to the data. True, the interactive census tools do allow you to experiment, to a certain extent, with various map views, allowing you to select what values you want to compare on a geographical basis as well as over time in the form of a side by side view of the 2001 and 2011 census data.

But no data... Maybe it's in the data area of the site, hidden away amongst the Reference Tables? There are certainly lots of data there, but it's maybe not clear to all but data wrangling experts how you might take that data yourself and get it onto a map.

The rise of online and computer mapping tools, triggered in part by the expectations raised by Google Maps about what maps can - and should - be able to do, has made producing digital maps relatively easy:

  1. if you know what tools to use, and
  2. you're prepared to get your hands bitty. (Bitty as in bits... bits and bytes? Bits versus atoms? Oh, never mind...)

In brief, there are two basic approaches to producing 'coloured in' maps (a particular form of thematic map also known as choropleth maps). The first is to load your data into a tool that already knows about the shapes of different explicitly identified geographical regions. If you identify each region with an appropriate code, the area of a map associated with this identification code value can then be coloured in appropriately.

Screen shots from the Office for National Statistics website Copyrighted image Icon Copyright: ONS Census area codes

The second approach is to actually bundle so called "shapefile" data along with your actual data. A plotting tool can then be used to draw out the regional boundaries from the shape data, and then fill the regions with a colour according to one of the statistics values associated with the region. If you look carefully at the data sets, you will see that where regions are mentioned, they are identified by an explicit Area Code value. These codes may be recognised by mapping tools of the first type, tools such as OpenHeatMap. The ONS also publishes "shape" data for each of the Area Code identified regions, which can be used to power mapping tools of the second type. Conveniently, the Census Dissemination Unit, operated by Mimas at The University of Manchester, publish combined data and shape-data files that make it relatively easy to create your own maps. So for example, here's one I created this morning (it's easy when you know how).

An example demo of Census 2011 data Census 2011 demo

For more about online mapping, see OpenLearn: Visual representations of data and information - Geographical data.

In the field of astronomy, experienced amateur astronomers have in the past, and continue to this day, to make significant contributions to professional astronomy. In part, this is because there's a big sky out there and there aren't enough professional astronomers to watch it all, all of the time..! With the release of ever more public datasets, such as the 2011 Census data, might there be similar opportunities for dedicated amateur statisticians to help make sense out of it all? As citizen journalists, might we be able to spot interesting stories within the data, particularly at a local or regional level familiar to us, that may have been missed in national reports? And as creative technologists, might we be able to find new ways of exploring or visualising the data that make it easier to analyse for amateurs and professionals alike?

As a recently advertised job for the "Head of Rich Content" at the ONS suggests, the ONS website should anticipate and meet user needs and expectations, including those of the Citizen User [my emphasis]. Are you one of those users? And if so, how are you using the ONS website in general, and the ONS census website in particular? Let us know via the comments below.

[DISCLAIMER: In the interests of full disclosure, OpenLearn’s Laura Dewis will be leaving the OU to work for the ONS in early 2013. This article reflects the opinions of Tony Hirst, the author, and Laura wasn’t involved in writing or publishing any aspect of it. Everyone at OpenLearn looks forward to seeing how the ONS website develops with Laura on board their team.]

 

For further information, take a look at our frequently asked questions which may give you the support you need.

Have a question?

Other content you may like

Dawn of the digital census article icon

Society, Politics & Law 

Dawn of the digital census

With online returns for the UK's 2011 census opening tomorrow, Dr Ruppert, research fellow with the ESRC Centre for Research on Socio-cultural Change (CRESC), explores the impact of the web on the census

Article
Mathematical language Copyrighted image Icon Copyright: Used with permission free course icon Level 2 icon

Science, Maths & Technology 

Mathematical language

In our everyday lives we use we use language to develop ideas and to communicate them to other people. In this free course, Mathematical language, we examine ways in which language is adapted to express mathematical ideas.

Free course
20 hrs
Talking primes Copyrighted image Icon Copyright: BBC article icon

Science, Maths & Technology 

Talking primes

What is so magical about prime numbers - and what makes them musical? The Material World wanted to find out...

Article
Rounding and estimation Copyrighted image Icon Copyright: Used with permission free course icon Level 1 icon

Science, Maths & Technology 

Rounding and estimation

Scientific calculators are a wonderful invention, but theyre only as good as the people who use them. If you often get an unexpected or ridiculous result when you press the enter button, this free course, Rounding and estimation, is for you. Learn how to do a calculation correctly and get the right answer every time.

Free course
5 hrs
Ratio, proportion and percentages Copyrighted image Icon Copyright: Used with permission free course icon Level 1 icon

Science, Maths & Technology 

Ratio, proportion and percentages

From politics to cookery, ratios, proportions and percentages are part of everyday life. This free course is designed to help you become more familiar with how figures can be manipulated, then you can check whether that discount really is as big as they claim!

Free course
5 hrs
Darwin and statistics Creative commons image Icon Ex_Libris_gul via Flickr under Creative-Commons license video icon

Science, Maths & Technology 

Darwin and statistics

Little did Darwin realise, as he puzzled over the numerical data he collected, that he was laying the foundations of statistics. Kevin McConway tells the story.

Video
5 mins
John Napier Copyrighted image Icon Copyright: Used with permission free course icon Level 2 icon

Science, Maths & Technology 

John Napier

This free course looks at Scotsman John Napier, best known to for his treatise on Protestant religion. However, it was his interest in a completely different subject that radically altered the course of mathematics. After 40 years of dabbling in maths, he revealed his table of logarithms in the early 17th century.

Free course
3 hrs
Statistically significant: The US Supreme Court takes a view Creative commons image Icon Clearly Ambiguous under CC-BY licence under Creative-Commons license article icon

Science, Maths & Technology 

Statistically significant: The US Supreme Court takes a view

A lawsuit over a cold remedy has forced the US Supreme Court into deciding what might be statistically significant.

Article
Modelling static problems Copyrighted image Icon Copyright: Used with permission free course icon Level 2 icon

Science, Maths & Technology 

Modelling static problems

This free course, Modelling static problems, lays the foundation of the subject of mechanics. Mechanics is concerned with how and why objects stay put, and how and why they move. In particular, the course considers why objects stay put. And it assumes that you have a good working knowledge of vectors.

Free course
16 hrs