Science, Maths & Technology

Diary of a data sleuth: Getting to grips with the census data

Updated Friday 14th December 2012

With the recent release of key statistics around the UK's 2011 census by the Office for National Statistics, our resident data sleuth set out to see what he could find.

A few days ago, a set of key statistics around the UK's 2011 census was released by the Office for National Statistics (ONS). For amateur statisticians, releases such as this provide a wealth of data that can be explored. For the data sleuth, official data releases supposedly makes discovering the data sets relatively straightforward, although evidence recently presented to the House of Commons Public Administration Committee (PASC) has been highly critical of the Office for National Statistics website in general (BBC: ONS website is improving, Andrew Dilnot tells critical MPs; watch Andrew Dilnot's evidence to the PASC, December 2012). If we start at the beginning, though, with a general web search engine, and the task of tracking down census data, how well do we get on? A quick search for uk census 2011 data turns up the Census 2011 area of the ONS website.

The census website provides three ways in to exploring more about the census data:

Screen shots from the Office for National Statistics website Copyrighted image Icon Copyright: ONS The census website

The census analysis ("Stories") section takes us to a series of articles that help us get some sort of understanding about the make up of the population of England and Wales (on census day, at least!). Each article addresses a particular topic, starting with religious affiliations, or ethnicity and national identity.  As Professor Kevin McConway describes in his compelling inaugural lecture, statisticians love to compare things, and in the case of the census, one obvious comparison to make is between regional or local distributions, comparisons which are often illustrated with maps.

The articles also hint at how we can start to pull stories out of data. It is quite possible that section headings along the lines of "Religious affiliation across the English regions and Wales" arose from a statistician asking him- or herself the question: "Do religious affiliations differ across regions, or within local areas?"; looking at the data through this lens;  refining this question as "How do religious affiliations differ across regions, or within local areas?"; and then reporting back on what they found. In many cases, one or more stories may suggest themselves as to why the population is distributed in this way, which can lead to a further round of analysis or the start of a journalistic or academic research project.

As well as illustrated text articles, some of the topics are also summarised by means of narrated videos, which are collated on the ONS Stats YouTube channel.

Religion in England and Wales

If you want to dig into the numbers behind the charts, links are provided to files containing the data in the Microsoft Office Excel/.xls spreadsheet format. This is slightly irritating, because it presupposes that we have a spreadsheet application to hand to view the data or access to an online spreadsheet application such as Google Sheets. While spreadsheets provide one of the most widely used tools for working with data, alternative approaches to working with data and creating data visualisations are increasingly to be found. One of my favourites is DataWrapper, a package built originally to support the work of data journalists, which lets you visualise simple data sets with a range of attractive charts that can be embedded in your own web pages. There are also power tools available for working with data in its rawest form, such as "data mechanics" tools like OpenRefine (which amongst other things will open data files, let you preview them, and then convert the data to other file formats), and statistical programming tools such as R/RStudio.

Unfortunately, the map-based figures don't give you direct access to the data. True, the interactive census tools do allow you to experiment, to a certain extent, with various map views, allowing you to select what values you want to compare on a geographical basis as well as over time in the form of a side by side view of the 2001 and 2011 census data.

But no data... Maybe it's in the data area of the site, hidden away amongst the Reference Tables? There are certainly lots of data there, but it's maybe not clear to all but data wrangling experts how you might take that data yourself and get it onto a map.

The rise of online and computer mapping tools, triggered in part by the expectations raised by Google Maps about what maps can - and should - be able to do, has made producing digital maps relatively easy:

  1. if you know what tools to use, and
  2. you're prepared to get your hands bitty. (Bitty as in bits... bits and bytes? Bits versus atoms? Oh, never mind...)

In brief, there are two basic approaches to producing 'coloured in' maps (a particular form of thematic map also known as choropleth maps). The first is to load your data into a tool that already knows about the shapes of different explicitly identified geographical regions. If you identify each region with an appropriate code, the area of a map associated with this identification code value can then be coloured in appropriately.

Screen shots from the Office for National Statistics website Copyrighted image Icon Copyright: ONS Census area codes

The second approach is to actually bundle so called "shapefile" data along with your actual data. A plotting tool can then be used to draw out the regional boundaries from the shape data, and then fill the regions with a colour according to one of the statistics values associated with the region. If you look carefully at the data sets, you will see that where regions are mentioned, they are identified by an explicit Area Code value. These codes may be recognised by mapping tools of the first type, tools such as OpenHeatMap. The ONS also publishes "shape" data for each of the Area Code identified regions, which can be used to power mapping tools of the second type. Conveniently, the Census Dissemination Unit, operated by Mimas at The University of Manchester, publish combined data and shape-data files that make it relatively easy to create your own maps. So for example, here's one I created this morning (it's easy when you know how).

An example demo of Census 2011 data Census 2011 demo

For more about online mapping, see OpenLearn: Visual representations of data and information - Geographical data.

In the field of astronomy, experienced amateur astronomers have in the past, and continue to this day, to make significant contributions to professional astronomy. In part, this is because there's a big sky out there and there aren't enough professional astronomers to watch it all, all of the time..! With the release of ever more public datasets, such as the 2011 Census data, might there be similar opportunities for dedicated amateur statisticians to help make sense out of it all? As citizen journalists, might we be able to spot interesting stories within the data, particularly at a local or regional level familiar to us, that may have been missed in national reports? And as creative technologists, might we be able to find new ways of exploring or visualising the data that make it easier to analyse for amateurs and professionals alike?

As a recently advertised job for the "Head of Rich Content" at the ONS suggests, the ONS website should anticipate and meet user needs and expectations, including those of the Citizen User [my emphasis]. Are you one of those users? And if so, how are you using the ONS website in general, and the ONS census website in particular? Let us know via the comments below.

[DISCLAIMER: In the interests of full disclosure, OpenLearn’s Laura Dewis will be leaving the OU to work for the ONS in early 2013. This article reflects the opinions of Tony Hirst, the author, and Laura wasn’t involved in writing or publishing any aspect of it. Everyone at OpenLearn looks forward to seeing how the ONS website develops with Laura on board their team.]

 

For further information, take a look at our frequently asked questions which may give you the support you need.

Have a question?

Other content you may like

Introduction to computer forensics and investigations Copyrighted image Icon Copyright: Used with permission free course icon Level 2 icon

Science, Maths & Technology 

Introduction to computer forensics and investigations

With a few easily available tools people can reveal the stored passwords on their computer and access previously deleted data. Learn about some of the issues in data privacy and computer forensics. This free course, Introduction to computer forensics and investigations, provides practical demonstration in a clear and accessible format.

Free course
6 hrs
Dawn of the digital census article icon

Society, Politics & Law 

Dawn of the digital census

With online returns for the UK's 2011 census opening tomorrow, Dr Ruppert, research fellow with the ESRC Centre for Research on Socio-cultural Change (CRESC), explores the impact of the web on the census

Article
More working with charts, graphs and tables Copyrighted image Icon Copyright: Used with permission free course icon Level 1 icon

Science, Maths & Technology 

More working with charts, graphs and tables

When you encounter maths or technical content youll need to know how to interpret this information, and possibly to present your own findings in this way. This free course, More working with charts, graphs and tables, will help you to develop the skills you need to do this, and gain the confidence to use them. This free course can be used in conjunction with, and builds on the free course, Working with charts, graphs and tables.

Free course
15 hrs
Talking primes Copyrighted image Icon Copyright: BBC article icon

Science, Maths & Technology 

Talking primes

What is so magical about prime numbers - and what makes them musical? The Material World wanted to find out...

Article
Mathematical language Copyrighted image Icon Copyright: Used with permission free course icon Level 2 icon

Science, Maths & Technology 

Mathematical language

In our everyday lives we use we use language to develop ideas and to communicate them to other people. In this free course, Mathematical language, we examine ways in which language is adapted to express mathematical ideas.

Free course
20 hrs
Using a scientific calculator Copyrighted image Icon Copyright: Used with permission free course icon Level 1 icon

Science, Maths & Technology 

Using a scientific calculator

Do you have a Casio fx-83 ES scientific calculator (or a compatible model) and want to learn how to use it? This free course, Using a scientific calculator, will help you to understand how to use the different facilities and functions and discover what a powerful tool this calculator can be!

Free course
10 hrs

Science, Maths & Technology 

Working with diagrams

Working with diagrams is essential for students of science, technology, engineering and mathematics. This free course is packed with practical activities and tips which make learning from and with diagrams more enjoyable and rewarding. One part of this course deals with the reading of diagrams and the other part with the drawing of diagrams.

Free course
8 hrs
Modelling static problems Copyrighted image Icon Copyright: Used with permission free course icon Level 2 icon

Science, Maths & Technology 

Modelling static problems

This free course, Modelling static problems, lays the foundation of the subject of mechanics. Mechanics is concerned with how and why objects stay put, and how and why they move. In particular, the course considers why objects stay put. And it assumes that you have a good working knowledge of vectors.

Free course
16 hrs
How fertility rates affect population Creative commons image Icon The Open University under Creative Commons BY-NC-SA 4.0 license article icon

Science, Maths & Technology 

How fertility rates affect population

Tony Hirst looks at population models and the effects of changing fertility rates.

Article