Skip to content
Science, Maths & Technology

Finding and charting data

Updated Monday 4th November 2013

To test assumptions you need to be able to find and analyse relevant data. Tony Hirst shares some techniques.

"Besides everything else, look at the data, look at the facts about the world," enthuses Hans Rosling in Don't Panic, encouraging us all to look to the data to make sense of the world, rather than relying our preconceived ideas about fertility rates or literacy rates around the world.

That's all very well for a statistician to say, but how would the rest of us even know where to start if we wanted to explore such data visualisations ourselves?

Aside: Many of us encounter numerical data on an everyday basis, but it's not always obvious how we can make use of it, especially if we aren't very comfortable working with numbers. The free OpenLearn unit Using numbers and handling data reviews how we can manipulate and make use of numbers, before describing some straightforward, but still powerful, ways of looking at data using simple charts and summary statistics.

Gapminder

One good place is with Hans Rosling's own Gapminder site (although if you're on a tablet computer, you may find it doesn't work—try on your laptop or desktop machine instead).

For several years, the Gapminder Foundation has taken on the role of a 'modern "museum" on the Internet', curating a collection of datasets relating to international health and development indicators from a range of trusted sources. To help bring the data alive, site visitors can use a range of interactive software tools—most notably the "Trendalyser"—to engage with the data.

Creative commons image Icon The Open University under Creative Commons BY-NC-SA 4.0 license The Gapminder interface. Click through for the interactive tool itself

As well as using the application to animate how the data changes over time, we can also use it to generate static images to illustrate a particular story. For example, the following screenshot shows how we can track the relationship between the average income per person and the number of children per woman (the fertility rate) for different countries over the last two hundred years or so.

Charts and diagrams are used on a daily basis to communicate information—but do we read them as well as we might? Learn more about visual literacy from this free OpenLearn unit on Diagrams, charts and graphs.

In the example shown here, I've highlighted the data relating to Bangladesh, as well as keeping track of how that relationship evolved over time (try it here).

The Gapminder application allows you to compare a whole range of indicators from within the application itself, and share links that include the settings for a specific setup.

Finding the Data

Whilst the Gapminder Trendalyser application is a powerful tool for exploring a range of datasets, for some situations you may want to get hold of the data yourself. Once again, the Gapminder website is a great place to start: all the data we can explore with the application can be downloaded from the site as simple spreadsheet data files.

Creative commons image Icon The Open University under Creative Commons BY-NC-SA 4.0 license Downloading Gapminder data

You may notice that the original source of the data sets is also provided. Although there is a increasing amount of data published on the web, finding good quality, trusted data can often be problematic, so it's encouraging to see how Gapminder publishes a description of the provenance of the data that the site uses on the one hand, and that it uses reliable sources on the other.

If you are keen to start exploring the data riches that are published by some of the international development agencies, here's a list of sites to get you started:

When looking for data, try to find files that have the suffix .xls or .xlsx (spreadsheet files) or .csv or .tsv (simple text formatted comma or tab separated variable files, respectively). If the data appears as a table in a PDF or word processor document, it's much harder to get it into a form that you can actually work with, though it can be done if you have the right tools available.

Motion Charts via Google Charts

In recent years, the type of chart that Hans Rosling and the Gapminder Foundation popularised has started to appear more widely as a chart type known as a motion chart. In the same way that you can use a spreadsheet to generate line charts or bar charts from your own data, some spreadsheet applications also support motion charts.

For example, if you want to create a motion chart from your own data that is contained in a Google Spreadsheet, all you need to do is make sure that the data is organised in the following way:

  • the first column should contain the names of things you want to track, such as a country name. All subsequent columns can contain either numeric or text data.
  • the second column must include date values; this might just be a year (for example, a number formatted column containing a value such as 1973), a full date (as a date formatted column), a week number (eg 2008W03, as a string column type) or quarter format (for example, 2008Q3, again as a string column type).
  • the other columns should contain numbers, which will be available in the X, Y, Color and Size axes of the motion chart; or text, the options for which will only appear in the Color menu.
Creative commons image Icon The Open University under Creative Commons BY-NC-SA 4.0 license Example Google moption chart. Click through for the real version

Here is an example of a motion chart in Google Spreadsheets; unfortunately, as with the Gapminder Trendalyser application, it's unlikely to work on many tablet computers.

Google Motion Charts can also be generated from visualisation toolkits associated with data analysis programming tools such as R. This is a little too involved to describe here, but if you're interested, these links should get you started: googleVis toolkit (R), First steps of using googleVis on shiny.

Web Native Motion Charts

One of the first world problems associated with the Gapminder Trendalyser and Google motion charts is that they don't work on many tablet computers because of the dependencies they have on some specific browser plugins. So what options are left to us if we want to make use of such visualisations, safe in the knowledge that we will actually be able to see them working on a wider range of computers?

One possibility is to make use of one of the growing number of visualisation libraries that have been developed to support rich interactive data visualisations using standardised (and near universal) web technologies.

For example, the following motion chart has been constructed using the popular d3.js visualisation toolkit (click here for the interactive version).

Creative commons image Icon The Open University under Creative Commons BY-NC-SA 4.0 license d3.js motion chart

Whilst it's getting easier to generate such charts from data analysis tools such as R (in this case, using the rCharts library, for example), components such as the motion chart still require a certain amount of programming knowledge to get the data into them and the charts actually up and running.

So what's stopping you...?

Once upon a time it was quite hard for anyone other than academics or public servants operating at national or international agencies to get hold of timely, and good quality, data. With more and more data being published onto the web by these agencies, we are now in a position where we can start learning directly from the data ourselves, as well as checking the stories that others tell us based on their reading of—or spin applied to—the data.

Now you've seen where the data lives, as well as some ways of quizzing it using free online interactive data visualisation tools. So which of your assumptions are you now going to put to the data test?

If you would like to learn more about data visualisation, why not try this free OpenLearn unit on Visual representations of data and information.
 

For further information, take a look at our frequently asked questions which may give you the support you need.

Have a question?