Author: Tony Hirst

Diary of a data sleuth: Data-scraping the sick bucket

Updated Thursday, 20th December 2012
Are more of us getting sick in winter or is it just hype? Our resident data sleuth set out to unpick the stories of winter bugs in the run up to Christmas.

This page was published 9 years and 1 month ago. Please be aware that due to the passage of time, the information provided on this page may be out of date or otherwise inaccurate, and any views or opinions expressed may no longer be relevant. Some technical elements such as audio-visual and interactive media may no longer work. For more detail, see our Archive and Deletion Policy.

It's that time of year again when the dreaded "norovirus" hits the news, with tales of wards closing and hospital visitors being encouraged to stay away from their loved ones if at all possible. Norovirus, aka the "winter sickness bug", is back, and with a vengeance, it seems (you can read up about it on the NHS Choices website: norovirus).

A quick look at Google Trends, a tool for reviewing the relative volume of search terms entered on the Google web search engine, shows how searches for "norovirus" grows around this time each year...

Google Trends for Norovirus search terms

(You can also use Google Trends to look up search activity around other terms. Try presents, for example, or trifle, or try them both together).

Of course, this may in part be a reflection of the extent to which people are looking up "norovirus" to see what it is, having heard it referenced in the news. For example, if we look at the coverage of "norovirus"-related stories in the Guardian over the last couple of years, we see how stories mentioning the disease feature prominently at the end of the year, and particularly in the run up to Christmas...

Data snapshots relating to the Norovirus

The story this year appears to be that Winter bug cases [are] '83% up on 2011' based on estimated occurrences since the summer compared to the same period last year, as reported by the Health Protection Agency. In addition, "[t]he figures also show there were 61 outbreaks of norovirus in hospitals in the fortnight up to December 16 - almost double the number in the same period last year when there were 35."

So can we find any data to explore these claims in a little more detail? Well it so happens we can... sort of. Each week, the NHS publishes a winter pressures daily situation report that records, on a weekly basis, a variety of Health Trust-related status reports collected each weekday. This data includes bed closures resulting from "D & V / Norovirus", which I take to mean diarrhoea and vomiting. A couple of weeks ago, I started looking at ways of aggregating this data in a convenient online database, rather than having to download and then try to cope with the officially released spreadsheet. The technique I used to grab the data is often referred to as data-scraping and the tool I use is called Scraperwiki. (You can read about some of the trials and tribulations associated with getting the data out of the released spreadsheets, and tidied up so it can be put into a database on my blog.)

With the news about norovirus cases being up on last year, I had a quick look to see if there was winter sitrep data available from last year, and indeed there is: NHS winter sitrep data 2011/12. Rather conveniently, the spreadsheet format used this year to release the data is the same that was used last year, so I pointed a copy of my datascraper at last year's data, and got that into a database, too.

A little bit of tinkering, and I managed to plot a handful of charts showing the number of hospital bed closures due to norovirus outbreaks over the winter period 2011/2012, and since early November of this year (2012). You can see the live charts here, and a snapshot is shown below. The chart on the left shows figures for 2011, the one on the right for 2012. If you click through to the actual charts, you will find they are interactive. The slider at the bottom of the chart allows you to zoom in to a particular date range.

In each chart, the upper blue line shows reported figures at 8am or 9am on the day of reporting for Beds closed norovirus ("The number of beds closed due to D&V or norovirus-like symptoms"); and the lower red line shows Beds closed unocc ("Of the number of beds closed due to D&V or norovirus-like symptoms, the number of beds that are unoccupied"). These terms are described in the data release notes.

Data snapshots relating to the Norovirus

As you can see, norovirus does seem to be having more of an effect than it did last year.

For more examples of how to view the winter sitrep data, see NHS Winter Situation Reports: Shiny Viewer v2.

For more information on the measures hospitals are likely to put in place around a norovrius outbreak, see the NHS Guidelines for the management of norovirus outbreaks in acute and community health and social care settings.



Ratings & Comments

Share this free course

Copyright information

For further information, take a look at our frequently asked questions which may give you the support you need.

Have a question?