Diary of a data sleuth: Data-scraping the sick bucket
Are more of us getting sick in winter or is it just...
It's that time of year again when the dreaded "norovirus" hits the news, with tales of wards closing and hospital visitors being encouraged to stay away from their loved ones if at all possible. Norovirus, aka the "winter sickness bug", is back, and with a vengeance, it seems (you can read up about it on the NHS Choices website: norovirus).
A quick look at Google Trends, a tool for reviewing the relative volume of search terms entered on the Google web search engine, shows how searches for "norovirus" grows around this time each year...
Of course, this may in part be a reflection of the extent to which people are looking up "norovirus" to see what it is, having heard it referenced in the news. For example, if we look at the coverage of "norovirus"-related stories in the Guardian over the last couple of years, we see how stories mentioning the disease feature prominently at the end of the year, and particularly in the run up to Christmas...
The story this year appears to be that Winter bug cases [are] '83% up on 2011' based on estimated occurrences since the summer compared to the same period last year, as reported by the Health Protection Agency. In addition, "[t]he figures also show there were 61 outbreaks of norovirus in hospitals in the fortnight up to December 16 - almost double the number in the same period last year when there were 35."
So can we find any data to explore these claims in a little more detail? Well it so happens we can... sort of. Each week, the NHS publishes a winter pressures daily situation report that records, on a weekly basis, a variety of Health Trust-related status reports collected each weekday. This data includes bed closures resulting from "D & V / Norovirus", which I take to mean diarrhoea and vomiting. A couple of weeks ago, I started looking at ways of aggregating this data in a convenient online database, rather than having to download and then try to cope with the officially released spreadsheet. The technique I used to grab the data is often referred to as data-scraping and the tool I use is called Scraperwiki. (You can read about some of the trials and tribulations associated with getting the data out of the released spreadsheets, and tidied up so it can be put into a database on my blog.)
With the news about norovirus cases being up on last year, I had a quick look to see if there was winter sitrep data available from last year, and indeed there is: NHS winter sitrep data 2011/12. Rather conveniently, the spreadsheet format used this year to release the data is the same that was used last year, so I pointed a copy of my datascraper at last year's data, and got that into a database, too.
A little bit of tinkering, and I managed to plot a handful of charts showing the number of hospital bed closures due to norovirus outbreaks over the winter period 2011/2012, and since early November of this year (2012). You can see the live charts here, and a snapshot is shown below. The chart on the left shows figures for 2011, the one on the right for 2012. If you click through to the actual charts, you will find they are interactive. The slider at the bottom of the chart allows you to zoom in to a particular date range.
As you can see, norovirus does seem to be having more of an effect than it did last year.
For more examples of how to view the winter sitrep data, see NHS Winter Situation Reports: Shiny Viewer v2.