1.6.3 Don’t jump to conclusions
Time-series graphs are popular with newspapers for suggesting and comparing trends. But showing how a single quantity varies with time is not the same as showing how two quantities vary, and then suggesting a link between them.
Graphs showing the variation of two things with time often use two different vertical scales. You saw an example of this in the graph in Figure 30 charting the number of medical staff. Figure 34 shows an example taken from a national newspaper. This graph was included in a front-page article suggesting that there is a link between the level of unemployment among young men and the number of offenders committing burglaries. The way the graph has been drawn seems unambiguously to support the claim that when unemployment rises so does crime and, by virtue of the closeness of the shape of the two curves, carries the strong implication that indeed unemployment causes crime. You are not, of course, expected to draw the alternative conclusion that an increase in crime causes an increase in unemployment!
But, you should not jump to conclusions. First, look carefully at what the graph shows and read out the information that is actually there. Along the bottom, the scale represents the years 1977 to 1990. The vertical axis on the left-hand side shows the level of unemployment among men under 25 years old expressed as a percentage. Notice that the scale divisions are 4%, except for the top one which is 2%, although this may be a misprint and ‘20’ should have been printed as ‘22’.
On the right-hand side, the scale shows the number of offenders per 100,000. Note that the graph on its own does not make it clear just what this scale means. Is it the number of offenders per 100,000 men under 25, or might it be the number of offenders per 100,000 unemployed men under 25? The graph gives no clues, so you would have to look elsewhere for clarification, emphasising the point that all graphs are part of a wider context. Now look at the line graphs themselves. There are two lines, one relating to the left-hand scale and one relating to the right-hand scale. The two vertical scales have been chosen so that both graphs occupy roughly the same vertical height and, if you look at the bottom left of the graph, start together. The conclusion, of course, is that as unemployment goes up, so does crime, with the further implication being that it is the unemployed who turn to crime. To make that conclusion you are asked to compare trends, but detailed comparison is difficult because the vertical axis of each graph is different. The graphic encourages you to think that there is a strong causal link between two different trends, by the visual impression created by the way it has been drawn.
But you can play the same game. Figure 35 shows the same data, only now the vertical axes are both scaled in percentages. The left-hand axis still shows the percentage level of unemployment, but now the right-hand axis shows the number of offenders expressed as a percentage. This time you could argue that the graph tells quite a different story – that the level of crime is hardly affected at all by the level of unemployment. In spite of a significant 13% (or is it 15%?) increase in joblessness between 1979 and 1983, the number of offenders increased by less than 0.2%.
Even the original graphic starts to tell a different story towards the end of the 1980s, revealing that in 1990 the level of unemployment had dropped back almost to the 10% level of 1980, while crime was not far below its 1984 peak. The strong visual impression of the two overlaid graphs and the apparent close match between 1977 and 1983 works to divert attention from what is going on in the last years of the decade. What the graphic actually shows is no more and no less than two separate time-series graphs that have been drawn in the same place. There may or may not be a causal link between crime and unemployment, but graphical similarity on its own does not tell about cause. For that you need additional knowledge about the factors and forces that influence an actual, real-world situation.
Comparing trends requires a notion that the variables plotted against time are somehow related. But any such relationship must be established elsewhere – the graph itself cannot do it. A graph is a presentational device, all it can do is display data in a chosen format. Graphs are drawn by people, and it is people who decide what a graph shows and how it shows it. There is nothing inevitable about a graph.
Graphical representations are not restricted to two or three variables. The story a graphic tells can be dramatically enhanced by allowing additional variables to play a subtle counterpoint to the main theme. Perhaps one of the most famous multivariate narrative graphics, using six plotted variables, is the one by Minard examined in the next activity.
Activity 13: The fate of Napoleon’s army
Study the reader article ‘Narrative graphics of space and time’ by Edward Tufte. Use Minard’s graphic showing the fate of Napoleon’s army to tackle these questions.
When did the army reach Smolensk during the retreat from Moscow?
What was the size of the army at that stage, as a percentage of the size at the beginning of the campaign?
The winter was bitterly cold. Minard’s graphic gives the temperature in degrees on the Reaumur scale (which we shall write as °R). On this scale, water freezes at 0 and boils at 80 °R. What was the lowest temperature in °C that the army had to endure?
How many of the men who were at Moscow made it back to the River Niemen?
Napoleon's army reached Smolensk on the 14th of November 1812, with about 24 000 men, less than 6 per cent of the 422 000 who started the campaign.
The lowest temperature indicated by Minard is −30 °R, on 6 December. The Celsius scale is directly proportional to the Reaumur temperature scale. The freezing point of water is the same on both scales, but the boiling point is different. On a conversion graph with degrees Reaumur plotted horizontally, and degrees Celsius plotted vertically, the two fixed points are (0, 0) and (80,100). If you sketch the graph and work out the formula you should be able to convert between the two scales. It is
So a temperature of −30°R is equivalent to −30 × (100/80)=−37.5°C.
The numbers on Minard's graphic indicate that 100 000 men left Moscow in October 1812 for the long march home. Only 20 000 were left by the time they were joined at Bobr (Bobruysh) by a 30 000 strong column from the north, boosting the army to 50 000 men. Of this number, two out of every five men had marched from Moscow. 22 000 were lost crossing the Berezina River in the sub-zero temperatures, and of the survivors only 4000 reached the Niemen River, their starting point six months earlier. Assuming that this number contained the same proportion of men, two out of five, who had left Moscow, then of that 100 000 strong force only 1600 were still with the army. Death and desertion accounted for the rest. Remember also that women and children would have accompanied the army, as well as others not directly part of the fighting force. Their numbers are not recorded, but we must assume that they shared the soldiers’ fate.
Minard's graphic speaks eloquently and poignantly of a human tragedy of the highest proportions.