3 The distribution of repeated measurements
As noted in the previous section, if the same quantity is measured repeatedly, the results will generally be scattered across a range of values. This is perhaps best illustrated using a real example. Table 1 shows 10 measurements of a quantity called the ‘unit cell constant’ for an industrial catalyst used in the refining of petrol; this is an important quantity which determines how well the catalyst works, and can be measured by X-ray diffraction techniques. Notice that the cell constant is very small and is measured in nanometres.
Measurement | Cell constant/nm |
1 | 2.458 |
2 | 2.452 |
3 | 2.454 |
4 | 2.452 |
5 | 2.459 |
6 | 2.455 |
7 | 2.464 |
8 | 2.453 |
9 | 2.449 |
10 | 2.448 |
It is always difficult to see patterns in lists or tables of numbers. However, if the data are put into the form of a histogram, the task becomes much easier. The histogram provides a visual representation of the way in which measurements are distributed across a range of values.
The following examples show four variations on a histogram, using the same core data: distributions for repeated measurements of the unit cell constant of a batch of industrial catalyst. ‘Click on ‘View interactive version’ to access the activity’. Open the interactive in a new tab or window (by holding down Ctrl [or Cmd on a Mac]. Select each tab in the interactive to compare the histograms. What happens when the number of measurements is increased?
The distributions in these examples all give some impression of the spread of the measurements, and the way the results cluster at the peak of the distribution in examples c and d suggests that this peak might represent the average or 'best estimate' value. However, a scientist would want a more quantitative and succinct way to describe such results and to communicate them to other people working on similar problems. The mean and standard deviation are the measures most commonly used to summarise large sets of data.