# 4.3 Numerical variation

When it comes to designing for large numbers of other people it becomes impractical to ask every individual what they think. But the same practical methods you have just used can be scaled up using statistical methods. For example, from Activity 8 you now have a desk height that suits you, and you might assume that most people of your size will also like that desk height. So now you need to know how many people are the same size as you. For that you need to make use of statistics.

## Box 4 Working with datasets

A dataset is a collection of data, usually presented in the form of a table. In engineering, numerical data is particularly important.

The range of a dataset is the difference between its largest and smallest values.

It can be useful to distinguish between two different types of data:

- Discrete data is data that can take only certain values.
- Continuous data is data that can take any value (often over a particular range).

For example, metric screws are available in certain thicknesses and lengths. The dataset of screw lengths available might include 12 mm, 16 mm, 20 mm and 25 mm. It would not include 12.312 mm or other lengths in between these standard values. This is discrete data.

In contrast, a person’s height can take any value (though you might reasonably expect it to lie between the two extremes quoted earlier!). The only restriction is the accuracy to which you choose, or are practically able, to measure it. This is continuous data.

## Activity 9 Discrete and continuous data

Identify each of the following as discrete or continuous data:

- a.the number of students in a tutor group
- b.the distance from a person’s home to the nearest bus stop
- c.the price of a ballpoint pen, in pence
- d.wind speed, measured in kilometres per hour.

### Discussion

- a.Discrete – you can have only a whole number of people.
- b.Continuous.
- c.Discrete.
- d.Continuous (though if you used the Beaufort scale to measure wind speed the data would be discrete).

Before carrying out a detailed analysis of any dataset, it is a good idea to examine the values to see if any patterns or unusual values stand out. You should never ignore data because it is different from what you were expecting, but if a dataset does contain surprising values, it is a good idea to question them in case a mistake has been made. You might look for the following:

- Missing data – for instance, if you are given a table of data that has obvious unexpected gaps in it.
- Spurious precision – for instance, if a number is quoted to more significant figures than is plausible, or too few significant figures to be useful for the intended purpose.
- Dubious data, perhaps caused by a misplaced decimal point.
- Coded values, where the data provider may have used a code to indicate something, such as a missing value.
- Constraints – for instance, there may be some good reason why the data has to lie within a particular range.
- The presence of outliers – single values that are very different from the rest of the dataset.

Having checked that your dataset looks valid, you can perform calculations on it. For example, it may be useful to find the average of a set of values.

You have probably come across the idea of an average before, and you may also know that there are different kinds of average. The most useful types of average are usually the mean and the median.

## Box 5 Average values: mean and median

Finding the mean of a dataset: To find the mean of a set of numbers, add all the numbers together and divide by however many numbers there are in the set.

Finding the median of a dataset: To find the median of a set of numbers:

- Sort the data into increasing (or decreasing) order.
- If there is an odd number of data values, the median is the middle value.
- If there is an even number of data values, the median is the mean of the middle two values.

These basic statistical tools are useful when it comes to designing for people.

## Activity 10 Statistical methods

Below are the heights of 15 people taken at random from the general population.

Heights (mm): 1671, 1817, 1763, 1733, 1722, 1745, 1773, 1778, 1725, 1696, 1689, 1718, 1735, 1705, 1734.

- a.Write down the minimum and maximum values in this dataset. Hence calculate the range of heights.
- b.Calculate the mean height, based seon all the data values. Give your answer to the nearest mm.
- c.State the median height, based on all the data values.
- d.Compare the mean and median heights to your own height, and calculate the difference to each.

### Discussion

a.Minimum = 1671 mm, maximum = 1817 mm.

Range = 1817 mm − 1671 mm = 146 mm.

b.To find the mean value, add the 15 different values together and divide by 15. All heights are in mm.

1671 + 1817 + 1763 + 1733 + 1722 + 1745 + 1773 + 1778 + 1725 + 1696 + 1689 + 1718 + 1735 + 1705 + 1734 = 26 004.

Mean height = 26 004 mm ÷ 15 = 1734 mm (to the nearest mm)

c.Putting the heights in order gives (in mm): 1671, 1689, 1696, 1705, 1718, 1722, 1725, 1733, 1734, 1735, 1745, 1763, 1773, 1778, 1817.

The median is the middle value, which is 1733 mm.

d.The answer to this part will depend on your own height. The example answer given below is based on a height of 1690 mm.

The mean is 1734 mm, making the difference 1734 mm − 1690 mm = 44 mm.

The median is 1733 mm, making the difference 1733 mm − 1690 mm = 45 mm.

Before you move on from this activity, it is important to realise that this is not just a theoretical exercise. The example height used in part (d) of the answer to Activity 10 produces a real value (44 mm) that can be used practically. As you saw in Activity 8, small differences (even smaller than 44 mm) can have a significant effect. It is important to be aware of what the values used actually represent when you use any mathematical methods in design engineering.