7.2.1 Mean, median and mode
The mean, median and mode are all types of average and are typical of the data they represent. Each has advantages and disadvantages, and can be used in different situations, but they all give us an idea of the general size of the values involved. Here we provide brief definitions, and some idea of when each should be used.
The following set of data is the number of miles (to the nearest mile) walked in a week by a group of students. You are going to look at how to calculate the mean, median and mode of these values.
Data set A: 1, 2, 2, 3, 4, 5, 5, 6, 6, 6, 6, 9, 11, 12, 14, 15
The mean is found by adding up all the values in a set of numbers and dividing that total by the number of values in the set.
For data set A, the total is 107. There are 16 values, so you need to divide 107 by 16. This gives 6.6875. So, the mean of this set of data is 6.6875 miles.
You also need to consider to how many decimal places you should quote. It is unlikely that the distance walked by each of the students would have been a complete number of miles, but the actual values have been rounded to the nearest whole number. As a result, you can only give the mean correct to one decimal place, or even to the nearest mile. Rounding 6.6875 to one decimal place gives 6.7 miles. If you look at the data, you will see that 6.7 seems to be quite a good average measure of these values.
Many people would choose to calculate the mean of a set of data if they wanted to get an idea of a typical value of the set, but there are times when the mean isn't as representative as you would like. For instance, if one of these students were a very keen walker, the largest value could have been, say, 50 miles not 15. If you replace the 15 by 50 and then recalculate the mean, the answer you get is 8.875 or 8.9 correct to one decimal place. Thus, the one value of 50 has significantly increased the value of the mean such that it is now less representative of the group of data as a whole.
Another problem with using the mean is that sometimes when you calculate the mean of a set of data, the value you get is difficult to interpret. For example, if you calculate the mean number of children in a group of families, you could get a value like 2.1 or 2.3. Of course, it is not possible to have 2.3 children.
The median is the middle value of a set of numbers arranged in ascending (or descending) order. If the set has an even number of values, then the median is the mean of the two middle numbers.
Data set A is already in ascending order: 1, 2, 2, 3, 4, 5, 5, 6, 6, 6, 6, 9, 11, 12, 13, 16
This is a set of 16 numbers, so the median is the mean of the two middle numbers, which are shown in bold (8th and 9th data values) the median is:
(6 + 6) ÷ 2 = 6
In some ways, this value is more representative of the group. For data set A, the median tells us that half the group walk 6 miles or less and half the group walk 6 miles or more. Here it doesn't matter what the largest (or the smallest) value is; if the largest value were 50, the median would still be 6.
The mode (or modal value) is the most popular value in a set of numbers, that is the one that occurs most often.
Looking at data set A, you can see that the mode is 6. In other words, more people walked 6 miles than any other distance.
The value of the mode, like the median, is not affected by any particularly large or small values in the set.
While every set of data has a mean and a median, it is not always possible to find a mode. Some sets of data have only single instances of each value, other sets of data have more than one value that occurs equally often.
The following set of data is the number of hours (to the nearest hour) of television watched by a group of students in a week.
8, 6, 2, 3, 6, 0, 5, 7, 3, 9, 1, 0, 5, 12, 6
Calculate the mean, median and mode.
Arranging the 15 values in ascending order gives:
0, 0, 1, 2, 3, 3, 5, 5, 6, 6, 6, 7, 8, 9, 12
The mean (total of all the hours divided by 15) = 72 + 15 = 4.9 hours.
The median (here the median is the 8th value) = 5 hours.
The mode (the value which appears most frequently in the set of numbers) = 6 hours.
Table 13: Summary of use of mean, median and mode with each type of variable
|Discrete – word category||not possible||only if you can order the data||works for most types of category data|
|Discrete – numerical||OK but answer may be difficult to interpret (e.g. 2.3 children)||OK||OK if at least one of the values occurs more than once|
|Continuous||OK||OK||data needs to be grouped|