1.5 Grouping data

When there are a large number of possible outcomes, it may be necessary to group the data.

For example, if a learner is investigating how many DVDs each of their classmates own, the values could vary from 0 to 100 (or more).

Introducing 100+ different possible outcomes into a table is going to be very time consuming and difficult to plot in any kind of graph. Instead, grouping the data in a sensible way can make data collection and analysis much more straightforward.

Below you will see a grouped frequency table which could be used to investigate how many DVDs the learner’s classmates own.

Table 3 How many DVDs do you own?
Number of DVDsTallyFrequency
0–5
6–10
11–20
21–40
Over 40

Activity 4 Reflecting

Timing: Allow 5 minutes

Did you notice that there was no overlap in the class intervals and that all possible outcomes were included?

Did you notice that the class intervals in the example above are not equal?

Discussion

If there is no upper bound for a numerical category, it is common to use ‘over’ or ‘more than’ in order to cover all of the possible responses. Similarly, if there is no lower bound, it is common to use ‘less than’, so the first category could have been ‘less than 5’.

Depending on how the data is going to be used, it might be necessary to use classes with equal widths. We will discuss this later in this week.