2.1.3 Box-and-whisker plot
Another way of displaying information about the spread of data is the box-and-whisker plot (also referred to as the boxplot). Boxplots are graphs that summarise the five-number summary of continuous or discrete data. They consist of:
- a central box that spans the quartiles Q1 (lower quartile, 25th percentile) and Q3 (upper quartile, 75th percentile) – (note that the range from Q1 to Q3 is known as the interquartile range (IQR))
- a line in the box that marks the median (50th percentile)
- lines (whiskers) that extend from the box out to either the smallest (minimum) and largest (maximum) observations, excluding outliers, or 1.5 times the interquartile range on each side, also excluding outliers (if it is ambiguous which method is used, it is normally mentioned in the Figure’s legend)
- outliers, which are data values that are far away from other data values. On a boxplot, outliers are identified by a symbol such as a dot or an asterisk.
Boxplots are most beneficial when used for a side-by-side comparison of more than one distribution.
Activity 5: Understanding box-and-whisker plots
Timing: Allow about 15 minutes
Look at the box plot in Figure 5. Can you annotate one of the boxplots to show the five-number summary for a distribution?

Figure 5 Biofilm formation by Stenotrophomonas maltophilia isolated from patient samples according to patient and sample types. Biofilm biomass, assessed by spectrophotometric assay after crystal violet assay, was stratified according to patients with or without cystic fibrosis (CF, non-CF) and (B) sample type. (Significance level from Mann-Whitney test: * p<0.05; **p<0.01; *** p<0.001) (Pompilio et al., 2020)
Show description|Hide descriptionBox-and-whisker plots showing the biomass of biofilm formed by S. maltophilia isolated from samples from patients with and without cystic fibrosis (CF). The left-hand plot (A) shows two plots obtained from CF and non-CF patients; the right-hand plot (B) shows three plots obtained from CF patients, respiratory samples from non-CF patients and blood samples.
Figure 5 Biofilm formation by Stenotrophomonas maltophilia isolated from patient samples according to patient and sample types. Biofilm ...
Answer

Figure 6 Annotated Figure 5
Show description|Hide descriptionBox-and-whisker plots showing the biomass of biofilm formed by S. maltophilia isolated from samples from patients with and without cystic fibrosis (CF). The left-hand plot (A) shows two plots obtained from CF and non-CF patients; the right-hand plot (B) shows three plots obtained from CF patients, respiratory samples from non-CF patients and blood samples. Labels have been added to the left-hand plot (A) to show where the Maximum, Median, Minimum, Q1 and Q3 are.
Figure 6 Annotated Figure 5 The box extends from the first (Q1) and third (Q3) quartiles. The line in the middle of the box is plotted at the median, while the ends of the whiskers represent the minima and the maxima of all the data. The lines extending from the interquartile range are called whiskers. The whiskers extend to the maximum and minimum values in the dataset, excluding outliers. Note, the asterisks in Figure 5 and Figure 6 are not outliers, rather they are indicating the statistical significance of the results.
The strengths and limitations of boxplots are listed below:
Table 9 Strengths and limitations of boxplotsStrengths | Limitations |
---|
Summarise large datasets | Cannot read exact values |
Summarises the distribution of the data, the symmetry and skewness | Emphasises the tails of the distribution, which are the least certain points in a data set |
Shows outliers, unlike many other graphs | Doesn’t show many details of the distribution, so need to use in combination with a histogram |
Compare the distribution of other data sets | |
Important tool for exploratory data analysis | |