4.2 Statistical determinants of sample size

There are three key statistical parameters that have to be specified when determining sample size. In general, if a study sample is too small, the precision, confidence level and power of a study will fall short. This means you can’t reliably generalise the findings from your sample to the target population. Therefore, you need to understand and decide on appropriate values for each of these parameters in order to calculate the minimum sample size required.

Precision describes the margin of random error around an estimate. Larger sample sizes result in less random error, so larger sample sizes are required for greater precision. Margins of random error of 5% (0.05) or less are commonly used. Precision measures are generally given on a scale of 0 to 1 (or 0% to 100%).

For example, say that researchers want to determine the prevalence of resistance to oxytetracycline in E. coli isolates sampled from commercial layers. The researchers plan to use an appropriate probability sampling strategy to select a representative sample of farms and individual animals. They decide they want their study results to be within 5% of the true prevalence (+/– 5%).

  • What is their desired precision?

  • Their desired precision is 0.05.

The confidence level relates to the uncertainty associated with the estimate and is defined as the chance that the margin of error around the estimate contains the true value you are trying to estimate. The higher the confidence level, the less likely the results observed are due to chance. Larger sample sizes result in higher levels of confidence. Confidence levels of 90% or 95% are commonly used.

For example, say that researchers want to compare use of 3GCs in Canadian and Australian piggeries. They are planning on using a probability sampling strategy to select a representative sample of farms in both countries. If a difference in use between the two countries is detected, they want to be 95% confident that this is because a difference truly does exist. What this means is that if the study shows a difference between the two countries, there is still a 5% probability that the difference found is due to chance (or luck) alone. For example, if all the Australian farms with high levels of 3GC use and all the Canadian farms with low levels of 3GC use happened to be selected, the results could suggest a difference that doesn’t actually exist.

The power of a study is the probability of detecting an effect, such as a difference between two groups, if it is truly present in the target population. Larger sample sizes result in more power. In general, powers of 80% or higher are considered acceptable. Look again at the example comparing use of 3GCs in Australian and Canadian piggeries: if there is a true difference in 3GC use, the researchers want to be at least 80% sure that their study will uncover this difference. Their desired power is 80%.

4.1 Non-statistical considerations for determining sample size

4.3 Choosing a sample size calculator