Skip to main content
Printable page generated Thursday, 25 April 2024, 11:44 AM
Use 'Print preview' to check the number of pages and printer settings.
Print functionality varies between browsers.
Unless otherwise stated, copyright © 2024 The Open University, all rights reserved.
Printable page generated Thursday, 25 April 2024, 11:44 AM

Sampling (animal health)

Introduction

To generate data on AMR in surveillance programmes or research studies, we first need to make decisions about how, where and from which individuals or groups to collect data. This is known as ‘sampling’.

This module will focus on sampling and how a sample size is determined, including statistical and non-statistical considerations. It will focus on sampling for AMR in livestock and aquatic animal health, but draw comparisons to human health where relevant, as part of the emphasis on the importance of One Health approaches in tackling AMR.

After completing this module, you will be able to:

  • describe the purpose of sampling livestock and aquatic species for AMR
  • explain what factors need to be considered when choosing which livestock and aquatic species to sample for AMR surveillance
  • recognise the lists of priority pathogens suggested for sampling in livestock and aquatic species
  • list the steps involved in sampling livestock and aquatic species for AMR
  • explain the common problems associated with identifying sampling frames and how they can be addressed.

Activity 1: Assessing your skills and knowledge

Timing: Allow about 10 minutes

Before you begin this module, you should take a moment to think about the learning outcomes and how confident you feel about your knowledge and skills in these areas. Do not worry if you do not feel very confident in some skills – they may be areas that you are hoping to develop by studying these modules.

Now use the interactive tool to rate your confidence in these areas using the following scale:

  • 5 Very confident
  • 4 Confident
  • 3 Neither confident nor not confident
  • 2 Not very confident
  • 1 Not at all confident

This is for you to reflect on your own knowledge and skills you already have.

Active content not displayed. This content requires JavaScript to be enabled.
Interactive feature not available in single page view (see it in standard view).

1 Sampling basics

In this section we consider the basics of sampling: why and how to do it.

1.1 Why do we need to sample?

Have you ever taken part in a census, such as a national human population census, or a livestock census? A census involves collecting data from every single unit (such as a human, animal, farm) in the population. Conducting a true census is a very resource-intensive activity. What happens if we would like to answer a question such as:

‘How frequently is resistance to third-generation cephalosporins identified in Escherichia coli isolated from farmed chickens?’

Is it necessary to collect data on every single chicken in a country to answer this question? In most cases, it is impractical to conduct a census, especially when we need to collect specimens, such as blood, urine or faecal samples, from every single person or animal in the population.

Fortunately, for the majority of research or surveillance, it is not necessary to conduct a census. Instead, we can select an appropriate sample of subjects from the population. Sampling allows us to make inferences about a larger population. But we can’t just pick any animals we happen to find and expect that this ‘sample’ allows us to make inferences (apply the findings) to the entire population; instead, there are several steps we have to go through to ensure that our sample is representative (accurately reflects the characteristics) of the broader population that we are interested in. Going through these steps is the focus of this section.

1.2 Sampling terminology

How can we determine what constitutes an ‘appropriate’ sample? It is helpful to introduce some terminology as we go through the steps in selecting a sample.

Figure 1 From populations to samples.

First, we need to identify the population in which we are interested – this is known as the ‘target (reference) population’. Then we need to identify how we can choose a sample that is representative of the target population. There are three stages of defining the sample:

Activity 2: Understanding sampling

Timing: Allow about 5 minutes

Can you think of a livestock target population you are familiar with or interested in? (If not, try this example: fish farms around the coast of an island country.)

Use the space below to answer the following questions:

  • Can you identify the target population?
  • Now think about the source population you might select. Can you imagine a possible study sample for this source population?
  • Finally, note the possible sampling unit.
To use this interactive functionality a free OU account is required. Sign in or register.
Interactive feature not available in single page view (see it in standard view).
Discussion

How did you do with this activity?

If you chose to use the example provided, your target population might have been all the fish farms in the island country. Your source population could be farms in three bays around the island, and the sample could be 500 fish from five farms in each bay. Individual sampling units could be individual fish, but you might have chosen pooled sampling units such as sediment collected from the bottom of the ponds – this is because bacteria can also be isolated from these environmental samples, and are therefore likely to be present in all the fish in that environment.

1.3 Sampling and validity

Why is it important to describe and identify these sampling parameters?

Here, we need to think about the ‘validity’ of sampling: how well a measurement represents the true situation. If you have already completed the Fundamentals of data for AMR module, you may recall that the concepts of external validity and internal validity were introduced. These concepts are defined in terms of the relationship between the study sample and the target population.

Described image
Figure 2 External and internal validity.

Epidemiological principles of sampling in animal populations are the same as the principles applied to sampling in human populations. In human health, it is similarly necessary to define the target population, the source population, the study sample and the sampling unit. In human health, individuals are the most common lowest-level sampling unit. Compared to human health, it is somewhat more common to choose a flock, herd or other group as the lowest-level sampling unit in animal health: this is because in a herd, all the animals are in the same physical area and exposed to identical risk factors. There are well-established AMR sampling protocols that use farms or slaughterhouses as the lowest level sampling unit.

Activity 3 has an example of this.

Activity 3: Extracting useful sampling information

Timing: Allow about 15 minutes

Read the following abstract from a study describing surveillance of AMR in poultry (Nguyen et al., 2015). Use the space below to identify the target population, source population, study sample and sampling unit.

Objectives: To describe the prevalence of AMR among commensal Escherichia coli isolates on household and small-scale chicken farms, common in southern Vietnam, and to investigate the association of AMR with farming practices and antimicrobial use (AMU).

Methods: We collected data on farming and AMU from 208 chicken farms. E. coli was isolated from boot swabs and cultured on selective media. E. coli isolates were tested for their susceptibility to 11 antimicrobials. Risk factor analyses were carried out at both the bacterial population and farm levels.

Results: E. coli resistant to gentamicin, ciprofloxacin and third-generation cephalosporins (3GCs) was detected on 201 (96.6%), 191 (91.8%) and 77 (37.0%) of the farms, respectively. Of the 895 E. coli isolates included in the analysis, resistance to gentamicin, ciprofloxacin and 3GCs was detected in 178 (19.9%), 291 (32.5%) and 29 (3.2%) of the isolates, respectively. Ciprofloxacin resistance was significantly associated with quinolone usage and tetracycline usage.

Conclusions: Household and small farms showed frequent AMU associated with a high prevalence of resistance to the most commonly used antimicrobials. Given the weak biocontainment, the high prevalence of resistant E. coli could represent a risk to the environment and to humans.

To use this interactive functionality a free OU account is required. Sign in or register.
Interactive feature not available in single page view (see it in standard view).
Discussion

How did you find this exercise? Was there enough information available in the abstract to complete this activity?

The target population is household and small-scale chicken farms in Vietnam.

The source population is not described in this abstract. If you read the full study, it reports that the source population is three districts in the province of Tien Giang.

The study sample is 208 chicken farms.

The sampling unit is a single chicken farm. How do we know this? It is not clearly described in the abstract, but there’s an important clue in the description of the data collection method. ‘Boot swab samples’ were collected from farm workers, which represent the bacteria present on the entire farm. Similar to sediment sampling in fish, if certain types of bacteria are present in the environment in a farm, they are almost certainly present in the majority of the flock. If you are looking for prevalence of particular bacteria in a flock it is therefore often much more efficient to sample the environment (through the boot swab technique) rather than individual animals.

2 Sampling frames

We now know that in order to sample from a population, we need to first identify the target population and the source (study) population. We then select the study sample. But how do we identify sampling units that form part of the source population? We need a sampling frame.

A sampling frame might be a list of all of the poultry farms in the province(s) selected as the source population, for example. In human health, a sampling frame might consist of a list of all hospitals in a region, along with a list of all the patients being treated at these hospitals (either as inpatients or outpatients, depending on the topic of interest) in a defined time period. Note that because sampling frames include identifying information such as farmers’ names and farm addresses, extra attention must be paid to ensuring that this information is stored and accessed securely by authorised members of the research team or surveillance programme (see the Legal and ethical considerations in AMR data module).

Because sampling frames constitute a data-collection process, they are subject to the same risks of error and bias as all other AMR-related data collection and analysis (see the Fundamentals of data for AMR module for more information on error and bias). Most of the time, available sampling frames do not perfectly correspond to the actual source population: for example, a sampling frame for AMR surveillance in poultry might consist of a list of registered poultry slaughterhouses. However, not all slaughterhouses might be registered – the list might be out of date, or only record slaughterhouses of a minimum processing capacity. Further, not all poultry are processed through slaughterhouses: some are slaughtered at the point of sale in live animal markets, or in backyards for home consumption. In general, a sampling frame should include facilities (such as slaughterhouses or farms) that account for at least 80% of the target population.

Activity 4: Identifying a sampling frame

Timing: Allow about 5 minutes

Imagine that researchers would like to study the proportion of E. coli isolates that are resistant to specific antimicrobials among commercial layer poultry in Java, Indonesia. Commercial layers are not slaughtered until the end of their lay, so they decide to sample on farms.

Can you identify a possible source population and sampling frame for this study?

Discussion

The source population is identified as all commercial layer birds in Java. A possible sampling frame is a Ministry of Agriculture list of all commercial layer farms in Java, which is maintained for auditing purposes.

In reality, sampling frames are not always available or complete, and therefore might not include all units in the source population, and so by extension are not representative of the target population. The lack of a reliable sampling frame is particularly common in livestock and aquaculture studies and surveillance programmes, compared to human clinical studies and public health surveillance. Examples include when there is no list of farms in a region, or when a list of individual animals within a flock, herd or pen does not exist: fish would need to be selected from a pen, but the fish farm does not keep a list of individual fish within a pen.

Three potential solutions to achieving representative sampling when no adequate sampling frame exists are summarised below:

  • Random geographic coordinates sampling: An online tool generates a series of randomly selected map coordinates that fall within certain geographic boundaries. Research teams then need to travel to each specific point (or as close as possible). Once the point is reached, sampling occurs as close to that point as possible. This might be used for sampling wild or feral animals, for example.
  • Systematic random sampling: The nth unit is selected from a series of units as each unit presents; for example, selecting every tenth fish in a pen that swims through a race. This is best performed when the total number of sampling units and the required sample size is known. The first unit to be selected should be decided by randomly generating a number. This method is also included as a sampling method in the next section.
  • Stakeholder consultation: In some circumstances, it may be possible to develop a sampling frame through stakeholder consultation. For example, if there is no list of farms in a particular region, it may be possible to arrange a meeting with local leaders or community members and ask them to list the farms they are aware of in the region. This approach is not perfect, but might be preferable to having no sampling frame at all.

3 Sampling methods

Once we have a sampling frame, or an appropriate alternative, we need methods to actually select the sampling units. There are many types; different academic sources report slightly different lists of sampling methods. However, the consensus is that all sampling methods are categorised into two groups: probability and non-probability methods.

3.1 Probability sampling

In probability sampling methods, every sampling unit within the population has the same (or known) probability of being selected. They allow for representative sampling (such that results can be generalised to the target population) and limit selection bias (see the Fundamentals of data for AMR module).

Probability sampling methods include the following:

  • Simple random sampling: Where a sampling frame is available and is used to randomly generate a list of units to be sampled. This can be done manually (such as ‘pulling numbers out of a hat’) or using a random number generator to compile a list of sampling units.
  • Systematic random sampling: Where every nth unit is selected as each unit presents (or appears). For example, this could involve selecting every tenth fish that swims through a race. In public health, this could involve selecting every fifth patient presenting at a primary healthcare facility included in the sampling frame. Both the sampling interval (what value n takes) and starting point (whether systematic sampling starts from the first, second, third or nth unit) should be selected randomly.

Video 1 summarises simple random sampling and systematic random sampling.

Video 1 Simple random sampling and systematic sampling.
Interactive feature not available in single page view (see it in standard view).
  • What is the main difference between simple and systematic random sampling?

  • A simple random sample is drawn using a random number generator, whereas a systematic sample starts at a random number, and then selects every nth unit.

There are important extensions to simple random sampling or systematic random sampling, to allow for probability sampling of primary, secondary and tertiary sampling units (as described in Section 1.2), as required. This includes the following:

  • Multistage sampling: A random sample of primary units is selected, followed by a random sample of secondary units (and then tertiary, and so forth). For example, randomly selecting regions and then randomly selecting farms and then randomly selecting animals from each farm.
  • Stratified random sampling: The source population is divided into mutually exclusive strata based on factors that may affect the outcome, such as geographic region. A known number of units are then randomly selected from each stratum.
  • Cluster sampling: This is similar to multistage sampling, except that all sampling units are sampled in the final stage. That is, the farms are randomly selected, and then all animals on those farms are sampled.

These different sampling methods can be combined. For example, stratified sampling can be included within a multistage sampling design.

Video 2 summarises two of the more advanced approaches to probability sampling. Watch the video and then answer the question below.

Video 2 Stratified sampling and multi-stage cluster sampling.
Interactive feature not available in single page view (see it in standard view).
  • Which type of sampling divides the sample population according to individual attributes of a person or animal?

  • Stratified sampling. By contrast, multistage sampling first divides the sample population according to their geographic area or similar variable, but not their personal attributes (such as age or gender).

3.2 Non-probability sampling

In non-probability sampling methods, sampling is done without determining a sampling unit’s probability of being sampled. Non-probability sampling methods should be avoided, as they introduce substantial bias, and greatly limit the applicability of the findings to the target population. There are two broad types of non-probability sampling methods:

  • Convenience sampling is the collection of easily accessible sampling units, such as animals that present to a veterinary clinic, or farms located close to a laboratory with capacity for antimicrobial susceptibility testing (AST).

    Convenience sampling is common in AMR surveillance programmes, but is highly prone to selection bias. For example, farms located close to veterinary laboratories might have different biosecurity practices and other characteristics than farms in more remote areas. Convenience samples are typically poorly representative of the source population, and the findings from convenience samples cannot be generalised to the target population. Therefore, it is difficult to justify convenience sampling, even though it is relatively commonly used. Efforts to identify and select from sampling frames should be promoted over convenience sampling.

  • In purposive sampling, units are deliberately selected because they have particular characteristics. Purposive sampling might be appropriate when dealing with a very rare disease or other health-related characteristic, as it can be impractical to use probability-based sampling in these circumstances. Instead, efforts are made to sample as many sampling units that have the disease or characteristic as possible. For example, a study of the prevalence of AMR in E. coli isolates from meat in South Africa made use of both systematic random sampling and purposive sampling (Jaja et al., 2020):

    • A total of 83 and 35 carcasses were sampled in the formal meat sector (FMS) and informal meat sector (INMS) respectively by swabbing the rump, neck, brisket, and flank areas[.] … Systematic random sampling was adopted for the FMS, whereas a purposive sampling technique was adopted for the INMS. The difference in the sampling method is due to the disparity in the number of animals slaughtered in the FMS and INMS.
  • Simply put, when there was a small number of animals, all of them were sampled; however, because numbers are much larger in the formal meat sector, it was possible to use systematic random sampling.

Video 3 summarises what you have learned so far.

Video 3 Sampling from populations.
Interactive feature not available in single page view (see it in standard view).
  • What is the first step in sampling?

  • Defining the population of interest.

4 Sample size calculations

By this point, you have learned how to:

  • describe the target population and source population
  • identify a suitable sampling frame
  • select a robust sampling method to select sampling units (e.g. individual animals) from the sampling frame.

A crucial step in finalising your sample is to determine how many sampling units should be selected. In this section, we will need to introduce some statistical concepts, as we use statistical calculations to define the minimum required sample size. However, there are also non-statistical considerations, which we will review first.

4.1 Non-statistical considerations for determining sample size

There are several non-statistical considerations when determining sample size. Firstly, in most cases, the availability of funding, time and human resources is a fundamental consideration when determining sample size. Secondly, when complete sampling frames do not exist, simple random sampling is impossible and alternative sampling strategies that may require higher sample sizes for equivalent precision are needed.

Most importantly, however, study objectives will ultimately determine the statistical parameters that are acceptable, and certain decisions need to be made before progressing to calculating the required sample size. In studies comparing AMR in one population to another, or temporal trends, one key consideration is the minimum difference that you want to detect. In general, the study should be able to detect the minimum ‘clinically meaningful’ difference. This is a clinical judgement, not a statistical calculation. For example, in a very large study, it might be possible to detect a 1% difference between two groups, but if only a 10% or higher difference would influence clinical practice or be important for public health, then a smaller sample in which a 10% difference can be detected is sufficient. Let’s review an example in Activity 5.

Activity 5: Minimum differences in practice

Timing: Allow about 20 minutes

Figure 3 is taken from a study of AMU over a 12-month period on six cattle farms (labelled A to F) in the United Kingdom (Mills et al., 2018). AMU on farms is of interest to public health because most AMU globally occurs in agriculture, and there is a risk that resistant organisms can be transferred to humans from food animals. Even if your work is not focused on animal health, it is important to be aware of how AMR and AMU are measured and understood across all sectors, given the importance of One Health approaches to combatting AMR.

Figure 3 AMU over a 12-month period on six cattle farms.

Each farm is represented as a separate bar in Figure 3. AMU is calculated using the defined daily dose for animals (DDDvet) unit, which is very similar to the defined daily dose for humans unit. (You can read more about this measurement unit in the Fundamentals of data for AMR and Processing and analysing AMR data modules.)

Knowing that Figure 3 represents AMU on six different farms, how might you describe the magnitude of difference in AMU between farms? Which differences might be large enough to be ‘clinically meaningful’? Use the space below to make notes of your conclusions.

To use this interactive functionality a free OU account is required. Sign in or register.
Interactive feature not available in single page view (see it in standard view).
Discussion

Figure 3 shows that Farm A reports no AMU, and Farm D reports very little. Farms C and E have very similar AMU, such that the difference between them might not be considered important. By contrast, Farm B has approximately double the AMU of Farms C and E, and AMU on Farm F is approximately six times higher than Farms C and E.

A twofold or greater difference in AMU might have implications for AMR prevention and control, and thus it would be ‘meaningful’ to detect such differences in a study or surveillance programme. In fact, we might consider the ‘minimum’ important difference to detect to be less than a doubling in the amount of AMU. Perhaps we might be concerned about any farm that had, for example, 30% more AMU than Farm D.

4.2 Statistical determinants of sample size

There are three key statistical parameters that have to be specified when determining sample size. In general, if a study sample is too small, the precision, confidence level and power of a study will fall short. This means you can’t reliably generalise the findings from your sample to the target population. Therefore, you need to understand and decide on appropriate values for each of these parameters in order to calculate the minimum sample size required.

Precision describes the margin of random error around an estimate. Larger sample sizes result in less random error, so larger sample sizes are required for greater precision. Margins of random error of 5% (0.05) or less are commonly used. Precision measures are generally given on a scale of 0 to 1 (or 0% to 100%).

For example, say that researchers want to determine the prevalence of resistance to oxytetracycline in E. coli isolates sampled from commercial layers. The researchers plan to use an appropriate probability sampling strategy to select a representative sample of farms and individual animals. They decide they want their study results to be within 5% of the true prevalence (+/– 5%).

  • What is their desired precision?

  • Their desired precision is 0.05.

The confidence level relates to the uncertainty associated with the estimate and is defined as the chance that the margin of error around the estimate contains the true value you are trying to estimate. The higher the confidence level, the less likely the results observed are due to chance. Larger sample sizes result in higher levels of confidence. Confidence levels of 90% or 95% are commonly used.

For example, say that researchers want to compare use of 3GCs in Canadian and Australian piggeries. They are planning on using a probability sampling strategy to select a representative sample of farms in both countries. If a difference in use between the two countries is detected, they want to be 95% confident that this is because a difference truly does exist. What this means is that if the study shows a difference between the two countries, there is still a 5% probability that the difference found is due to chance (or luck) alone. For example, if all the Australian farms with high levels of 3GC use and all the Canadian farms with low levels of 3GC use happened to be selected, the results could suggest a difference that doesn’t actually exist.

The power of a study is the probability of detecting an effect, such as a difference between two groups, if it is truly present in the target population. Larger sample sizes result in more power. In general, powers of 80% or higher are considered acceptable. Look again at the example comparing use of 3GCs in Australian and Canadian piggeries: if there is a true difference in 3GC use, the researchers want to be at least 80% sure that their study will uncover this difference. Their desired power is 80%.

4.3 Choosing a sample size calculator

To make life easier when designing a study there are several sample size calculators available online, including Epitools and Sample Size Calculators. Which sample size calculator to choose depends on the objectives of a study. Many AMR studies aim to estimate a single proportion, such as the proportion of isolates that are resistant to a particular drug. Some also compare two proportions, for example the number of resistant isolates in cases in one region compared to another region. Similarly, AMU studies might aim to measure the total amount of AMU, or compare AMU between farms. There are appropriate sample size calculators available for both of these objectives.

In addition to specifying the desired precision, confidence level and power to calculate a sample size, you may also have to provide values for other parameters. For example, to estimate the sample size required for a study comparing two proportions, you need to provide an input value for the proportion in the baseline (or control) population, and an input value for the proportion in the comparison population. This is where it can be helpful to think about the minimum clinically meaningful difference you wish to detect.

Activity 6: Sample size calculations in practice

Timing: Allow about 30 minutes

Let’s use an online statistical calculator to calculate the sample size for a study comparing two proportions.

Imagine you wish to compare the prevalence of multidrug-resistantSalmonella isolates on pig farms in two neighbouring regions. You know from a previous study that the prevalence of multidrug resistance in Salmonella isolates in pigs from one of the two provinces is around 50%. You’re not sure about the equivalent figure in the second province, but you are concerned that it is higher than in the first province. You decide it is important to be able to detect a prevalence of multidrug resistance of at least 55% in the second region, to test your suspicions. You would like your study to have a confidence level of 95%, and a power of 90%.

Using this Epitools online calculator, calculate the minimum sample size required to detect a difference in multidrug resistance prevalence between the two provinces. Don’t change the default values for ‘ratio of sample sizes’, or ‘use of 1 or 2 tailed test’ (which should be set at 1, and 2-tailed respectively) – these concepts are beyond the scope of this module.

Hint: remember that a proportion is a decimal between 0 and 1, so 50% is expressed as 0.5. You usually need to enter proportions, not percentages, when calculating sample sizes.

Fill in your values in the calculator and click on ‘Submit’. Use the space below to make notes on what you have done, and on the outcomes of the calculation. Are you surprised by the size of the sample needed to answer your experimental question?

To use this interactive functionality a free OU account is required. Sign in or register.
Interactive feature not available in single page view (see it in standard view).
Discussion

The total required sample size is 4268 isolates. This means you would need to collect samples from at least 2134 pigs in each province. In practice, more pigs may need to be sampled than calculated, to compensate for potential issues such as loss of collected samples, issues with sample transport, failure to isolate Salmonella from some of the collected samples, expected bacterial prevalence (because not all animals will normally carry Salmonella – see details in Section 5.4) and so on. This type of consideration is often described as adjusting for the anticipated ‘dropout rate’ of the study. The size of the required sample is likely to have resource implications that need to be considered before the decision is made to proceed.

4.4 Adjusting statistical parameters to suit constraints

In Activity 6, you calculated a sample size for a given set of statistical parameters. However, you may have noticed that the required sample size is quite large. In general, you should always aim to match your resources to the required sample size. That is, if the required sample size is large, endeavour to seek enough funding or capacity to achieve this sample size.

However, key statistical parameters can be altered from an ideal value to something less-than-ideal in order to achieve a more practical sample size that is sufficient (if not perfect) for addressing the research topic or surveillance objective. Reducing precision, confidence level or power, or increasing the minimum difference to detect, will lead to lower sample size requirements.

Note though that in practice, confidence level is rarely set below 90%, and power is rarely set below 80%. As a general guide, you should aim to have 95% confidence, 80% power and 5% precision (where relevant), as well as a minimum detectable difference that is truly clinically meaningful (which might be as small as 5% or as large as 50%, depending on the topic).

  • Think back to the data presented in Activity 5, where there was a twofold difference between some farms and a sixfold difference between others. Which of these clinical differences would require a bigger sample size to demonstrate?

  • The twofold difference, because it is smaller.

Activity 7: Refining the sample size

Timing: Allow about 15 minutes

Go back to the example and calculation you made in Activity 6. What happens if you:

  • reduce only the confidence level to 90%?
  • reduce only the power to 80%?
  • increase only the proportion in the second province to 70%?
  • make all three of the changes above?
Answer
  • Changing only the confidence level to 90% reduces the total sample size to 3494.
  • Changing only the power to 80% reduces the total sample size to 3210.
  • Increasing only the proportion in the second province to 70% reduces the total sample size to 268.
  • Making all three of the changes above reduces the total sample size to 166.

Were you surprised by how much the sample size changed when increasing the proportion in the second province? Try other changes to the parameter values and see what happens. Reflect on the guidance in the previous section about which parameters might be appropriate to change.

4.5 Additional considerations when calculating sample sizes

So far this module has introduced you to the concepts and methods for determining a suitable sample size for your research study or surveillance programme. However, there are many other considerations. You should always consult with an epidemiologist or statistician when planning to conduct a sample size calculation, as you may need to consider the following situations:

  • When animals are ‘clustered’ into herds or flocks, or when you have used a multistage or stratified sampling design, this needs to be taken into account in the calculation of the sample size. There are a number of ways to achieve this, depending on the structure of the clustered population.
  • Sometimes you might have multiple objectives for AMR surveillance, and it is important to ensure the overall sample size is sufficient for each of these. For example, you might to compare differences in AMR between multiple regions, but also compare changes in AMR over time. You will need different sample size calculations for these two objectives. Make sure your calculated sample size meets the minimum required precision, confidence level and power of all your important objectives, not just the main or primary objective.
  • Special methods (known as ‘exact’ methods) may need to be used to calculate sample size if the expected frequency of the resistant organisms or use of a particular antimicrobial being studied is very low; for example, if fewer than five isolates are expected. Where possible, avoid this situation by increasing your sample size.
  • Various things can go wrong when collecting and processing samples. It might be impossible to reach the selected surveillance site due to flooding or bad weather. Specimens might be lost or accidentally destroyed during transport or in the laboratory. Combined, this is variously described as ‘dropout’, ‘loss to follow-up’ or ‘missing data’. As a general rule, increase your sample size by 5–10% above the calculated value to allow for dropouts.

5 Putting it all together: sampling for AMR in livestock and aquaculture

This section puts many of the concepts you have learned in this module into the broader context in which they are applied: designing and conducting AMR studies and surveillance.

5.1 The purpose of sampling animals for AMR

Livestock species

Bacteria, including those resistant to antimicrobials, can be transmitted from food-producing animals to humans, either through ingestion as food, or handling (for example, by slaughterhouse workers). This is the primary reason why AMR surveillance is conducted in food-producing animals. Other reasons to monitor AMR in food-producing animals include:

  • understanding the epidemiology of AMR in the food chain
  • monitoring the impact of AMU in animals
  • providing data for risk analysis for both humans and animals
  • evaluating the impact of interventions that aim to reduce AMR in animals
  • informing treatment guidelines for common bacterial infections in animals.

Aquaculture species

In aquaculture species, the reasons for AMR surveillance are similar to those we considered for livestock. Bacteria and antimicrobial residues in aquatic species can persist in aquatic food products, resulting in entry into the food chain. Additionally, because aquatic species are in water-based systems, resistant pathogens or residues from antimicrobials used for treatment can readily leak into the environment, affecting native species and potentially contaminating water supplies for humans and livestock.

In some situations, livestock and aquatic species may exchange pathogens through integrated fish farming systems. Examples include housing poultry directly above fish farms, so that the poultry faeces provides nutrients to the fish (Figure 4).

Figure 4 An example of integrated poultry and fish farming.

5.2 Choosing animals to sample for AMR

The first step to sampling for AMR is to determine the target population. When deciding on the target population, both species and type of farming system should be considered. Intensive poultry systems may have different levels of AMR from free-range village chickens, for example. When resources are limited, consider prioritising the species most commonly consumed as a human food source.

Other factors to consider are surveillance objectives and which products to sample. If the objective is to provide data for an analysis of the risk of human infection by resistant bacteria through consumption of animal products, and if resources are limited, prioritise the animal part that is most commonly consumed. Often this is muscle meat, but in some communities, products such as liver and other offal are popular. If the objective is to estimate the prevalence of resistant bacteria carried by layers in a country, potentially contaminating eggs, handlers and the environment, then faecal or cloacal samples would be the most appropriate.

Consider the most appropriate sampling unit. Sampling at farms allows for collecting detailed data on location, farm management practices, AMU on the farm and other contextual data in addition to collecting samples for bacterial isolation. However, farm-level surveillance is resource-intensive, and sampling frames are often unavailable. Sometimes, effluent from slaughterhouses is sampled. This is an efficient way to sample animals at the point that they enter the food chain. The main disadvantage of this is that additional contextual data are not readily available, because it is not possible to trace isolates identified in effluent to specific farms. Sampling of meat or fish products at markets can also be conducted. Compared to sampling at slaughterhouses, it may be more time consuming to take samples of meat products, and there are similar limitations when there is a lack of traceability of animal products back to the farm where they originated. For aquatic species, environmental samples can also be used to investigate resistance patterns in certain bacterial species. However, it can be difficult to determine the source of resistant pathogens because of the presence of integrated farming and run-off.

Lastly, also consider the production stage of the animals to be sampled. Animals that have lived longer will be more likely to carry resistant pathogens.

Figure 5 Fish at a market: a possible site for AMR surveillance.

5.3 Choosing pathogens to isolate

There are hundreds of pathogens that could be potentially isolated and tested for antimicrobial susceptibility. The most important livestock animal pathogens in relation to AMR are commonly split into two categories:

  • Commensal bacteria, such as Escherichia coli, Enterococcus faecium and E. faecalis are carried by all animals and commonly isolated from animal intestinal contents and faeces. Commensal bacteria are exposed to antimicrobials administered through food and water, and may be a reservoir for transferable resistance.
  • Food-borne zoonotic bacteria such as Salmonella spp. and Campylobacter spp. occur in animals and cause food-borne infections in humans. For aquaculture, the World Organisation for Animal Health’s (OIE’s) Aquatic Animal Health Code suggests three pathogenic bacteria species capable of causing disease in humans (i.e. zoonotic), that should be isolated as a minimum. Individual countries or regions may choose their own list of priority pathogens to isolate from animals and test for antimicrobial resistance. This may include sampling for pathogens that only cause disease in animals, given the importance of these diseases to animal health and trade. In the interest of harmonisation, the OIE has released a list of priority pathogens (Table 1).
Table 1 OIE priority pathogens list for animals.
Type of animalsBacteria categoryPriority pathogens
LivestockCommensalsE. coli, E. faecium, E. faecalis
Zoonotic bacteriaSalmonella spp. and Campylobacter spp. and other bacteria that may be pathogenic to humans such as methicillin-resistant Staphylococcus aureus (MRSA) and Listeria monocytogenes
AquaculturePathogensSalmonella spp., Vibrio parahaemolyticus and Listeria monocytogenes

5.4 Samples versus isolates

So far, we have been talking about sampling and sample size calculations without specifying whether we mean animal specimen samples (such as faecal or caecal samples) or a bacterial isolate. In general, sample size calculations in AMR studies are based on the number of bacterial isolates.

The type of bacteria you are sampling affects the proportion of samples in which isolates can be expected to be identified, and therefore the final sample size calculation for ‘animal samples’ from which isolates are obtained.

When sampling for AMR in commensal bacteria, there is a very high chance that the target bacterial species will be isolated from all collected samples. For example, caecal samples will almost certainly contain intestinal bacteria such as E. coli and E. faecium. Therefore, one sample is assumed to represent one or more isolates.

In contrast, when sampling for zoonotic bacteria, not all animal samples will contain disease-causing bacteria. For example, Salmonella spp. might be isolated from 40% of poultry samples, whereas Campylobacter spp. might only be isolated from 10% of poultry samples. Therefore, we need to increase the sample size proportional to the expected prevalence. Activity 8 invites you to consider this.

Activity 8: Sampling in action

Timing: Allow about 5 minutes

How many poultry specimens would you need to analyse if the sample size calculation suggests that 246 Salmonella spp. isolates are needed, and 40% of poultry specimens are expected to contain Salmonella spp.?

Answer

To calculate the total number of animal specimens, divide the number of isolates by the proportion of samples which contain the required bacteria. In this case, we divide 246 by 0.4 (which is the same as 40%), giving us a total sample size of 615 poultry specimens.

5.5 Steps involved in sampling for AMR

As an overview, the following steps should be considered when sampling livestock or aquaculture species for AMR.

  1. Decide on the source of your sample animals: In most cases, this will be slaughterhouses, but it could also be farms or the market/point of sale.
  2. Identify the target bacterial species: This might be commensal bacteria or zoonotic bacteria species.
  3. Outline a sampling frame: This might be a list of all the farms or ponds in the target population, or all the slaughterhouses. Remember that it is ideal to have at least 80% of the total target population listed in the sampling frame, although this can be difficult to achieve.
  4. Determine a sampling strategy, including how to select sampling units: Consider using probability sampling methods.
  5. Calculate the required sample size: This may require more than one calculation (such as the number of farms and then number of animals on each farm), and may require adjustment in light of non-statistical determinants of sample size. Be sure to consider whether you need to adjust the number of animal specimens to sample if the target bacterial species are not expected to be isolated from all samples.

Activity 9: Planning your own sampling

Timing: Allow about 30 minutes

Think of a target population of interest to you in your role, or in your country more generally. You might want to use the same example you considered in Activity 2. Think about an important research or surveillance question and use the space below to reflect on the steps for planning sampling:

  1. What are the possible sources of samples? Are there options at slaughterhouses, farms and retail outlets? Which sampling site do you think would be most effective, and why?
  2. Identify your target bacterial species. For your example, is it of more interest to focus on commensal or zoonotic bacteria in your population?
  3. Identify a possible sampling frame, and the source of information for this sampling frame.
  4. What type of sampling method would you use? Are there advantages to using multistage sampling compared to simple random sampling?
  5. What considerations might you need to take into account when preparing a sample size calculation? Do you expect the target bacteria to be present in all the animal samples collected? Do you need to take dropout or clustering into account?
To use this interactive functionality a free OU account is required. Sign in or register.
Interactive feature not available in single page view (see it in standard view).
Discussion

How did you get on with this activity?

If you used the fish farm example from Activity 2, you may have considered the following steps:

  1. Sample sources include fish farms, fish slaughterhouses, wet markets or retail outlets. As testing for AMR in aquaculture often happens in the context of an outbreak investigation, it is common to collect samples of diseased fish at farms. In aquaculture, many farmed fish are slaughtered in processing units at the farm, or sold live to consumers who kill the fish at home, so fish slaughterhouses are not a representative sample source. Sampling at wet markets and retail outlets may be feasible, but it is difficult to capture data on AMU when sampling away from farms.
  2. Target bacterial species in fish might include Vibrio parahaemolyticus or Listeria monocytogenes, which infect a wide range of farmed fish and shellfish, and can cause severe disease in humans. These are two of the species OIE recommends for surveillance of AMR in aquaculture (see Table 1).
  3. A sampling frame of fish farms might be obtained by requesting access to the official registry of fish farms maintain by the Ministry of Fisheries or equivalent. A sampling frame of retail outlets might be more difficult to obtain, especially as there are likely to be informal market sellers as well as large outlets.
  4. A multistage sampling strategy is appropriate, where geographical areas such as districts are first selected, and then farms within the district are randomly selected. If you are interested in sampling more than one commodity species (such as tilapia and catfish), you might consider a stratified multistage sample to ensure adequate representation of each commodity species.
  5. The target bacteria cause disease in fish as well as humans, so are unlikely to be present in every collected sample. The sample size will need to be adjusted accordingly. Because most fish farms consist of one or more ponds, the specified number of samples should be collected evenly from each pond, rather than all samples from one pond. Either way, this additional level of clustering will need to be taken into account at design and analysis stages.

6 End-of-module quiz

Well done – you have reached the end of this module and can now do the quiz to test your learning.

This quiz is an opportunity for you to reflect on what you have learned rather than a test, and you can revisit it as many times as you like. 

Open the quiz in a new tab or window by holding down ‘Ctrl’ (or ‘Cmd’ on a Mac) when you click on the link.

7 Summary

In this module you have learned about different approaches to sampling for AMR studies and surveillance in animals. You have learned about important parameters and best practice methods for sampling animals, and have also learned that many AMR studies and surveillance programs use less than ideal sampling methods.

You should now be able to:

  • describe the purpose of sampling livestock and aquatic species for AMR
  • explain what factors need to be considered when choosing which livestock and aquatic species to sample for AMR surveillance
  • recognise the lists of priority pathogens suggested for sampling in livestock and aquatic species
  • list the steps involved in sampling livestock and aquatic species for AMR
  • explain the common problems associated with identifying sampling frames and how they can be addressed.

Now that you have completed this module, consider the following questions:

  • What is the single most important lesson that you have taken away from this module?
  • How relevant is it to your work?
  • Can you suggest ways in which this new knowledge can benefit your practice?

When you have reflected on these, go to your reflective blog  and note down your thoughts.

Activity 10: Reflecting on your progress

Timing: Allow about 15 minutes

Do you remember at the beginning of this module you were asked to take a moment to think about these learning outcomes and how confident you felt about your knowledge and skills in these areas?

Now that you have completed this module, take some time to reflect on your progress and use the interactive tool to rate your confidence in these areas using the following scale:

  • 5 Very confident
  • 4 Confident
  • 3 Neither confident nor not confident
  • 2 Not very confident
  • 1 Not at all confident

Try to use the full range of ratings shown above to rate yourself:

Active content not displayed. This content requires JavaScript to be enabled.
Interactive feature not available in single page view (see it in standard view).

When you have reflected on your answers and your progress on this module, go to your reflective blog and note down your thoughts.

8 Your experience of this module

Now that you have completed this module, take a few moments to reflect on your experience of working through it. Please complete a survey to tell us about your reflections. Your responses will allow us to gauge how useful you have found this module and how effectively you have engaged with the content. We will also use your feedback on this pathway to better inform the design of future online experiences for our learners.

Many thanks for your help.

Now go to the survey.

References

ACER Consulting [YouTube user] (2018) ‘Epidemiology 101: sampling from populations’, YouTube, 27 July [online]. Available at https://www.youtube.com/ watch?v=sXKcaVGtKDU (accessed 16 February 2021).
Course 0HP00 [YouTube user] (2017a) ‘4 Simple random sampling and systematic sampling’, YouTube, 14 March [online]. Available at https://www.youtube.com/ watch?v=7yf2HGLrnwQ (accessed 16 February 2021).
Course 0HP00 [YouTube user] (2017b) ‘4 Stratified sampling and multi-stage cluster sampling’, YouTube, 14 March [online]. Available at https://www.youtube.com/ watch?v=BEK66aJWvt0 (accessed 16 February 2021).
Epitools, https://epitools.ausvet.com.au/ (accessed 16 February 2021).
Jaja, I.F., Oguttu, J., Jaja, C.-J.I. and Green, E. (2020) ‘Prevalence and distribution of antimicrobial resistance determinants of Escherichia coli isolates obtained from meat in South Africa’, PLOS ONE, 15(5), p. e0216914. doi: 10.1371/journal.pone.0216914 [online]. Available at https://journals.plos.org/ plosone/ article?id=10.1371/ journal.pone.0216914 (accessed 17 February 2021).
Mills, H.L., Turner, A., Morgans, L., Massey, J., Schubert, H., Rees, G., Barrett, D., Dowsey, A. and Reyher, K.K. (2018) ‘Evaluation of metrics for benchmarking antimicrobial use in the UK dairy industry’, Veterinary Record, 182(13), p. 379. doi: 10.1136/vr.104701.
Nguyen, V.T., Carrique-Mas, J.J., Ngo, T.H., Ho, H.M., Ha, T.T., Campbell, J.I., Nguyen ,T.N., Hoang, N.N., Pham, V.M., Wagenaar, J.A., Hardon, A., Thai, Q.H. and Schultsz, C. (2015) ‘Prevalence and risk factors for carriage of antimicrobial-resistant Escherichia coli on household and small-scale chicken farms in the Mekong Delta of Vietnam’, Journal of Antimicrobial Chemotherapy, 70(7), pp. 2144–52. doi: 10.1093/jac/dkv053.
Sample Size Calculators, https://sample-size.net/ (accessed 16 February 2021).

Acknowledgements

This free course was collaboratively written by Melanie Bannister-Tyrrell, Emma Zalcman and Clare Sansom, and reviewed by Siddharth Mookerjee, Claire Gordon, Natalie Moyen and Hilary MacQueen.

Except for third party materials and otherwise stated (see terms and conditions), this content is made available under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 Licence.

The material acknowledged below is Proprietary and used under licence (not subject to Creative Commons Licence). Grateful acknowledgement is made to the following sources for permission to reproduce material in this free course:

Images

Module image: AndreasReh/Getty Images.

Figures 1 and 2: Ausvet Pty Ltd.

Tables

Table 1: based on information sourced from Chapter 6.8 of the Terrestrial Animal Health Code (2019) and Chapter 6.4 of the Aquatic Animal Health Code (2019), World Organisation for Animal Health (OIE), https://www.oie.int/ en/ standard-setting/ overview/.

Text

Activity 3: Nguyen, V.T., Carrique-Mas, J.J., Ngo, T.H., Ho, H.M., Ha, T.T., Campbell, J.I., Nguyen ,T.N., Hoang, N.N., Pham, V.M., Wagenaar, J.A., Hardon, A., Thai, Q.H. and Schultsz, C. (2015) ‘Prevalence and risk factors for carriage of antimicrobial-resistant Escherichia coli on household and small-scale chicken farms in the Mekong Delta of Vietnam’, Journal of Antimicrobial Chemotherapy, 70(7), pp. 2144–52. doi: 10.1093/jac/dkv053. An open access article distributed under the terms of the Creative Commons Attribution licence, http://creativecommons.org/ licenses/ by/ 4.0/.

Section 3.2: Jaja, I.F., Oguttu, J., Jaja, C.-J.I. and Green, E. (2020) ‘Prevalence and distribution of antimicrobial resistance determinants of Escherichia coli isolates obtained from meat in South Africa’, PLOS ONE, 15(5), p. e0216914. doi: 10.1371/journal.pone.0216914. An open access article distributed under the terms of the Creative Commons Attribution 4.0 International (CC BY 4.0) licence, http://creativecommons.org/ licenses/ by/ 4.0/.

Every effort has been made to contact copyright owners. If any have been inadvertently overlooked, the publishers will be pleased to make the necessary arrangements at the first opportunity.