5 Error, bias and validity

When we conduct an analysis of AMR-related data, we are aiming to estimate quantities such as the true level of AMR in a defined population, or the true relationship between an exposure and an outcome. However, AMR-related data sources, and our methods for collecting, managing and analysing data, are not perfect – in fact, no data sources can be perfect. Error occurs when observed, measured or calculated values differ from the true value.

Sources of error are often divided into three categories:

  • A random error is the error that occurs due to chance alone. It most commonly arises from having small sample sizes. For example, if you toss a fair coin hundreds of times, you would find that close to 50% of your tosses land on ‘heads’, and close to 50% land on ‘tails’. This result approximates the ‘truth’, which is that there is a 50% chance of a ‘heads’ or ‘tails’ result from tossing such a coin. However, if you only tossed a coin three times, it is possible that just by chance, you get three ‘heads’ in a row. In fact, the probability is quite high, it is one in 23 (one in eight, or 12.5%). Based on this small sample, you might conclude that the chance of getting ‘heads’ is 100%. This would be random error. A confidence interval and tests of statistical significance (p-values) are used to assess whether a finding is likely to be due to chance. There is more detail on random error in the Processing and analysing AMR data module.
  • A systematic error (or bias) is an error that represents a consistent deviation from the true value, and can arise during design, data collection, data management or analysis stages in AMR surveillance systems or studies. For example, if only patients who fail first-line antimicrobial treatment are sampled for AST, then estimates of the prevalence of AMR might consistently exceed the true population value.
  • Confounding is a type of bias that occurs when a conceptual error in understanding an observed association between two variables leads to the ‘wrong explanation’. It occurs when the true relationship between an exposure and an outcome is distorted by a third variable, called a ‘confounder’. The third variable is associated with both the exposure and the outcome, but it doesn’t cause the outcome. A simple example is that grey hair is associated with cancer, but as we know, grey hair doesn’t actually cause cancer, it is confounded by its association with age (as older people have grey hair and are more likely to get cancer). There are no statistical tests for confounding – it is a type of conceptual error that has to be taken into account when we plan, conduct and report data analyses.

4.2.2 Milligrams of antimicrobial ingredient adjusted for animal biomass

5.1 Bias in AMR studies