2 Sampling frames

We now know that in order to sample from a population, we need to first identify the target population and the source (study) population. We then select the study sample. But how do we identify sampling units that form part of the source population? We need a sampling frame.

A sampling frame might be a list of all of the hospitals in the province(s) selected as the source population, along with a list of all the patients being treated at these hospitals (either as inpatients resident in hospital or outpatients attending walk-in clinics, depending on the topic of interest) in a defined time period. Note that because sampling frames include identifying information such as names and addresses, extra attention must be paid to ensuring that this information is stored and accessed securely by authorised members of the research team or surveillance programme (see the Legal and ethical considerations in AMR data course).

Because sampling frames constitute a data-collection process, they are subject to the same risks of error and bias as all other AMR-related data collection and analysis (see the Fundamentals of data for AMR course for more information on error and bias). Most of the time, available sampling frames do not perfectly correspond to the actual source population: for example, a sampling frame for AMR surveillance might consist of a list of all antimicrobial susceptibility tests (ASTs) performed in laboratories on samples collected from hospitalised patients in a city. However, not all patients with resistant infections are sampled, or the sample may not be fully tested – that is, may not be tested against all the drugs of interest. Some hospitals have limited capacity and experience with collecting samples for bacterial isolation and AST. Some patients may decline to have their sample tested if they have to pay a fee for the laboratory test. In general, a sampling frame should include facilities, such as hospital sites, that account for at least 80% of the target population.

Activity 4: Identifying a sampling frame

Timing: Allow about 5 minutes

Imagine that researchers would like to study the proportion of E. coli isolates that are resistant to specific antimicrobials among non-hospitalised patients with UTIs in their country (target population). Their study sample consists of all patients who present at selected primary healthcare facilities in one city over a three-month period.

Can you identify a possible source population and sampling frame for this study?

Discussion

The source population might include all non-hospitalised people with UTIs in one or more cities within their country, most of whom might go to a primary healthcare facility, but some might present to an emergency department in a hospital without being admitted, and some might seek care with alternative or traditional medicine providers.

A possible sampling frame they might have used is a Ministry of Health list of all primary healthcare facilities in the city where their study is conducted. They may have then selected primary healthcare facilities from this sampling frame, and enrolled all eligible patients at each of these facilities in their study.

A limitation you may have thought of is that there may not be an official register of alternative and traditional medicine providers.

In reality, sampling frames are not always available or complete, and therefore might not include all units in the source population, and so by extension are not representative of the target population. Examples include when there is a mix of public and private health facilities in an area, but only the public facilities are included in the sampling frame because it is more convenient to do so. For studies or surveillance of AMR or antimicrobial use (AMU) in the community, in many settings there is no complete list of every person who lives in a village or district of interest. This can make it difficult to collate a comprehensive list of sampling units to include in the sampling frame.

Three potential solutions to achieving representative sampling when no adequate sampling frame exists are summarised below:

  • Random geographic coordinates sampling: An online tool generates a series of randomly selected map coordinates that fall within certain geographic boundaries. Research teams then need to travel to each specific point (or as close as possible). Once the point is reached, sampling occurs as close to that point as possible. This might be used for sampling households in a village where no street addresses are used, for example.
  • Systematic random sampling: The nth unit is selected from a series of units as each unit presents at sites (e.g. clinics) included in the sampling frame; for example, selecting every tenth patient who presents to a clinic. This is best performed when the total number of sampling units and the required sample size is known. The first unit to be selected should be decided by randomly generating a number. This method is also included as a sampling method in the next section.
  • Stakeholder consultation: In some circumstances, it may be possible to develop a sampling frame through stakeholder consultation. For example, if there is no list of private (including traditional or alternative) health providers in a particular region, it may be possible to arrange a meeting with local leaders or community members and ask them to list the private practices they are aware of in the region. This approach is not perfect, but might be preferable to having no sampling frame at all.

1.3 Sampling and validity

3 Sampling methods