1.1 Review of data types

Table 1 below reminds you of the definitions of data types previously described in the module Fundamentals of data for AMR.

Table 1 Key definitions
TermDefinition
Variable

A variable is an attribute used to characterise a data unit. They are called variables because their values vary from one data unit to another and may change over time. Commonly used variables for AMR data may include date of admission, sex, species, production type, sample type and minimum inhibitory concentration (MIC).

Variables can be classified into numeric (quantitative) and categorical (qualitative) variables. The classification of variables as numeric or categorical has implications for how data is analysed and visualised.

Numeric variables

Numeric variables contain only numbers and have meaning as a measurement or a count. Numeric data can be represented as integers (1, 2, 3), fractions (½, ¼), decimals (377, 39.134) or percentages (20%, 50%).

Numeric data can be further defined as either discrete or continuous.

Discrete data represent items that can be counted and can only take on finite values. Examples of discrete data include the number of hospitalised patients, the number of deaths attributable to resistant pathogens and the number of different antimicrobials to which resistance is identified.

Continuous data represent measurements, and their values cannot be counted, but they can be measured in units. Examples of continuous data include MIC, zone diameter and milligrams of antimicrobial ingredient per kilo of bodyweight.

Numeric data can be split into categories by applying one or more cut-off values to the data. For example, to determine the susceptibility status of isolates to an antimicrobial, you could apply susceptible and resistant clinical breakpoints to the MIC measurements. This yields three categories: susceptible, intermediate and resistant.

Categorical variables

Categorical variables represent characteristics of distinct groups. Categorical data is represented by a name, a string of alphanumeric characters or numeric values. A numerical code may be given to a categorical variable for analysis purposes (e.g. 1 for female and 0 for male), but these numbers have no mathematical meaning.

Categorical data can be further defined as nominal or ordinal.

Nominal data are entirely qualitative and are unordered. That is, the meaning of the data does not change if the categories are reordered. Examples include sex, production type and antimicrobial class. Nominal variables are often mutually exclusive such as female/male or alive/dead.

Ordinal data represents values that are discrete and ordered. These values are ranked, and the order of the ranking is important. The distance between values is not necessarily equal. Examples of ordinal data include socioeconomic status (low income, middle income, high income) or antimicrobial resistance status (susceptible, intermediate, resistant).

You need to first understand the types of data you have (i.e. discrete, continuous, nominal or ordinal) before preparing to present it in visual formats. This is because the data type will determine the formats you can use to summarise and display findings.

To understand your dataset, first begin by examining individual records and summarising the data in tables. You may find that summary tables are sufficient for presenting the data, especially if the dataset is small. If the data are more complex, then graphs or maps can help highlight important findings, trends or errors that need to be corrected (such as data entry errors).

Activity 2: Reviewing data types used in AMR analysis

Timing: Allow about 15 minutes
By signing in and enrolling on this course you can view and complete all activities within the course, track your progress in My OpenLearn Create. and when you have completed a course, you can download and print a free Statement of Participation - which you can use to demonstrate your learning.

1 Recap: AMR data and analysis

1.2 Review of different approaches to data analysis