OLCreate: Fleming Fund Q Fundamentals of data for AMR: 2.1.2 Variables

A variable is any characteristic or attribute of a data unit that can be measured. The term variable is used because the value can vary from one data unit to another and may change over time. Examples of commonly used variables used to characterise a patient in a hospital include ‘age’, ‘gender’, ‘date of admission’, ‘body temperature at admission’, ‘primary diagnosis’, etc. Variables that might be used to characterise an individual animal might include ‘age’, ‘sex’, ‘breed’, ‘production type’, ‘farm identification number’, ‘herd size’ and others. Variables relevant to a laboratory that performs AST might include ‘date of sample’, ‘specimen identification number’, ‘sample type’, ‘minimum inhibitory concentration’ and many others.

Activity 3: Common variables for characterising data units

Timing: Allow about 5 minutes

In your workplace, what data units are most often used and what are some of the common variables that might be relevant for characterising these units?

To use this interactive functionality a free OU account is required. Sign in or register.

Interactive feature not available in single page view (see it in standard view).

Some variables can be observed or recorded directly – for example, the body temperature of a person or animal can be observed by reading the result on the thermometer, providing it is working and used correctly. Other variables are defined by applying rules or calculations to directly observed variables. For example, the variable ‘fever’ might be based on a binary classification of body temperature, with any patients recording a body temperature higher than 37.5°C classified as having a fever, while those with a body temperature of 37.5°C or less classified as ‘no fever’. Similarly, ‘age in years’ can be recorded directly or calculated from the documented date of birth.

Variables can be further classified into numeric (quantitative) and categorical (qualitative) variables. This has implications for how much information is conveyed by data, and how it can be analysed.

Consider once more the example of body temperature. Imagine that you measure the body temperature of five patients using a standard thermometer, with the results shown in Table 2.

**Table 2**
Patient number	Body temperature
1	36.6°C
2	41.2°C
3	37.6°C
4	37.4°C
5	20.5°C

Now imagine that you have data for the same patients, but this time the data is represented using the variable ‘Fever’, which has two categories, ‘Yes’ and ‘No’. Their body temperatures are shown again in Table 3. A cut-off value of 37.5°C or above is used to determine whether the patient has a fever.

**Table 3**
Patient number	Body temperature	Fever
1	36.6°C	No
2	41.2°C	Yes
3	37.6°C	Yes
4	37.4°C	No
5	20.5°C	No

Activity 4: Different types of information implied by ‘body temperature’ and ‘fever’ variables

Timing: Allow about 10 minutes

What do you notice about these two variables, ‘body temperature’ and ‘fever’? What different types of information are implied by each variable?

To use this interactive functionality a free OU account is required. Sign in or register.

Interactive feature not available in single page view (see it in standard view).

Discussion

‘Body temperature’ is an example of a numeric (quantitative) variable, whereas ‘fever’ is an example of a categorical (qualitative) variable. In general, there is more information available when the data are presented as numeric variables. In this example, we can see that patient 2 has a very high body temperature, which might indicate a life-threatening infection. We can also see that patients 3 and 4 have a mildly elevated temperature and their values are very similar (only 0.2°C difference). However, when the cut-off is applied, only patient 3 is classified as having ‘fever’. Finally, we observe that the body temperature of patient 5 is implausibly low, and is therefore likely to be an error (the thermometer might be broken or the user might not have read it properly – or possibly, the patient might be dead!) We would not detect this problem if we only had access to the variable ‘fever’, as this patient is simply categorised as having no fever. In fact, we would normally remove this observation from our analysis as it is a clear error.

Despite containing less information overall, categorical variables are still very useful. For example, it is both simpler and more clinically meaningful to describe the proportion of hospitalised patients with blood infections who have fever, than describing their average body temperature. Using categorical variables also means that comparisons can be made between groups, for example comparing outcomes for patients classified as having a fever versus patients with ‘no fever’. There is more detail on how to describe and summarise different types of data in the Processing and analysing AMR data module.

By now you can probably infer some of the key differences between numeric and categorical variables. Numeric (quantitative) variables have the following characteristics:

In addition to classification as ‘numeric’ or ‘categorical’, variables are further defined by what they measure (see Figure 1).

The reason that variable classifications matter is that each one affects how data can be analysed, interpreted and used. You’ll learn more about this in the Processing and analysing data module. The key takeaway for this module, however, is to recognise that variable classifications are partly determined by the nature of the data unit that is being measured (for example, you can’t have half a person, so counts of people or animals are always discrete), and partly determined by how data are processed (such as when categorical variables are generated from continuous variables).

My OpenLearn Create Profile

Download this course

About this course

Course rewards

Fundamentals of data for AMR

2.1.2 Variables

Activity 3: Common variables for characterising data units

Activity 4: Different types of information implied by ‘body temperature’ and ‘fever’ variables

Discussion

Activity 5: Classifying age variables

Question 1a

Question 1b

Question 1c