# 12.5  Data analysis

The data in Table 12.2 is for only seven people. Imagine how large the table would need to be for a whole community! Analysing data enables you to present information in a clearer and more useful way. Data analysis means describing and summarising your findings in an unbiased way. The results obtained from the analysis will not only help you to meet your community survey objectives, they will also enable you to:

• Monitor and evaluate your activities and establish whether you have progressed as planned.
• Assess the effect of your activities on the knowledge, perceptions, behaviour, and ultimately on the health, of the individuals within your community.
• Share your results with interested stakeholders in your community and local government officials.

To analyse your data, you first need to identify the type of data you have. You may have collected quantitative or qualitative data. Qualitative data use names or descriptions to describe variables, while quantitative data usually use numbers. A variable is any measured characteristic or attribute that differs between different people, households, etc.

• Give an example from Table 12.2 of quantitative data and an example of qualitative data.

• An example of quantitative data would be the column listing the respondents’ age. An example of qualitative data would be ethnicity, occupation, house type or water source.

Several terms are used to describe types of variable. For some variables, called categorical variables, there are a limited number of possible responses that can be given, in other words, a limited number of categories. For example, ‘gender’ is a categorical variable because it has two categories: ‘male’ and ‘female’. Other variables, known as continuous variables, have lots of different possible responses, though usually within a certain range. For example, age is a continuous variable, within the range of a normal human lifespan.

Variables that are described by a number are, unsurprisingly, also known as numerical variables. For example, the number of new AIDS cases reported during a one-year period, the number of beds available in a particular hospital, or a person’s weight or temperature are all numerical variables.

• Of the variables given in Table 12.2, gender is one categorical variable. Can you find another?

• Another categorical variable would be marital status, because everyone can be categorised into single, married, divorced or widowed, or cohabiting (living together without being married).

• ‘Blood group’ is a variable. People may have one of four blood groups and these are A, B, AB and O. Is blood group a categorical or a continuous variable?

• Blood group is a categorical variable because it has four categories. Each person has one of the four blood groups – A, B, AB or O.

At times, you may find it useful to transform numerical data into categorical data. You can do this by dividing the range of values of the variable into intervals, i.e. by grouping the data. For example, the numerical variable ‘age’ might be transformed into a categorical variable ‘age group’, which consists of categories such as under 30 years, 30–44, 45–59 and over 60 years. This transformation is useful if the researcher is interested in the number of people falling into each of these four categories (Figure 12.3).

Figure 12.3  The age group of community members is often used in surveys. (Photo: Janet Haresnape)
• Suppose you find that the ages of a group of people you interviewed about tuberculosis in your kebele are as shown in Table 12.3. How many of these people would be in each of the age groups under 21, 21–30, 31–40, 41–50, 51–59 and over 60? Put your answers in Table 12.4a. Which age category has the most people in it?

## Table 12.3  Ages of people interviewed about tuberculosis.

Age (years)Number of people
192
203
214
223
234
244
253
262
283
301
323
353
383
451
491
551

## Table 12.4a  Age groups of people surveyed about tuberculosis (for completion).

Age group (years)Number of people
under 21
21–30
31–40
41–50
51–59
over 60
• Your completed table should look like Table 12.4b below. The age group with the most people in it is the 21–30 years category, with 24 people.

## Table 12.4b  Age groups of people surveyed about tuberculosis (completed).

Age groupNumber of people
under 215
21–3024
31–409
41–502
51–601
over 600

12.4  Data processing and checking for errors

12.6  Summarising quantitative data