Printable page generated Tuesday, 29 Nov 2022, 12:30
Use 'Print preview' to check the number of pages and printer settings.
Print functionality varies between browsers.
Printable page generated Tuesday, 29 Nov 2022, 12:30

# Unit 11: Communicating with Data, Charts, and Graphs

## 11 Introduction

In every aspect of our lives, data—information, numbers, words, or images—are collected, recorded, analyzed, interpreted, and used. We encounter this information in the form of statistics too—everything from graphs of the latest home sales figures to census results, the current rate of inflation, or the unemployment rate.

Being able to make sense of data is an important skill. In Section 11.1, we will consider how to summarize a set of data by using an average value, calculated using the mean, median or mode.

In Section 11.2, we will look at how to present and understand data in tables. The focus will then move to graphs, and different types of charts that all make data easier to understand in Sections 11.3, 11.4, and 11.5. Graphs and charts are very useful for displaying the relationship between quantities quickly and in an accessible form, but these can also be used to mislead the unwary reader—so you’ll learn how to critically assess this type of information too.

Finally, in Section 11.6, you'll see some of the new types of interactive data visualizations that are becoming more common.

### Practice Quiz

Check your understanding of averages and interpreting data from charts and graphs before you start by completing the Unit 11 pre quiz, then use the feedback to help you plan your study.

The quiz does not check all the topics in the unit, but it should give you some idea of the areas you may need to spend most time on. Remember, it doesn’t matter if you get some or even all of the questions wrong – it just indicates how much time you may need for this unit!

### 11.0.1 What to Expect in this Unit

This unit should take around ten hours to complete. In this unit you will learn about:

• The three different averages.
• Understanding and constructing tables.
• What to look out for in charts and graphs.
• How to construct charts and graphs to display information.

## 11.1 Averages

Think of a trip that you or a friend might take frequently, such as shopping or going to work. How long would you say that trip takes?

You may have said something like, "Well, it usually takes about 45 minutes to get to work, but, if the traffic is heavy, it can take up to hours.” In other words, 45 minutes is a typical time, though there are exceptions. How did you decide what time to say? Did you pick a time that seemed to occur quite often, or perhaps a time that was somewhere in the middle of the times taken for recent trips?

Knowing a typical value for a data set, and also how the data is spread out around this typical value, is important for all sorts of situations.

A value that is typical of the values in a set of data is known as an average. There are different ways of calculating an average, and you will learn about three of these: The mean, the median, and the mode.

### 11.1.1 Calculating the Mean of a Data Set

You have probably come across the mean before; it is the most commonly used type of average and takes into account all the data.

Let’s look at an example to explore this type of average.

Suppose eight students took an exam, with the following scores:

 9 7 6 7 8 4 3 9

To get a feel for the problem and to help you check that your final answer is reasonable, look at the exam scores and decide what a typical value might be. Make a note of your estimate.

To calculate the mean value, we add all the data values together and then divide this sum by the number of values.

In this example, the sum of the data values is .

There are eight data values. Therefore, the mean value is .

The mean exam score for these students was 6.6 (rounded to 1 decimal place).

How did this compare with the typical value you estimated at the beginning? Did you decide that 6 or 7 might be a typical value?

The mean value will always lie between the smallest value and the largest value, and will often be somewhere towards the middle, though exactly where depends on the actual values in the data set.

#### Calculating the Mean

Calculating a mean can be summarized as follows:

• Add all the values together to find their sum.
• Count the number of values.
• Divide the sum by the number of values.
• Write down the conclusion and include the units.

Alternatively, .

Remember to make a note of this in your math notebook for easy reference later.

Take a look at this pencast to see how to calculate the mean.

View document

#### Activity: Finding the Mean Time for a Trip

Here’s an example for you to try. The times for my trip to work during one week last month are shown in the table below:

 Day of Week Monday Tuesday Wednesday Thursday Friday Time in Minutes 42 58 45 47 52

First, look at the data. What would you say is a typical length of time for the trip from this set of data? Write down your estimate.

Now calculate the mean commute time using the method given above.

The smallest time is 42 minutes and the largest is 58 minutes, so a typical time would lie between these, perhaps 50 minutes. Your estimate may be different from this, of course, because it is just a sensible guess at a typical value.

To calculate the mean:

The sum of the values is .

There are five data values, so the mean is .

The mean commute time over that week was about 49 minutes.

Remember to include the units. In this case, the units are minutes.

The mean is fairly close to the estimated typical value of 50 minutes, so it looks as if the calculated value for the mean is correct.

### 11.1.2 Using a Calculator to Find the Mean

The calculator can be accessed on the left-hand side bar under Toolkit.

In the last activity, you were asked to calculate the mean of the following commute times:

Commute Times
Day of Week Monday Tuesday Wednesday Thursday Friday
Time in Minutes 42 58 45 47 52

A student used the calculator to work out the mean in one step—can you spot anything that is not correct with how they did this?

The key sequence used was:

#### Activity: Using a Calculator to Find the Mean

Enter the student’s key sequence above into the calculator.

(a) Write down the answer given and explain how you know that the answer is incorrect.

(a) The answer given is 202.4. You know this is incorrect because it is much larger than all of the values. The mean value will always be between the smallest and largest values. You may also have said that it does not correspond to our previous answer which is another good way to check the calculator answer.

(b) Where did the student make a mistake?

(b) The student should have divided the sum of the values by the number of values. Remember the order of operations (PEDMAS), which we discussed in detail in Unit 4. As division takes precedence over addition, the calculator has divided 52 by 5 to get 10.4 and then added that to the first four numbers. Did you spot this when you looked at the key sequence?

(c) Write down two calculator sequences that could be used to calculate the mean correctly.

(c) Brackets can be used to calculate the sum before the division. The key sequence is:

Alternatively, the sum can be calculated by pressing the equals sign before the division is carried out, then dividing the sum by the appropriate value.

The key sequence is:

Use the calculator to check that these two sequences give the correct answer.

When you are calculating a mean value, remember to check that it lies between the smallest and largest values in the data set. And remember to include the units, too!

In this example, the mean commute time is again 48.8 minutes, which is the same as our previous answer now!

Here’s a fun game that lets you see how changing values affects the mean.

### 11.1.3 Finding the Median of a Data Set

One of the advantages of using the mean as an average value is that it takes account of all the values in the data set. However, this also means that if one of the values is much higher or lower than the other values in the data set, it can greatly affect the mean.

Suppose you are planning a visit to Vermont, USA in October, and you want an idea of how much rainfall there will be. You consult records from the National Climate Data Center and find the following:

 Year 2007 2008 2009 2010 2011 Rainfall, in inches (to 1 d.p.) 5.8 5.2 4.6 9.3 4.5
Reference Enloe, Jesse. “Vermont Precipitation October 1895-2011.” National Oceanic and Atmospheric Administration Climatic Data Center, National Environmental Satellite, Data, and Information Service (NESDIS), U.S. Department of Commerce. Available online at http://www.ncdc.noaa.gov/temp-and-precip/time-series/index.php (accessed December 22, 2011).

#### Activity: How Wet is it in Vermont?

Find the mean rainfall in Vermont in October for these five years.

The sum of the five data values is: .

The mean rainfall is .

Over these five years the mean rainfall was 5.9 inches (to 1 decimal place).

Now look at the five data values again. Does 5.9 inches give you a good idea of how much rainfall there has been? If the data values are plotted on a line, they look like this.

You can see that in four out of the five years, rainfall was below the mean, while it was above the mean in only one year. In that year, the rainfall was particularly high and this has pulled the mean up. So the mean is perhaps not a particularly good choice for a typical value in this case.

Let's take a look at an example in this pencast (click on “View document”).

View document

#### Calculating the Median Rainfall in Vermont

To find the median rainfall, first arrange the data values in order:

 4.5 4.6 5.2 5.8 9.3

There are five data values—an odd number—so choose the middle value, 5.2.

Therefore the median rainfall is 5.2 inches.

Looking at the data displayed on the numberline and where the median value falls it appears the median may be a better choice for an average or typical value in this case.

Now try out the following activity yourself!

### 11.1.4 More Means and Medians

#### Activity: How Good is Charlie at His Computer Game?

Charlie likes to play a computer game. He plays it every day for six days and records his score. He is happy if he averages more than 450 over the six days. Usually he has high scores, but on Saturday he felt sick and didn’t play very well. Here are his results:

 Monday Tuesday Wednesday Thursday Friday Saturday 450 470 430 490 490 220

(a) Find Charlie’s mean score over these six days.

##### Comment

Remember to order correctly the scores and use the mean if there are an even number of values.

(a) To find the mean, find the sum of the scores and divide by the number of scores, six.

So Charlie’s mean score is .

(b) Find Charlie’s median score over these six days.

(b) To find the median score, arrange the data values in order:

 220 430 450 470 490 490

There is an even number of values (6) so take the mean of the two middle values, 450 and 470.

The mean of 450 and 470 is , so Charlie’s median score is 460.

(c) Would you say that the mean score or the median score is more typical of Charlie’s performance?

(c) The low score on Saturday has pulled Charlie’s mean score down, so he might prefer to calculate his average using the median score. It is more typical of the general standard of his scores.

#### Activity: Another Computer Game for Charlie

The next week, Charlie starts playing a new computer game. Here are his scores:

 Monday Tuesday Wednesday Thursday Friday Saturday Sunday 250 270 300 290 320 290 270

(a) Find Charlie’s mean score over these seven days.

(a) To find the mean, find the sum of the scores and divide by the number of scores, seven.

So Charlie’s mean score is (to the nearest whole number).

(b) Find Charlie’s median score over these seven days.

(b) To find the median score, arrange the data values in order:

 250 270 270 290 290 300 320

There is an odd number of values, 7, so Charlie’s median score is the middle value, 290.

(c) How do the median score and the mean score compare this time?

(c) This time, the mean score and the median score are quite close. This is because the data values are fairly evenly clustered around the mean value without any extremely high or low values.

### 11.1.5 Finding the Mode of a Data Set

Another method of finding an average is simply to pick the value that occurs most often. The value that occurs most often is known as the mode or the modal value.

In some data sets, more than one value occurs most frequently and so there is more than one mode. The mode is fairly easy to find, but it does not consider all the values in the data set. However it can be useful in some situations, such as when making decisions about which sizes of a particular piece of clothing should be stocked in a shop.

#### Finding the Mode

Finding the mode of a data set can be summarized as follows:

• Arrange all the values in ascending or descending order.
• Choose the value (or values) that occurs most frequently.
• Write down the conclusion and include the units.

Let's look at an example in the following pencast (click on “View document”).

View document

From the previous example, Charlie’s first set of scores is shown below:

 Monday Tuesday Wednesday Thursday Friday Saturday 450 470 430 490 490 220

Arranging Charlie’s scores in order gives: 220, 430, 450, 470, 490, 490.

Each score occurs once, except for the score of 490, which occurs twice.

So, the mode (or the modal score) is 490.

In this case, the mode is the highest score. Do you think this is a good way of describing Charlie’s typical score?

#### Activity: Finding the Mode of a Data Set

Charlie’s second set of scores is shown below:

 Monday Tuesday Wednesday Thursday Friday Saturday Sunday 250 270 300 290 320 290 270

Find the two modal values.

Arranging Charlie’s scores in order gives: 250, 270, 270, 290, 290, 300, 320.

Each score occurs once, except 270 and 290, which both occur twice. The modal scores are therefore 270 and 290.

### 11.1.6 Using Different Averages

You have now met three different ways of describing an average value: mean, median, and mode.

For Charlie’s first set of scores:

• The mean score was 425.
• The median score was 460.
• The modal score was 490.

These values are very different, and so it is important that when you meet an "average" value, you find out which average is being used.

For example, suppose a group of five employees in a small company each receive a wage of $500 per week and the director receives$3500 a week.

The mean wage is: .

However the median (and also the modal) wage is $500. To see how the median and mode were calculated take a look at this pencast (click on “View document”). View document So if there were a dispute about the employees’ pay in the company, their representative could say, “The average pay in the company is only$500 per week,” whereas the director might say, “The average pay is $1000 per week.” Both statements are technically correct because neither party has stated the type of average they have used. However, both parties have chosen the average that best suits their cause! Knowing what average has been used is very important when you are trying to understand data that you encounter. Knowing something about how the data is spread will provide you with additional information that will shed more light on average value that you are given. We are now going to look at the spread or range of a data set. ### 11.1.7 Measuring the Spread of a Data Set As well as finding a typical value or average for a set of data, it is also important to see how the data values are spread about the typical value. Last year, Sam grew two varieties of sunflowers and measured the heights of the flowers. Here are Sam’s results:  Variety A 60 59 66 55 55 62 58 65 Variety B 52 69 61 60 54 67 50 67 Sam wants compare how each sunflower variety performed so wants to look at the data in more detail. #### Activity: How Tall Are the Sunflowers on Average? (a) Find the mean height of the sunflowers of variety A, using the data from the table above. ##### Answer (a) The sum of the heights of the sunflowers of variety A is . So the mean height of the sunflowers of variety A is . Remember to include the units, in this case, inches; otherwise your answer won’t make much sense. (b) Find the mean height of the sunflowers of variety B, using the data from the table above. ##### Answer (b) For variety B, the sum of the heights is . So the mean height is . On average, the sunflowers of the two varieties are the same height, 60 inches. When Sam looked at the sunflowers, they did look about the same height on average, but there was still a difference. To investigate this, you can plot the heights on number lines, like this: The blue dots represent the heights in inches of the sunflowers of variety A, and the red dots represent the heights in inches of the sunflowers of variety B. Can you see that the red dots are more spread out than the blue dots? The heights of sunflowers of variety A are more consistent, closer together, while the heights of the sunflowers of variety B are more variable, more spread out. One way of measuring the spread of a data set is called the range. It is the largest value minus the smallest value. #### Finding the Range For the sunflowers of variety A, the smallest value is 55 inches and the largest value is 66 inches, so the range in inches is . Remember to include the units when you state the range: The range in the heights of variety A sunflowers is 11 inches. #### Activity: The Range for Sunflowers of Variety B Find the range in the heights of the sunflowers of variety B. ##### Answer The smallest value in the data set for variety B is 50 inches, and the largest value is 69 inches. So the range in inches is . Therefore, the range in the heights of variety B sunflowers is 19 inches. So even though both sunflower varieties had the same mean height there was much more variation, that is a bigger range, of heights seen in variety B compared to variety A. If you grew sunflowers commercially can you think of any reasons why knowing this information about the range may be significant? You might like to discuss this with friends and see what they think. The range gives an idea of the spread of the data, although it can be affected by either very high or very low values. There are other ways of measuring the spread of a data set, and you’ll meet these if you continue studying mathematics or statistics. ### 11.1.8 More About Data So far in this unit, you have discovered how to describe a set of data by looking at a typical value, an average value. You have learned how to find three kinds of average: mean, median, and mode. You have also seen how to measure the spread of data using the range. Averages are used frequently in reports and in the media to summarize data. When you come across an average in this way, it’s worth asking yourself some questions before drawing any conclusions. • Do you know what kind of average was used? • How many values were used to calculate the average? • How was the data collected? Then, you’ll be prepared to consider the data, and the type of average used, in a critical manner. #### Activity: How Many Data? Write down some ideas about why you think that the number of values collected is something to consider? ##### Answer The more data that has been used to calculate the average the more likely it is to be a reliable result. When averages are calculated this may be based upon a sample of the total data available rather than all the data, particularly when dealing with large populations. For example if you wanted to know the average height of men in a country, it would be impractical to measure and record this value for all men—so a sample of the population would be taken. The more measurements you had from across the whole country, the better idea you would get of an average value for the whole population. As well as this, the more data that you have, the less influence extremes can have on the average if the mean is used. Now we have looked at a few ways to analyze data, let’s move on to how to present this using tables, graphs, or charts making data easier to understand . ## 11.2 Tables Tables are often used to display data clearly, particularly when it is important that the data are displayed accurately, such as on a food label, or when a lot of data needs to be displayed in a concise form, such as on a bus or train schedule. The table shown below is from the Irish Tourism Fact Card for 2003. There’s a lot of information here to sort out, but imagine how confusing it would be if it was a list! Employment in the Irish Tourist Industry (Numbers Employed by Sector) Sector/Year 1999 2001 2002 2003 %+/– (’01–’03) Hotels 53,906 54,275 54,656 54,164 0.2% Guesthouses 3,115 2,943 2,914 2,879 2.2% Self-catering accommodation 4,580 3,830 n/a 3,878 1.3% Restaurants 40,283 41,827 41,409 41,085 1.8% Non-licensed restaurants 15,221 13,849 n/a 15,642 12.9% Licensed premises 78,300 78,225 80,121 79,319 1.4% Tourism services and attractions 33,910 34,568 34,852 34,749 0.5% Total 229,315 229,517 231,716 0.96% Source: Fáilte Ireland (2004). Note: Percentages have been rounded. So how do we start to understand what is shown in a table? The first thing to do is to look at the title. This should explain clearly what the table contains. This table contains the number of people employed in the Irish tourist industry, divided into sectors. [“Fáilte”, pronounced fall-sha, is an Irish word meaning “welcome”.] Next, check the source of the data. Have the data been collected by a reputable organization? Can you check how the data have been collected? This is an important step: if, for example, the data are only based on a few results or have not been collected properly, then you may not wish to rely on the values given in the table. The source of these data is given below the table, a survey conducted in 2004 by Fáilte Ireland, the Tourist Board for Ireland. This is an official organization, so the data are likely to be reliable. The next step is to examine the column and row headings. What information is being given? Raw numbers, or percentages? What units are being used? Do you understand any abbreviations used and how the table is constructed? The column headings are the years 1999, 2001, 2002, and 2003. Note that the year 2000 is missing; this could affect your interpretation of the figures. The final column has a confusing heading. Can you work out what it means? Look carefully at the symbols used. It is the percentage change between 2001 and 2003, given as either a positive or a negative number. Between 2001 and 2003, for instance, the number of people employed in hotels in Ireland fell by 0.2%. The rows give types of tourist facilities, such as hotels and restaurants, with the total for all types given in the bottom row. Finally, you can look at the main body of the table to find the information you need. For example, to find the number of people employed in restaurants in 2001, move along the row labeled restaurants, and down the column labeled 2001. Where this row and column meet, the value is 41,827. This means that, in 2001, 41,827 people were employed in restaurants. ### Reading a Table Before you go on to use the information in the table, here is a summary of the steps in reading a table. • Read the title. • Check the source of the data. • Examine the column and row headings, particularly the units used. • Check that you understand the meaning of any symbols or abbreviations. • Extract the information you need from the main body of the table. ### 11.2.1 Using Tables Now it’s time to use the table from the Tourism Fact Card. #### Activity: Employment in the Irish Tourist Industry Employment in the Irish Tourist Industry (Numbers Employed by Sector) Sector/Year 1999 2001 2002 2003 %+/– (’01–’03) Hotels 53,906 54,275 54,656 54,164 0.2% Guesthouses 3,115 2,943 2,914 2,879 2.2% Self-catering accommodation 4,580 3,830 n/a 3,878 1.3% Restaurants 40,283 41,827 41,409 41,085 1.8% Non-licensed restaurants 15,221 13,849 n/a 15,642 12.9% Licensed premises 78,300 78,225 80,121 79,319 1.4% Tourism services and attractions 33,910 34,568 34,852 34,749 0.5% Total 229,315 229,517 231,716 0.96% Source: Fáilte Ireland (2004). Note: Percentages have been rounded. The table above is the same table you saw on the previous page. Use it to answer the following questions. (a) How many people were employed in guesthouses in Ireland in 2003? ##### Comment Look down the first column for the sector and along this row until you reach the correct year. ##### Answer (a) Going across the row marked ‘Guesthouses’ and down the column headed ‘2003’ shows that 2,879 people were employed in guesthouses in 2003. (b) Why do you think that there is no total for 2002? ##### Comment Look carefully at all the data in this column. ##### Answer (b) In the column for 2002, the number of people employed in self-catering accommodation and in non-licensed restaurants was not available (n/a), so the overall total can’t be found. (c) Which type of premises employed the highest number of people in 2001? ##### Answer (c) Going down the column for 2001, the highest number (apart from the total) is 78,225. So, licensed premises employed the highest number of people in 2001. (d) What type of premises has shown the largest percentage rise between 2001 and 2003? How has this percentage been calculated? ##### Comment When calculating the percentage think carefully about what you have been asked to work out – which value is the original value that you are comparing the increase to? ##### Answer (d) The largest percentage rise is for non-licensed restaurants, 12.9%. The increase in the number of people employed in non-licensed restaurants between 2001 and 2003 is . So the percentage increase is (rounded to 1 d.p.). It’s handy that the calculation has been done for you, though! ### 11.2.2 Representing Data in Tables Some tables contain very large numbers or very small numbers and to make these easier to read the way they are presented may be changed. For example, suppose the number of tourists (rounded to the nearest thousand) visiting south-west Ireland in four quarterly periods were 556,000, 320,000, 284,000, and 355,000. These numbers can be expressed as , , and so on. Using units of 1000, the numbers in the table can be written simply as 365, 320, 284, and 355, and the column heading changed to “Numbers (000s)” or “Numbers (thousands).” The values then become much easier to read. The table below is also taken from the Tourism Fact Card and it indicates the number of visitors to the different regions of Ireland in 2003 and the revenue generated. #### Where Did the Tourists Go and How Much Did They Spend? Regional Numbers (000s) and Revenue (Euro m) 2003 • Numbers (000s) • Revenue (€m) Overseas Tourists Northern Ireland Domestic Total Dublin • 3,444 • 1,051.3 • 171 • 55.1 • 893 • 113.8 • 4,508 • 1,220.2 Midlands/East • 775 • 310.2 • 42 • 9.2 • 802 • 90.6 • 1,619 • 410.0 South-East • 907 • 268.4 • 9 • 6.6 • 1,042 • 138.1 • 1,958 • 413.1 South-West • 1,515 • 619.3 • 47 • 16.7 • 1,287 • 232.2 • 2,849 • 868.2 Shannon • 983 • 334.2 • 26 • 4.8 • 818 • 115.0 • 1,827 • 454.0 West • 1,159 • 456.5 • 82 • 30.0 • 1,249 • 204.3 • 2,490 • 690.8 North-West • 476 • 187.9 • 219 • 56.1 • 566 • 76.9 • 1,261 • 320.9 Total Revenue • 3,227.7 • 178.5 • 970.9 • 4,377.1 “Euro m” (or “€m”) is an abbreviation for “million euros;” euros are the currency used in Ireland and throughout much of Europe. #### Activity: Where Did the Tourists Go? Use the table above to answer the following questions. Remember, “Euro m” (or “€m”) is an abbreviation for “million euros;” euros are the currency used in Ireland and throughout much of Europe. (a) Look carefully at the table. What information does it give? ##### Answer (a) The table shows where tourists went in Ireland during 2003 and how much they spent. Seven regions of Ireland are included, and the tourists are divided into those from overseas, those from Northern Ireland and domestic tourists. Each cell in the table contains two numbers. The top number is the number of tourists in thousands, and the second number, in italics, shows how much they spent in millions of euros (€m). (b) How reliable do you think the data in the table are? Do you know how the data were collected? ##### Answer (b) No source is given on the table which might make you question the reliability. However, there was reference in the original document to the Central Statistical Office for Ireland, so it is probably information from an official source. You might have questioned, though, how are tourists counted? Are they people who stayed at hotels or guesthouses, perhaps, or those who visited tourist attractions, or perhaps tourists were counted at the borders as they entered Ireland? Estimating how much money tourists spend would not be straightforward either. (c) How many domestic tourists visited Shannon in 2003? ##### Comment Remember to take note of the units used for numbers of visitors. ##### Answer (c) Take care with the figures when you read them from the table. The number of people is given in thousands, so the number of domestic tourists who visited Shannon in 2003 was 818,000. (d) Roughly how much did all the tourists spend in Dublin in 2003? ##### Answer (d) Tourists visiting Dublin spent 1220.2 million euros, or a little over 1.22 billion euros. 1220.2 million = 1,220,200,000 1 billion = 1,000,000,000 So 1220.2 million = 1.22 billion (to 2 d.p.) (e) In which regions were there more domestic tourists than overseas tourists? ##### Answer (e) Comparing the columns for the overseas tourists and the domestic tourists shows that there were more domestic tourists than overseas tourists in the midlands/east, south-east, west, and north-west regions. (f) Why do you think the data was collected? ##### Answer (f) It is possible that the data was collected to help plan tourist initiatives, such as targeted marketing. ### 11.2.3 Constructing Your Own Tables Sometimes, you may need to summarize a set of data and construct a table yourself. The first step is to decide how to sort the data into categories to stress the points you wish to make. For example, you may have collected data from a group of people that included their age. Should you divide this into “under 25,” “25–40,” and “over 40,” perhaps? Your choice of category would depend on the kind of data that you have and what you want to show. If you are counting the data elements yourself rather than using data that are already tabulated, you may want to employ some strategies to save time and minimize the chances for error. If you have a lot of data, for instance, you may find it useful to use a tally chart. If you are unsure what a tally chart is or how to construct one take a look at this pencast (click on “View document”). View document First draw up a blank table with the row and column headings. Then, for each data value, put a vertical mark in the cell to which it belongs. When you have counted the item, you may find it helpful to cross it off your list, so that you know which item you are up to: It is easy to lose track! When you get to the fifth value in the cell, draw a diagonal line across the previous four marks (instead of a new vertical mark), making a “gate” like this: . This will make it easier to total the number in each cell, as you can add groups of five together quickly. When you have completed the tally chart, check that the total number of tally marks agrees with the number of items on your data list. For example, suppose you have volunteered to collect the lunch preferences for your colleagues at work. There is a choice of cheese (C) sandwich, ham (H) sandwich, or salad (S). Your colleagues have expressed their preferences as follows: H S C H H C C H S C H H The tallies for the lunch order would look like this: You will need four cheese sandwiches, six ham sandwiches, and two salads. There are different ways of drawing up a tally table. A statistician named John Tukey suggested using a square symbol, putting one dot in each corner for the first four items of data, then a side for each of the next four items and finally a diagonal line for each of the last two items. This symbol then represented ten data items. Tukey felt that this symbol was easier to read and add up than the gate symbol. What do you think? Your answer will probably depend upon what you have used in the past! #### Activity: More About Tourists Imagine you are the manager of a small Irish hotel that has guests of different ages and nationalities. You would like to know what kinds of guests visit your hotel, so you decide to summarize this information in a table. You decide to divide the ages into groups: child (under 16), adult (16–60), and senior (over 60); and the nationalities into the categories Irish (Ir), British (B), Mainland European (E) and the Rest of the World (W). From the room bookings the following data were collected. This is of the form 16E, where the number represents the age (16) and the letter represents the category (E), so 16E represents a 16-year-old guest from Mainland Europe.  16E 7B 24Ir 26Ir 46Ir 43Ir 5Ir 50W 55W 13Ir 13B 15Ir 61B 8Ir 37W 48E 8Ir 62B 49E 6W 55Ir 11Ir 9Ir 12B 62W 65B 65B 67Ir 13W 12W 54B 72Ir 61Ir 48B 61W 15B 62Ir 67E 10W 27B 12Ir 31B 35W 8W 42B 43B 15W Construct a blank table with a title, the source of the data and column and row headings corresponding to the categories above. Use a tally to determine the number of guests in each age group and each category in your table first. Then from the tally chart, construct the final table. The data probably looks quite confusing—take your time and work methodically through the data to make sure you don’t miss any! The title, column and row headings should make it clear to the reader what information is contained in the table—so think carefully about these. ##### Answer The final table should look something like the table below. You may have put the rows and columns the other way round. Origin and Age Categories of Hotel Guests Child Adult Senior Total Irish 8 5 4 17 British 4 6 4 14 Mainland European 0 3 1 4 Rest of the world 6 4 2 12 Total 18 18 11 47 You can work out the total number of visitors either by adding together the numbers of children, adults and seniors , or by adding the numbers of Irish, British, Mainland European and “Rest of the world” visitors . These two totals should agree, so it is useful to work out both as a check. Do you think that there is any other information that would be useful to include with this table? For example how old is a child or a senior? ### 11.2.4 Analyzing the Tourist Table Here is the table from the previous page that you have just constructed. Origin and Age Categories of Hotel Guests Child Adult Senior Total Irish 8 5 4 17 British 4 6 4 14 Mainland European 0 3 1 4 Rest of the world 6 4 2 12 Total 18 18 11 47 #### Activity: Using a Table From the table you constructed in the previous activity, work out the following: (a) The percentage of visitors that are children. ##### Answer (a) Eighteen out of the 47 visitors are children. So the percentage of visitors that are children is given by: (to the nearest whole number). (b) The percentage of visitors that come from outside Ireland. ##### Answer (b) 17 visitors were Irish. So the number of visitors that came from outside Ireland is . (You might have determined the number of non-Irish visitors by adding the totals from other regions: ). So the percentage of visitors from outside Ireland is given by: (to the nearest whole number). (c) Why is it helpful to calculate percentages here? What other calculations might be useful? ##### Answer (c) Percentages indicate the proportion of guests in each category, so, for this sample of data, just over a third of the visitors were children. This sort of information may be useful for marketing or development purposes. Many other calculations can be made such as the percentage of seniors or the percentage of visitors from mainland Europe. Percentages also make it much easier to compare the data in different categories than looking at the raw numbers alone. If you would like more practice interpreting tables, try the following exercise on temperatures in Jamaica. ## 11.3 Graphs and Charts Although tables are useful for summarizing a lot of data concisely, clearly, and accurately, sometimes you want to get an overall message across quickly. You can do this using a graph or a chart. Graphs can also be used to explore relationships between sets of data, such as the way in which a sunflower grows over a season. ### Activity: Graphs and Charts You have probably seen different types of graphs and charts in your day to day life, when watching the TV, reading a newspaper or on the Internet. Spend a few moments now writing down all the different types that you have seen—don’t worry if you don’t know the math language for these at the moment! #### Answer Here are some we came up with: Bar charts Pie charts Line graphs Scatter graphs—similar to a line graph, but where the plotted points are not joined. In this unit we will be looking at line graphs, pie charts, and bar charts. Keep an eye out for examples to see around you! ### 11.3.1 Plotting Points Many graphs and charts are constructed by plotting points on a grid. The graph below shows two perpendicular lines: the horizontal axis and the vertical axis, which set up a grid. Each axis is marked with a scale, in this case from 0 to 3. The point where the two axes meet and where the value on both scales is zero is known as the origin. (Axes is the plural form of axis.) We can describe the location of any point on the grid by its horizontal and vertical distance from the origin. In the graph below, the horizontal axis has been labeled with an “x” and the vertical axis with a “y.” This is because the horizontal axis is usually known as the x-axis and the vertical axis as the y-axis. Look at the point P in the above graph. To get to the point P from the origin, move 3 units across and 2 units up. (Alternatively, the point P is opposite 3 on the x-axis and 2 on the y-axis.) We say that P has the coordinates (3, 2). The horizontal distance of P from the origin is known as the x-coordinate of P and the vertical distance from the origin is known as the y-coordinate of P. So for point P, 3 is the x-coordinate and 2 is the y-coordinate. Notice that the x-coordinate is the first coordinate in the pair and the y-coordinate is the second coordinate in the pair. So the general form for the coordinates of a point is (x, y). This is important to remember to avoid plotting points the wrong way round! In the same way, Q is 1 unit across and 3 units up, so Q has coordinates (1, 3). ### 11.3.2 Negative Coordinates The axes can be extended downward and to the left to include negative values as shown here. In the graph below, B is 3 units to the left of the origin so its x-coordinate is 3. It is 1 unit up, so its y coordinate is 1. So B has coordinates (3, 1). #### Activity: Reading Coordinates Write down the coordinates of points A, C, and D, shown in the graph above. What are the coordinates of the origin? ##### Comment The x co-ordinate is shown first. ##### Answer A is opposite 3 on the x-axis, so its x-coordinate is 3. It is opposite 1 on the y-axis, so its y-coordinate is 1. The coordinates of A are (3, 1). C is opposite 2 on the x-axis, so its x-coordinate is 2. It is opposite 2 on the y-axis, so its y-coordinate is -2. The coordinates of C are (2, -2). The coordinates of D are (2, 1). The origin has coordinates (0, 0). For some more fun practice reading plotted points, try this game. There are three different levels of difficulty to choose from—try medium to start with. #### Activity: Plotting Your Own Points Draw x- and y-axes and mark scales as in the graph above. Mark in the points E at (1, 1), F at (2, 1), G at (1, 3) and H at (2, 3). It would be useful to do this exercise on graph paper as it will be much easier to plot the points if you have a grid to work with. If you have none, you may download a graph paper .PDF file provided in the link below. View document ##### Answer The plot you drew should have points in the E, F, G and H marked in the same position they are are shown on the plot above. E has coordinates (1, 1), so it is plotted 1 unit to the right and 1 unit up, that is, opposite 1 on the x-axis and opposite 1 on the y-axis. F has coordinates (2, 1), so F is plotted opposite 2 on the x-axis and opposite 1 on the y-axis. The coordinates of G are (1, 3) so G is plotted opposite 1 on the x-axis and opposite -3 on the y-axis. The coordinates of H are (2, 3) so H is plotted opposite 2 on the x-axis and opposite 3 on the y-axis. Now try the game again, but choose “hard” as the level and you can plot your own points. ### 11.3.3 Decimal Coordinates In the examples so far, all the coordinates were whole numbers and the scales were also marked at whole unit intervals. It is possible to plot points that have decimal coordinates such as (1.8, 2.1). In the graph below, each unit along the axes has been divided into ten, so each gray line represents a tenth of a whole unit, or 0.1, just like on the number lines we saw in Unit 2. The point A has coordinates (1.3, 0.8) and is plotted 1.3 units across to the right of the origin and 0.8 units up. #### Activity: Reading Decimal Coordinates Write down the coordinates for the points B and C in the graph above. ##### Answer B is plotted at the point which is 3 intervals to the left of the origin on the horizontal scale and 4 intervals past the 1 mark on the vertical scale, i.e. opposite 0.3 on the x-axis and opposite 1.4 on the y-axis. So the coordinates of B are (0.3, 1.4). Similarly, the coordinates of C are (1.5, 0.9). For more practice in plotting points, try the following pages: ### 11.3.4 Displaying Data on a Graph or Chart You have seen that, provided you have both an x- and a y-coordinate, you can plot a point on a chart. So if a set of data consists of pairs of data values, then the data can be plotted on a graph. For example, suppose a nurse is monitoring the weight of a woman during the last few weeks of her pregnancy and has recorded the woman’s weight each week as shown in the table below.  Week of pregnancy 34 36 38 40 Weight in lb 166 170 172 173 The second row heading, “Weight in lb,” tells us that the woman’s weight is expressed in lb (pounds). So in week 34, the woman had a weight of 166 lb. This can be written as (34, 166) and plotted on a graph in the same way as the earlier examples. If we take the x-coordinate of each point as the week of pregnancy and the y-coordinate as the weight in lb, then points can be plotted at (34, 166), (36, 170), (38, 172), and (40, 173) to represent this set of data. Now we know how we are going to plot the points we need to decide how to draw the graph itself. ### 11.3.5 Choosing the Scales for a Graph or Chart In the examples considered so far, the scales on both axes have been provided for you. The scales have been the same on both axes, and they have both started at zero. However, this is not always necessary. Scales are usually chosen to illustrate the data clearly, making good use of the graph paper and using a scale that is easy to interpret. In the example of weight during pregnancy, if the scales started at 0, all the points would be plotted in the top right-hand corner and would be difficult to read. The y-values range from 166 to 173, and the x-values start at 34 and end at 40. So, the difference between the lowest and highest y-coordinate is 7, and the difference in the x-coordinates is 6. A clearer way of presenting this data would be to use a scale on the x-axis that starts at 30 and ends at 42, and a scale on the y-axis that starts at 160 and ends at 174. Using these scales, the points could be plotted as follows. When the scales have been marked on the axes, remember to label the axes to show what is being measured, and include the units. Then plot the points and, if appropriate, join them with a line or smooth curve. In this case, you would expect the woman’s weight to change steadily between hospital visits over this period, so it is reasonable to join the points with a curve to show the overall trend. Add a title so that your reader can see at a glance what your graph illustrates. Finally, state the source so your reader can check the data if they wish. So here’s what you need to do to draw a graph. • Choose an appropriate scale for the x- and y-axis. • Plot the points accurately. • Label the both axes including a brief description of the data and the units. • Give your graph a suitable title. • State the source of data. You will be able to plot your own graph later in this unit. ### 11.3.6 Using the Graph Since you would expect the woman’s weight to change steadily between hospital visits, it is possible to use the graph to estimate her weight at other times during the six weeks. [ It may be helpful to print this page, so that you may draw lines on the graphs to identify values. You may also use an object with a straight edge, such as a ruler or piece of paper, and hold it up to your monitor to help visualize where certain points appear in relation to the axis. ] For example, to estimate her weight at 35 weeks, first find 35 on the horizontal axis. Then from this point, draw a line parallel to the vertical axis, until the line meets the curve at point P. From P, draw a line horizontally, to meet the vertical axis. Read off this value: here, it is about 168.4 lb. So, an estimate of the woman’s weight at 35 weeks is between 168 and 169 lb. Determining values between the plotted points is known as interpolation. #### Activity: Reading a Value from a Graph Use the graph above to answer the following. (a) How would you estimate the woman’s weight at 37 weeks? ##### Answer (a) Find 37 on the horizontal axis and draw a line vertically up to the curve. Then draw a line horizontally from this point to intersect the vertical axis. This value is approximately 171.2 lb. The woman’s weight at 37 weeks is estimated to be about 171 lb. (b) When do you think the woman’s weight was 167 lb? ##### Answer (b) Find 167 lb on the vertical axis. Draw a horizontal line to meet the curve. Then draw a line vertically down to meet the horizontal axis. Read off the value: it is just over 34 weeks. So the woman’s weight is estimated to have been 167 lb after about 34 weeks. (c) From the graph, can you predict the weight of the woman at times later than 40 weeks? ##### Answer (c) Sometimes it is possible to extend a graph past the plotted points and use the graph to read off values outside the data set. For example, if the baby had still not been born after 41 weeks, then you might estimate that the mother would weigh about 173.4 lb. However, a woman’s pregnancy usually lasts about 40 weeks and when the baby is born, her weight will fall substantially. If the baby is born before 41 weeks , the woman’s weight at 41 weeks would not be 173.4 lb. As there is no information given on when the baby was born, it is not possible to use the graph to estimate the woman’s weight after 40 weeks. Trying to estimate the coordinates of points on the graph that lie outside the plotted points is known as extrapolation. It can sometimes be used to find values that are close to the plotted data set, if you are confident that the graph continues in a similar manner. However, you do need to be cautious: we cannot be certain that trends shown in graphs will continue. The graph can also be used to determine the overall trend in the data—how one value is changing with the other. In this case, the graph emphasizes the change in the woman’s weight during the last few weeks of pregnancy. It shows that she gained weight quite rapidly between weeks 34 and 36, but from week 36 to week 40 she gained less. ### 11.3.7 Drawing a Graph The following data values were collected on the circumferences and diameters of various circular objects. All of the measurements are in inches. Diameter (in.) Circumference (in.) 4.4 14.1 6.6 20.9 8.6 27 10.3 32.3 23.0 72.8 #### Activity: Drawing a Graph (a) On some graph paper or in your notebook, plot these points on a graph with the diameter on the horizontal (x) axis and the circumference on the vertical (y) axis. Draw a straight line through the points. ##### Comment Remember to refer back to the summary of how to draw a graph so that you include all the detail needed. In particular look at the data you are given carefully before you decide what scale to use for the x- and y-axes. ##### Answer (a) Your graph should look like this: Note the title and labeling of the axes. You may have used a different scale from us if you had different graph paper. This is fine as long as it makes best use of the paper and is easy to read from. (b) Use your graph from part (a) to estimate the circumference of a circle of diameter 15 inches. ##### Answer (b) To find the circumference of a circle of diameter 15 inches, find 15 on the horizontal axis. From this point, draw a vertical line to meet the graph. Then draw a horizontal line across to the vertical axis and read off the value. The circumference of a circle of diameter 15 in. is about 47 inches. (c) Use your graph from part (a) to estimate the radius of a circle of circumference 30 inches. ##### Comment Remind yourself about what is the radius of a circle—you may have it in your math notebook. ##### Answer (b) The circumference is 30 in., so find 30 on the vertical axis. From this point, draw a line horizontally across to the graph, then vertically down to the horizontal axis. The point where this line crosses the horizontal axis gives the diameter of a circle whose circumference is 30 inches. It is about 9.5 inches, so the radius of the circle will be half of this, about 4.7 to 4.8 inches. Optional Brain Stretcher This is an interesting graph as you can make an estimate for π (pi) from it. π is the ratio of the circumference to the diameter of any circle and is a constant. By measuring the slope (gradient) of the graph that you have drawn you will obtain an estimate for π. This is calculated by dividing the rise of the line by corresponding run of the line as shown below. Have a go and see if you get a value close to 3.141. ##### Answer Gradient = rise = 72.8 in – 14.1 in = 58.7 in = 3.156 (to 3 d.p.) run 23.0 in – 4.4 in 18.6 in This is not a bad estimate for pi! For more practice in drawing and interpreting graphs, try: In the next section we will look at bar charts and the different ways that these can be drawn depending on the data that you have and what you want to show. ## 11.4 Bar Charts Earlier, you saw data from the Irish Tourism Fact Card presented in tables. Information can also be illustrated in bar charts. The chart below is also from the Tourism Fact Card and shows the numbers of visitors participating in different activities. The data were based on estimates from the Irish Central Statistical Office “Purpose of Visit” survey. Fáilte Ireland (2004) In a bar chart, the length of each bar represents the number in that category. On this chart, you can see that the longest bar is for hiking/hill walking, so this was the most popular activity. The shortest bar is for cruising, indicating that this was the least popular activity of those listed. From the title, we know that the numbers of visitors are measured in thousands. On this bar chart, the values represented by the bars have been marked directly on the chart. For example, the number of visitors who played golf was 134,000 and the number of visitors who went hiking/hill walking was 168,000. If the values had not been marked on the bars, they could have been estimated by drawing a line from the end of the bar and seeing where it intersected the horizontal axis. For example, the end of the cycling bar is level with the 100 marked on the horizontal scale (as shown by the dashed line), so approximately 100,000 visitors went cycling. Because the bars on this chart are horizontal, the chart is known as a horizontal bar chart. Bar charts can also be drawn with the bars vertical. Note that in a bar chart, each bar has the same width, and since the bars represent different and unrelated categories, the bars do not touch each other and are separated by gaps. Can you see anything that is missing from this bar chart? There is no label on the horizontal axis, which you would normally expect to complete a bar chart. ### 11.4.1 Drawing a Bar Chart The same basic guidelines apply to drawing a bar chart as they do to drawing a graph, so bear these in mind when completing this activity. #### Activity: Drawing a Bar Chart In an earlier activity, you created the following table showing information about hotel guests. Use the table to draw a vertical bar chart to show the total number of guests in each of the nationality categories. Mark the nationality categories on the horizontal axis and the number of guests on the vertical axis. Origin and Age Categories of Hotel Guests Child Adult Senior Total Irish 8 5 4 17 British 4 6 4 14 Mainland European 0 3 1 4 Rest of the world 6 4 2 12 Total 18 18 11 47 Source: Hotel accommodation records ##### Answer Your bar chart should look like this: Did you remember to include the following essential information? • A title describing the information shown in the chart? • The source of the data? • Labels for the axes? • A scale for the vertical axis? Notice how this bar chart stresses the total number of tourists in each category. You can see easily that there were more tourists from Ireland than any of the other groups. ### 11.4.2 Component Bar Charts If you wished to show how the totals were broken down into the different age groups, you could split each bar into three different sections where the length of each section represented the number of guests in that age group. The resulting component bar chart is shown below. Notice that a key or legend has been added to the graph to explain the shading for the different categories. When you read a component bar chart, you need to find the height of the relevant section. For instance, the top of the section representing British adults is opposite the 10 on the vertical scale. The bottom of that section is opposite the 4. So the number of British adults is . #### Activity: Component Bar Chart Draw the component bar chart for yourself from the hotel data and compare it to that shown above when you have finished. ### 11.4.3 Comparative Bar Charts Another way of displaying the totals would be to split each bar into two, one representing the number of males and the other the number of females in each category. To enable these numbers to be compared directly, the bars representing males and females can be placed next to each other as shown below. This is an example of a comparative bar chart. #### Activity: Reading a Comparative Bar Chart Using the bar chart above, answer the following questions. (a) How many female guests from Britain were there? ##### Answer (a) There were ten female guests from Britain. (b) How many male guests in total stayed at the hotel? ##### Answer (b) The total number of male guests was . Notice how all these charts followed the same format with the title and source clearly labeled and the axes and scales clearly marked, but they emphasized different aspects of the data. When you are displaying data in a graphical form, it is important to choose a chart or graph that stresses the main points as simply as possible, so that your reader can understand your chart quickly and easily. The tourist market bar charts display how many items (in this case, guests) there are in each category. How many times an item occurs is known as the frequency. Charts that display frequencies are also known as frequency diagrams. For more practice interpreting bar charts, try the following lesson: ## 11.5 Pie Charts You may have heard of Florence Nightingale as a pioneer in nursing practice, but did you know how important data was to her and that she also invented the pie chart? This brief video will give you an insight into her work: Interactive feature not available in single page view (see it in standard view). Although bar charts are useful to display numbers and percentages, sometimes you may wish to stress how different components contribute to the whole. Pie charts are often used where you want to compare different proportions in the data set. The area of each slice (or sector) of the pie represents the proportion in that particular category. For example, the pie chart below illustrates the favorite type of exercise for a group of people. This pie chart shows the percentages for each category, so you can read these off directly. The most popular activity was walking (40%), followed by cycling (24%) and swimming (18%). Notice, however, that the source of the data has not been stated, so you do not know where it came from or how many people were interviewed. It is important to consider both these points. If, for example, it was a small sample, you may not wish to rely on the results to generalize to a larger group. Sometimes the percentages are not marked on the chart. So, the proportions can then be roughly estimated by eye. For example, the pie chart below illustrates how employees in a particular company travel to work. ### Activity: How Employees Travel to Work Use the pie chart above to answer the following questions. (a) Do more people travel to work by bus or by train? #### Answer (a) The slice of the pie representing bus travel is larger than the slice representing train travel, so yes, more people travel by bus than by train. (b) Would it be correct to say that the number of employees who travel to work as car drivers is about twice the number who travel as car passengers? #### Answer (b) The slice of the pie representing car drivers is about twice the size of the slice representing car passengers, so it is correct to say that the number of employees who travel to work as car drivers is about twice the number who travel as car passengers. (c) Estimate the proportion of employees who travel to work by car, either as the driver or as a passenger. #### Answer (c) The two slices for car drivers and passengers cover about two-thirds of the circle, so about two-thirds of employees travel to work by car, either as the driver or as a passenger. It is important to remember that pie charts are useful only for comparing proportions, looking at parts of a whole. If you are interested in other aspects such as trends or frequencies, then other graphs or charts may be more appropriate. If necessary—provided the pie chart has been drawn accurately—you can also work out the percentages by measuring the angle at the center of the pie for each sector, using a protractor. For example, the angle at the center of the circle for the “Car Passenger” category is about 80º. Because a complete circle measures 360º, the fraction that this sector represents is , or 22%. So about 22% of the employees travel to work as car passengers. Pie charts are often used to give an overall impression rather than detailed information, so on many occasions a rough estimate will suffice. ### Optional Activity: How Employees Travel to Work If you have a protractor you can print off the pie chart and work out the proportions for the other sectors. #### Comment You may need to extend the lines of the circles for each sector so that you can measure them properly with the protractor. Line up the bottom 0° line with one side of the sector and make sure that the center of the protractor is on the center of the circle. Check your measurements add up to 360°. #### Answer The angles measured from the pie chart are: Passengers in cars79° Bus43° Train26° Cycle13° Walk30° Other Car162° Calculation of percentages Passengers in cars79°21.9% (to 1 d.p.) Bus43°11.9% (to 1 d.p.) Train26°7.2% (to 1 d.p.) Cycle13°3.6% (to 1 d.p.) Walk30°8.3% (to 1 d.p.) Other1.9% (to 1 d.p.) Car162°45% (to 1 d.p.) You may have noticed that the total percentage does not add up to 100%; this is due to rounding all the values to 1 d.p. For the calculation of the percentages it would be better in you can to include the workings. Don’t worry if you didn’t get exactly the same values—there are bound to be some small differences in the measurement of the angles which will affect this. For more practice interpreting pie charts, try the following lesson: ## 11.6 Reading and Drawing Graphs and Charts—A Summary You have had plenty of practice at reading graphs and charts in this unit, and you’ll meet lots more graphs and charts as you study, or in newspapers, magazines, or the Internet. Take time to make sure that you understand what the graph is telling you. Here are the steps summarized so that you can refer to them later: ### Reading a Chart or Graph • Read the title of the chart and check the source of the data. • Check the axes. What is being measured? What units have been used? • Examine the scales. How should these be interpreted? What does each interval represent? • Check the key or legend. What does the shading or line represent? • Finally, examine the chart or graph itself, reading off the information that you need. Are you looking for a specific value or a trend or some other information? You have also created graphs by hand. You may prefer to use a computer to draw your graphs and charts. There are many exciting ways to present data, but the important points listed below still apply whether you are drawing graphs by hand or by using software. ### Drawing Graphs and Charts • Include a clear title—so that your reader can see at a glance whether this chart is the one they are interested in. • Include the source of the data—so that your reader can make further checks on how the data was collected if they wish. • Choose the simplest graph or chart that illustrates the point you wish to emphasize. • Label axes with both words and units. • Mark scales clearly, choosing a scale that is easy to interpret. • Include a legend if you have used more than one type of shading or line. Look out for real-life graphs and charts in newspapers, magazines, on television, or on the Internet. Do these graphs follow the guidelines? Can you understand what the graphs are illustrating? ### 11.6.1 Reading a Graph with Caution! #### Activity: Cutting the Cost of Heating The graph below shows the monthly heating bills of a house, before and after attic insulation was installed. Use this graph to answer the following questions. (a) Which of the principles in the Drawing Graphs and Charts box have not been followed? ##### Answer (a) No source is stated, so you have no idea how the data were collected or how reliable the information may be. Was just one house used in the survey, or were many houses used? The axes are labeled but the scales are not marked. Suppose the scale on the vertical axis was from$120 to $121. Then the apparent drop in the bill after insulation would be negligible. However, if the scale went from$0 to \$130, the drop might be of more interest.

No scale is marked on the horizontal axis, either. In fact, the data was collected from November to September, so the horizontal scale should have indicated this.

(b) What does the graph appear to show?

(b) The graph appears to show a large drop in the heating bills after insulation was installed. However, without the scale on the vertical axis, it is impossible to say what kind of drop this is.

From the data, the drop in the monthly heating cost occurred in the May bill just as the weather was warming up for the summer. The reduced bills could simply be due to less heating being used in the summer.

Overall, no conclusions can be made from this graph about the effectiveness of the insulation.

The last activity illustrated an important point: when you are comparing two sets of data, you need to compare like with like. You would expect the bills for the summer to be less than those in the winter anyway, regardless of the presence or absence of attic insulation. It would be more appropriate to consider the amount of energy used for heating over two periods with similar weather.

Alternatively, you could compare two similar groups of houses, one group with the insulation and the other without, over the same period of time. So although graphs and charts are very useful, it is important to read them critically, checking that all the information you need to interpret them is provided.

Another important point to note when interpreting a graph is that if the graph appears to show an association between two quantities, it does not prove that one has caused the other. For example, suppose the number of students in a town registering in a math course rose from year to year and the number of burglaries in the town also rose.

Does this mean that the math students have committed the burglaries? No, of course not! Both rises may be linked to some other factor such as the number of people who have recently moved into the town.

However in other situations, it may be much more tempting to suggest that one thing has caused another—for example, if the number of cancer cases are higher around nuclear power plants than elsewhere. In order to establish cause and effect, very large statistical surveys and analyses have to be carried out.

This short video clip describes how the link between smoking and lung cancer was investigated.

## 11.7 Data Visualization

In this unit, you have seen how to visualize data as graphs, bar charts, and pie charts. However, there are many other exciting ways of presenting data, including interactive displays. As an example, watch the video clip below. It summarizes 200 years of world development for 200 countries in just four minutes! Note that in this visualization, the scale on the horizontal axis is not a uniform scale. Each value marked on the scale is ten times the value marked on its left (this is a logarithmic scale and if you continue with your math study you will come across this).

You can visit the Gapminder website if you would like to investigate this graph further. This gives you the opportunity to add different countries to the graph or just concentrate on one at a time—don't spend too much time playing, though!

This dynamic chart of world development used a huge amount of data. Recently, more and more government and public data have been made freely available online. For example, you can check how the U.S. government is spending its money at the USAspending.gov site.

Freely available data allows different data sets to be combined and then displayed with powerful results. These combined data displays are known as mashups. The Year Open Data Went Worldwide is a video clip which shows the inventor of the World Wide Web, Sir Tim Berners-Lee, describing how data mashups have helped in a court case, in the Haiti earthquake disaster, and in providing information about local areas.

Remember that the same principles for reading charts and graphs that you met earlier also apply to mashups.

In particular:

• Do you know what data have been used?
• Do you know the source of the data?
• Have the data been selected to support the author’s point of view?

In this unit, you have learned how to describe a data set by using an average value and the spread of the data set. You have also learned how to draw and interpret different charts and graphs. You will find these skills helpful in your everyday life, as well as in your future studies.

## 11.8 Self-Check

The more you practice, the more your skills improve. Below are some exercises that will help you continue to develop your ability and check to make sure you understand the concepts discussed in this unit. Be sure to write your work out in your math notebook using proper notation so that you can refer to it later if necessary.

### 11.8.1 Self-Check on Averages

#### Exercise 1: Finding the Mean and the Range

Ten children in a class measured how tall they were and want to work out their typical height.

Work out the mean height and range for these 10 children from these measurements:

134 cm, 140 cm, 164 cm, 136 cm, 143 cm, 152 cm, 150 cm, 155 cm, 149 cm, 160 cm.

So the mean height of children in the class is 148.3 cm (to 1 d.p.).

Range = Highest value – lowest value

= 164 cm – 134 cm

= 30 cm

So the range of heights for the 10 children is 30 cm (about 1 foot)!

#### Exercise 2: Finding the Median Value

Using the data from exercise 1 work out the median height for the class.

Arranging all the values in ascending order:

134, 136, 140, 143, 149, 150, 152, 155, 160, 164

We have 10 values so need to calculate the mean of the middle two to find the median.

So mean of middle two values = = 149.2 cm.

So the median height for the class is 149.2 cm.

In this case the values are quite close.

#### Exercise 3: Finding the Mode or Modal Value

The remaining 10 children in the class were then measured giving a complete set of values.  Find the mode or modal value/s for the height.

134 cm, 140 cm, 164 cm, 136 cm, 143 cm, 152 cm, 150 cm, 155 cm, 149 cm, 160 cm, 134 cm, 144 cm, 147 cm, 150 cm, 154 cm, 162 cm, 144 cm, 142 cm, 153 cm, 130 cm.

Arranging the values in ascending order:

130, 134, 134, 136, 140, 142, 144, 144, 143, 147, 149, 150, 150, 152, 153, 154, 155, 160, 162, 164.

So the modal values in this data set are 134 cm, 144 cm and 150 cm since they all occur the most often, in this case twice each.

### 11.8.2 Self-Check on Bar Charts

#### Exercise 5: Bar Charts

Expected Median Height of Girls and Boys Aged Between 2 and 5 Years Old
Age in yearsGirls' height/cmBoys' height/cm
285.787.1
2.590.791.9
395.196.1
3.599.099.9
4102.0103.3
4.5106.2106.7
5109.4110.0
Source of data: World Health Organisation,http://www.who.int/childgrowth/standards/sft_lhfa_girls_z_2_5.pdf, accessed September 6, 2012.

Draw a comparative bar chart for the data in the table.

Your bar chart should look something like this. Did you remember to give yours a title and label the axes appropriately?

Expected Median Height of Girls and Boys Aged Between 2 and 5 Years Old

Source: World Health Organisation

The difference between the heights of boys and girls at these ages does not show up in the bar chart as they are so close.  Can you think of a way of making this clearer?

### 11.8.3 Self-Check on Pie Charts

#### Exercise 6: Pie Charts

Pie Chart Showing the Favourite Pies of 50 People

Look at the pie chart and estimate by eye the percentage of people for whom each pie was the favorite.

About 33% said that pumpkin pie was their favorite.

About 25% said that apple pie was their favorite.

So by adding up the values for the first two sectors and subtracting this from 100%, 42% said that key lime pie was their favorite.

## 11.9 Quiz Time

### Practice Quiz

Now that you have taken the time to work through these sections, consider giving this short quiz a try! You may find that it will help you to monitor your progress, particularly if you took the quiz at the start of the unit as well.

The quiz does not check all the topics in the unit, but it should give you some idea of the areas you may need to spend more time on. Remember, it doesn’t matter if you get some, or even all of the questions wrong—it just indicates how much time you may need to come back and review this unit!

## 11.10 Study Checklist

### Study Checklist

You should now be able to:

• Calculate the mean, median, and mode for a data set and understand when to use each.
• Interpret and construct tables.
• Critically interpret charts and graphs.
• Construct appropriate charts and graphs to display information.
• Describe some new types of data visualization.

### Activity: How Far Have You Come?

Well done for reaching the end of the final unit of the course—a great achievement.  Now is the time to look back over the whole course and think about how far you have come since you started: