Skip to main content

About this free course

Download this course

Share this free course

Learn to code for data analysis
Learn to code for data analysis

Start this free course now. Just create an account and sign in. Enrol and complete the course for a free statement of participation or digital badge if available.

1 Life expectancy project

This week I wish to see (literally, via a chart) if the life expectancy in richer countries tends to be longer.

A photograph of an adult hand holding the hand of a baby.
Figure 1

Richer countries can afford to spend more on healthcare and on road safety, for example, to reduce mortality. On the other hand, richer countries may have less healthy lifestyles.

The World Bank provides loans and grants to governments of middle and low-income countries to help reduce poverty. As part of their work, the World Bank has put together hundreds of datasets on a range of issues, such as health, education, economy, energy and the effectiveness of aid in different countries. I will use two of their datasets, which you can see online by following the links below. You do not need to download the datasets.

One dataset lists the gross domestic product (GDP) for each country, in United States dollars and cents; the other lists the life expectancy, in years, for each country. The latest life expectancy data I can access is for 2013, so that will be the year I take for the GDP. The disadvantage of using the GDP and the life expectancy values for the same year is that they do not account for the time it takes for a country’s wealth to have an effect on lifestyle, healthcare and other factors influencing life expectancy.

While it is useful to have all GDPs in a common currency to compare different countries, it doesn’t make much sense to report the GDP of a whole country to a supposed precision of a US cent. I noted that the value for the USA is a round number, but it is not for other countries. This is likely due in part to the conversion of local currencies to US dollars. It makes more sense to report the GDP values in a larger unit, e.g. millions of dollars. Moreover, for those who don’t live in a country using the US dollar as the official currency, it’s probably easier to understand GDP values in their own local currency.

To sum up, this week’s project will transform currency values and combine GDP and life expectancy data.

Note that the combination is made simple by the common country names in the two datasets, but in general care has to be taken that the common attribute really means the same thing. For example, if you were combining two datasets on a common unemployment attribute, you must be sure that it was obtained in the same way as there are various ways of measuring unemployment.

I’m aware that the GDP is a crude way of comparing wealth across nations. For example, it doesn’t take population or the cost of living into account. Some of this week’s exercises will ask you to add the population data. Think of other ways to improve the analysis method, of other conversions that might be needed, and of other ways to investigate life expectancy factors.

Links: