Skip to content
Skip to main content

About this free course

Download this course

Share this free course

Learn to code for data analysis
Learn to code for data analysis

Start this free course now. Just create an account and sign in. Enrol and complete the course for a free statement of participation or digital badge if available.

1.4 Applying functions

Having coded the three data conversion functions, they can be applied to the GDP table.

I first select the relevant column:

In []:

column = gdp['Country']

column

Out[]:

0 UK

1 USA

2 China

3 Brazil

4 South Africa

Name: Country, dtype: object

Next, I use the column method apply() , which applies a given function to each cell in the column, returning a new column, in which each cell is the conversion of the corresponding original cell:

In []:

column.apply(expandCountry)

Out[]:

0 United Kingdom

1 United States

2 China

3 Brazil

4 South Africa

Name: Country, dtype: object

Finally, I add that new column to the dataframe, using a new column heading:

In [] :

gdp['Country name'] = column.apply(expandCountry)

gdp

Out[]:

CountryGDP (US$)Country name
0UK2.678455e+12United Kingdom
1USA1.676810e+13United States
2China9.240270e+12China
3Brazil2.245673e+12Brazil
4South Africa3.660579e+11South Africa

In a similar way, I can convert the US dollars to British pounds, then round to the nearest million, and store the result in a new column. I could apply the conversion and rounding functions in two separate statements, but using method chaining , I can apply both functions in a single line of code. This is possible because the column returned by the first call of apply() is the context for the second call of apply() . Here’s how it’s written:

In []:

column = gdp['GDP (US$)']

result = column.apply(usdToGbp).apply(roundToMillions)

gdp['GDP (£m)'] = result

gdp

Out[]:

CountryGDP (US$)Country nameGDP (£m)
0UK2.678455e+12United Kingdom1711727
1USA1.676810e+13United States10716029
2China9.240270e+12China5905202
3Brazil2.245673e+12Brazil1435148
4South Africa3.660579e+11South Africa233937

Now it’s just a matter of selecting the two new columns, as the original ones are no longer needed.

In []:

headings = ['Country name', 'GDP (£m)']

gdp = gdp[headings]

gdp

Out[]:

Country nameGDP (£m)
0United Kingdom1711727
1United States10716029
2China5905202
3Brazil1435148
4South Africa233937

Note that method chaining only works if the methods chained return the same type of value as their context, in the same way that you can chain multiple arithmetic operators (e.g. 3+4-5) because each one takes two numbers and returns a number that is used by the next operator in the chain. In this course, methods only have two possible contexts, columns and dataframes, so you can either chain column methods that return a single column (that is a Series ), like apply() , or dataframe methods that return dataframes. For example, gdp.head(4).tail(2) is a dataframe just with China and Brazil, i.e. the last two of the first four rows of the dataframe shown above. You’ll see further examples of chaining (and an easier way to select multiple rows) later this week.

This concludes the data transformation part. After applying functions in the next exercise, you’ll learn how to combine two tables.

Exercise 4 Applying functions

You can practise applying functions in Exercise 4 of your Exercise notebook 3.