Skip to content
Skip to main content

About this free course

Download this course

Share this free course

Learn to code for data analysis
Learn to code for data analysis

Start this free course now. Just create an account and sign in. Enrol and complete the course for a free statement of participation or digital badge if available.

1.2 Defining functions

To make the GDP values easier to read, I wish to convert US dollars to millions of US dollars.

An image of a large shipping container.
Figure 2

I have to be precise about what I mean. For example, if the GDP is 4,567,890.1 (using commas to separate the thousands, millions, etc.), what do I want to obtain? Do I want always to round down to the nearest million, making it 4 million, round to the nearest million, making it 5, or round to one decimal place, making it 4.6 million? Since the aim is to simplify the numbers and not introduce a false sense of precision, let’s round to the nearest million.

I will define my own function to do such a conversion. It’s a generic function that takes any number and rounds it to the nearest million. I will later apply the function to each value in the GDP column. It’s easier to first show the code and then explain it.

In []:

def roundToMillions (value):

result = round(value / 1000000)

return result

A function definition always starts with def , which is a reserved word in Python.

After it comes the function’s name and arguments, surrounded by parenthesis, and finally a colon (:). This function just takes one argument. If there’s more than one argument, use commas to separate them.

Next comes the function’s body, where the calculations are done, using the arguments like any other variables. The body must be indented, conventionally by four spaces.

For this function, the calculation is simple. I take the value, divide it by one million, and call the built-in Python function round() to convert that number to the nearest integer. If the number is exactly mid-way between two integers, round() will pick the even integer, i.e. round(2.5) is 2 but round(3.5) is 4.Finally, I write a return statement to pass the result back to the code that called the function. The return word is also reserved in Python.

The result variable just stores the rounded value temporarily and has no other purpose. It‘s better to write the body as a single line of code:

return round(value / 1000000)

Finally I need to test the function, by calling it with various argument values and checking whether the returned value is equal to what I expect.

In []:

roundToMillions(4567890.1) == 5

Out[]:

True

The art of testing is to find as few test cases as possible that cover all bases. And I mean all, especially those you think ‘Naaah, it’ll never happen’. It will, because data can be incorrect. Prepare for the worst and hope for the best.

So here are some more tests, even for the unlikely cases of the GDP being zero or negative, and you can probably think of others.

In []:

roundToMillions(0) == 0 # always test with zero...

Out[]:

True

In []:

roundToMillions(-1) == 0 #...and negative numbers

Out[]:

True

In []:

roundToMillions(1499999) == 1 # test rounding to the nearest

Out[]:

True

Now for the next conversion, from US dollars to a local currency, for example British pounds. I searched the internet for ‘average yearly USD to GBP rate’, chose a conversion service and took the value for 2013. Here’s the code and some tests.

In []:

def usdToGbp (usd):

return usd / 1.564768 # average rate during 2013

usdToGbp(0) == 0

Out[]:

True

In []:

usdToGbp(1.564768) == 1

Out[]:

True

In []:

usdToGbp(-1)

Out[]:

True

Defining functions is such an important part of coding, that you should not skip the next exercise where you will define your own functions.

Exercise 2 Defining functions

Complete Exercise 2 in the Exercise notebook 3 to practise defining your own functions.