Learn to code for data analysis
Learn to code for data analysis

Start this free course now. Just create an account and sign in. Enrol and complete the course for a free statement of participation or digital badge if available.

Free course

Learn to code for data analysis

Week 2: Having a go at it Part 2

1 Enter the pandas

As you probably realised, this way of coding is not practical for large scale data analysis.

Three lines of code were required for each country, to store the number of deaths, store the population, and calculate the death rate. With roughly 200 countries in the world, my trivial analysis would require 400 variables and typing almost 600 lines of code! Life’s too short to be spent that way.

Instead of using a separate variable for each datum, it is better to organise data as a table of rows and columns.

Table 1

CountryDeathsPopulation
Angola690021472
Brazil4400200362
Portugal14010608

In that way, instead of 400 variables, I only need one that stores the whole table. Instead of writing a mile long expression that adds 200 variables to obtain the total deaths, I’ll write a short expression that calculates the total of the ‘Deaths’ column, no matter how many countries (rows) there are.

To organise data into tables and do calculations on such tables, you and I will use the pandas module, which is included in Anaconda and CoCalc. A module is a package of various pieces of code that can be used individually. The pandas module provides very extensive and advanced data analysis capabilities to compliment Python. This course only scratches the surface of pandas.

I have to tell the computer that I’m going to use a module.

In []:

from pandas import *

That line of code is an import statement: from the pandas module, import everything. In plain English: load into memory all pieces of code that are in the pandas module, so that I can use any of them. In the above statement, the asterisk isn’t the multiplication operator but instead means ‘everything’.

Each weekly project in this course will start with this import statement, because all projects need the pandas module.

The words from and import are reserved words : they can’t be used as variable, function or module names. Otherwise you will get a syntax error.

In []:

from = 100

File "<ipython-input-23-6958f0ebc10d>", line 1

from = 100

^

SyntaxError: invalid syntax

Jupyter notebooks show reserved words in boldface font to make them easier to spot. If you see a boldface name in an assignment (as you will for the code above), you must choose a different name.

Exercise 5 pandas

Use Exercise 5 the Exercise notebook 1 to help you answer these questions about errors you might come across.

1. What kind of error will you get if you misspell 'pandas' as 'Pandas'?

a. 

A syntax error


b. 

A name error, reported as an import error


The correct answer is b.

b. 

The computer is expecting a name but there is no module with the name 'Pandas' in the Anaconda distribution. Remember that names are case-sensitive.


2. What kind of error will you get if you misspell 'import' as 'impart'?

a. 

A name error


b. 

A syntax error


The correct answer is b.

b. 

The computer is expecting a reserved word and anything else will raise a syntax error.


3. What kind of error will you get if you forget the asterisk?

a. 

A name error


b. 

A syntax error


The correct answer is b.

b. 

The statement cannot end with the reserved word 'import'; the computer is expecting an indication of what to import.


LCDAB_1

Take your learning further

Making the decision to study can be a big step, which is why you'll want a trusted University. The Open University has 50 years’ experience delivering flexible learning and 170,000 students are studying with us right now. Take a look at all Open University courses.

If you are new to university level study, find out more about the types of qualifications we offer, including our entry level Access courses and Certificates.

Not ready for University study then browse over 900 free courses on OpenLearn and sign up to our newsletter to hear about new free courses as they are released.

Every year, thousands of students decide to study with The Open University. With over 120 qualifications, we’ve got the right course for you.

Request an Open University prospectus