### Become an OU student

Learn to code for data analysis

Start this free course now. Just create an account and sign in. Enrol and complete the course for a free statement of participation or digital badge if available.

# 1.5 Comparison operators

In Expressions [Tip: hold Ctrl and click a link to open it in a new tab. (Hide tip)] , you learned that Python has arithmetic operators: +, /, - and * and that expressions such as 5 + 2 evaluate to a value (in this case the number 7).

Figure 6

Python also has what are called comparison operators, these are:

== equals

!= not equal

> greater than

>= greater than or equal to

Expressions involving these operators always evaluate to a Boolean value, that is True or False. Here are some examples:

2 = = 2 evaluates to True

2 + 2 = = 5 evaluates to False

2 != 1 + 1 evaluates to False

45

20 > 30 evaluates to False

100

101 >= 100 evaluates to True

The comparison operators can be used with other types of data, not just numbers. Used with strings they compare using alphabetical order. For example:

'aardvark'

In Calculating over columns you saw that when applied to whole columns, the arithmetic operators did the calculations row by row. Similarly, an expression like df['Country'] >= 'K' will compare the country names, row by row, against the string 'K' and record whether the result is True or False in a series like this:

0 False

1 False

2 False

3 False

4 False

5 False

...

Name: Country, dtype: bool

If such an expression is put within square brackets immediately after a dataframe’s name, a new dataframe is obtained with only those rows where the result is True. So:

df[df['Country'] >= 'K']

returns a new dataframe with all the columns of df but with only the rows corresponding to countries starting with K or a letter later in the alphabet.

As another example, to see the data for countries with over 80 million inhabitants, the following code will return and display a new dataframe with all the columns of df but with only the rows where it is True that the value in the 'Population (1000s)' column is greater than 80000:

In []:

df[df['Population (1000s)'] > 80000]

Out[]:

CountryPopulation (1000s)TB deaths
23Brazil2003624400
36China139333741000
53Egypt82056550
58Ethiopia9410130000
65Germany82727300
77India1252140240000
78Indonesia24986664000
85Japan1271442100
109Mexico1223322200
124Nigeria173615160000
128Pakistan18214349000
134Philippines9839427000
141Russian Federation14283417000
185United States of America320051490
190Viet Nam9168017000

## Exercise 2 Comparison operators

You are ready to complete Exercise 2 in the Exercise notebook 2.

Remember to run the existing code in the notebook before you start the exercise. When you’ve completed the exercise, save the notebook.