1.5 Sorting on a column
One of the research questions was: which countries have the smallest and largest number of deaths?
Being a small table, it is not too difficult to scan the TB deaths column and find those countries. However, such a process is prone to errors and impractical for large tables. It’s much better to sort the table by that column, and then look up the countries in the first and last rows.
As you’ve guessed by now, sorting a table is another single line of code.
data.sort_values('TB deaths')
| Country | Population (1000s) | TB deaths | |
|---|---|---|---|
| 9 | Sao Tome and Principe | 193 | 18 |
| 3 | Equatorial Guinea | 757 | 67 |
| 7 | Portugal | 10608 | 140 |
| 11 | Timor-Leste | 1133 | 990 |
| 4 | Guinea-Bissau | 1704 | 1200 |
| 1 | Brazil | 200362 | 4400 |
| 0 | Angola | 21472 | 6900 |
| 8 | Russian Federation | 142834 | 17000 |
| 6 | Mozambique | 25834 | 18000 |
| 10 | South Africa | 52776 | 25000 |
| 2 | China | 1393337 | 41000 |
| 5 | India | 1252140 | 240000 |
The dataframe method takes as argument a column name and returns a new dataframe where the rows are in ascending order of the values in that column. Note that sorting doesn’t modify the original dataframe.
data # rows still in original order
| Country | Population (1000s) | TB deaths | |
|---|---|---|---|
| 0 | Angola | 21472 | 6900 |
| 1 | Brazil | 200362 | 4400 |
| 2 | China | 1393337 | 41000 |
| 3 | Equatorial Guinea | 757 | 67 |
| 4 | Guinea-Bissau | 1704 | 1200 |
| 5 | India | 1252140 | 240000 |
| 6 | Mozambique | 25834 | 18000 |
| 7 | Portugal | 10608 | 140 |
| 8 | Russian Federation | 142834 | 17000 |
| 9 | Sao Tome and Principe | 193 | 18 |
| 10 | South Africa | 52776 | 25000 |
| 11 | Timor-Leste | 1133 | 990 |
It’s also possible to sort on a column that has text instead of numbers; the rows will be sorted in alphabetical order.
data.sort_values('Country')
| Country | Population (1000s) | TB deaths | |
|---|---|---|---|
| 0 | Angola | 21472 | 6900 |
| 1 | Brazil | 200362 | 4400 |
| 2 | China | 1393337 | 41000 |
| 3 | Equatorial Guinea | 757 | 67 |
| 4 | Guinea-Bissau | 1704 | 1200 |
| 5 | India | 1252140 | 240000 |
| 6 | Mozambique | 25834 | 18000 |
| 7 | Portugal | 10608 | 140 |
| 8 | Russian Federation | 142834 | 17000 |
| 9 | Sao Tome and Principe | 193 | 18 |
| 10 | South Africa | 52776 | 25000 |
| 11 | Timor-Leste | 1133 | 990 |
Exercise 8 sorting on a column
Use the Exercise notebook 1 to sort the table by population so that you can quickly see which are the least and the most populous countries. Remember to run all code before doing the exercise.
In the next section you’ll learn about calculations over columns.
OpenLearn - Introduction and guidance
Except for third party materials and otherwise, this content is made available under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 Licence, full copyright detail can be found in the acknowledgements section. Please see full copyright statement for details.
