1.4 Getting and displaying dataframe columns
You learned in Week 2 that you can get and display a single column of a dataframe by putting the name of the column (in quotes) within square brackets immediately after the dataframe’s name.
For example, like this:
df['TB deaths']
You then get output like this:
0 13000.00
1 20.00
2 5100.00
3 0.26
4 6900.00
5 1.20
6 570.00
...
Notice that although there is an index, there is no column heading. This is because what is returned is not a new dataframe with a single column but an example of the data type.

Each column in a dataframe is an example of a series
The data type is a collection of values with an integer index that starts from zero. In addition, the data type has many of the same methods and attributes as the data type, so you can still execute code like:
df['TB deaths'].head()
0 13000.00
1 20.00
2 5100.00
3 0.26
4 6900.00
Name: TB deaths, dtype: float64
And
df['TB deaths'].iloc[2]
5100.00
However, pandas does provide a mechanism for you to get and display one or more selected columns as a new dataframe in its own right. To do this you need to use a list. A list in Python consists of one or more items separated by commas and enclosed within square brackets, for example or. This list is then put within outer square brackets immediately after the dataframe’s name, like this:
df[['Country']].head()
| Country | |
|---|---|
| 0 | Afghanistan |
| 1 | Albania |
| 2 | Algeria |
| 3 | Andorra |
| 4 | Angola |
Note that the column is now named. The expression(with two square brackets) evaluates to a new dataframe (which happens to have a single column) rather than a series.
To get a new dataframe with multiple columns you just need to put more column names in the list, like this:
df[['Country', 'Population (1000s)']].head()
| Country | Population (1000s) | |
|---|---|---|
| 0 | Afghanistan | 30552 |
| 1 | Albania | 3173 |
| 2 | Algeria | 39208 |
| 3 | Andorra | 79 |
| 4 | Angola | 21472 |
The code has returned a new dataframe with just the and columns.
Exercise 1 Dataframes and CSV files
Now that you’ve learned about CSV files and more about pandas you are ready to complete Exercise 1 in the exercise notebook 2.
Open the exercise 2 notebook and the data file you used last week WHO POP TB all.csv and save it in the folder you created in Week 1.
If you’re using Anaconda instead of CoCalc, remember that to open the notebook you’ll need to navigate to the notebook using Jupyter. Once it’s open, run the existing code in the notebook before you start the exercise. When you’ve completed the exercise, save the notebook. If you need a quick reminder of how to use Jupyter watch again the video in Week 1 Exercise 1.
OpenLearn - Introduction and guidance
Except for third party materials and otherwise, this content is made available under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 Licence, full copyright detail can be found in the acknowledgements section. Please see full copyright statement for details.
