Learn to code for data analysis

Start this free course now. Just create an account and sign in. Enrol and complete the course for a free statement of participation or digital badge if available.

Free course

# 2.1 Practice project

Here’s a quick project for you, which is about looking at TB deaths in all countries.

## Activity 1 The project

1. Open the data file WHO POP TB all.xls . Do not open or edit this file, to avoid changing its encoding. If you want to see the contents of the file, make a copy and look at the copy.
2. Open the project notebook.
3. If you’re using CoCalc do the following two steps:
• b.On your computer, rename the downloaded file so that it includes your name, e.g. ‘TB deaths all world – Michel Wermelinger.ipynb’. Then upload the renamed notebook to CoCalc and open it.
4. If you’re using Anaconda do the following two steps:
• a.Click on the File menu and select ‘Make a copy’.
• b.Click on the title of the new notebook (‘project 1-Copy1’) to rename it. Make sure to include your name in the file name, e.g. ‘TB deaths all world – Michel Wermelinger’.
5. In the new notebook, add your name to mine and update the date.
6. Edit the first code cell: change the file name to ‘WHO POP TB all.xls’, in order to load the data for all countries in the world.
7. Run all cells in the notebook. This might take a little while.
8. Add one line of code at the end to sort the table by the death rate, so that it’s easy to see the least and most affected countries.
9. Go through the notebook and change any text (in particular the conclusions) to reflect the new results.
10. Save and then close and halt the notebook.

If you happen to know how to use a spreadsheet application, then you can do a personal project: open the Excel file, remove all countries you are not interested in, and then do the analysis only for the remaining subset.

You might like to share your experience of working on this project with friends, family or colleagues.

LCDAB_1