{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Exercise notebook 3: Transforming and Combining Data\n",
    "\n",
    "This Jupyter notebook is for Part 3 of The Open University's _Learn to code for Data Analysis_ course.\n",
    "\n",
    "This notebook has all code examples and coding exercises. Remember to start by running the code in this notebook. You will need to add a code cell below each task to complete it.\n",
    "\n",
    "You'll come across steps in the course directing you to this notebook. Once you've done each exercise, go back to the corresponding step and mark it as complete."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "import warnings\n",
    "warnings.simplefilter('ignore', FutureWarning)\n",
    "\n",
    "from pandas import *"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Exercise 1: Creating the data"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Dataframes can be constructed from scratch as follows."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style>\n",
       "    .dataframe thead tr:only-child th {\n",
       "        text-align: right;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: left;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>Country</th>\n",
       "      <th>GDP (US$)</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>UK</td>\n",
       "      <td>2.678455e+12</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>USA</td>\n",
       "      <td>1.676810e+13</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>China</td>\n",
       "      <td>9.240270e+12</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>Brazil</td>\n",
       "      <td>2.245673e+12</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>South Africa</td>\n",
       "      <td>3.660579e+11</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "        Country     GDP (US$)\n",
       "0            UK  2.678455e+12\n",
       "1           USA  1.676810e+13\n",
       "2         China  9.240270e+12\n",
       "3        Brazil  2.245673e+12\n",
       "4  South Africa  3.660579e+11"
      ]
     },
     "execution_count": 2,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "headings = ['Country', 'GDP (US$)']\n",
    "table = [\n",
    "  ['UK', 2678454886796.7],    # 1st row\n",
    "  ['USA', 16768100000000.0],  # 2nd row\n",
    "  ['China', 9240270452047.0], # and so on...\n",
    "  ['Brazil', 2245673032353.8],\n",
    "  ['South Africa', 366057913367.1]\n",
    "]\n",
    "gdp = DataFrame(columns=headings, data=table)\n",
    "gdp"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "And similarly for the life expectancy of those born in 2013..."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style>\n",
       "    .dataframe thead tr:only-child th {\n",
       "        text-align: right;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: left;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>Country name</th>\n",
       "      <th>Life expectancy (years)</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>China</td>\n",
       "      <td>75</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>Russia</td>\n",
       "      <td>71</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>United States</td>\n",
       "      <td>79</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>India</td>\n",
       "      <td>66</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>United Kingdom</td>\n",
       "      <td>81</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "     Country name  Life expectancy (years)\n",
       "0           China                       75\n",
       "1          Russia                       71\n",
       "2   United States                       79\n",
       "3           India                       66\n",
       "4  United Kingdom                       81"
      ]
     },
     "execution_count": 3,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "headings = ['Country name', 'Life expectancy (years)']\n",
    "table = [\n",
    "  ['China', 75],\n",
    "  ['Russia', 71],  \n",
    "  ['United States', 79],\n",
    "  ['India', 66],\n",
    "  ['United Kingdom', 81]\n",
    "]\n",
    "life = DataFrame(columns=headings, data=table)\n",
    "life"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Task\n",
    "\n",
    "Create a dataframe with all five BRICS countries and their population, in thousands of inhabitants, in 2013. The values (given in the first exercise notebook) are: Brazil 200362, Russian Federation 142834, India 1252140, China 1393337, South Africa 52776."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "**Now go back to the course step and mark it complete.**"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Exercise 2: Defining functions\n",
    "\n",
    "The following function, written in two different ways, rounds a number to the nearest million. It calls the Python function `round()` which rounds a decimal number to the nearest integer. If two integers are equally near, it rounds to the even integer."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "def roundToMillions (value):\n",
    "    result = round(value / 1000000)\n",
    "    return result"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "def roundToMillions (value):\n",
    "    return round(value / 1000000)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "To test a function, write expressions that check for various argument values whether the function returns the expected value in each case."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "True"
      ]
     },
     "execution_count": 6,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "roundToMillions(4567890.1) == 5"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "True"
      ]
     },
     "execution_count": 7,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "roundToMillions(0) == 0  # always test with zero..."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "True"
      ]
     },
     "execution_count": 8,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "roundToMillions(-1) == 0 # ...and negative numbers"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "True"
      ]
     },
     "execution_count": 9,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "roundToMillions(1499999) == 1 # test rounding to the nearest"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The next function converts US dollars to British pounds."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "True"
      ]
     },
     "execution_count": 10,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "def usdToGBP (usd):\n",
    "    return usd / 1.564768 # average rate during 2013 \n",
    "\n",
    "usdToGBP(0) == 0"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "True"
      ]
     },
     "execution_count": 11,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "usdToGBP(1.564768) == 1"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "True"
      ]
     },
     "execution_count": 12,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "usdToGBP(-1) < 0"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Tasks\n",
    "\n",
    "1. Define a few more test cases for both functions.\n",
    "- Why can't you use `roundToMillions()` to round the population to millions of inhabitants? Write a new function and test it. **You need to write this function in preparation for Exercise 4.**\n",
    "- Write a function to convert US dollars to your local currency. If your local currency is USD or GBP, convert to Euros. Look up online what was the average exchange rate in 2013."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "**Now go back to the course step and mark it complete.**"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Exercise 3: What if...?\n",
    "\n",
    "The next function uses the full form of the conditional statement to expand the abbreviated country names UK and USA and leave other names unchanged."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 13,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "True"
      ]
     },
     "execution_count": 13,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "def expandCountry (name):\n",
    "    if name == 'UK':\n",
    "        return 'United Kingdom'\n",
    "    elif name == 'USA':\n",
    "        return 'United States'\n",
    "    else:\n",
    "        return name\n",
    "\n",
    "expandCountry('India') == 'India'"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Here is the same function, written differently, using the simplest form of the conditional statement, without the `elif` and `else` parts."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 14,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "def expandCountry (name):\n",
    "    if name == 'UK':\n",
    "        name = 'United Kingdom'\n",
    "    if name == 'USA':\n",
    "        name = 'United States'\n",
    "    return name"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Tasks\n",
    "\n",
    "1. Write more tests.\n",
    "- Explain why the second version of the function works. Note how the code is indented.\n",
    "- Extend both versions to expand 'St. Lucia' to 'Saint Lucia'.\n",
    "- Write a function to translate some country names from their original language to English, e.g. 'Brasil' to 'Brazil', 'España' to 'Spain' and 'Deutschland' to 'Germany'.\n",
    "- Can you think of a different way of expanding abbreviated country names? You're not expected to write any code. Hint: this is a course about data tables."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "**Now go back to the course step and mark it complete.**"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Exercise 4: Applying functions\n",
    "\n",
    "A one-argument function can be applied to each cell in a column, in order to obtain a new column with the converted values."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 15,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style>\n",
       "    .dataframe thead tr:only-child th {\n",
       "        text-align: right;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: left;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>Country</th>\n",
       "      <th>GDP (US$)</th>\n",
       "      <th>Country name</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>UK</td>\n",
       "      <td>2.678455e+12</td>\n",
       "      <td>United Kingdom</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>USA</td>\n",
       "      <td>1.676810e+13</td>\n",
       "      <td>United States</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>China</td>\n",
       "      <td>9.240270e+12</td>\n",
       "      <td>China</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>Brazil</td>\n",
       "      <td>2.245673e+12</td>\n",
       "      <td>Brazil</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>South Africa</td>\n",
       "      <td>3.660579e+11</td>\n",
       "      <td>South Africa</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "        Country     GDP (US$)    Country name\n",
       "0            UK  2.678455e+12  United Kingdom\n",
       "1           USA  1.676810e+13   United States\n",
       "2         China  9.240270e+12           China\n",
       "3        Brazil  2.245673e+12          Brazil\n",
       "4  South Africa  3.660579e+11    South Africa"
      ]
     },
     "execution_count": 15,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "gdp['Country name'] = gdp['Country'].apply(expandCountry)\n",
    "gdp"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Given that `apply()` is a column method that returns a column, it can be **chained**, to apply several conversions in one go."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 16,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style>\n",
       "    .dataframe thead tr:only-child th {\n",
       "        text-align: right;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: left;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>Country</th>\n",
       "      <th>GDP (US$)</th>\n",
       "      <th>Country name</th>\n",
       "      <th>GDP (£m)</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>UK</td>\n",
       "      <td>2.678455e+12</td>\n",
       "      <td>United Kingdom</td>\n",
       "      <td>1711727</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>USA</td>\n",
       "      <td>1.676810e+13</td>\n",
       "      <td>United States</td>\n",
       "      <td>10716029</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>China</td>\n",
       "      <td>9.240270e+12</td>\n",
       "      <td>China</td>\n",
       "      <td>5905202</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>Brazil</td>\n",
       "      <td>2.245673e+12</td>\n",
       "      <td>Brazil</td>\n",
       "      <td>1435148</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>South Africa</td>\n",
       "      <td>3.660579e+11</td>\n",
       "      <td>South Africa</td>\n",
       "      <td>233937</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "        Country     GDP (US$)    Country name  GDP (£m)\n",
       "0            UK  2.678455e+12  United Kingdom   1711727\n",
       "1           USA  1.676810e+13   United States  10716029\n",
       "2         China  9.240270e+12           China   5905202\n",
       "3        Brazil  2.245673e+12          Brazil   1435148\n",
       "4  South Africa  3.660579e+11    South Africa    233937"
      ]
     },
     "execution_count": 16,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "gdp['GDP (£m)'] = gdp['GDP (US$)'].apply(usdToGBP).apply(roundToMillions)\n",
    "gdp"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Applying the conversion functions in a different order will lead to a different result."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 17,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "0     1711727\n",
       "1    10716029\n",
       "2     5905201\n",
       "3     1435148\n",
       "4      233938\n",
       "Name: GDP (US$), dtype: int64"
      ]
     },
     "execution_count": 17,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "gdp['GDP (US$)'].apply(roundToMillions).apply(usdToGBP).apply(round)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The original columns can be discarded."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 18,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style>\n",
       "    .dataframe thead tr:only-child th {\n",
       "        text-align: right;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: left;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>Country name</th>\n",
       "      <th>GDP (£m)</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>United Kingdom</td>\n",
       "      <td>1711727</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>United States</td>\n",
       "      <td>10716029</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>China</td>\n",
       "      <td>5905202</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>Brazil</td>\n",
       "      <td>1435148</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>South Africa</td>\n",
       "      <td>233937</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "     Country name  GDP (£m)\n",
       "0  United Kingdom   1711727\n",
       "1   United States  10716029\n",
       "2           China   5905202\n",
       "3          Brazil   1435148\n",
       "4    South Africa    233937"
      ]
     },
     "execution_count": 18,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "headings = ['Country name', 'GDP (£m)']\n",
    "gdp = gdp[headings]\n",
    "gdp"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Task\n",
    "\n",
    "Take the dataframe you created for Exercise 1, and apply to its population column the rounding function you wrote in Exercise 2."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "**Now go back to the course step and mark it complete.**"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Exercise 5: Joining left, right and centre\n",
    "\n",
    "At this point, both tables have a common column, 'Country name', with fully expanded country names."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 19,
   "metadata": {
    "scrolled": false
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style>\n",
       "    .dataframe thead tr:only-child th {\n",
       "        text-align: right;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: left;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>Country name</th>\n",
       "      <th>Life expectancy (years)</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>China</td>\n",
       "      <td>75</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>Russia</td>\n",
       "      <td>71</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>United States</td>\n",
       "      <td>79</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>India</td>\n",
       "      <td>66</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>United Kingdom</td>\n",
       "      <td>81</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "     Country name  Life expectancy (years)\n",
       "0           China                       75\n",
       "1          Russia                       71\n",
       "2   United States                       79\n",
       "3           India                       66\n",
       "4  United Kingdom                       81"
      ]
     },
     "execution_count": 19,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "life"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 20,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style>\n",
       "    .dataframe thead tr:only-child th {\n",
       "        text-align: right;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: left;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>Country name</th>\n",
       "      <th>GDP (£m)</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>United Kingdom</td>\n",
       "      <td>1711727</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>United States</td>\n",
       "      <td>10716029</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>China</td>\n",
       "      <td>5905202</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>Brazil</td>\n",
       "      <td>1435148</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>South Africa</td>\n",
       "      <td>233937</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "     Country name  GDP (£m)\n",
       "0  United Kingdom   1711727\n",
       "1   United States  10716029\n",
       "2           China   5905202\n",
       "3          Brazil   1435148\n",
       "4    South Africa    233937"
      ]
     },
     "execution_count": 20,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "gdp"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "A **left join** takes the rows of the left table and adds the columns of the right table. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 21,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style>\n",
       "    .dataframe thead tr:only-child th {\n",
       "        text-align: right;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: left;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>Country name</th>\n",
       "      <th>GDP (£m)</th>\n",
       "      <th>Life expectancy (years)</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>United Kingdom</td>\n",
       "      <td>1711727</td>\n",
       "      <td>81.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>United States</td>\n",
       "      <td>10716029</td>\n",
       "      <td>79.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>China</td>\n",
       "      <td>5905202</td>\n",
       "      <td>75.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>Brazil</td>\n",
       "      <td>1435148</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>South Africa</td>\n",
       "      <td>233937</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "     Country name  GDP (£m)  Life expectancy (years)\n",
       "0  United Kingdom   1711727                     81.0\n",
       "1   United States  10716029                     79.0\n",
       "2           China   5905202                     75.0\n",
       "3          Brazil   1435148                      NaN\n",
       "4    South Africa    233937                      NaN"
      ]
     },
     "execution_count": 21,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "merge(gdp, life, on='Country name', how='left')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "A **right join** takes the rows from the right table, and adds the columns of the left table."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 22,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style>\n",
       "    .dataframe thead tr:only-child th {\n",
       "        text-align: right;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: left;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>Country name</th>\n",
       "      <th>GDP (£m)</th>\n",
       "      <th>Life expectancy (years)</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>United Kingdom</td>\n",
       "      <td>1711727.0</td>\n",
       "      <td>81</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>United States</td>\n",
       "      <td>10716029.0</td>\n",
       "      <td>79</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>China</td>\n",
       "      <td>5905202.0</td>\n",
       "      <td>75</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>Russia</td>\n",
       "      <td>NaN</td>\n",
       "      <td>71</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>India</td>\n",
       "      <td>NaN</td>\n",
       "      <td>66</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "     Country name    GDP (£m)  Life expectancy (years)\n",
       "0  United Kingdom   1711727.0                       81\n",
       "1   United States  10716029.0                       79\n",
       "2           China   5905202.0                       75\n",
       "3          Russia         NaN                       71\n",
       "4           India         NaN                       66"
      ]
     },
     "execution_count": 22,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "merge(gdp, life, on='Country name', how='right')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "An **outer join** takes the union of the rows, i.e. it has all the rows of the left and right joins."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 23,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style>\n",
       "    .dataframe thead tr:only-child th {\n",
       "        text-align: right;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: left;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>Country name</th>\n",
       "      <th>GDP (£m)</th>\n",
       "      <th>Life expectancy (years)</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>United Kingdom</td>\n",
       "      <td>1711727.0</td>\n",
       "      <td>81.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>United States</td>\n",
       "      <td>10716029.0</td>\n",
       "      <td>79.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>China</td>\n",
       "      <td>5905202.0</td>\n",
       "      <td>75.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>Brazil</td>\n",
       "      <td>1435148.0</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>South Africa</td>\n",
       "      <td>233937.0</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>5</th>\n",
       "      <td>Russia</td>\n",
       "      <td>NaN</td>\n",
       "      <td>71.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>6</th>\n",
       "      <td>India</td>\n",
       "      <td>NaN</td>\n",
       "      <td>66.0</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "     Country name    GDP (£m)  Life expectancy (years)\n",
       "0  United Kingdom   1711727.0                     81.0\n",
       "1   United States  10716029.0                     79.0\n",
       "2           China   5905202.0                     75.0\n",
       "3          Brazil   1435148.0                      NaN\n",
       "4    South Africa    233937.0                      NaN\n",
       "5          Russia         NaN                     71.0\n",
       "6           India         NaN                     66.0"
      ]
     },
     "execution_count": 23,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "merge(gdp, life, on='Country name', how='outer')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "An **inner join** takes the intersection of the rows (i.e. the common rows) of the left and right joins."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 24,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style>\n",
       "    .dataframe thead tr:only-child th {\n",
       "        text-align: right;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: left;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>Country name</th>\n",
       "      <th>GDP (£m)</th>\n",
       "      <th>Life expectancy (years)</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>United Kingdom</td>\n",
       "      <td>1711727</td>\n",
       "      <td>81</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>United States</td>\n",
       "      <td>10716029</td>\n",
       "      <td>79</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>China</td>\n",
       "      <td>5905202</td>\n",
       "      <td>75</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "     Country name  GDP (£m)  Life expectancy (years)\n",
       "0  United Kingdom   1711727                       81\n",
       "1   United States  10716029                       79\n",
       "2           China   5905202                       75"
      ]
     },
     "execution_count": 24,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "gdpVsLife = merge(gdp, life, on='Country name', how='inner')\n",
    "gdpVsLife"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Task\n",
    "\n",
    "Join your population dataframe (from Exercise 4) with `gdpVsLife`, in four different ways, and note the differences."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "**Now go back to the course step and mark it complete.**"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Exercise 6: Constant variables\n",
    "\n",
    "Constants are used to represent fixed values (e.g. strings and numbers) that occur frequently in a program. Constant names are conventionally written in uppercase, with underscores to separate multiple words."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 25,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "'GDP (US$)'"
      ]
     },
     "execution_count": 25,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "GDP_USD = 'GDP (US$)'\n",
    "GDP_GBP = 'GDP (£m)'\n",
    "GDP_USD"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Task\n",
    "\n",
    "Look through the code you wrote so far, and rewrite it using constants, when appropriate."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "**Now go back to the course step and mark it complete.**"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Exercise 7: Getting real\n",
    "\n",
    "It is possible to directly download data from the World Bank, for a particular time period and indicator, like the GDP in current US dollars. The indicator name is given in the URL of the webpage about the dataset.\n",
    "\n",
    "Getting the data directly from the World Bank only works with Anaconda (or a paid CoCalc account) and requires an Internet connection. It can take some time to download the data, depending on the speed of your connection and the load on the World Bank server. Moreover, the World Bank occasionally changes the layout of the data, which could break the code in the rest of this notebook. \n",
    "\n",
    "To avoid such problems I have saved the World Bank data into CSV files. The data is in a column with the same name as the indicator. Hence I declare the indicator names as constants, to be used later when processing the dataframe."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 26,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "GDP_INDICATOR = 'NY.GDP.MKTP.CD'\n",
    "gdpReset = read_csv('WB GDP 2013.csv')\n",
    "\n",
    "LIFE_INDICATOR = 'SP.DYN.LE00.IN'\n",
    "lifeReset = read_csv('WB LE 2013.csv')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The CSV files were obtained in two steps, which are shown next in commented code because we already have the CSV files. \n",
    "\n",
    "First the data was obtained directly from the World Bank using a particular function in pandas, and indicating the desired indicator and time period. Note that you may have to install the `pandas_datareader` module, using Anaconda Navigator."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 27,
   "metadata": {},
   "outputs": [],
   "source": [
    "# from pandas_datareader.wb import download\n",
    "\n",
    "# YEAR = 2013\n",
    "# gdpWB = download(indicator=GDP_INDICATOR, country='all', start=YEAR, end=YEAR)\n",
    "# lifeWB = download(indicator=LIFE_INDICATOR, country='all', start=YEAR, end=YEAR)\n",
    "# lifeWB.head()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The downloaded dataframe has descriptive row names instead of the usual 0, 1, 2, etc. In other words, the dataframe's index is given by the country and year instead of integers. Hence the second step was to reset the index. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 28,
   "metadata": {},
   "outputs": [],
   "source": [
    "# gdpReset = gdpWB.reset_index()\n",
    "# lifeReset = lifeWB.reset_index()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Resetting the index put the dataframes into the usual form, which was saved to CSV files. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 29,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style>\n",
       "    .dataframe thead tr:only-child th {\n",
       "        text-align: right;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: left;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>country</th>\n",
       "      <th>year</th>\n",
       "      <th>SP.DYN.LE00.IN</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>Arab World</td>\n",
       "      <td>2013</td>\n",
       "      <td>70.631305</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>Caribbean small states</td>\n",
       "      <td>2013</td>\n",
       "      <td>71.901964</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>Central Europe and the Baltics</td>\n",
       "      <td>2013</td>\n",
       "      <td>76.127583</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>East Asia &amp; Pacific (all income levels)</td>\n",
       "      <td>2013</td>\n",
       "      <td>74.604619</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>East Asia &amp; Pacific (developing only)</td>\n",
       "      <td>2013</td>\n",
       "      <td>73.657617</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "                                   country  year  SP.DYN.LE00.IN\n",
       "0                               Arab World  2013       70.631305\n",
       "1                   Caribbean small states  2013       71.901964\n",
       "2           Central Europe and the Baltics  2013       76.127583\n",
       "3  East Asia & Pacific (all income levels)  2013       74.604619\n",
       "4    East Asia & Pacific (developing only)  2013       73.657617"
      ]
     },
     "execution_count": 29,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "lifeReset.head()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Tasks\n",
    "\n",
    "1. Create a data frame with the World Bank's data on population, using the CSV file provided. **This dataframe will be used in the remaining exercises.**\n",
    "- If you're using Anaconda, uncomment the code above and run it to check that you can get the GDP and life expectancy data directly from the World Bank. **Don't forget to afterwards comment again the code.**\n",
    "- If you have extra time, you can alternatively obtain the population data directly from the World Bank: go to their [data page](http://data.worldbank.org/), search for population, select the total population indicator, note its name in the URL, copy the commented code above and adapt it to get the data and reset its index. Note that the World Bank may have changed its data format since this was written and therefore you may have to do extra steps to get a dataframe in the same shape as the CSV file we provide, with three columns for country name, year and population."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "**Now go back to the course step and mark it complete.**"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Exercise 8: Cleaning up\n",
    "\n",
    "The expression `frame[m:n]` represents a dataframe with only row `m` to row `n-1` (or until the end if `n` is omitted) of `frame`."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 30,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style>\n",
       "    .dataframe thead tr:only-child th {\n",
       "        text-align: right;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: left;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>country</th>\n",
       "      <th>year</th>\n",
       "      <th>SP.DYN.LE00.IN</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>Arab World</td>\n",
       "      <td>2013</td>\n",
       "      <td>70.631305</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>Caribbean small states</td>\n",
       "      <td>2013</td>\n",
       "      <td>71.901964</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>Central Europe and the Baltics</td>\n",
       "      <td>2013</td>\n",
       "      <td>76.127583</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "                          country  year  SP.DYN.LE00.IN\n",
       "0                      Arab World  2013       70.631305\n",
       "1          Caribbean small states  2013       71.901964\n",
       "2  Central Europe and the Baltics  2013       76.127583"
      ]
     },
     "execution_count": 30,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "lifeReset[0:3]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 31,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style>\n",
       "    .dataframe thead tr:only-child th {\n",
       "        text-align: right;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: left;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>country</th>\n",
       "      <th>year</th>\n",
       "      <th>SP.DYN.LE00.IN</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>240</th>\n",
       "      <td>Vanuatu</td>\n",
       "      <td>2013</td>\n",
       "      <td>71.669244</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>241</th>\n",
       "      <td>Venezuela, RB</td>\n",
       "      <td>2013</td>\n",
       "      <td>74.074415</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>242</th>\n",
       "      <td>Vietnam</td>\n",
       "      <td>2013</td>\n",
       "      <td>75.756488</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>243</th>\n",
       "      <td>Virgin Islands (U.S.)</td>\n",
       "      <td>2013</td>\n",
       "      <td>79.624390</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>244</th>\n",
       "      <td>West Bank and Gaza</td>\n",
       "      <td>2013</td>\n",
       "      <td>73.203341</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>245</th>\n",
       "      <td>Yemen, Rep.</td>\n",
       "      <td>2013</td>\n",
       "      <td>63.583512</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>246</th>\n",
       "      <td>Zambia</td>\n",
       "      <td>2013</td>\n",
       "      <td>59.237366</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>247</th>\n",
       "      <td>Zimbabwe</td>\n",
       "      <td>2013</td>\n",
       "      <td>55.633000</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "                   country  year  SP.DYN.LE00.IN\n",
       "240                Vanuatu  2013       71.669244\n",
       "241          Venezuela, RB  2013       74.074415\n",
       "242                Vietnam  2013       75.756488\n",
       "243  Virgin Islands (U.S.)  2013       79.624390\n",
       "244     West Bank and Gaza  2013       73.203341\n",
       "245            Yemen, Rep.  2013       63.583512\n",
       "246                 Zambia  2013       59.237366\n",
       "247               Zimbabwe  2013       55.633000"
      ]
     },
     "execution_count": 31,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "lifeReset[240:]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The first rows of World Bank dataframes are aggregated data for country groups, and are thus discarded. There were 34 country groups when I generated the CSV files, but the World Bank sometimes adds or removes groups. Therefore, if you obtained the data directly from the World Bank, you may need to discard more or fewer than 34 rows to get a dataframe that starts with Afghanistan."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 32,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style>\n",
       "    .dataframe thead tr:only-child th {\n",
       "        text-align: right;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: left;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>country</th>\n",
       "      <th>year</th>\n",
       "      <th>NY.GDP.MKTP.CD</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>34</th>\n",
       "      <td>Afghanistan</td>\n",
       "      <td>2013</td>\n",
       "      <td>2.045894e+10</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>35</th>\n",
       "      <td>Albania</td>\n",
       "      <td>2013</td>\n",
       "      <td>1.278103e+10</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>36</th>\n",
       "      <td>Algeria</td>\n",
       "      <td>2013</td>\n",
       "      <td>2.097035e+11</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>37</th>\n",
       "      <td>American Samoa</td>\n",
       "      <td>2013</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>38</th>\n",
       "      <td>Andorra</td>\n",
       "      <td>2013</td>\n",
       "      <td>3.249101e+09</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "           country  year  NY.GDP.MKTP.CD\n",
       "34     Afghanistan  2013    2.045894e+10\n",
       "35         Albania  2013    1.278103e+10\n",
       "36         Algeria  2013    2.097035e+11\n",
       "37  American Samoa  2013             NaN\n",
       "38         Andorra  2013    3.249101e+09"
      ]
     },
     "execution_count": 32,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "gdpCountries = gdpReset[34:]\n",
    "lifeCountries = lifeReset[34:]\n",
    "gdpCountries.head()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Rows with missing data are dropped."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 33,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style>\n",
       "    .dataframe thead tr:only-child th {\n",
       "        text-align: right;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: left;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>country</th>\n",
       "      <th>year</th>\n",
       "      <th>NY.GDP.MKTP.CD</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>34</th>\n",
       "      <td>Afghanistan</td>\n",
       "      <td>2013</td>\n",
       "      <td>2.045894e+10</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>35</th>\n",
       "      <td>Albania</td>\n",
       "      <td>2013</td>\n",
       "      <td>1.278103e+10</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>36</th>\n",
       "      <td>Algeria</td>\n",
       "      <td>2013</td>\n",
       "      <td>2.097035e+11</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>38</th>\n",
       "      <td>Andorra</td>\n",
       "      <td>2013</td>\n",
       "      <td>3.249101e+09</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>39</th>\n",
       "      <td>Angola</td>\n",
       "      <td>2013</td>\n",
       "      <td>1.383568e+11</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "        country  year  NY.GDP.MKTP.CD\n",
       "34  Afghanistan  2013    2.045894e+10\n",
       "35      Albania  2013    1.278103e+10\n",
       "36      Algeria  2013    2.097035e+11\n",
       "38      Andorra  2013    3.249101e+09\n",
       "39       Angola  2013    1.383568e+11"
      ]
     },
     "execution_count": 33,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "gdpData = gdpCountries.dropna()\n",
    "lifeData = lifeCountries.dropna()\n",
    "gdpData.head()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The year column is discarded."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 34,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style>\n",
       "    .dataframe thead tr:only-child th {\n",
       "        text-align: right;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: left;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>country</th>\n",
       "      <th>SP.DYN.LE00.IN</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>34</th>\n",
       "      <td>Afghanistan</td>\n",
       "      <td>60.028268</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>35</th>\n",
       "      <td>Albania</td>\n",
       "      <td>77.537244</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>36</th>\n",
       "      <td>Algeria</td>\n",
       "      <td>74.568951</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>39</th>\n",
       "      <td>Angola</td>\n",
       "      <td>51.866171</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>40</th>\n",
       "      <td>Antigua and Barbuda</td>\n",
       "      <td>75.778659</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "                country  SP.DYN.LE00.IN\n",
       "34          Afghanistan       60.028268\n",
       "35              Albania       77.537244\n",
       "36              Algeria       74.568951\n",
       "39               Angola       51.866171\n",
       "40  Antigua and Barbuda       75.778659"
      ]
     },
     "execution_count": 34,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "COUNTRY = 'country'\n",
    "headings = [COUNTRY, GDP_INDICATOR]\n",
    "gdpClean = gdpData[headings]\n",
    "headings = [COUNTRY, LIFE_INDICATOR]\n",
    "lifeClean = lifeData[headings]\n",
    "lifeClean.head()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Task\n",
    "\n",
    "Clean the population dataframe you created in Exercise 7.\n",
    "\n",
    "If in Exercise 7 you chose to directly get the population data from the World Bank instead of using the provided CSV file, you may need to remove more (or fewer) than 34 rows at the start of the dataframe due to changes done by the World Bank to its data reporting."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "**Now go back to the course step and mark it complete.**"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Exercise 9: Joining and transforming\n",
    "\n",
    "The two dataframes can now be merged with an inner join."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 35,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style>\n",
       "    .dataframe thead tr:only-child th {\n",
       "        text-align: right;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: left;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>country</th>\n",
       "      <th>NY.GDP.MKTP.CD</th>\n",
       "      <th>SP.DYN.LE00.IN</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>Afghanistan</td>\n",
       "      <td>2.045894e+10</td>\n",
       "      <td>60.028268</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>Albania</td>\n",
       "      <td>1.278103e+10</td>\n",
       "      <td>77.537244</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>Algeria</td>\n",
       "      <td>2.097035e+11</td>\n",
       "      <td>74.568951</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>Angola</td>\n",
       "      <td>1.383568e+11</td>\n",
       "      <td>51.866171</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>Antigua and Barbuda</td>\n",
       "      <td>1.200588e+09</td>\n",
       "      <td>75.778659</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "               country  NY.GDP.MKTP.CD  SP.DYN.LE00.IN\n",
       "0          Afghanistan    2.045894e+10       60.028268\n",
       "1              Albania    1.278103e+10       77.537244\n",
       "2              Algeria    2.097035e+11       74.568951\n",
       "3               Angola    1.383568e+11       51.866171\n",
       "4  Antigua and Barbuda    1.200588e+09       75.778659"
      ]
     },
     "execution_count": 35,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "gdpVsLifeAll = merge(gdpClean, lifeClean, on=COUNTRY, how='inner')\n",
    "gdpVsLifeAll.head()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The dollars are converted to million pounds."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 36,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style>\n",
       "    .dataframe thead tr:only-child th {\n",
       "        text-align: right;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: left;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>country</th>\n",
       "      <th>NY.GDP.MKTP.CD</th>\n",
       "      <th>SP.DYN.LE00.IN</th>\n",
       "      <th>GDP (£m)</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>Afghanistan</td>\n",
       "      <td>2.045894e+10</td>\n",
       "      <td>60.028268</td>\n",
       "      <td>13075</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>Albania</td>\n",
       "      <td>1.278103e+10</td>\n",
       "      <td>77.537244</td>\n",
       "      <td>8168</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>Algeria</td>\n",
       "      <td>2.097035e+11</td>\n",
       "      <td>74.568951</td>\n",
       "      <td>134016</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>Angola</td>\n",
       "      <td>1.383568e+11</td>\n",
       "      <td>51.866171</td>\n",
       "      <td>88420</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>Antigua and Barbuda</td>\n",
       "      <td>1.200588e+09</td>\n",
       "      <td>75.778659</td>\n",
       "      <td>767</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "               country  NY.GDP.MKTP.CD  SP.DYN.LE00.IN  GDP (£m)\n",
       "0          Afghanistan    2.045894e+10       60.028268     13075\n",
       "1              Albania    1.278103e+10       77.537244      8168\n",
       "2              Algeria    2.097035e+11       74.568951    134016\n",
       "3               Angola    1.383568e+11       51.866171     88420\n",
       "4  Antigua and Barbuda    1.200588e+09       75.778659       767"
      ]
     },
     "execution_count": 36,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "GDP = 'GDP (£m)'\n",
    "column = gdpVsLifeAll[GDP_INDICATOR]\n",
    "gdpVsLifeAll[GDP] = column.apply(usdToGBP).apply(roundToMillions)\n",
    "gdpVsLifeAll.head()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The life expectancy is rounded, by applying the `round()` function."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 37,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style>\n",
       "    .dataframe thead tr:only-child th {\n",
       "        text-align: right;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: left;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>country</th>\n",
       "      <th>NY.GDP.MKTP.CD</th>\n",
       "      <th>SP.DYN.LE00.IN</th>\n",
       "      <th>GDP (£m)</th>\n",
       "      <th>Life expectancy (years)</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>Afghanistan</td>\n",
       "      <td>2.045894e+10</td>\n",
       "      <td>60.028268</td>\n",
       "      <td>13075</td>\n",
       "      <td>60</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>Albania</td>\n",
       "      <td>1.278103e+10</td>\n",
       "      <td>77.537244</td>\n",
       "      <td>8168</td>\n",
       "      <td>78</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>Algeria</td>\n",
       "      <td>2.097035e+11</td>\n",
       "      <td>74.568951</td>\n",
       "      <td>134016</td>\n",
       "      <td>75</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>Angola</td>\n",
       "      <td>1.383568e+11</td>\n",
       "      <td>51.866171</td>\n",
       "      <td>88420</td>\n",
       "      <td>52</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>Antigua and Barbuda</td>\n",
       "      <td>1.200588e+09</td>\n",
       "      <td>75.778659</td>\n",
       "      <td>767</td>\n",
       "      <td>76</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "               country  NY.GDP.MKTP.CD  SP.DYN.LE00.IN  GDP (£m)  \\\n",
       "0          Afghanistan    2.045894e+10       60.028268     13075   \n",
       "1              Albania    1.278103e+10       77.537244      8168   \n",
       "2              Algeria    2.097035e+11       74.568951    134016   \n",
       "3               Angola    1.383568e+11       51.866171     88420   \n",
       "4  Antigua and Barbuda    1.200588e+09       75.778659       767   \n",
       "\n",
       "   Life expectancy (years)  \n",
       "0                       60  \n",
       "1                       78  \n",
       "2                       75  \n",
       "3                       52  \n",
       "4                       76  "
      ]
     },
     "execution_count": 37,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "LIFE = 'Life expectancy (years)'\n",
    "gdpVsLifeAll[LIFE] = gdpVsLifeAll[LIFE_INDICATOR].apply(round)\n",
    "gdpVsLifeAll.head()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The original GDP and life expectancy columns are dropped."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 38,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style>\n",
       "    .dataframe thead tr:only-child th {\n",
       "        text-align: right;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: left;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>country</th>\n",
       "      <th>GDP (£m)</th>\n",
       "      <th>Life expectancy (years)</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>Afghanistan</td>\n",
       "      <td>13075</td>\n",
       "      <td>60</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>Albania</td>\n",
       "      <td>8168</td>\n",
       "      <td>78</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>Algeria</td>\n",
       "      <td>134016</td>\n",
       "      <td>75</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>Angola</td>\n",
       "      <td>88420</td>\n",
       "      <td>52</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>Antigua and Barbuda</td>\n",
       "      <td>767</td>\n",
       "      <td>76</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "               country  GDP (£m)  Life expectancy (years)\n",
       "0          Afghanistan     13075                       60\n",
       "1              Albania      8168                       78\n",
       "2              Algeria    134016                       75\n",
       "3               Angola     88420                       52\n",
       "4  Antigua and Barbuda       767                       76"
      ]
     },
     "execution_count": 38,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "headings = [COUNTRY, GDP, LIFE]\n",
    "gdpVsLifeClean = gdpVsLifeAll[headings]\n",
    "gdpVsLifeClean.head()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Tasks\n",
    "\n",
    "1. Merge `gdpVsLifeClean` with the population dataframe obtained in the previous exercise. \n",
    "- Round the population value to the nearest million.\n",
    "- Remove the original population column."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "**Now go back to the course step and mark it complete.**"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Exercise 10: Correlation\n",
    "\n",
    "The Spearman rank correlation coefficient between GDP and life expectancy, and the corresponding p-value are calculated as follows."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 39,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "The correlation is 0.501023238967\n",
      "It is statistically significant.\n"
     ]
    }
   ],
   "source": [
    "from scipy.stats import spearmanr\n",
    "\n",
    "gdpColumn = gdpVsLifeClean[GDP]\n",
    "lifeColumn = gdpVsLifeClean[LIFE]\n",
    "(correlation, pValue) = spearmanr(gdpColumn, lifeColumn)\n",
    "print('The correlation is', correlation)\n",
    "if pValue < 0.05:\n",
    "    print('It is statistically significant.')\n",
    "else:\n",
    "    print('It is not statistically significant.')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Task\n",
    "\n",
    "Calculate the correlation between GDP and population."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "**Now go back to the course step and mark it complete.**"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Exercise 11: Scatterplots\n",
    "\n",
    "The dataframe method `plot()` can also produce scatterplots. The `logx` and `logy` arguments  set a logarithmic scale on the corresponding axis."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 40,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "<matplotlib.axes._subplots.AxesSubplot at 0x1a11a13ba8>"
      ]
     },
     "execution_count": 40,
     "metadata": {},
     "output_type": "execute_result"
    },
    {
     "data": {
      "image/png": "iVBORw0KGgoAAAANSUhEUgAAAmEAAAEOCAYAAADBv8BZAAAABHNCSVQICAgIfAhkiAAAAAlwSFlz\nAAALEgAACxIB0t1+/AAAIABJREFUeJzt3XucXGWd5/HPrzudC+lgYgI9COGiUUZQEodeESNuAo4z\n4zioI17GGVFxjcwKOs4qwdnxys6uCA4vdJhlWC/gZcxgUFC8rA7YICwyJtAJN4UAShIgkDaBdEg6\nne7f/lFVobq6LqeqzuWpOt/364VJVZ06z+88v1OVx3OeXz3m7oiIiIhIunqyDkBEREQkjzQIExER\nEcmABmEiIiIiGdAgTERERCQDGoSJiIiIZECDMBEREZEMaBAmIiIikgENwkREREQyoEGYiIiISAY0\nCBMRERHJwIwkd25mHwb+C+DAXcB7gMuB/ww8Vdzs3e4+XG8/ixYt8qOPPjr2+Hbv3s3cuXNj36+0\nR3kJj3ISHuUkTMpLeLLIyfr167e7+yGNtktsEGZmhwMfBI5z9z1mdjXw9uLLH3X3tVH3dfTRR7Nu\n3brYYxwaGmLFihWx71fao7yERzkJj3ISJuUlPFnkxMx+G2W7pG9HzgDmmNkM4CDg0YTbExEREekI\niQ3C3H0rcDHwCPAY8JS7/6T48j+Y2UYzu8TMZiUVg4iIiEiozN2T2bHZAuAa4G3ATuDbwFrgBuBx\nYCZwBfCgu3+myvtXAasABgYGTlyzZk3sMY6OjtLf3x/7fqU9ykt4lJPwKCdhUl7Ck0VOVq5cud7d\nBxttl+TE/NcAD7v7kwBm9h3gle7+jeLrY2b2VeAj1d7s7ldQGKQxODjoSdzP1b37MCkv4VFOwqOc\nhEl5CU/IOUlyTtgjwCvM7CAzM+A04D4zOwyg+NwbgbsTjEFEREQkSIldCXP3281sLXAHsB+4k8KV\nrR+Z2SGAAcPA2UnFICIirRkZHWPLjj0csWAOC/s1dVckCYn+Tpi7fxL4ZMXTpybZpoiItOe64a2s\nvmYjfT09jE9O8rk3n8Dpyw7POiyRrqNfzBcRkQNGRsdYfc1G9o5PsmtsP3vHJznvmo2MjI5lHZpI\n19EgTEREDtiyYw99PVP/aejr6WHLjj0ZRSTSvTQIExGRA45YMIfxyckpz41PTnLEgjkZRSTSvTQI\nE5GaRkbH2LB5JxOTyfyeoIRnYf8sPvfmE5jd18O8WTOY3dfD5958QnCT80vnpm6Txk99m55EJ+aL\nSOcqn5x99rFj7B7eqsnZOXH6ssNZvmRRsNWRKhxIjvo2XboSJiLTVE7OnnTX5OycWdg/i6WL5wc3\nAFPhQHLUt+nTIExEptHkbAmVzs3kqG/Tp0GYiEyjydkSKp2byVHfpk+DMBGZpnJydo9ZkJOzJX86\npXCgE6lv06eJ+SJSVfnk7O0P3MlpOZqcqyV7phoZHeOeR58GnOOf95zU+qRWHrIuHKgV18joGHvG\nJxgZHYscUyvnWpLnZ9Z9mzcahIlITQv7Z7GwfxZDD1rWoaRG1WFTXTe8lf929TD7i3ep+nqNz79l\naeJ90igPpXMzbbXiKj3/wReP8+ELb4x03rRyrqVxfmbVt3mk25EiIkWqDptqZHSM89ZuODAAAxif\ncD66Ntk+CTUPteLatG3Xgecn3CPF28oxhtov0joNwkREilQdNtWWHXvoten/TPT2WKJ9EmoeasU1\nvHln0/G2coyh9ou0ToMwEZEiVYdNdcSCOUz45LTnJyY90T4JNQ+14lq2eH7T8bZyjKH2i7ROgzDJ\nhVCW4Ug6jlCOs1N1S3VYM+fBpm27WLtuM5u27Zry/pvvf4J7Hn2aT/zZ8cwo+5eir9e46Izm+6RR\nTOWvx52HOD8XH1ixhFkzpsa1ZGDegXh7zarGWxlDq8dYaN+aeo++F8KlifnS9UKZaJ10HKEcZ6fr\n9OqwZs6DT1x7F1/7xSMHHp958pGceNRz+ci3NzA+UVgvdEYPfPr0l7D4uQfRanVko5hqvR5HHuL6\nXJTvB5xVr34+7zjpyANxleL9j9tu4dbTXzUl3loxNHOMU9u3ae0nffySDF0Jk64WykTWpOMI5Ti7\nRahL9jTSzHmwaduuKQMwgK/d9ggfuXr4wAAMYP8kfOb6ezj+eQfz6hcd2tIVsHox1Xu93TzE9bmo\n3M/YfueyoU3TtlvYP4s5fb3TroDViyHKMU5vf7Jq+0kdvyRHgzDpaqFMZE06jlCOU7LVzHkwvHln\n9Z3Y9J8j6bXWz6VGMSV57sa173b2E0cMre5D3wvh0yBMulooE1mTjiOU45RsNXMeLFs8v/pO3Kc9\nNeGtn0uNYkry3I1r3+3sJ44YWt2HvhfCl+ggzMw+bGb3mNndZvYtM5ttZseY2e1m9oCZ/ZuZzUwy\nBsm3UCZaJx1HKMcp2WrmPFgyMI8zTz5yynNnnnwkn3/rMvp6n70aNqMHLjpjacvnUqOYkjx349p3\nO/uJI4ZW96HvhfCZV/l/PbHs2Oxw4BbgOHffY2ZXAz8EXgd8x93XmNnlwAZ3/9/19jU4OOjr1q2L\nPcahoSFWrFgR+36lPUnkJZRlaJKOI6n9x5WTNPNQ3hZQ9e+d/I9RvZxE6efSNuP7J/jNyDMsWzyf\nJQPzDrx2z6NPAcbxzzu44T6i9GWjbTdt28Xw5p1T4ohLXOddlP3UykscMbS6j1C+/7KSxb/1Zrbe\n3QcbbZd0deQMYI6ZjQMHAY8BpwLvKL5+FfApoO4gTKRdoSzDkXQcoRxnNWlWaZW3tWd8P2bG7Bm9\n7N0/gbszp29GV1eKNToPquWifOCzsH8Wr37RoXXbaDaf9WJK+tyI63PRzn7iiKHVfYT8vZB3id2O\ndPetwMXAIxQGX08B64Gd7r6/uNkWoPu+AUVkijSrtCrb2j9ZWGpn19h+xiec/ZPkulIsjlzEmU9V\n8EmeJXk7cgFwDfA2YCfw7eLjT7r7kuI2i4EfuvtLq7x/FbAKYGBg4MQ1a9bEHuPo6Cj9/f2x71fa\no7yEp92c7Bmf4OEndzNR9n3Ta8Yxh8xlTl9vHCHWbauWpGJIQ6s5iSMXceYzzXMjDfr+Ck8WOVm5\ncmXmtyNfAzzs7k8CmNl3gFcC881sRvFq2BHAo9Xe7O5XAFdAYU5YEvdzNScsTMpLeNrNycjoGB++\n8Eb2jj9bqTW7r2faj1rGoVpbtSQVQxpazUkcuYgzn2meG2nQ91d4Qs5JktWRjwCvMLODzMyA04B7\ngZ8BZxS3eRdwXYIxiEgA0qzSqmxrRk9hqZ15s2bQ12vM6CHXlWJZVuslvS+RTpPYlTB3v93M1gJ3\nAPuBOylc2foBsMbM/kfxuS8nFYOIxGNi0tmweWdb1VVpLgdU2RYUKiLnzuzl0af20uryO+VaqTgL\npUotjlzEmc9OXypKOkMon79yiVZHuvsngU9WPP0Q8PIk2xWR+Fw3vJUtj+/i8ptub7tyLc0qrcq2\nbtm0PbYKvFaq+UJbwy/Lar2k9yVSKbTPX4l+MV9EaipVrk26d3TlWtbVfKoAFMnOxKQH+/nTIExE\nauqWtefiPI5W9tUt/SjSifZNTAb7+dMgTERq6pa15+I8jlb21S39KNKJZvb2BPv50yBMJMdGRsfY\nsHlnzcvypcq1HrOOrlzLupov6wrAkdExbr7/CW6+/8kgbsGIpKm3x4KtwE162SIRCVTUiaqnLzuc\nG353P9845WVBVRU1K+tqvqwqAK8b3spHvr2B8YnCj6HO6IF/fOuyICYli6Ql1ApcDcJEcqh8ovhe\nCpfpz7tmI8uXLKr65dTbYyxdPD/tMGOXdTVf2hWAI6NjnLd244EBGMD+Sfjo2g01cy3SrUKswNXt\nSJEc0kTxfNiyYw+9PTbt+V5TrkVCoEGYSA5pong+HLFgDhOT09fQnHDlWiQEGoSJ5FDWE8UlHQv7\nZ3HRGSfQ1/vs1bAZPXDRGUuVa5EAaE6Y5E6IS1dkIdSJqtU0k7NWlxO659GnAOP45x3cccsQ1VPK\ncyvH16pW+yXU/gw1Lul8GoRJroS6dEVWQpyoWqmZnLW6nFAr1YOddC4t7J/Fq190aCpttdovofZn\nqHFJd9DtSMkNLR3TeZrJWavLCdWqHtQyRM1rtV9C7c9Q45LuoUGY5IYqAjtPMzlrdTmhVqoHdS5V\n12q/hNqfocYl3UODMMkNVQR2nmZy1upyQq1UD+pcqq7Vfgm1P0ONS7pHw0GYmR1qZm8ysw+Y2Vlm\n9nIz0+BNOo4qAjtPMzlrdTmhVqoHdS5V12q/hNqfocYl3aPmxHwzWwmcDzwXuBN4ApgNvBF4gZmt\nBT7v7k+nEah0ptCqijqpIjB07eS2mfc2k7N2lhNqtnowSlul45w7s5fd+yYSO+dC+py1+hlr9L6s\njlHfGZKketWRrwPe5+6PVL5gZjOA1wN/CFyTUGzS4UKtKuqEisDQtZPbVt7bTM5aXU6olerBem2V\njtMnnbEJZ3Zf4QZC3J+DED9nrX7Gar0v62PUd4YkpeZtRXf/aLUBWPG1/e5+rbtrACZVqaqoe7WT\n27ycF+XHOVasvNw7Phn78eahP/NwjJJfUeaEfcjMDraCL5vZHWb22jSCk86lqqLu1U5u83JeVDvO\nkjiPNw/9mYdjlPyKMsH+rOK8r9cChwDvAT7b6E1mdqyZDZf997SZ/Y2ZfcrMtpY9/7o2j0ECpKqi\n7tVObvNyXlQ7zpI4jzcP/ZmHY5T8ijIIK5UNvQ74qrtvKHuuJnf/tbsvc/dlwInAM8B3iy9fUnrN\n3X/YSuAStnpVRSOjY2zYvLMrbieUjmXTtl2pHlMzfbhp2y6uvPUhrt/waCzxtVMxlna1WVbnWvlx\nzipWXs7u62n6eBvF30x/xtkXSfRrrX22es500/dMHuQ1X1GWLVpvZj8BjgE+ZmbzgOr/F6+204AH\n3f23Zg3Hb9IlqlUVZT3BNk5pTbyu1W6UPvzEtXfxtV88O7XTgEvf3nhJnkbaqRhLq9os63Ot/Dhb\nqY6MGn+U/oyzL5Lo10b7bPacyTr30pw856vulTArjJg+QeGnKv6Tuz8DzKRwS7IZbwe+Vfb4HDPb\naGZfMbMFTe5LOsjC/lksXTz/wBWwbplgm9bE63rtNurDTdt2TRmAATjwkW8Px3ZFrJTbNN8bRSjn\nWuk4lwzMa+p4m42/Xn/G2RdJ9GvUfUY9Z0LJvUST93yZ+/Rfi56ygdl6dz+x5QbMZgKPAse7+zYz\nGwC2U/j34ALgMHc/q8r7VgGrAAYGBk5cs2ZNqyHUNDo6Sn9/f+z7ler2jE/w8JO7mSg753rNOOaQ\nuczp6z3wXCfkpdqxlFQ7piTbrdXejmfG2bLjmWn76DHj+U3G1wk5KddMP4UoSvxRcxJnXyTRr3Hv\nM+vcd9pnJWtp5CuLnKxcuXK9uw822i7KIOwy4Ep3/2UrgZjZG4APuPu0ikozOxq43t1fUm8fg4OD\nvm7dulaar2toaIgVK1bEvl+pbmR0jOUX3sje8WfvZs/u6+HW1adO+X+3nZCXasdSUu2Ykmy3Vnub\ntu3iNZfcPG0fM3vhto+9pqn4OiEn5ZrppxBFiT9qTuLsiyT6Ne59Zp37TvusZC2NfGWRk+IFrIaD\nsCgT81cCt5nZg8VbiHeZ2cYmYvkLym5FmtlhZa+9Cbi7iX1JB+umJUDimnjdTruN+nDJwDzOPPnI\nKc8ZcPFblnVknzej08+1OOMPdV9J7bPTc583ec9XlCthR1V73t1/23DnZgcBm4Hnu/tTxee+Diyj\ncDvyN8D73f2xevvRlbDu0mgpl07KS1rL0tRqN0p7m7bt4pZNT7KofzYnv2AhQNOT4m+48WcseuHL\nMj2+VpatGRkd47YHt7N9dB+vWrKIJQPzIrXVSnzNiPr+ettVfk4qty1/DDS9LFOr8beau7j7NKtl\njuL4/gppGaq0JHnMIV8Ja1gdWRpsmdmhFNaOjKw4kX9hxXPvbGYf0n0W9s/ilk3bu6IaJqvlTJpp\nd8nAvAODj1aqkK4b3sqWx3dx+U23p5KrajE6tHS+XPrv908pTjjz5CP5zBteWretKP3RzrnbzPuj\n5rlyn2898QiuXr+Fvp4e9u6fwN2Z0zcjtvw1s7xQ1Ny181mq1aedOIDJa6VgXpeGivKL+aeb2QPA\nw8BNFK5e/SjhuKSL5b0aJiut9HvpPZPuqeSqWowfXbuR89ZuaPp8qVYd+rXbHmHTtl0124raH62e\nu2lVF37tF48ceDw+4eyfJPH8Vc/dBs5bm+xnvZu+T7rpWCSaKHPCLgBeAdzv7sdQ+M2vWxONSrqa\nliHJRiv9nnauqrXX22P0WvMxDG/eWff5LPojif6st0RSNUnlr2rurIfenqm/DRl3+930fdJNxyLR\nRPnkjrv7CNBjZj3u/jMKc7pEWqJlSLLRSr+nnatq7U1MOhPefAzLFs+v+3wW/ZFEf9ZbIqmapPJX\nNXc+ycTk1HnHcbffTd8n3XQsEk2UQdhOM+sHfg5808wuBfYnG5Z0s7xXw2SllX4vvafHLJVcVYvx\nojNO4KIzljZ9vlSrDj3z5CMPzI9rpz9aPXfTqi488+QjDzzu6zVm9JB4/qrnbikXnZHsZ72bvk+6\n6VgkmijVkXOBPRQGbH8JPAf4ZvHqWCpUHVmQZPVIrcqiJKv+ah1PrbzksWIoCY1yW62fa1VHxp2T\nerG12tambbsY3ryTZcVfrq/VZjNVdWlVR9bTbHVkWp+duCpb42g3C6qODE+nV0fuLv5MxQvd/ari\nz06E/5PTXSbJiplplVWDR3D1ui1AYSmeWb2G9VjsVTrNVMPktWIoCfWqU2v1c2+PsbTi9l7cOam2\nv/I2W62eKq8OraZ8v1GPqd1KriQqwSr3We1xGqodWxqVb91UXddNxyL1RamOfB+wFviX4lOHA9cm\nGZRMlWTFTNXKqtseObAOIsDYhGdapaOKoXjV6s9N23ZF7ue4cxJCjkOIQUTyJcqcsA8Ay4GnAdz9\nAeDQJIOSqZKsmGmmsiqrKh1VDMWrVn8Ob94ZuZ/jzkkIOQ4hBhHJlyj/+o65+77SAzObQeHX7iUl\nSVbMNFNZlVWVjiqG4lWrP5ctnh+5n+POSQg5DiEGEcmXKIOwm8zs74A5ZvaHwLeB7ycblpRLsmKm\nVmXVrBnGzOKaiLN6LdMqHVUMxatWfy4ZmBe5n9vJycjoGBs275xymy+EHFeL4eN/ehxbduyZEmu1\n+EVEWtFwYj5wPvBe4C7g/cAPgS8lGZRMd/qyw1m+ZFEiFTOV+75l03b+7ZdbmNFrmDnnrFzCO046\nMtNBT5LHn0e1+rOZfm4lJ/UmvoeQ4/IY7t76FBf84N5Ylk8SEakmyiDsdcCX3f3/JB2M1JdkxUxp\n36XJyWP7n70tc9nQJt5x0pF13p0OVQzFq1Z/NtPPzWxbPvF9L4Xz67xrNrJ8yaID+wghx6X233bF\nbVNi/ejajYAztt9rxi8i0owotyPfDjxgZp8zsxcnHZBkS5OTJSmddG7FuXySiEgtDQdh7v5XwMuA\nB4GvmtltZrbKzGr/8I50LE1OlqR00rkV5/JJIiK1RPptAnd/GrgGWAMcBrwJuMPMzk0wNslACBOk\npTt10rkV5/JJIiK1NJwTZmZ/BpwFvAD4OvByd3+i+Mv59wFfTDZESVsIE6S7RavLj3TrsiUhnVuN\n+rhWrMcddnDdZZCSVoq7cmHsytfb7d9uPQdDoj6WKBPz3wJc4u43lz/p7s+Y2VnJhCVZC2GCdKdr\ndVmfbl+iKYRzq9XlibLOTXn7Zx87xu7hrVPajyu+rI8zD9THAnVuR5qZAbj7mZUDsDI3JhKVSIdr\ndQkcLZ2TvE7NTWX7k+5T2o8rvqyPMw/Ux1JSb07Yz8zsXDOb8tsEZjbTzE41s6uAdyUbnkhnarUS\nsJMqCDtVp+amUftxxZf1ceaB+lhK6t2O/GMKc8G+ZWbHADuB2UAv8BMKtyiHkw9RpPO0WgnYSRWE\nnapTc9Oo/bjiy/o480B9LCU1r4S5+153/2d3Xw4cBZwG/IG7H+Xu72s0ADOzY81suOy/p83sb8zs\nuWb2UzN7oPjngpiPSSRzrVYCdlIFYafq1NxUtt9jNqX9uOLL+jjzQH0sJVEm5uPu48BjzezY3X8N\nLAMws15gK/BdCssg3eDunzWz84uPVzezb+luE5POhs07g68YarW6rpHy982d2cvufROMjI4F3Rft\nyKJCLI7cZHF+lre//YE7Oa1iInez8dXq+6yPMw/UxwIRB2ExOA140N1/a2ZvAFYUn78KGEKDMCm6\nbngrWx7fxeU33R50xVCr1XVRLeyfxS2btnd99VSWFWLt5CbLfzBL7Q89aHVfb6RR32d9nHmgPpZI\nP9Yag7cD3yr+fcDdHwMo/nloSjFI4EoVQ5PuQVcMpVHZlIfqqTwcY6jU9yJhMPfqP/h3YAOzc4Bv\nuvuOlhowmwk8Chzv7tvMbKe7zy97fYe7T5sXZmargFUAAwMDJ65Zs6aV5usaHR2lv78/9v1Ka/aM\nT/Dwk7tZNNvZViwS6jXjmEPmMqevN9vgypTinCj77MQdZxptNCOJz0pox9hp2smJ+j45+nclPFnk\nZOXKlevdfbDRdlFuR/4e8EszuwP4CvB/vdHIbao/Ae5w923Fx9vM7DB3f8zMDgOeqPYmd78CuAJg\ncHDQV6xY0UST0QwNDZHEfqU1I6NjfPjCG/nA7+/j83cVTs3ZfT3cevqrgrpkX4pz7/iz1U1xx5lG\nG81I4rMS2jF2mnZyor5Pjv5dCU/IOYmygPffAy8Evgy8G3jAzP6nmb0gYht/wbO3IgG+x7O/L/Yu\n4LrI0XaJkdExNmzeqUv/FUoVQz1m0yqGQuqzNCqbqrXx8T89ji079jAyOhZLf8TZp63sSxVi2Sn1\n/awZPRw0s5dZM/LR9yF9j4hA9OpIN7PHgceB/cACYK2Z/dTdz6v1vuL6kn8IvL/s6c8CV5vZe4FH\nKCyLlBtaqqK+05cdzg2/u59vnPKyAxVDIfZZGpVN5W3cvfUpLvjBvfT19LBnfD9mxuwZvS33R5x9\n2s6+VCGWHS/9r9uBR90sxO8RkYZXwszsg2a2HvgccCvwUnf/a+BE4M313uvuz7j7Qnd/quy5EXc/\nzd1fWPzzd20eQ8fQZNhoenuMpYvnH7gCFmqfLeyfdSDOJNs4YsEcLvjBvQf6YP8kjE+0XrwQZ5/G\nsa80+lGmKuVtbL/zzPgEY/s9mM9VEkL+HpF8i1IduQj4c3f/I3f/dvE3w3D3SeD1iUbXZbRURfPU\nZ9X7oFyz/RFnnyo/nSlvecvb8UrniDII+yFw4GqVmc0zs5MA3P2+pALrRlqqonnqs+p9UK7Z/oiz\nT5WfzpS3vOXteKVzRBmE/W9gtOzx7uJz0iRNRG6e+mx6H8zogb7e6cULre6vnT5VfjpT3vKWt+OV\nzhFlYr6V/ySFu0+aWVq/tN91unkicpTlZ1pZoqab+yyqyj4A2uqPqH1aytfEZO2J20nlJ83ljLJY\nOilrjfLWbX2S9fdIt/WnxCPKYOohM/sgz179+q/AQ8mF1P26camKKJVH7VQndWOfNauyD9rtj0Z9\nWp6vs48dY/fw1pr5ijs/aVay5blqrlbeurVPsvoe6db+lPZFuR15NvBKCgtwbwFOovhL9iIQrfJI\n1UmdpTJfk55e9Vya54rOy+nUJ/FSf0o9UX6s9Ql3f7u7H+ruA+7+Dnev+iv3kk9RKo9UndRZssxX\nmm3rvJxOfRIv9afU0/B2pJkdArwPOLp8e3c/K7mwpJNEqTxSdVJnyTJfabat83I69Um81J9ST5Tb\nkdcBzwH+HfhB2X8iQLTKI1UndZbKfPWYpZavNM8VnZfTqU/ipf6UeqJMzD/I3VcnHol0tCiVR1lX\nJ0lzyvO1/YE7OS3FicRpnis6L6dTn8RL/Sm1RBmEXW9mr3P3HyYejXS0KJVHqnLsLKV8DT1ombXd\nbW11CvVJvNSfUk2U25EfojAQ22NmT5vZLjN7OunARERERLpZwyth7j4vjUBERERE8iTSL9+b2QLg\nhcDs0nPufnNSQYmIiIh0uyg/UfFfKNySPAIYBl4B3Aacmmxo6UprSQktXRGWvOSjleMsf08e5OVc\nKJfHYxYJSZQrYR8C/hPwC3dfaWa/D3w62bDSldaSElq6Iix5yUcrx1n5ngtf2d3LxeblXCiXx2MW\nCU2Uifl73X0vgJnNcvdfAccmG1Z60lpSQktXhCUv+WjlOKu9Z8uOPV3XNyV5ORfK5fGYRUIUZRC2\nxczmA9cCPzWz64BHkw0rPWktKaGlK8KSl3y0cpzV3mPF57tRXs6Fcnk8ZpEQRamOfFPxr58ys59R\n+PX8HyUaVYrSWlJCS1eEJS/5aOU4q73Hi893o7ycC+XyeMwiIWp4JczMvl76u7vf5O7fA74SZedm\nNt/M1prZr8zsPjM72cw+ZWZbzWy4+N/r2oi/bWktKaGlK8KSl3y0cpzV3tPNE7fzci6Uy+Mxi4Qo\nymzb48sfmFkvcGLE/V8K/NjdzzCzmcBBwB8Bl7j7xU1FmqC0lpTQ0hVhyUM+RkbHOGrhXK4/51Xs\n3jcR+Tgr++audbelEG172qn0y8O5UCmPxywSmpqDMDP7GPB3wJyyX8g3YB9wRaMdm9nBwKuBdwO4\n+z5gn1n6y59EkdaSElq6IizdnI9q1W9LF8+P/P5O6ps4Kv066XjjksdjFglJzduR7v6/ir+Wf5G7\nH1z8b567L3T3j0XY9/OBJ4GvmtmdZvYlM5tbfO0cM9toZl8p/hCsiMQoT9VveTpWEeku5u71NzB7\nE3Cjuz9VfDwfWOHu1zZ43yDwC2C5u99uZpcCTwP/BGynMNf3AuAwdz+ryvtXAasABgYGTlyzZk2z\nx9bQ6Ogo/f39se9X2qO8tG/P+AQPP7mbibLPd68Zxxwylzl9vU3vL+ScxH2snSLknOSZ8hKeLHKy\ncuXK9e4+2Gi7KIOwYXdfVvHcne7+sgbv+z0KP/B6dPHxKcD57v6nZdscDVzv7i+pt6/BwUFft25d\n3ThbMTQ0xIoVK2Lfr7RHeWnfyOgYyy+8kb3jz1bAze7r4dbVp7Z0+ynknMR9rJ0i5JzkmfISnixy\nYmaRBmH57HoOAAAW2ElEQVRRfies2jZRftricWCzmZV+2PU04F4zO6xsszcBd0eIQUSaELX6bWR0\njA2bd1a9ddfqa2kob7/WsQKZxigi0kiU6sh1ZvaPwGUUbiGeC6yPuP9zgW8WKyMfAt4DfMHMlhX3\n9Rvg/c0GLSKNNap+qzeZvd6yRVkvd1Or/fJjvWXTdpZfeKOW5BGRoEUZhJ0LfBz4t+LjnwB/H2Xn\n7j4MVF6Oe2fk6ESkLbWq38ons++lcBvvvGs2snzJIoBpr5UvW1TrfWnc+qsXd+lYG20jIhKKKLcV\ndwPnm1m/u4+mEJOIJKy0bE1pkAJTl62pfK182aJa70tjgFMv7lL7UbYREQlBlF/Mf6WZ3QvcW3y8\n1Mz+OfHIRCQx9ZatqbdsUdbL3URpP+sYRUSiijIx/xIKv3I/AuDuGyj8CKuIdKh6E/frLVuU9XI3\nUdrPOkYRkaiizAnD3TdX/NL9RDLhSBLaWc5FWhdXvyeVv3oT9+stW5T1cjdR2q/cBgqVkq3Eq8+P\niCQlyiBss5m9EvBileMHgfuSDUviknUlW17F1e9J56/esjWtvpaGKO2XtmmnD/X5EZEkRbkdeTbw\nAeBwYCuwrPhYAqflXLIRV78rf+1rpw/V/yKStIaDMHff7u5/6e4D7n6Iu/+Vu4+kEZy0p1QlVq68\nAk6SEVe/K3/ta6cP1f8ikrQo1ZHPN7Pvm9mTZvaEmV1nZs9PIzhpj6rEshFXvyt/7WunD9X/IpK0\nKLcj/xW4GjgMeB7wbeBbSQYl8VCVWDbi6nflr33t9KH6X0SSFmVivrn718sef8PMzkkqIIlX1pVs\nUXRj9Vlc/V5rP0n1WeV+S48nJj22NtLWTi464fMjIp0ryiDsZ2Z2PrCGwm82vg34gZk9F8Ddf5dg\nfBKDrCvZ6qlWfXZw1kHFJK5+r9xPUhV7lft964lHcPX6LfT19HD2sWPsHt7asZWB7eQi5M+PiHS2\nKIOwtxX/rFxo+ywKgzLND5OW1Frj77KVszOOLFxJrYtYbb9f+8UjAOxlkkl3rb8oIhKzKGtHHpNG\nIJI/tdb42zcxWedd+ZbUuojV9ltJ6y+KiMQrSnXkBWbWW/b4YDP7arJhSR7Uqj6b2RulXiSfkqrY\nq7bfSqoMFBGJV5R/7WYA/2FmJ5jZa4FfAuuTDUvyoFb1WW+P1X3fyOgYGzbvzORHM7Nqu9QukEjF\nXrVcnHnykQce95h1dGVgrbxleS6JiES5HfkxM7sBuB3YAbza3TclHpnkQrXqs6GhB2pun+UyMlm1\nXa3dW1efGnvFXrVcfOi0F7Flxx62P3Anp3XopPxaedOSRCKStSi3I18NXAp8BhgC/snMnpdwXJIj\nC/tnsXTx/IaDiSyXkcmq7VrtApH6rFmVuSg9bnR1MlS1+m/Ttl1akkhEMhflduTFwFvc/X+5+zuA\nK4Abkw1LZLosl5HJqm0tndOeWv03vHmn+lVEMhflJypOdveJ0gN3/46Z3ZRgTCJVZbmMTFZta+mc\n9tTqv2WL56tfRSRzUa6ELTKzL5vZjwHM7DjgjVF2bmbzzWytmf3KzO4zs5PN7Llm9lMze6D454J2\nDkDyI8tlZLJqW0vntKdW/y0ZmKd+FZHMRbkSdiXwVeC/Fx/fD/wb8OUI770U+LG7n2FmM4GDgL8D\nbnD3zxZ/if98YHWzgUs2sl5iKMtlZLJqu9OWzsn6HKlUq/86rV9FpPtEGYQtcverzexjAO6+38wm\nGr3JzA4GXg28u/i+fcA+M3sDsKK42VUUJvtrENYBQqkmy3IZmaza7pSlc0I5RyrV6r9O6VcR6U5R\nbkfuNrOFFJYowsxeATwV4X3PB54Evmpmd5rZl8xsLjDg7o8BFP88tLXQJU1ZViZKZ9A5IiLSHHP3\n+huY/QHwReAlwN3AIcAZ7r6xwfsGgV8Ay939djO7FHgaONfd55dtt8Pdp80LM7NVwCqAgYGBE9es\nWdPUgUUxOjpKf39/7PvtRnvGJ3j4yd1MlJ0vvWYcc8hc5vT11nln85SX8ETJSZrniOhzEirlJTxZ\n5GTlypXr3X2w0XYNB2EAZjYDOBYw4NfuPh7hPb8H/MLdjy4+PoXC/K8lwAp3f8zMDgOG3P3Yevsa\nHBz0devWNYyzWUNDQ6xYsSL2/XajkdExll94I3vHn60om93Xw62rT439do7yEp4oOUnzHBF9TkKl\nvIQni5yYWaRBWKRF+tx9v7vf4+53RxmAFd/zOLDZzEoDrNOAe4HvAe8qPvcu4Loo+5NsqUpPGtE5\nIiLSnCgT89txLvDNYmXkQ8B7KAz8rjaz9wKPAG9JOAaJiarJqsu6GjDr9su1co6EFL+ISJoSHYS5\n+zBQ7XLcaUm2K8lRNdlUWVcDZt1+Nc2cIyHGLyKSlihrR5qZ/ZWZfaL4+Egze3nyoYmELetqwKzb\nb1enxy8i0q4oc8L+GTgZ+Ivi413AZYlFJNIhsl7XMev229Xp8YuItCvK7ciT3P0PzOxOAHffUZzj\nJZJrWa/rmHX77er0+EVE2hXlSti4mfXy7I+1HgJM1n+LSPeLqxpwZHSMDZt3Nn0bLmr71fbfaptx\nKLUN1Iw/y/hERNIS5UrYF4DvAoea2T8AZwB/n2hUIh2i3YrRdiemN2q/2v4dMpsMXy2eW1efOiV+\nTdYXkbyoOQgzs2Pc/WF3/6aZradQ0WjAG939vtQiFAlcqxWj5RPT9xYvLp93zUaWL1nU1P5qtV9t\n/x9duxFwxvZ7W222otbx3rr6VJYunl93mzTiExFJW73bkWsBzOwGd/+Vu1/m7v+kAZhIPJKemF5t\n/709Rq9lMxk+yvFqsr6I5Em925E9ZvZJ4EVm9reVL7r7PyYXlkj3S3pierX9T0w6xemdibTZbDyV\nbWuyvojkSb0rYW8H9lIYqM2r8p+ItCHpZX6q7f+iM07gojOWZrK0UJTj1dJHIpInNa+EufuvgQvN\nbKO7/yjFmERyI+mloGrtv50221lmKMrxanmszqUlqESaU29i/l+5+zeA48zsxZWv63akSDySXgqq\n2v5bbTOOysUobWt5rM6jqlaR5tW7HTm3+Gc/029F9iccl4gERssMSS06N0RaU+925L8U//x05Wtm\n9jdJBiUi4SlVLu4t+63mUuWirlrlm84NkdZE+cX8aqZVS4pId1PlotSic0OkNa0OwizWKEQkeKpc\nlFp0boi0JsqyRdV4401EwqPqrdqi9I0qF6UWnRsizatXHbmL6oMtA3SNWTqOqrdqa6ZvVLkotejc\nEGlOzduR7j7P3Q+u8t88d2/1CppIJlS9VZv6RkQkG63OCRPpKFqTsDb1jYhINhIdhJnZb8zsLjMb\nNrN1xec+ZWZbi88Nm9nrkoxBBFS9VY/6RkQkG2lcCVvp7svcfbDsuUuKzy1z9x+mEIPEbGR0jA2b\nd3bMLausq7dC7q+F/bP4+OuPY+aMHubO6q3aNxOTnlr8IfeViEicNLdLmtapE9yzqt4Kvb+uG97K\nBdffS1+PMb5/kk/+2fFT4rtueCtbHt/F5Tfdnnj8ofeViEickr4S5sBPzGy9ma0qe/4cM9toZl8x\nswUJxyAx6vRJ3Av7Z7F08fxUr4CF3F/l8e3eN8G+CeeCH9x7IL7S65Puiccfel+JiMTN3JP7yS8z\ne567P2pmhwI/Bc4Ffg1spzBAuwA4zN3PqvLeVcAqgIGBgRPXrFkTe3yjo6P092sZzGbsGZ/g4Sd3\nM1F23vSaccwhc5nT1xtLG92UlzT6qx2N4iu9vmi2s23P9NfTjEWm6qbPSTdRXsKTRU5Wrly5vmIa\nVlWJDsKmNGT2KWDU3S8ue+5o4Hp3f0m99w4ODvq6detij2loaIgVK1bEvt9uNjI6xvILb2Tv+LMT\nuWf39XDr6lNju7rUTXlJo7/a0Si+0usf+P19fP6uGdNeTzMWmaqbPifdRHkJTxY5MbNIg7DEbkea\n2Vwzm1f6O/Ba4G4zO6xsszcBdycVg8Qv6wnunSb0/moUX+n1HrPE4w+9r0RE4pbkxPwB4LtmVmrn\nX939x2b2dTNbRuF25G+A9ycYgyRAy5M0J/T+ahTf6csO54bf3c83TnlZ4vGH3lciInFKbBDm7g8B\nS6s8/86k2pT0aHmS5oTeX43i6+0xli6eH0QsIiLdQr+YLyIiIpIBDcJEREREMqBBmIiIiEgGNAgT\nERERyYAGYSIiIiIZ0CBMREREJAMahImIiIhkQIMwERERkQxoECYiIiKSAQ3CREQCNjI6xobNOxkZ\nHcs6lETl5ThFyiW5dqSIiLThuuGtrL5mI309PYxPTvK5N5/A6csOzzqs2OXlOEUq6UqYiEiARkbH\nWH3NRvaOT7JrbD97xyc575qNXXelKC/HKVKNBmEiIgHasmMPfT1Tv6L7enrYsmNPRhElIy/HKVKN\nBmEiIgE6YsEcxicnpzw3PjnJEQvmZBRRMvJynCLVaBAmIhKghf2z+NybT2B2Xw/zZs1gdl8Pn3vz\nCSzsn5V1aLHKy3GKVKOJ+SI5NzI6xpYdezhiwZzg/uELObY0nL7scJYvWdT1fZCX4xSppEGYSI6F\nXJUWcmxpWtg/KxeDkrwcp0g53Y4UyamQq9JCjk1EJC4ahInkVMhVaSHHJiISFw3CRHIq5Kq0kGMT\nEYlLooMwM/uNmd1lZsNmtq743HPN7Kdm9kDxzwVJxiCdo7RsycSkZx1KWzpl+ZWQq9JCjk1EJC5p\nTMxf6e7byx6fD9zg7p81s/OLj1enEIcErHwS9tnHjrF7eGtHTsLutMnkIVelhRybiEgcsrgd+Qbg\nquLfrwLemEEMEpDKSdiT7h05CbtTJ5Mv7J/F0sXzgxzkhBybiEi7zD25Wz9m9jCwA3DgX9z9CjPb\n6e7zy7bZ4e7Tbkma2SpgFcDAwMCJa9asiT2+0dFR+vv7Y9+vNGfP+AQPP7mbieK5ODAHtu81jjlk\nLnP6ejOOLrrK4wDotc47jmr0WQmPchIm5SU8WeRk5cqV6919sNF2Sd+OXO7uj5rZocBPzexXUd/o\n7lcAVwAMDg76ihUrYg9uaGiIJPYrzRkZHePDF97I3vHCROz/9tL9XParPm49/VUddQWk8jgAZvf1\ndNxxVKPPSniUkzApL+EJOSeJ3o5090eLfz4BfBd4ObDNzA4DKP75RJIxSPgqJ2H3mHXkJGxNJhcR\nkWYkdiXMzOYCPe6+q/j31wKfAb4HvAv4bPHP65KKQTpH+STs7Q/cyWktTGbPeombkdExjlo4l+vP\neRW7901oMrmIiNSV5O3IAeC7ZlZq51/d/cdm9kvgajN7L/AI8JYEY5AOUlq2ZOhBa/q9WVclVmt/\n6eL5jd8oIiK5ldggzN0fApZWeX4EOC2pdiV/yqsS91KYj3XeNRtZvmRRKleism5fREQ6k34xXzpe\n1kvcZN2+iIh0Jg3CpONlvcRN1u2LiEhn0iBMOl7WVYlZty8iIp0pjWWLRBKX9RI3WbcvIiKdR4Mw\n6Rql6sq8ti8iIp1FtyNFREREMqBBmIiIiEgGNAgTERERyYAGYSIiIiIZ0CBMREREJAPm7lnH0JCZ\nPQU8UOPl5wBPRXy+8rlFwPa2A2xNrbiT3k/U7RttV+/1dnIC2eUlq5w0855W89KpOYF48hJiTuq9\npu+v9rbX91e6+9H313RHufshDbdy9+D/A65o9rVqz1c+B6wL8ZiS3E/U7Rttl1ROssxLVjlJIy+d\nmpO48hJiTtrNS6fnpJX9hJ6TLPOi76/wchLlv065Hfn9Fl6r9ny9/aQtrlia3U/U7Rttp5zEu5+k\n89KpOYF44gkxJ/VeCz0v+v6K1k6a9P3VXCxB6IjbkUkxs3XuPph1HDKV8hIe5SQ8ykmYlJfwhJyT\nTrkSlpQrsg5AqlJewqOchEc5CZPyEp5gc5LrK2EiIiIiWcn7lTARERGRTGgQJiIiIpIBDcJERERE\nMqBBWBkze6OZ/R8zu87MXpt1PAJm9mIzu9zM1prZX2cdjzzLzOaa2Xoze33WsQiY2Qoz+3nx87Ii\n63gEzKzHzP7BzL5oZu/KOh4pMLNTip+TL5nZ/8sylq4fhJnZV8zsCTO7u+L5PzazX5vZJjM7H8Dd\nr3X39wHvBt6WQbi50GRO7nP3s4G3AkGWGHeLZvJStBq4Ot0o86XJnDgwCswGtqQda140mZM3AIcD\n4ygniWry35WfF/9duR64Kot4S7p+EAZcCfxx+RNm1gtcBvwJcBzwF2Z2XNkmf198XZJxJU3kxMxO\nB24Bbkg3zNy5koh5MbPXAPcC29IOMmeuJPpn5efu/icUBsefTjnOPLmS6Dk5FrjN3f8W0JX8ZF1J\n8//WvwP4VloBVtP1gzB3vxn4XcXTLwc2uftD7r4PWAO8wQouBH7k7nekHWteNJOT4vbfc/dXAn+Z\nbqT50mReVgKvoPAl9j4z6/rvkiw0kxN3nyy+vgOYlWKYudLk52QLhXwATKQXZf40+++KmR0JPOXu\nT6cb6VQzsmw8Q4cDm8sebwFOAs4FXgM8x8yWuPvlWQSXU1VzUpzb8ucU/lH5YQZx5V3VvLj7OQBm\n9m5ge9kAQJJX67Py58AfAfOBf8oisByr9W/KpcAXzewU4OYsAsu5WnkBeC/w1dQjqpDXQZhVec7d\n/QvAF9IORoDaORkChtINRcpUzcuBv7hfmV4oUlTrs/Id4DtpByNA7Zw8Q+Efe8lGze8vd/9kyrFU\nlddbCFuAxWWPjwAezSgWKVBOwqS8hEc5CY9yEqbg85LXQdgvgRea2TFmNhN4O/C9jGPKO+UkTMpL\neJST8CgnYQo+L10/CDOzbwG3Acea2RYze6+77wfOAf4vcB9wtbvfk2WceaKchEl5CY9yEh7lJEyd\nmhct4C0iIiKSga6/EiYiIiISIg3CRERERDKgQZiIiIhIBjQIExEREcmABmEiIiIiGdAgTERERCQD\nGoSJSNDMbMDM/tXMHjKz9WZ2m5m9qfjaCjN7yszuNLNfm9nNZvb6svd+ysy2mtmwmd1tZqfXaOON\nZvaJssdHmdktZnaXmX2/yXjPMbP3tHq8IpIfeV07UkQ6gJkZcC1wlbu/o/jcUUD5YOrn7v764mvL\ngGvNbI+731B8/RJ3v9jMXgz83MwOrbLg+HkV+/xbCj/s+AUzO6TJsL8C3EoAiwOLSNh0JUxEQnYq\nsM/dLy894e6/dfcvVtvY3YeBz1D4lezK1+4D9gOLyp83sxcBY+6+vezpY4F1xfc9WdxuhZndZGZX\nm9n9ZvZZM/tLM/uP4hWzFxS3fwb4jZm9vI3jFpEc0CBMREJ2PHBHk++5A/j9yifN7CRgEniy4qXl\nVdqYDeyrsu+lwIeAlwLvBF7k7i8HvgScW7bdOuCUJuMWkZzRIExEOoaZXWZmG8zsl/U2q3j8YTMb\nBi4G3ubT12o7jLKBmZldBgwC3yzOJVtctu0v3f0xdx8DHgR+Unz+LuDosu2eAJ4X9bhEJJ80J0xE\nQnYP8ObSA3f/gJktonirsIaXUVist+QSd7+4zvZ7gOdUtHE88BF3r2xnrOzvk2WPJ5n6fTq7uF8R\nkZp0JUxEQnYjMNvM/rrsuYNqbWxmJwAfBy5roo37gCWthVfTi4C7Y96niHQZXQkTkWC5u5vZG4FL\nzOw8CrcNdwOryzY7xczupDA4ewL4YFllZBQ3A583Myu29+88eztyD4VJ+yc1Gfpy4NNNvkdEcsam\nT48QEckXM7sU+L67/3sM+3oZ8Lfu/s72IxORbqbbkSIi8D+pc5uzSYso3BIVEalLV8JEREREMqAr\nYSIiIiIZ0CBMREREJAMahImIiIhkQIMwERERkQxoECYiIiKSAQ3CRERERDLw/wFM7gCpd3aXkAAA\nAABJRU5ErkJggg==\n",
      "text/plain": [
       "<matplotlib.figure.Figure at 0x107f8c908>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "%matplotlib inline\n",
    "gdpVsLifeClean.plot(x=GDP, y=LIFE, kind='scatter', grid=True, logx=True, figsize = (10, 4))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Tasks\n",
    "\n",
    "- Swap the axes of the scatterplot, i.e. show the GDP in the y axis and the life expectancy in the x axis.\n",
    "- Display a scatterplot of the GDP and the population."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "**Now go back to the course step and mark it complete.**"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.6.2"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 1
}