Learn to code for data analysis
Learn to code for data analysis

Start this free course now. Just create an account and sign in. Enrol and complete the course for a free statement of participation or digital badge if available.

Free course

Learn to code for data analysis

Week 7: Further techniques Part 1


Welcome to Week 7

Please note: in the following video, where reference is made to a study ‘week’, this corresponds to Weeks 7 and 8 of this course.

Download this video clip.Video player: lcdab_w7_intro.mp4
Skip transcript


The ability to make comparisons between different things lies at the heart of statistics. Is this number bigger than that number? Is this average of this group more or less than the average of that group? Or are these different things correlated in some way. Combining data sets is one way of getting data into a form that makes comparisons possible. But we might also be able to look within a particular set for groups of elements that we can compare with each other.
For example, when it comes to totting up the bills each month, if the items on our bank and credit card statements were tagged with a category such as food, transport, utilities or entertainment, we could more easily find out just how much we're spending on one aspect of our lives compared to another. Or if you were in a sales team you mind find it useful to compare sales by month, by region or by sales rep. Ordering large data sets and then grouping them into sets of related items and treating that as a small data set in its own right is another of those things that's hard for us but easy for a programme to do.
And as well as processing chunks of a data set as separate groups of items we can also twist and contort a data set, making new rows from columns or many columns from one column and bending it into a shape where we can group it, summarise it, or generate interesting data visualisations directly from it. So that's what we'll be covering in this final week of the course. How to work with groups of data in a single data set and how to get our data sets doing data gymnastics.
End transcript
Interactive feature not available in single page view (see it in standard view).

In Week 6 you saw how to merge two datasets containing a common column to create a single, combined dataset. Combining datasets allows us to make comparisons across datasets, as you discovered when looking for correlations between GDP and life expectancy.

This week, you’ll learn how to go the other way, separating out distinct ‘subsets’ or groups of data, before summarising them individually.

As well as splitting out different groups of data, row and column values can be rearranged to reshape a dataset and allow the creation of a wide range of pivot table style reports from a single data table.

In this week’s exercises, you’ll learn how a single line of code can be used to generate a wide variety of pivot table style reports of your own.


Take your learning further

Making the decision to study can be a big step, which is why you'll want a trusted University. The Open University has 50 years’ experience delivering flexible learning and 170,000 students are studying with us right now. Take a look at all Open University courses.

If you are new to University-level study, we offer two introductory routes to our qualifications. You could either choose to start with an Access module, or a module which allows you to count your previous learning towards an Open University qualification. Read our guide on Where to take your learning next for more information.

Not ready for formal University study? Then browse over 1000 free courses on OpenLearn and sign up to our newsletter to hear about new free courses as they are released.

Every year, thousands of students decide to study with The Open University. With over 120 qualifications, we’ve got the right course for you.

Request an Open University prospectus371