My OpenLearn Profile

Personalise your OpenLearn profile, save your favourite content and get recognition for your learning

Create account / Sign in

Course content Course content

Learn to code for data analysis

Start this free course now. Just create an account and sign in. Enrol and complete the course for a free statement of participation or digital badge if available.

More free courses

1.3 Summary operations

Summary, or aggregation, operations are used to produce a single summary value or statistic, such as the group average, for each separate group.

Find the ‘total’ amount within each group using a summary operation:

A summary operator of the 'total' amount for Commodity A, B and C

Figure 4

Show description|Hide description

Applying a summary operator to the rows contained within each group for each group separately: Three tables one for each Commodity A, B and C with their amounts. Amount for each commodity are then totalled.

Figure 4

To apply a summary operator to each group, such as a function to find the mean value of each group, and then automatically combine the results into a single output dataframe, pass the name of the function in to the aggregate() method. Note that pandas will try to use this operator to summarise each column in the grouped rows separately if there is more than one column that can be summarised. So for example, if there was a ‘Volume’ column, it would also return total volumes.

Let’s use again the example dataframe defined earlier:

In []:

Out[]:

	Commodity	Amount
0	A	10
1	A	15
2	A	5
3	A	20
4	B	10
5	B	10
6	B	5
7	C	20
8	C	30

Group the data by commodity type and then apply the sum operation and combine the results in an output dataframe. The grouping elements are used to create index values in the output dataframe.

In []:

grouped=df.groupby('Commodity')

grouped.aggregate(sum)

Out[]:

	Amount
Commodity
A	50
B	25
C	50

In this case, the aggregate() method applies the sum summary operation to each group and then automatically combines the results. For a summary operation such as this, the resulting combined dataframe contains as many rows as there were groups created by the splitting .groupby() operation.

A single dataframe of the previous summary operator.

Figure 5

Show description|Hide description

The individual summaries, the results of the previous image, for each group Commodities A, B and C are combined into a single dataframe

Figure 5

The slightly more general apply() method can also be substituted for the aggregate() method and will similarly take the rows associated with each group, apply a function to them, and return a combined result.

The apply() method can be really handy if you have defined a function of your own that you want to apply to just the rows associated with each group. Simply pass the name of the function to the apply() method and it will then call your function, once per group, on the sets of rows associated with each group.

For example, find the top two items by ‘Amount’ in each group:

In []:

def top2byAmount(g):

return g.sort_values('Amount', ascending=False).head(2)

grouped.apply(top2byAmount)

Out[]:

		Amount
Commodity
A	3	20
	1	15
B	4	10
	5	10
C	8	30
	7	20

The second index column containing the numbers 3, 1, 4 etc., contains the original index value of each row.

In Week 3 the apply() method was called on a column, to apply the given function to each cell. Here it was called on a grouped dataframe, to apply the given function to each group.

Exercise 3 Experimenting with split-apply-combine

Work through Exercise 3 in your Exercise notebook 4 to practise the summary operations.

As you complete the tasks, think about these questions:

For your dataset, which months saw the highest and lowest levels of trade activity? Did there appear to be any seasonal behaviour?
When graphically comparing total trade flows from the leading partner countries to the World total, did it look as if any partners particularly dominated that area of trade? If you have time, find news reports discussing why this should be the case.

Previous 1.2 Looking at apply and combine operations

Next 1.4 Filtering groups

Take your learning further

Making the decision to study can be a big step, which is why you’ll want a trusted University. We’ve pioneered distance learning for over 50 years, bringing university to you wherever you are so you can fit study around your life. Take a look at all Open University courses.

If you’re new to university-level study, read our guide on Where to take your learning next, or find out more about the types of qualifications we offer including entry level Access modules, Certificates, and Short Courses.

Want to achieve your ambition? Study with us and you’ll be joining over 2 million students who’ve achieved their career and personal goals with The Open University.

Browse all Open University courses

My OpenLearn Profile

About this free course

Become an OU student

Download this course

Share this free course

1.3 Summary operations

Exercise 3 Experimenting with split-apply-combine