Learn to code for data analysis
Learn to code for data analysis

Start this free course now. Just create an account and sign in. Enrol and complete the course for a free statement of participation or digital badge if available.

Free course

Learn to code for data analysis

4.1 Week 7 and 8 glossary

Here are alphabetical lists, for quick look up, of what this week introduced.

Concepts

An API , or application programming interface provides a way for computer programmes to make function or resource requests from a software application. Many online applications provide a web API that allows requests to be made over the internet using web addresses or URLs (uniform resource locator). A URL may include several parameters that act as arguments used to pass information into a function provided by the API. To prevent ambiguity, simple punctuation is avoided in URLs. Instead, ‘websafe’ encodings using the ASCII encoding scheme are typically used to describe punctuation characters.

The notion of grouping refers to the collecting together of sets of rows based on some defining characteristic. Grouping on one or more key columns splits the dataset into separate groups, one group for each unique combination of values that appears in the dataset across the key columns. Note that not all possible combinations of cross-column key values will necessarily exist in a dataset.

The split-apply-combine pattern describes a process in which a dataset is split into separate groups, some function is applied to the members of each separate group, and the results then combined to form an output dataset.

Functions and methods

df.to_csv(filename,index=False) writes the contents of the dataframe df to a CSV file with the specified filename in the current folder. The index parameter controls whether the dataframe index is included in the output file.

read_csv(URL,dtype={}) can be used to read a CSV file in from a web location given the web address or URL of the file. We also made use of an additional parameter, dtype to specify the data type of specified columns in a dataframe created from a CSV file.

df.groupby(columnName) or df.groupby(listOfColumnNames) is used to split a dataframe into separate groups indexed by the unique values of columnName or unique combinations of the column values specified in the listOfColumnNames.

grouped.get_group(groupName) is used to retrieve a particular group of rows by group name from a set of grouped items.

grouped.groups.keys() is used to retrieve the names of groups that exist within a set of grouped items.

grouped.aggregate(operation) applies a specified operation to a group (such as sum) and then combines the results into a single dataframe indexed by group.

grouped.apply(myFunction) will apply a user defined function to the rows associated with each group in a set of grouped items and return a dataframe containing the combined rows returned from the user defined function.

grouped.filter(myFilterFunction) will apply a user defined filtration function to each group in a set of grouped items that tests each group and returns a Boolean True or False value to say whether each group has passed the filter test. The .filter() function then returns a single dataframe that contains the combined rows from groups that the user defined filter function let through.

pivot_table(df, index=indexColumnNames, columns=columnsColumnNames, values=valueColumnName, aggfunc=aggregationFunction) generates a pivot table from a dataframe using unique combinations of values from one or more columns specified by the indexColumnNames list to define the row index and unique combinations of values from one or more columns specified by the columnsColumnNames list to define the columns. The pivot table cells are calculated by applying the aggfunc function to the valueColumnName column in the group of rows associated with each cell.

LCDAB_1

Take your learning further

Making the decision to study can be a big step, which is why you'll want a trusted University. The Open University has 50 years’ experience delivering flexible learning and 170,000 students are studying with us right now. Take a look at all Open University courses.

If you are new to University-level study, we offer two introductory routes to our qualifications. You could either choose to start with an Access module, or a module which allows you to count your previous learning towards an Open University qualification. Read our guide on Where to take your learning next for more information.

Not ready for formal University study? Then browse over 1000 free courses on OpenLearn and sign up to our newsletter to hear about new free courses as they are released.

Every year, thousands of students decide to study with The Open University. With over 120 qualifications, we’ve got the right course for you.

Request an Open University prospectus371