Learn to code for data analysis
Learn to code for data analysis

Start this free course now. Just create an account and sign in. Enrol and complete the course for a free statement of participation or digital badge if available.

Free course

Learn to code for data analysis

4.1 Week 4 glossary

Here is an alphabetical list of the terms introduced this week, for quick look-up.

Programming and data analysis concepts

The bitwise operators & (and) and | (or) are used in pandas to build more complicated expressions from two comparison expressions (typically involving column comparisons).

A Boolean has one of two possible values: True or False .

A Comma Separated Values (CSV) file is a plain text file that is used to hold tabular data.

A list is a sequence of values, separated by commas, and written within square brackets.

There are six comparison operators that can be used to compare number, string and date values. Expressions composed of these operators evaluate to True or False . These operators can also be used to compare every value in a column, row by row, against some number, string or date value. When used in this manner the operators return a series of Boolean values.

The ‘dot’ notation is used to access a dataframe’s methods and attributes.

The Series data type is a collection of values with an integer index that starts from zero. Each column in a dataframe is an example of the Series data type. The Series data type has many of the same methods as the DataFrame data type.

The object data type is how pandas represents strings.

The datetime64 data type is how pandas represents dates.

The int64 data type is how pandas represents integers (whole numbers).

The float64 data type is how pandas represents floating point numbers (decimals).

Functions and methods

asType(aType) when applied to a dataframe column, the method changes the data type of each value in that column to the type given by the string aType .

datetime(yyyy, mm, dd) the function takes three arguments, yyyy a four digit integer representing a year, mm a two digit integer representing a month and dd a two digit integer representing a day. From these arguments the function creates and returns a value of datetime64 .

dropna() when applied to a dataframe returns a new dataframe without the rows that have at least one missing value.

head() gets and displays the first five rows of a dataframe. Optionally the method can take an integer argument to specify how many rows (from and including row 0) to get and display.

iloc[index] gets and displays the row in the dataframe indicated by the integer argument index .

isnull() is a series method that checks which rows in that series have a missing value.

fillna(value) is a series method that returns a new series in which all missing values have been filled with the given value.

plot() when applied to a dataframe column of numeric values, the method displays a graph of those values. The x-axis shows the dataframe’s index and the y-axis the range of the column’s values. Before the method is called you first need to execute %matplotlib inline .

read_csv(csvFile) creates a dataframe from the dataset in the CSV file.

rename(columns={oldName : newName}) renames the column oldName to newName .

str.rstrip(suffix) when applied to a dataframe column of string values, the method removes the argument suffix from the end of each string value in the column.

tail() gets and displays the last five rows of a dataframe. Optionally the method can take an integer argument to specify how many rows (until and including the last row) to get and display.

to_datetime(aSeries) when applied to a series, typically a column from a dataframe, this function returns a new series in which each value in aSeries has been changed to type datetime64 .

LCDAB_1

Take your learning further

Making the decision to study can be a big step, which is why you'll want a trusted University. The Open University has 50 years’ experience delivering flexible learning and 170,000 students are studying with us right now. Take a look at all Open University courses.

If you are new to University-level study, we offer two introductory routes to our qualifications. You could either choose to start with an Access module, or a module which allows you to count your previous learning towards an Open University qualification. Read our guide on Where to take your learning next for more information.

Not ready for formal University study? Then browse over 1000 free courses on OpenLearn and sign up to our newsletter to hear about new free courses as they are released.

Every year, thousands of students decide to study with The Open University. With over 120 qualifications, we’ve got the right course for you.

Request an Open University prospectus371