3.2 How did it work?
Judging by the large fine and the harsh comment from the ICO, you may have guessed that the TalkTalk attack was a relatively simple one. It was a type of attack known as a Structured Query Language injection (SQLi) – which, at the time of writing, has been well known and understood within the security field for over a decade.
The Structured Query Language (SQL) is a programming language that is used for managing relational databases and their data. As the contents of most modern commercial websites are database-driven, many web pages are dynamically created based on templates, user inputs, the data in the database and other information. This method enables web pages to be more easily personalised. However, if the designer of the template – which is usually a script or program that can access the databases – does not consider SQLi prevention, an attacker can append SQL codes to their input fields in the web page to manipulate data in the database, even if they are not authorised to do so.
To give you an insight into how the SQLi attack works, Box 1 gives a simplified example.
Box 1 How the SQLi attack works
Suppose there is a landing web page that asks you to enter your username and password, as shown in Figure 4.
This web page is controlled by a script, which will create a personalised web page if the user logs in successfully. The script takes the entered username and password from the input fields of the web page and uses them to construct a SQL query statement. For this example, the query statement is to ask the database to return the user’s stored information, including the password, so that it can compare it with the entered password. As ‘John’ and ‘myPass’ were entered, as shown in Figure 4, the script will create a query statement like the one below:
SELECT * FROM Users WHERE Name = "John" AND Password = "myPass"
Don’t worry if you don’t understand what the above SQL query statement means, as this will be explained now. The first part:
SELECT * FROM Users
asks the database to select all the fields (as the * symbol means all the fields) in the data table named Users. Table 1 shows the contents of the Users data table.
ID | Name | Address | Phone | Password | |
1 | Faisal | 10 ABC Street, Some Town | faisal@abc.com | 01234 567890 | hisPass |
2 | John | 20 ABC Street, Some Town | john@abc.com | 01234 123456 | myPass |
3 | Mei-ling | 30 ABC Street, Some Town | meiling@abc.com | 01234 098765 | herPass |
… | … | … | … | … | … |
156959 | Bert | 1 DEF Drive, Another Town | Bert99@abc.com | 01567 987654 | BertHasAtleastTriedtoUseA securePassword^3 |
The second part:
WHERE Name = "John" AND Password = "myPass"
is a condition statement, which determines which row(s) in the data table are affected. In this example in Table 1, it is the second row as the contents in the Name and Password field match with those in the condition statement. In other words, the SQL statement asks the database to return all the data of the user whose name is ‘John’ and password is ‘myPass’.The following row of data will hence be returned:
ID | Name | Address | Phone | Password | |
2 | John | 20 ABC Street, Some Town | john@abc.com | 01234 123456 | myPass |
Now, if the attacker can find a way to make the database bypass the checking of the username and password, it can potentially obtain all the information in the database. One way to achieve this is to construct the SQL statement as follows:
SELECT * FROM Users WHERE TRUE
As the condition statement is now always TRUE regardless of what the entered username and password are, the database will return everything in the Users table.
An attacker cannot change the SQL query statement directly as they have no control of the script. Nevertheless, they may be able to influence what the constructed SQL query statement will be by carefully crafting and appending SQL codes to the ‘Username’ and ‘Password’ fields in the landing web page. Figure 5 shows an example.
These inputs look quite odd – for instance, they are missing the beginning and ending quotation marks. However, they are specially crafted such that when the script combines the entered username and password to construct the SQL query statement, it will become:
SELECT * FROM Users WHERE Name = "whatever" OR "a"="a" AND Password = "whatever" OR "a"="a"
The condition statement now contains two OR clauses and one AND clause. The OR operator will output TRUE if either of the conditions on the left and right sides of the OR operator is TRUE. As "a"="a" (two identical letters) will always be evaluated as TRUE, the query statement is in effect equivalent to:
SELECT * FROM Users WHERE TRUE AND TRUE
The AND operator will output TRUE if both of the conditions on the left and right sides of the AND operator are TRUE. This means the query statement is equivalent to:
SELECT * FROM Users WHERE TRUE
This query statement will make the database bypass the checking of the username and password and show all the information in the Users data table.
The TalkTalk attackers used a similar SQLi principle to steal TalkTalk’s customer information. SQLi can also be used to add or delete data or even to delete the whole database. The web page designer must therefore ensure that any user inputs obtained through fields in a web page are free of SQL codes. There are a number of ways of doing this validation, but they will not be described here as they are out of the scope of this course.
Activity 9
To reinforce your understanding of the SQLi attack watch Video 1, which explains the attack using an animation.
Transcript: Video 1 Animated explanation of the SQLi attack
[MUSIC PLAYING]