IT: e-government
IT: e-government

This free course is available to start right now. Review the full course description and key learning outcomes and create an account and enrol if you want a free statement of participation.

Free course

IT: e-government

3.3 Entities and attributes

A well-designed relational database overcomes the problems outlined in Section 3.2 by using two or more tables, rather than a single table, such as Table 1. This means that the data has to be divided in some way between the tables. The construction of tables is done according to several rules. I will just look briefly at one rule to give you an idea of the underlying principles.

The rule I shall look at states that there must be one table per entity. An entity is an item for which we want to store information. It can be tangible – for example a person or an object – or intangible – for example an event. (Typical events might be a sale, registration or renewal.) An entity can be a concept – for example a bank account.

In Table 1 the most obvious entity is 'student'. Any particular student, such as Colin Cherry, is an instance of this entity. Entities are the things we store data about in tables. Each piece of information we store is an attribute of that entity. An attribute is descriptive information about an entity. For the 'student' entity, attributes in Table 1 are name, address and postcode. In a properly constructed table there is one field per attribute.

You might argue that 'Course1', 'Course2' and 'Course3' are also attributes of 'student', and in some contexts that would be reasonable. In this case, though, there is descriptive information about courses, namely the number of sessions they run for. Courses therefore have attributes. In this context, then, 'course' is another entity. Whether something is regarded as an entity or not is dependent on the context.

Activity 7 (exploratory)

Does 'postcode' have any attributes in Table 1 (see screen 6)?

Discussion

No. There are no attributes of 'postcode'. There is no descriptive information, or label or other qualifying information about postcodes.

I said above that in a properly constructed relational database we have one table per entity. Table 2 is the table for the student entity and Table 3 is the one for the course entity. I have added a few more fields to the tables to make them more realistic. For instance, Table 3 has a column for 'Fee' and a column for 'Course_code'.

Table 2 Student entity table

Student_index Family_name Given_name Address Postcode
1 Lodhi Mona 22 The Grove, Newport AB12 3CD
2 Jones Bob 2 High Street, Stratford AB6 4PQ
3 Cherry ColinI 59 Acacia Avenue, Brompton AB2 12ZY
4 Cherry Colin 13 The Limes, Leighton XY6 7LR
5 Edwards Delia 40 Eldon Court, Hampton XY12 4TK
6 Roberts Albert 18 Mount Pleasant, Greenhill AB12 7UB
7 Singh Sara 7 Marina View, Sutton ABM 8WQ
8 Chang Patrick 21 Green Lane, Newport AB12 9TU
9 Evans Mary 13 The Limes, Leighton AB6 7LR

Table 3 Course entity table

Course-code Course Sessions Fee
E01 Yoga 10 £20
E02 Digital photography 10 £20
E03 Everyday maths 20 £40
E04 Oil painting 15 £30
E05 Creative writing 20 £40
E06 Holiday Spanish 15 £30
E07 Woodwork 15 £30
E08 IT for all 5 £10
E09 Drawing 15 £30
E10 Sewing 15 £30
E11 Ballroom dancing 30 £60

The next step is to link the records in these two tables, which I will do shortly. Linking the records in separate tables requires the use of keys. A key uniquely identifies each record of a table. Table 1 did not have any keys. ('Name' cannot be a key because, as Colin Cherry shows, names might not be unique.) In Table 2 I have introduced a new column named 'Student_index'. Each entry in this new column uniquely identifies a particular record. The 'Student_index' in fact consists of just the sequential numbering of the records. This is the simplest way to create a key when there is no other key readily available.

The ultimate intention is to link students to courses through the keys. By associating a particular student with a particular course we have a way of representing an enrolment. We need therefore to think about a suitable key for Table 3.

Activity 8 (self-assessment)

In Table 3 I have not numbered the records as a key, as I did in Table 2, because I would like to use the course code as a key. How valid is the course code as a key under the following circumstances?

  1. Courses are presented only once, and never repeated.

  2. Courses are run only once per year, and repeated the following year.

  3. Some courses are run twice or more per year.

Answer

  1. If the course is never repeated, then we can safely use the course code as a key, because the code identifies something unique.

  2. If courses are repeated annually there is a problem with using the course code as a key because it no longer represents something unique. For instance, students enrolling for Yoga in successive years are enrolling for different instances of the course. Each instance of the course would need its own record in the database. However, if the database is made afresh each year, and just covers one year's worth of data, then the problem does not arise and the course code can be used as a key.

  3. If courses are run twice or more per year, there is a problem with using the course code as a key for the same reasons as in (2). That is, each presentation of the course is a new instance (assuming the database has a lifetime of a year or more), and so the code does not represent something unique.

In the following I am going to assume that the course code can be used as a key in Table 3; that is, the conditions in (1) or (2) of Activity 8 apply, with the proviso in the case of (2) that the database holds only a year's worth of data.

Splitting Table 1 into Tables 2 and 3 has lost the relationship between the entities 'student' and 'course'. We can no longer tell who is enrolled for what. The essence of the relational approach to database design is to relate entities to each other by relating their keys. In this example we would do this by constructing a joining table to capture the relationship between entities. Table 4 is my joining table. Student keys are on the left, and the keys from the courses table are on the right. To keep things simple, I am using the course code as the key although, as you saw in Activity 8, this is acceptable only under certain conditions.

I have annotated the table with names so that you can identify the records; but the annotations are not part of the table.

Table 4

Joining table for students and courses

You will notice in Table 4 that student keys can appear more than once on the left, and courses can appear more than once on the right. This follows from the fact that the relationship between the student entity and the course entity is many-to-many. That is to say, one student can enrol for many courses, and each course can be taken by many students.

The entity tables and joining table taken together constitute the relational database. A real example might have many entity tables and many joining tables. It might also have no joining tables – for reasons I will come back to.

Notice how the new relational database solves the problems I identified with Table 1. If someone enrols for several courses, we just add more records to Table 4, as required. Each record in Table 4, therefore, represents the enrolment of a student on a course. We do not need to have empty fields available to accommodate additional enrolments by a student, as we did in Table 1. Also, if there is an administrative change to a course, such as the number of sessions, we need make only a single change, in Table 3. (You will recall that in Table 1, the repetition of the same piece of data in several places meant that administrative changes would require hunting out every occurrence of the data and modifying it.)

The process of organising data efficiently into tables, so that unnecessary repetition is avoided and so that each table represents a single entity and its attributes, is known as normalisation. The process we have gone through with Tables 1, 2, 3 and 4 is a very simple example of part of the normalisation process.

In the light of the normalisation we have just done, we can criticise Table 1 in more appropriate language than before. The trouble with Table 1 is that:

  • it contains data about more than one entity;

  • it attempts to capture the relationship between entities.

It is therefore mixing different functions, and this is why it is unsatisfactory.

Activity 9: (self-assessment)

Table 5 is a flat database relating to some regular committee meetings. What are the entities and their associated attributes?

Table 5 Committee information

Committee Meeting time Member Telephone number
Planning Every Monday a.m. Jones 1239
Patel 4728
Robinson 3589
Recreation Every Friday p.m. Jones 1239
Smith 4633
Education Final Wednesday of month, p.m. Patel 4728
Robinson 3589
Smith 4633

Answer

One entity is 'Committee'. The associated attribute is 'Meeting time'.

Another entity is 'Member'. The associated attribute is 'Telephone number'.

I mentioned that some relational databases might have no joining tables. The following activity is designed to show how this might come about.

Activity 10: (exploratory)

What would happen to the joining table, Table 4, if this particular local authority allowed people to enrol for only one course at a time? Write the first few rows of a new joining table on the assumption that each student takes only their 'Course1' as shown in Table 1.

Discussion

Table 6 shows the first five rows of the new joining table. Notice that in Table 6 each student key appears only once, reflecting the fact that students are allowed only one enrolment at a time.

Table 6: Modified joining table: one course per student

Student_index Course_code
1 E01
2 E08
3 E02
4 E05
5 E02

The relationship between the keys in Table 6 could be captured by a simple amendment to the student table, which would remove the need for a joining table. Table 7 shows the first few rows of the modified student table.

Table 7: Modified student entity table

Student_index Family_name Given_name Address Postcode Course_key
1 Lodhi Mona 22 The Grove, Newport AB12 3CD E01
2 Jones Bob 2 High Street, Stratford AB6 4PQ E08
3 Cherry Colin 59 Acacia Avenue, Brompton AB2 12ZY E02
4 Cherry Colin 13 The Limes, Leighton XY6 7LR E05
5 Edwards Delia 40 Eldon Court, Hampton XY12 4TK E02

You can see that Table 7 now has a new field, for the course key. This new field links records in Table 7 to those in Table 3, and lets us see enrolments at a glance. No joining table is needed.

The additional key in Table 7 (the course code) is known as a foreign key. Using a foreign key has enabled us to dispense with the joining table, but only because the relationship between students and courses is no longer many-to-many. When the relationship between entities is many-to-many, a joining table is needed.

Activity 11 (self-assessment)

Suppose Table 5 is to be normalised into separate tables for the 'Committee' entity and the 'Member' entity.

  1. Why is a joining table needed?

  2. Construct the joining table, using Committee as one key and Member as the other.

Answer

  1. A joining table is required because the relationship between 'Committee' and 'Member' is many-to-many. (A committee has several members, and each member can be on more than one committee.)

  2. See Table 8.

Table 8 Joining table for Activity 11

Committee Member
Planning Jones
Planning Patel
Planning Robinson
Recreation Jones
Recreation Smith
Education Patel
Education Robinson
Education Smith

This is as far as we shall go in constructing relational databases. The important point to appreciate from this brief introduction to the topic is that even a fairly simple example has called for a careful analysis of the nature of the information and how it is related. Many large-scale information systems succeed or fail on the quality of the analysis, and the quality of the design that follows from the analysis.

Some of the processes of database construction can be done automatically. For instance, if you construct a table like Table 1 in Microsoft Access, the program itself can make a reasonable attempt at creating entity tables and joining tables for you. However, for large-scale information systems, such as those involved in e-government, there is no substitute for a careful study of the nature of the information, its relationships, and its uses.

T175_5

Take your learning further

Making the decision to study can be a big step, which is why you'll want a trusted University. The Open University has 50 years’ experience delivering flexible learning and 170,000 students are studying with us right now. Take a look at all Open University courses.

If you are new to university level study, find out more about the types of qualifications we offer, including our entry level Access courses and Certificates.

Not ready for University study then browse over 900 free courses on OpenLearn and sign up to our newsletter to hear about new free courses as they are released.

Every year, thousands of students decide to study with The Open University. With over 120 qualifications, we’ve got the right course for you.

Request an Open University prospectus