IT: e-government
IT: e-government

Start this free course now. Just create an account and sign in. Enrol and complete the course for a free statement of participation or digital badge if available.

Free course

IT: e-government

3.3 Entities and attributes

A well-designed relational database overcomes the problems outlined in Section 3.2 by using two or more tables, rather than a single table, such as Table 1. This means that the data has to be divided in some way between the tables. The construction of tables is done according to several rules. I will just look briefly at one rule to give you an idea of the underlying principles.

The rule I shall look at states that there must be one table per entity. An entity is an item for which we want to store information. It can be tangible – for example a person or an object – or intangible – for example an event. (Typical events might be a sale, registration or renewal.) An entity can be a concept – for example a bank account.

In Table 1 the most obvious entity is 'student'. Any particular student, such as Colin Cherry, is an instance of this entity. Entities are the things we store data about in tables. Each piece of information we store is an attribute of that entity. An attribute is descriptive information about an entity. For the 'student' entity, attributes in Table 1 are name, address and postcode. In a properly constructed table there is one field per attribute.

You might argue that 'Course1', 'Course2' and 'Course3' are also attributes of 'student', and in some contexts that would be reasonable. In this case, though, there is descriptive information about courses, namely the number of sessions they run for. Courses therefore have attributes. In this context, then, 'course' is another entity. Whether something is regarded as an entity or not is dependent on the context.

Activity 7 (exploratory)

Does 'postcode' have any attributes in Table 1 (see screen 6)?

Discussion

No. There are no attributes of 'postcode'. There is no descriptive information, or label or other qualifying information about postcodes.

I said above that in a properly constructed relational database we have one table per entity. Table 2 is the table for the student entity and Table 3 is the one for the course entity. I have added a few more fields to the tables to make them more realistic. For instance, Table 3 has a column for 'Fee' and a column for 'Course_code'.

Table 2 Student entity table

Student_indexFamily_nameGiven_nameAddressPostcode
1LodhiMona22 The Grove, NewportAB12 3CD
2JonesBob2 High Street, StratfordAB6 4PQ
3CherryColinI59 Acacia Avenue, BromptonAB2 12ZY
4CherryColin13 The Limes, LeightonXY6 7LR
5EdwardsDelia40 Eldon Court, HamptonXY12 4TK
6RobertsAlbert18 Mount Pleasant, GreenhillAB12 7UB
7 SinghSara7 Marina View, SuttonABM 8WQ
8 ChangPatrick21 Green Lane, NewportAB12 9TU
9EvansMary13 The Limes, LeightonAB6 7LR

Table 3 Course entity table

Course-codeCourseSessionsFee
E01Yoga10£20
E02Digital photography10£20
E03Everyday maths20£40
E04Oil painting15£30
E05Creative writing20£40
E06Holiday Spanish15£30
E07Woodwork15£30
E08IT for all5£10
E09Drawing15£30
E10Sewing15£30
E11Ballroom dancing30£60

The next step is to link the records in these two tables, which I will do shortly. Linking the records in separate tables requires the use of keys. A key uniquely identifies each record of a table. Table 1 did not have any keys. ('Name' cannot be a key because, as Colin Cherry shows, names might not be unique.) In Table 2 I have introduced a new column named 'Student_index'. Each entry in this new column uniquely identifies a particular record. The 'Student_index' in fact consists of just the sequential numbering of the records. This is the simplest way to create a key when there is no other key readily available.

The ultimate intention is to link students to courses through the keys. By associating a particular student with a particular course we have a way of representing an enrolment. We need therefore to think about a suitable key for Table 3.

Activity 8 (self-assessment)

In Table 3 I have not numbered the records as a key, as I did in Table 2, because I would like to use the course code as a key. How valid is the course code as a key under the following circumstances?

  1. Courses are presented only once, and never repeated.

  2. Courses are run only once per year, and repeated the following year.

  3. Some courses are run twice or more per year.

Answer

  1. If the course is never repeated, then we can safely use the course code as a key, because the code identifies something unique.

  2. If courses are repeated annually there is a problem with using the course code as a key because it no longer represents something unique. For instance, students enrolling for Yoga in successive years are enrolling for different instances of the course. Each instance of the course would need its own record in the database. However, if the database is made afresh each year, and just covers one year's worth of data, then the problem does not arise and the course code can be used as a key.

  3. If courses are run twice or more per year, there is a problem with using the course code as a key for the same reasons as in (2). That is, each presentation of the course is a new instance (assuming the database has a lifetime of a year or more), and so the code does not represent something unique.

In the following I am going to assume that the course code can be used as a key in Table 3; that is, the conditions in (1) or (2) of Activity 8 apply, with the proviso in the case of (2) that the database holds only a year's worth of data.

Splitting Table 1 into Tables 2 and 3 has lost the relationship between the entities 'student' and 'course'. We can no longer tell who is enrolled for what. The essence of the relational approach to database design is to relate entities to each other by relating their keys. In this example we would do this by constructing a joining table to capture the relationship between entities. Table 4 is my joining table. Student keys are on the left, and the keys from the courses table are on the right. To keep things simple, I am using the course code as the key although, as you saw in Activity 8, this is acceptable only under certain conditions.

I have annotated the table with names so that you can identify the records; but the annotations are not part of the table.

Table 4

Joining table for students and courses

You will notice in Table 4 that student keys can appear more than once on the left, and courses can appear more than once on the right. This follows from the fact that the relationship between the student entity and the course entity is many-to-many. That is to say, one student can enrol for many courses, and each course can be taken by many students.

The entity tables and joining table taken together constitute the relational database. A real example might have many entity tables and many joining tables. It might also have no joining tables – for reasons I will come back to.

Notice how the new relational database solves the problems I identified with Table 1. If someone enrols for several courses, we just add more records to Table 4, as required. Each record in Table 4, therefore, represents the enrolment of a student on a course. We do not need to have empty fields available to accommodate additional enrolments by a student, as we did in Table 1. Also, if there is an administrative change to a course, such as the number of sessions, we need make only a single change, in Table 3. (You will recall that in Table 1, the repetition of the same piece of data in several places meant that administrative changes would require hunting out every occurrence of the data and modifying it.)

The process of organising data efficiently into tables, so that unnecessary repetition is avoided and so that each table represents a single entity and its attributes, is known as normalisation. The process we have gone through with Tables 1, 2, 3 and 4 is a very simple example of part of the normalisation process.

In the light of the normalisation we have just done, we can criticise Table 1 in more appropriate language than before. The trouble with Table 1 is that:

  • it contains data about more than one entity;

  • it attempts to capture the relationship between entities.

It is therefore mixing different functions, and this is why it is unsatisfactory.

Activity 9: (self-assessment)

Table 5 is a flat database relating to some regular committee meetings. What are the entities and their associated attributes?

Table 5 Committee information

CommitteeMeeting timeMemberTelephone number
PlanningEvery Monday a.m.Jones1239
Patel4728
Robinson3589
RecreationEvery Friday p.m.Jones 1239
Smith4633
EducationFinal Wednesday of month, p.m.Patel4728
Robinson3589
Smith4633

Answer

One entity is 'Committee'. The associated attribute is 'Meeting time'.

Another entity is 'Member'. The associated attribute is 'Telephone number'.

I mentioned that some relational databases might have no joining tables. The following activity is designed to show how this might come about.

Activity 10: (exploratory)

What would happen to the joining table, Table 4, if this particular local authority allowed people to enrol for only one course at a time? Write the first few rows of a new joining table on the assumption that each student takes only their 'Course1' as shown in Table 1.

Discussion

Table 6 shows the first five rows of the new joining table. Notice that in Table 6 each student key appears only once, reflecting the fact that students are allowed only one enrolment at a time.

Table 6: Modified joining table: one course per student

Student_indexCourse_code
1E01
2E08
3E02
4E05
5E02

The relationship between the keys in Table 6 could be captured by a simple amendment to the student table, which would remove the need for a joining table. Table 7 shows the first few rows of the modified student table.

Table 7: Modified student entity table

Student_indexFamily_nameGiven_nameAddressPostcodeCourse_key
1LodhiMona22 The Grove, NewportAB12 3CDE01
2JonesBob2 High Street, StratfordAB6 4PQE08
3CherryColin59 Acacia Avenue, BromptonAB2 12ZYE02
4CherryColin13 The Limes, LeightonXY6 7LRE05
5EdwardsDelia40 Eldon Court, HamptonXY12 4TKE02

You can see that Table 7 now has a new field, for the course key. This new field links records in Table 7 to those in Table 3, and lets us see enrolments at a glance. No joining table is needed.

The additional key in Table 7 (the course code) is known as a foreign key. Using a foreign key has enabled us to dispense with the joining table, but only because the relationship between students and courses is no longer many-to-many. When the relationship between entities is many-to-many, a joining table is needed.

Activity 11 (self-assessment)

Suppose Table 5 is to be normalised into separate tables for the 'Committee' entity and the 'Member' entity.

  1. Why is a joining table needed?

  2. Construct the joining table, using Committee as one key and Member as the other.

Answer

  1. A joining table is required because the relationship between 'Committee' and 'Member' is many-to-many. (A committee has several members, and each member can be on more than one committee.)

  2. See Table 8.

Table 8 Joining table for Activity 11

CommitteeMember
PlanningJones
PlanningPatel
PlanningRobinson
RecreationJones
RecreationSmith
EducationPatel
EducationRobinson
EducationSmith

This is as far as we shall go in constructing relational databases. The important point to appreciate from this brief introduction to the topic is that even a fairly simple example has called for a careful analysis of the nature of the information and how it is related. Many large-scale information systems succeed or fail on the quality of the analysis, and the quality of the design that follows from the analysis.

Some of the processes of database construction can be done automatically. For instance, if you construct a table like Table 1 in Microsoft Access, the program itself can make a reasonable attempt at creating entity tables and joining tables for you. However, for large-scale information systems, such as those involved in e-government, there is no substitute for a careful study of the nature of the information, its relationships, and its uses.

Skip Your course resources
T175_5

Take your learning further

Making the decision to study can be a big step, which is why you'll want a trusted University. The Open University has 50 years’ experience delivering flexible learning and 170,000 students are studying with us right now. Take a look at all Open University courses.

If you are new to University-level study, we offer two introductory routes to our qualifications. You could either choose to start with an Access module, or a module which allows you to count your previous learning towards an Open University qualification. Read our guide on Where to take your learning next for more information.

Not ready for formal University study? Then browse over 1000 free courses on OpenLearn and sign up to our newsletter to hear about new free courses as they are released.

Every year, thousands of students decide to study with The Open University. With over 120 qualifications, we’ve got the right course for you.

Request an Open University prospectus371