4 Data flow diagrams
4.1 What is a data flow diagram?
A data flow diagram (DFD) is a graphical description of the ebb and flow of data in a given context. A DFD allows you to identify the transformations that take place on data as it moves from input to output in the system. (DFDs pre-date UML diagrams, but still have a complementary role to play in describing systems.)
The Case Study below provides an example of a DFD used to describe the Open University's eTMA system (electronic Tutor Marked Assignment system). It uses the notation described in Table 1 below.
Case study: An electronic tutor-marked assignment (eTMA) system
As you may already know, the Open University (OU) operates what we call the eTMA system – a system that, among other things, enables students to submit their assignments electronically. The eTMA system came about through pressure, in the mid-1990s, from both students and the University. Students had begun to use word processors to prepare their assignments and they wanted to submit them electronically rather than print out the assignments and submit them using the postal system. At the same time, the University wanted to extend its reach to non-UK residents in countries where the postal system was less reliable and less efficient than that in the UK. Both sets of goals could be satisfied by an electronic system. At much the same time, demand for an electronic system was also coming from a small number of course teams in different academic units (the Faculty of Mathematics and Computing, the Faculty of Technology and the Business School). Although there was only a small number of courses involved, these were among the largest courses in the University (one of the Technology courses had 12,000 students on its first presentation!).
Prior to the development of the eTMA system, the University's assignment handling system was entirely paper-based with assignments written (often by hand) on paper and posted to tutors. Paper-based TMAs were accompanied by a multipart control form (known as a PT3 form), the naming of which was lost in the sands of time, but we do know that PT stood for ‘part time’ and 3 indicated the third in a sequence of forms dealing with different aspects of assignment handling. The tutors mark the paper-based assignments, fill in the PT3 forms and send the amended documents to the University by post.
The University operates a large Assignment Handling Office headed by the Head of Assignment Handling who is responsible for ensuring that all University policies and functions relating to assignments are carried out appropriately. Among other things, assignment handling clerks (a) enter the marks into the student records database, (b) copy some assignments for monitoring (quality assurance) purposes, and (c) send the marked paper-based assignments back to the student using the postal service.
The paper-based system runs in parallel with the eTMA system as there are still courses for which the eTMA system is not suitable for the types of assessment materials involved, and it acts as a back-up system should the eTMA system be unavailable. However, the University wants more courses to use the eTMA system because it provides a better service to students, gives more management information and is potentially less expensive.
The eTMA system
The eTMA system allows students to submit their answers to tutor-marked assignments (TMAs) electronically, as computer files, to the University via a website. Whenever a TMA file is submitted, it is stored in a central database and a ‘receipt’ (a simple message containing a unique number) is sent to the student to acknowledge that the TMA has been received. Tutors (Associate Lecturers) are informed, by email, that a TMA is waiting for them to be marked.
The system enables tutors to download their students' submissions, mark and comment on the assignments ‘on-screen’ and submit the marked TMAs back to the University. A marked TMA is stored in a database and the student is informed, by email, that their TMA has been marked and is available to be retrieved electronically.
When the tutor downloads an unmarked TMA, she also receives an electronic version of the PT3 form on which the marks awarded for each question and the overall comments on the TMA must be entered. The completed PT3 form accompanies the TMA when it is sent to the OU database, and eventually both of them are sent electronically to the student.
The data flow diagram given in Figure 1 summarises the basic system.
Whilst not shown in Figure 1, the fact that both the unmarked and marked assignments are saved by the University enables it to implement a number of web-based reports that provide summary information to students and tutors on the current status of their assignments within the system, as well as providing management information for the University's Assignment Handling Office.
To match the existing paper-based TMA system, marks and tutor comments are extracted from marked assignments and added to the students' records. Since the number and size of eTMAs is potentially huge, the University has decided that complete marked and unmarked assignments would be stored only for a short period of time.
From the start, the aim was to provide a web browser interface for both students and tutors. However, for security reasons, there would be two subsystems: one for students and one for tutors.
Should the eTMA system ever fail and be unavailable to students, a back-up system has been implemented in which students are able to submit an assignment as an attachment to an email. The attachment is extracted by the University and fed into the eTMA system once the eTMA system is operational again.
The basic system has been gradually enhanced and now provides a number of facilities not shown in Figure 1. For example, one of the OU's quality control systems involves monitoring, that is, examining samples of each tutor's work to ensure that standards of marking are being maintained. In the paper-based TMA system, clerks in the central Assignment Handling Office take samples of marked TMAs, photocopy them, and send the copies to the monitors (academic staff who review the work of tutors and report back on the quality of the work). In the current version of the eTMA system, electronic copies of the tutors’ work are sent to the monitors who will send back electronic versions of their reports.
The eTMA system also checks a number of business rules regarding the submission of TMAs. For example, a student can make as many submissions as he likes, but only the last one will be marked (provided the submissions are made before the cut-off date or before the tutor has downloaded the eTMA).
The system also rejects any submission that either contains a virus or is too large.
Table 1 DFD notation
|External entity||A rectangle||A producer or consumer of information that is outside the process being modelled.|
|Process or activity||A circle||A transformer of information that is within the scope of the system being modelled.|
|Data (flow)||A line with an arrowhead that indicates the direction of data flow||The input to, or output from, a given process, which is associated with each arrow in a DFD.|
|Data store||Two parallel lines with the name of the data store between them||A repository of data for use by one or more processes.|
The description inside each process bubble should be as terse and as meaningful as possible. The use of an imperative verb and a simple object can easily indicate the desired transformation. In the Case Study above for example, you can find processes called ‘Submit assignment’ and ‘Download unmarked assignment’.
It is common to use a decimal numbering system to identify each process or activity at its given level of abstraction. For example, in Figure 1, we could label the six process bubbles from 1 to 6 with ‘Submit assignment’ being number 1, say. Then, if we wanted to show the details of ‘Submit assignment’ in another DFD we would label the processes in the new diagram 1.1,1.2, and so on to show the connection with the original DFD.
When you decompose or refine an activity and show the result in a new DFD it is essential, for the development of a consistent set of models, that the inputs and outputs of each successive decomposition or refinement remain the same. In addition to the set of DFDs that describe something like the eTMA system, you should also prepare a dictionary or glossary of the terms you have used in the diagrams like ‘assignment’ and ‘tutor identifier’. In particular, you need to record your understanding of the content of the data associated with each arrow and the stores.
An important aspect of a DFD, and its main benefit in a requirements process, is that there is no explicit indication of the sequence of processing in the notation. A DFD identifies what is happening and what is being passed in and out of each activity, but it does not specify the order in which things happen. In other words, you can identify the activities that take place at a given level of abstraction, such as Figure 1 above, but some other technique is needed to indicate the time-ordering of those activities. Some kind of sequence may be implied by the naming of activities, as in Figure 1, although any combination of the activities may be somewhere between their starting and finishing points at a given point in time.
The convention followed in DFDs to minimise the number of overlapping lines is to allow external entities and data stores to be duplicated.