1 Representing data in computers: introduction
A computer is designed to do the following things:
receive data from the outside world;
store that data;
manipulate that data, probably creating and storing more data while doing so;
present data back to the outside world.
In the next few sections I am going to examine in more detail the data that a computer receives, stores, manipulates and presents. In particular, I want to explore the idea that in a computer the data represents something in the outside world.
Here are a couple of examples you are probably familiar with from using your PC. The data will represent text and punctuation marks if you are using your PC to do word processing. The data will represent numbers if you are using your PC to do calculations on a spreadsheet. Many applications, not just word processors and spreadsheets, require the representation of text and/or numbers, but there are also other types of data that need to be represented.
Activity 1 (Exploratory)
Here are three examples of computers: electonic kitchen scales, a digital camera and the PC. Use these examples to suggest what else will need to be represented in these computers. For instance, weights will need to be represented in the electronic kitchen scales, where they are an input.
You may have suggested the following:
numbers on a display panel (outputs in the kitchen scales)
sound produced by a beeper (output when the kitchen scales’ timer facility is used)
scenes that will be turned into still pictures (inputs in the digital camera)
still pictures (outputs in the digital camera and in PCs)
scenes that will be turned into moving pictures (inputs in PCs with web cams)
moving pictures (outputs in PCs)
music, spoken words and other types of sound (inputs and outputs in PCs).
Another, more subtle, input that you may have mentioned is the input from a button on the kitchen scales or digital camera. Think, for example, of the button on the digital camera that switches the flash on or off. A representation of the input from this button is an important data item in the camera's computer: it tells the computer whether flash is to be used.
Data is an important part of any computer system, and Sections 2 to 4 will discuss the ways in which various types of data can be represented in a computer, focusing on the three example computers: the kitchen scales in Section 2, the digital camera in Section 3 and the PC in Section 4.
A danger with using specific examples to introduce a general idea like data representation is that the examples may not demonstrate all the principles that need to be introduced. I have dealt with this potential problem by inserting ‘boxes’ at various points in the text. These discuss ideas about data representation which are related to those in the main text but either are not relevant to the particular example under discussion, or apply more widely. You should note that the material in these boxes is assessable.
The word ‘data’ itself is a Latin one and its root meaning is ‘things given’, hence ‘facts’. But in the context of computers the word ‘data’ has a subtly different meaning, which is a formalised representation of facts, entities or ideas such that they can be manipulated, transmitted and/or received. Note that this means that ‘data’ is therefore being given a different meaning from ‘information’. Information is also facts and ideas, but not in a formalised representation. For use by a computer, information must be converted to, or expressed in, a suitable formalised format, and it is this formalised format which is called data.
In computers all data is represented as binary codes. That is, all data is represented as strings of 0s and 1s. A single binary digit – that is, a 1 or a 0 – is called a bit, and that the term byte is used to refer to a group of eight bits.
‘Byte’ is the term traditionally used for a group of 8 bits in the context of computers. But ‘octet’ is the term traditionally used for a group of 8 bits in the context of communications. Now that computers and communications are converging you may meet either term in either context. You therefore need to be familiar with both terms.
As with all codes, the user must know what the coding convention is in order to be able to assign meaning to it. For instance, on one occasion in a computer it may be appropriate to assign the meaning ‘the letter J’ to the code 01001010; on another occasion it may be appropriate to assign the meaning ‘the number 74’ to the same code, and so on. Does this sound faintly alarming? How does the computer ‘know’ whether to treat 01001010 as the letter J or the number 74 or something else? The answer is that context is crucial: if the computer has been programmed to treat the next code it receives as a letter, it will treat the code 01001010 as the letter J; if the computer has been programmed to treat the next code it receives as a number, then it will treat the code 01001010 as the number 74; and if it has been programmed to treat the next code it receives as something else, then it will treat the code 01001010 as that something else.
Inside a computer, the codes are grouped into pre-defined numbers of bits. Sometimes, particularly in relatively simple computers such as the kitchen scales, these pre-defined groups are bytes. But many computers are designed to handle a longer group of bits as a single entity. Modern PCs handle 32-bit groups, and it is likely that they will be handling 64-bit groups in the near future.
A fixed length group of bits that is handled in a computer as a single entity is called a data word, or simply a word, and the number of bits in the word is referred to as the word length. Thus most PCs work with a word length of 32 bits.