Representing and manipulating data in computers
Representing and manipulating data in computers

This free course is available to start right now. Review the full course description and key learning outcomes and create an account and enrol if you want a free statement of participation.

Free course

Representing and manipulating data in computers

3.3 Compression

The previous section mentioned the large file size of bit-map representations of even small pictures. Therefore just a few images use up a great deal of storage space. This can be inconvenient for PC users, but in the case of a digital camera it presents a real problem. In addition, it is becoming increasingly popular to send digital pictures as email attachments, or via mobile phones using multimedia messaging services (MMS), but large files take a long time to transmit.

The way round the problem of inconveniently large files is to perform manipulations on the data file which represents the image so as to make the file smaller. Put bluntly, some of the 0s and 1s are removed from the file. But of course this is not done arbitrarily; instead it is done very carefully such that the original image can be reproduced when desired.

The process of re-coding data into a more compressed form is called compression, and almost any sort of binary data can be compressed. There are various algorithms (sets of rules) for carrying out the compression, each designed to work effectively with a particular type or types of data.

Notice that the advantage of a smaller file is gained at the expense of more processing, because the computer has to perform the algorithm needed to compress the data. Very likely it will also, at some later date, need to perform the algorithm needed to decompress the data.

The size of the original file divided by the size of the compressed file is known as the compression ratio. So if, for example, compression techniques could reduce the 540 000 bytes needed for the picture in Example 4 to just 54 000 bytes the compression ratio would be 10. Such compression ratios for pictures are clearly worth having, and they are achievable.

For still pictures such as those in the camera, a very common compression technique known as JPEG is used. (JPEG is very common because it is also used for images on the Web.) JPEG is pronounced ‘jay-peg’ and stands for Joint Photographic Experts Group, the group who devised this standard for compression. This compression technique divides a picture up into small blocks of pixels and performs complex calculations to arrive at a reasonably accurate but concise description of the block. One interesting point about JPEG is that the original data can never be recovered exactly – only an approximation to the original can be recovered. This might sound alarming, but in fact it exploits human physiological characteristics. The human eye simply does not detect some degradation in images, and so is unaware of the effects of the compression process. JPEG can achieve compression ratios of 10 to 20 with no visible loss of quality and ratios of 30 to 50 if some loss of quality is acceptable.

A compression technique like JPEG, where the original cannot be recovered exactly afterwards, is known as a lossy compression

Box 7: Lossless compression

Some compression techniques allow the original data to be recovered exactly. These are known as lossless compression techniques. These techniques are used, for example, for text files where it is important that no data is lost.

Two common lossless compression techniques are called run-length encoding and the LZ algorithm. These techniques can achieve compression ratios of around 3 or 4 on text.

Simple graphics files can be compressed with a compression technique known as GIF. This is a lossless compression provided the graphics file uses only 8-bit colour; otherwise the GIF algorithm reduces the colours in the image to the 256 possible with 8-bit colour. With GIF, compression ratios of well over 10 can be achieved for simple images. But an attempt to use GIF to compress more complex graphics files may actually increase the file size! In such cases JPEG – and hence lossy compression – is needed.

T224_2

Take your learning further

Making the decision to study can be a big step, which is why you'll want a trusted University. The Open University has 50 years’ experience delivering flexible learning and 170,000 students are studying with us right now. Take a look at all Open University courses.

If you are new to university level study, find out more about the types of qualifications we offer, including our entry level Access courses and Certificates.

Not ready for University study then browse over 900 free courses on OpenLearn and sign up to our newsletter to hear about new free courses as they are released.

Every year, thousands of students decide to study with The Open University. With over 120 qualifications, we’ve got the right course for you.

Request an Open University prospectus