The previous section mentioned the large file size of bit-map representations of even small pictures. Therefore just a few images use up a great deal of storage space. This can be inconvenient for PC users, but in the case of a digital camera it presents a real problem. In addition, it is becoming increasingly popular to send digital pictures as email attachments, or via mobile phones using multimedia messaging services (MMS), but large files take a long time to transmit.
The way round the problem of inconveniently large files is to perform manipulations on the data file which represents the image so as to make the file smaller. Put bluntly, some of the 0s and 1s are removed from the file. But of course this is not done arbitrarily; instead it is done very carefully such that the original image can be reproduced when desired.
The process of re-coding data into a more compressed form is called compression, and almost any sort of binary data can be compressed. There are various algorithms (sets of rules) for carrying out the compression, each designed to work effectively with a particular type or types of data.
Notice that the advantage of a smaller file is gained at the expense of more processing, because the computer has to perform the algorithm needed to compress the data. Very likely it will also, at some later date, need to perform the algorithm needed to decompress the data.
The size of the original file divided by the size of the compressed file is known as the compression ratio. So if, for example, compression techniques could reduce the 540 000 bytes needed for the picture in Example 4 to just 54 000 bytes the compression ratio would be 10. Such compression ratios for pictures are clearly worth having, and they are achievable.
For still pictures such as those in the camera, a very common compression technique known as JPEG is used. (JPEG is very common because it is also used for images on the Web.) JPEG is pronounced ‘jay-peg’ and stands for Joint Photographic Experts Group, the group who devised this standard for compression. This compression technique divides a picture up into small blocks of pixels and performs complex calculations to arrive at a reasonably accurate but concise description of the block. One interesting point about JPEG is that the original data can never be recovered exactly – only an approximation to the original can be recovered. This might sound alarming, but in fact it exploits human physiological characteristics. The human eye simply does not detect some degradation in images, and so is unaware of the effects of the compression process. JPEG can achieve compression ratios of 10 to 20 with no visible loss of quality and ratios of 30 to 50 if some loss of quality is acceptable.
A compression technique like JPEG, where the original cannot be recovered exactly afterwards, is known as a lossy compression
Box 7: Lossless compression
Some compression techniques allow the original data to be recovered exactly. These are known as lossless compression techniques. These techniques are used, for example, for text files where it is important that no data is lost.
Two common lossless compression techniques are called run-length encoding and the LZ algorithm. These techniques can achieve compression ratios of around 3 or 4 on text.
Simple graphics files can be compressed with a compression technique known as GIF. This is a lossless compression provided the graphics file uses only 8-bit colour; otherwise the GIF algorithm reduces the colours in the image to the 256 possible with 8-bit colour. With GIF, compression ratios of well over 10 can be achieved for simple images. But an attempt to use GIF to compress more complex graphics files may actually increase the file size! In such cases JPEG – and hence lossy compression – is needed.