An introduction to data and information
An introduction to data and information

This free course is available to start right now. Review the full course description and key learning outcomes and create an account and enrol if you want a free statement of participation.

Free course

An introduction to data and information

5.1.1 What is DNA?

DNA (deoxyribonucleic acid) is frequently in the news for four main reasons.

  1. DNA can be used in crime detection to eliminate innocent suspects from enquiries or, conversely, to identify with a very high degree of probability the guilty.

  2. DNA is now used in medicine to detect the possibility that diseases having a genetic origin may occur in an individual. This enables doctors to prescribe preventative treatments.

  3. It is hoped that discoveries about DNA will yield important new treatments for hitherto intractable diseases and conditions.

  4. DNA can be used to identify victims of disasters, and establish whether people are related.

Figure 12 illustrates the following characteristics of DNA.

  • DNA has the shape of an immensely long twisted ladder (the famous double helix) in which each pair of chemical bases in the strand can be thought of as a rung in the ladder.

  • It consists of pairs of chemical bases called adenine (A), cystosine (C), guanine (G) and thymine (T).

  • The bases (which in Figure 12 are colour coded) can only be paired according to the rules: A to T and C to G.

  • A ‘rung’ or pair of bases (e.g. A–T) is called a base pair.

  • A nucleotide is a base pair plus its attached ‘structural’ molecules (i.e. the sides of the ladder).

  • Sequences of base pairs constitute genes which are the sections of a DNA strand that form discrete units of heredity (such as eye colour).

  • A complete DNA strand constitutes a chromosome (a human being has 46 of these combined into 23 pairs).

  • The four letters (A, C, G, and T) representing the DNA bases constitute ‘signs’ symbolising the building blocks of DNA. You can think of a set of signs as a code.

Figure 12 A DNA strand, bases, nucleotides, genes, and a chromosome (a) A small section of a DNA strand as though it were untwisted. Each box represents a base (A, C, G or T). Each pair of bases forms one nucleotide. Several nucleotides make up a gene (shown by brackets) (b) How the strand of DNA in (a) is twisted into the famous double helix (c) A chromosome formed from one DNA strand.

Exercise 11

The English alphabet is a system of signs that consists of 26 letters, from A to Z. There are rules that govern how letters can form words in English. For example, the combination ‘m-s’ cannot be used to begin a word, but is acceptable within or at the end of a word. This limits the number of English words it is possible to form.

Words, and parts of words, can be combined to make longer words. For example, adding an ‘s’ to ‘dog’ makes ‘dogs’, and preceding ‘mill’ with ‘wind’ gives ‘windmill’. Rules also determine that ‘windmill’ is all right, but ‘millwind’ is not.

  1. Considering these facts, how many words do you think the English language has?

    Now think about things that can be said using the English language: utterances. These consist of words strung together according to a set of rules known as grammar.

  2. How many utterances do you think it's possible to make in English?


  1. A standard, reputable dictionary will have between 30,000 and 50,000 entries. Even this is only part of the story since most dictionaries do not include slang, dialect words or words that exist for only a very short period of time. Neither do they contain specialised vocabularies that exist in certain professional and trade groups (e.g. among doctors). Thus the likely total vocabulary of English is (at a guess) in excess of 100,000 words.

  2. The number of utterances possible in English is virtually infinite. This is because, even given the rules of grammar, they can vary in length and word order.

Exercise 11 shows how a relatively simple code (signs like the alphabet) can be combined in simple and complex ways to produce an enormous variety of possible ‘products’ (utterances in English).

Exercise 12

Think of the DNA bases (A, C, G, and T) as forming a code similar to the alphabet, i.e. four ‘signs’ that can be combined according to rules to form genes. The genes in turn are combined into structures called chromosomes (i.e. DNA strands) of which the human being has 46 in 23 pairs. Given this structure, a gene is analogous to an English word, a chromosome to a volume of English utterances, and all 23 pairs of chromosomes to the volumes of an encyclopedia.

  1. At a guess, how many base pairs, like A–C, do you think the 23 pairs of human chromosomes have?

  2. What might that answer tell you about how difficult a problem it is to develop a full understanding of the human genetic structure?


  1. The longest human chromosome has about 263 million base pairs, the shortest 50 million. For all 23 pairs the total exceeds 3.2 billion (i.e. 3,200,000,000).

  2. The base pairs in a gene can vary, which is what gives us genetic diversity. So the problem of trying to understand the genetic structure of humans is roughly analogous to trying to read and understand all the sentences in a huge, multi-volume encyclopedia!

These two Exercises demonstrate that having a simple code is no guarantee of a simple system! What can be produced lies not in the simplicity or complexity of the code, but in the possibilities for combinations and the stringing together of small parts to form larger products. In other words, simple elements of data can generate a huge amount of information.


Take your learning further

Making the decision to study can be a big step, which is why you'll want a trusted University. The Open University has over 40 years’ experience delivering flexible learning and 170,000 students are studying with us right now. Take a look at all Open University courses.

If you are new to university level study, find out more about the types of qualifications we offer, including our entry level Access courses and Certificates.

Not ready for University study then browse over 900 free courses on OpenLearn and sign up to our newsletter to hear about new free courses as they are released.

Every year, thousands of students decide to study with The Open University. With over 120 qualifications, we’ve got the right course for you.

Request an Open University prospectus