1.1 DNA and the genome
1.1.1 The chemical structure of DNA
This course explores the chemical nature of the genome. Genomes are composed of DNA, and a knowledge of the structure of DNA is essential to understand how it can function as hereditary material. DNA is remarkable, breathtakingly simple in its structure yet capable of directing all the living processes in a cell, the production of new cells and the development of a fertilized egg to an individual adult.
DNA illustrates beautifully the precise relationship between molecular structure and biological function. DNA has three key properties: it is relatively stable; its structure suggests an obvious way in which the molecule can be duplicated, or replicated; and it carries a store of vital information that is used in the cell to produce proteins. In this course, we will examine the chemical nature of DNA, which accounts for both its stability and the way it can be replicated – the first two of these three key properties.
Click below to view the video.
It was in 1953 that James Watson and Francis Crick (Figure 1) working in the UK, deduced and published the three-dimensional structure of DNA. It was the year that might be described as the dawn of molecular biology, for their publication was to have far-reaching consequences in terms of our understanding of the nature of life, i.e. how cells function at the molecular level. So monumental was this work that they were awarded, together with Maurice Wilkins, the Nobel Prize for Medicine with Physiology in 1962. In this section we shall examine in some detail the structure of the DNA molecule.
Watson and Crick showed that DNA or deoxyribonucleic acid has a double helix structure: two helical strands like spiral staircases, coiled round one another. It is this structure that accounts for the stability of DNA, one of its key features. Its simplest representation is shown in Figure 2. We will first consider the composition of each separate strand by taking it apart to reveal the building blocks of this molecule. Then we will show how the two strands interact to form the characteristic double helix structure of DNA.
Much as a protein is a string of amino acids, so each strand of the double helix is a string of building blocks, called nucleotides. Each nucleotide consists of three components: phosphate, a sugar molecule and a base (Figure 3). (The type of sugar is deoxyribose, hence the molecule is known as deoxyribonucleic acid.) The phosphate and the sugar are the same in each nucleotide but the base can differ. There are four different bases in DNA: adenine, guanine, cytosine and thymine. When illustrating a length of DNA we can simplify the name of each base to a single capital letter, so that A=adenine, G=guanine, C=cytosine and T=thymine, as shown in Figure 4.
In each strand, the phosphate of one nucleotide is joined to the sugar of another nucleotide, and so on down the strand, as shown on the left (in white) of the single strand shown in Figure 4. The strand of alternating phosphates and sugars is known as the sugar—phosphate backbone, and the bases protrude away from this towards the other strand of the helix.
How many nucleotides are shown in the DNA segment in Figure 4?
The simplest way of answering this question is to count the number of rectangles that represent the bases; there are eight.
There is a simplified way of describing the structure of DNA in text, and that is to write out the sequence of bases in a single strand, representing each base by its initial letter A, G, C or T. We will use this shorthand extensively from now on.
What is the base sequence in the portion of the DNA molecule shown in Figure 4, starting from the top?
These sequences are usually written or printed in lines like the letters or characters on this page. This is a convention, like reading left to right and top to bottom of this page. There may be many hundreds of thousands of bases within a single strand of a DNA molecule, in a long linear sequence. Such a sequence might appear as follows (reading from left to right):
You may be surprised to learn that DNA molecules are by far the largest known molecules on Earth. One DNA molecule runs the full length of the chromosome. If the DNA molecules of all the chromosomes in the nucleus of a single human cell were uncoiled, stretched out straight and laid end to end, they would measure about two metres. If all the DNA in all your cells were stretched out end-to-end, it would reach to the Moon and back about 10 000 times!
If you look at the comparatively short sequence of 50 bases printed above, and think about how many ways a simple coding language of just four letters could be rearranged in such a sequence, you will gain some appreciation of the huge variety of sequences that is possible. Take, for example, a short chain of just eight nucleotides. Since there are four different bases, there are four options for each position, and therefore 4×4×4×4×4×4×4×4=65 536 possible different sequences for a DNA molecule of just eight nucleotides! A DNA molecule consisting of thousands of nucleotides therefore represents a vast store of potential information.
So far we have considered a single strand of DNA, but Figure 2 showed that DNA has a double helix structure. Here each of the two ‘ribbons’ spiralled around each other represents the sugar—phosphate backbone of Figure 4, whilst the horizontal bars represent the bases of the two strands.
The key to understanding the structure of DNA and how it functions in the cell lies in the interaction between the bases in each strand at the core of the molecule. Along the length of a strand within the double helix, each base makes a specific pairing with a corresponding base in the other strand. These interactions are known as base-pairing, for which there are very precise rules.
T pairs only with A, and C pairs only with G. These pairs are called complementary base pairs.
These pairs of complementary bases sit ‘flat’ within the double helix, rather like the steps of a spiral staircase, as shown in Figure 2. The relative proportions of each base in DNA is related to the base-pairing rules just outlined.
If you were to extract some DNA from cells, and isolate and purify the four bases, how many A bases would you expect to find relative to T bases? Similarly, how many C bases would you find relative to G?
A consequence of the base-pairing rules is that the amount of A in a DNA molecule is always equal to the amount of T; the same applies to the amount of C relative to that of G.
Click below to view the video.
The alignment of base pairs within a DNA molecule is shown in Figure 5 (on next page). Here the helix is shown unwound, with the two sugar—phosphate backbones now parallel but still on the outside; the complementary base pairs form the core of the molecule. Since the sequence of bases on one strand is complementary to the sequence of bases on the other strand, the two strands of the double helix are described as complementary.
The structure of DNA can be summarized as follows. The complementary base pairs sit at the core of the molecule, and are arranged flat like the steps of a spiral staircase. The two sugar—phosphate backbones lie to the outside of the helix, each spiralled around the other.
Watson and Crick, in their famous 1953 paper published in the journal Nature, wrote:
We wish to suggest a structure for …. deoxyribose nucleic acid (DNA). This structure has novel features which are of considerable biological interest…. It has not escaped our notice that the specific pairing we have postulated immediately suggests a possible copying mechanism for the genetic material.
On the following page we consider how new DNA molecules are produced (the scientific term for this is synthesized), and show how true Watson and Crick's prediction was. But first watch the following video then try question 1.
Click below to view the video.
In a fragment of double-stranded DNA, there is a total of 100 bases, of which 30 are C. Calculate the total number of each of the following items in the DNA fragment: (a) complementary base pairs; (b) G bases; (c) T bases; (d) A bases.
(a) 100 bases would form 50 complementary base pairs.
(b) Since C always pairs with G, the number of G bases is the same as the number of C bases, i.e. 30.
(c) and (d) 60 of the 100 bases are either C or G, so the remaining 40 bases are either T or A. As A always pairs with T, then half of this number, i.e. 20 are T and 20 are A.