6.3 The recognition of specific DNA sequences by proteins
Transcription factors act by binding to specific DNA regions, dependent upon the recognition of particular sequences of bases, usually through direct interactions in the major groove. They are known to use certain motifs for DNA binding and many contain the helix–turn–helix (HTH) or zinc finger motifs. We will now discuss the molecular properties of these two protein motifs that confer DNA binding ability on the proteins that contain them.
The classical HTH motif consists of two -helical segments connected by a turn in the peptide chain that allows the helices to cross over each other. One of the helices serves to position the second recognition helix into the major groove of DNA, wherein specific hydrogen-bond formation occurs with the bases exposed in the groove. An example of this interaction is shown in Figure 22a and b for the Ultrabithorax (Ubx) homeodomain protein, a member of a family of proteins that play a critical role in Drosophila development. In this case, the protein utilises two supporting helices to direct the recognition domain, comprising a third helix, into the major groove. The two views of this interaction show how the recognition domain is positioned within the major groove. The interactions between the Ubx recognition domain and DNA are mediated by hydrogen bonds and non-polar interactions, as shown in Figure 22c.
The zinc finger motif consists of a protein domain within which either a pair of cysteines and a pair of histidines (Cys–His) or two pairs of cysteines (Cys–Cys) act together to coordinate a zinc ion (Figure 23a). The stretch of polypeptide in between these pairs of amino acids loops out and folds into a finger-shaped configuration. Much of the specificity of binding within the major groove resides in several amino acid residues in each finger, each finger recognising a three-base ‘target’ sequence in the DNA. An example of this interaction is shown in Figure 23b, where the mouse Zif 268 protein is shown interacting with a DNA helix. This protein carries three Cys–His fingers which can be seen lying within the major groove.
Within each finger, three specific amino acids, Arg, Glu and Arg, are responsible for the specificity of the recognition site – in this case, three figures interact with three sequential DNA triplets of 3′-GCG-5′, as shown in Figure 23c.
Extensive studies have been undertaken to determine a code that links specific amino acids to specific target triplets so that the DNA target sequences for any zinc finger-containing protein can be identified more easily. This would prove useful, as zinc finger-containing proteins are extremely common. For example, the Drosophila genome encodes 230 such proteins with an average of three fingers per protein and the human genome encodes almost 600 zinc finger-containing proteins with an average of eight fingers per protein.
What technique, that you have already come across, could be used to generate zinc finger proteins and/or target DNAs to investigate the specific interaction?
Genetic engineering techniques such as site-directed mutagenesis could be used to create both novel zinc fingers and target DNAs for such studies.
There are now computer programs that can apply DNA recognition rules for contacts between amino acid side-chains in a recognition domain and DNA bases in the major groove, to analyse transcription factor families, including the helix-turn-helix and the zinc finger proteins. Whilst these programs cannot predict the binding sites for engineered proteins, they can be used to create a generalised framework on the basis of which mutagenesis experiments can be designed. A direct examination of the strength of binding between an engineered protein and its target DNA is then assessed experimentally.
What techniques could be used to determine the strength of interaction (binding) between an engineered protein and its target DNA?
Isothermal titration calorimetry (ITC), and differential scanning calorimetry (DSC) are two such techniques.
In both zinc finger- and HTH-containing proteins, specific protein domains are utilised to increase the non-covalent bonding opportunities and hence the affinity and specificity of the protein–DNA interaction. You should notice, however, that the protein–DNA interface is not restricted to these domains, but that much of the rest of the DNA binding protein is in close proximity to the DNA chains. Interactions between these parts of the protein and the DNA, though non-specific, contribute to the overall binding affinity of the protein.