4.3 Conserved protein domains
By comparing the extensive protein databases, it is possible to identify many thousands of conserved domains. For example, within eukaryotes, over 600 domains have been identified with functions related to nuclear, extracellular and signalling proteins. The majority of conserved domains are evolutionarily ancient, with less than 10% being unique to vertebrates.
Figure 30 shows the amino acid sequence of the SH2 domains of several tyrosine kinases belonging to the so-called Src family. Src, described in the previous section, was the first tyrosine kinase to be discovered and has given its name to a family of nine homologous proteins. The SH2 domain is common to all these kinases and is highly conserved between species. This domain is found only within eukaryotic proteins and the number of SH2-containing proteins has increased in more recently evolved eukaryotes compared to more evolutionarily ancient species. Thus only one SH2-containing protein has been found in the yeast Saccharomyces cerevisiae and two in Arabidopsis thaliana, whereas Caenorhabditis elegans and Drosophila melanogaster have about 80 proteins with SH2 domains, and mammals have over 330.
In Figure 30, the sequences of the SH2 domain of four different Src family kinases, as well as homologous proteins in other species, have been aligned; that is, equivalent residues have been represented in the same vertical column with regions of sequence identity highlighted. Notice that the sequences for the Src SH2 domain from human, mouse and chicken are identical apart from residue 85 in chick, whereas that of Drosophila has 55 residues out of the 108 in the sequence that are identical to those in human Src (51% identity). A sequence identity level of 30% or more is generally indicative of homology between two proteins. In some cases, residues are conservatively ‘substituted’ in a homologous protein by others that have similar physical and chemical properties. Thus the structures that such proteins adopt may not differ greatly.
Tyrosine (Y) has a bulky aromatic ring in its side-chain and is uncharged at neutral pH. Which other amino acids might be conservatively substituted for tyrosine?
Phenylalanine (F) or tryptophan (W) might replace tyrosine without significantly affecting the folding of the protein.
Consider the first 15 amino acids of the human and Drosophila Src SH2 domains in Figure 30. Try to identify some of those amino acid residues that are nonidentical in the two sequences but that are similar in terms of their chemical properties and structure.
The ‘+’ signs denote those amino acids that are similar but not identical.
In some cases, conservation between two proteins extends across the entire length of the protein, but in many cases, the regions of conservation are scattered throughout its length. This distribution results from domain shuffling, wherein segments of genes are duplicated, rearranged or swapped, as discussed in Section 1.4 and illustrated in Figure 16.