Lecture V - Supersecondary Structure and Protein Morphology
Reading assignment: Branden-Tooze II, chapter 3 (a-domain structures); 10, pp.191-201 (helices in factor dimerization and DNA recoognition); 12 (membrane proteins); 14 (fibrous proteins); Creighton II, 5.5 (fibrous proteins) [Next: - Branden-Tooze II, Chapters 4 and 5)
Structures arising from Inter-Chain Hydrogen Bonds: b Sheets
When the orientation of peptide bonds alternates the polypeptide chain becomes almost maximally stretched out (i.e. not only w, but also f and y = 180o), and peptide bonds from separate chains can readily form H-bonds between chains. Both parallel and antiparallel alignments are possible. More than 2 chains can align and form sheets in which the a-carbon atoms are alternately above and below the plane of the sheet giving the appearance of creases or pleats. This structure therefore is often referred to as the "pleated sheet structure". It is also called b-structure after b-keratin, a fibrous protein in which it predominates. The repeat length corresponds to the length of an outstretched dipeptide: 7.0 Å; therefore the rise per residue is 3.5 Å. Since the chain is nearly fully stretched out both f and y are close to 180º. Occasionally an extra residue is accommodated in one of the strands causing a so-called b bulge. Note that proline would interrupt H-bond patterns here also.
Structures involving Proline
Reverse turns: Is there room for proline then? There certainly is. Most proteins have a roughly globular shape and contain a and b structural elements as well as portions without secondary structure. In principle a and b structures are linear features (cylinders, strands) that need to be interrupted if a globular shape is to arise. Frequently the chain which reaches the surface must turn back which it accomplishes my making reverse or b turns. It is not surprising that a substantial fraction, ca 30% of all amino acid residues in the average globular protein, occur in this structure. In terms of H bond patterns, the reverse turn can be considered as a fragment of 310 helix that connects two antiparallel b strands, with carbonyl 1 bonding with amide 4; in reality only one of more of a dozen different versions has bond angles similar to those of the 310 helix. The sharp turn imposes spatial restrictions; we therefore frequently find G in the third and fourth position. In addition, there is a need to avoid or accommodate H-bonds; this appears to be the reason for the prevalence of P in the second position and the frequent occurrence of aspartic acid and asparagine in the turn.
Polyproline helix: We have encountered proline as a residue that is difficult to accommodate in H-bonded structures; however, it can make a peculiar helix by itself. As stressed earlier, the P-P bond is sterically limited to f = ~-80º, and y = ~+140º. A repetition of this particular set of angles automatically leads to a helical structure, the polyproline helix, which is also called the collagen helix after the central role it plays in the structure of this important fibrous protein (since in collagen several poly-proline like chains are wound around each other, the individual chains are not as stretched out, and the dihedral angles deviate more from the fully stretched 180º). The repeat length of the polyproline helix is 10 Å and there are 3 residues per repeat, or 3.3 Å per residue. Since peptide bonds involving P fairly readily adopt the cis configuration, we can distinguish 2 kinds of poly-proline helices: Poly-proline I, with cis bonds and right-handed screw sense, and poly-proline II, with trans bonds and left-handed screw sense; the latter predominates in aqueous media.
Dihedral angles in secondary structures
Structure f y
poly-G (fully extended) 180º 180º
b, antiparallel -140º +135º
b, parallel -120º +120º
b, twisted sheet -120º +135º
poly-P (also helical poly-G) -80º +150º
collagen (-45 to -76º) (+127 to +153º)
310 helix -50º -30º
a helix -57º -47º
p helix -57º -70º
Secondary Structure and the Ramachandran Plot
All secondary structures are periodic and consequently characterized by a unique set of f and y values (see handout). As a corollary, secondary structures can be represented as points on the plain of the Ramachandran plot (show on handout).
Supersecondary structure
The way secondary structure elements form specific recurring discernible assemblies is also called superseconday structure. We will now consider how elements of secondary structure contribute to the formation of larger structures, domains and proteins. Overall proteins fall into 3 main classes: a domains with cores consisting entirely of a helices; b domains comprised of antiparallel b sheets, and a/b domains mostly made up of a parallel b sheet surrounded by a helices. In addition there are proteins not easily classified, such as small secretory proteins which are mostly held together by disulfide bridges.
a -Domains
Let us take a look at a-helix-derived structures first. a -Domains are made up of aggregates of a -helices. An isolated helix rarely forms a major protein component because it readily unfolds when surrounded by an aqueous medium; this is because helix stability relies primarily on hydrogen bonds, and hydrogen bonds can of course be as readily established with the solvent. Additional stablizing elements are required such as an apolar environment with which H bonds cannot be made. Therefore, individual helices completely surrounded by solvent are rare in proteins. An example are the calcium-binding proteins calmodulin and troponin C. Here the stabilizing factor may be the binding of 2 calcium ions near the carboxy terminal of the helix. As you may remember, the peptide bond has a fairly strong dipole moment (3.5 D, where 1 D - debye - equals 1 e.s.u.xAngstrom), and the a helix where all peptide bonds are oriented the same way itself can be considered as a dipole, with a positive charge at the N terminal, and a negative charge at the C terminal end (show). This stabilizing effect is utilized even in helices that are embedded in globular proteins: A survey of the frequency of specific amino acids at the 2 ends of helices shows that acidic ones preferentially occur at the N terminal, basic ones preferentially at the C terminal. Stabilization of a wildtype protein can also be accomplished by site-directed mutagenesis. (perhaps low temperature should be mentioned here as a stabilizing factor; antifreeze proteins which protect the blood of arctic fish crystallize in single a helices).]
For the analysis of the interaction of several helices it is advisable to get acquainted with several types of representation: (i) Helical wheel: This is a projection of the a-carban atoms and their attached side chains onto a circular or top view of the helix. Side chains are spaced 100º apart and recorded on the perimeter of the wheel, until after 5 complete turns ( = 18 residues) they return to the origin. This kind of representation rapidly reveals the presence of a hydrophobic face that one would expect in a helix that either interacts with another helix or is partly embedded into the apolar core of the protein. (ii) Cylindrical plot: Imagine the peptide chain projected onto a paper cylinder wrapped around it, the cylinder cut in the back parallel to the helix axis, and the surface flattened. The peptide backbone then appears not as a helix but as several disconnected straight lines that slope from lower left to upper right in the case of a right-handed helix as is the case with 310 or a. Helix interactions can be analyzed by superimposing on this plot another in which the cylinder is sliced open in front and flattened to reveal the interior surface. In this presentation the right-handed helix appears as a series of lines sloping from loer right to upper left. By positioning every 4th and 7th side chain (which in coiled coils or superhelices tend to be hydrophobic) of one helix as knobs into the corresponding holes on the other helix, one obtaines a helix crossing at 18º, i.e. the 2 elementary helices wind around each other (in left-handed fashion!) to form a superhelix whose repeat length is 360º/18º = 20 a-helical turns, totaling ~110 Å. Other crossings of a-helices are possible; these involve ridges and grooves generated by the alignment of either every 3rd or every 4th side chain. Crossing angles can be 20º, 50º and 80º.
Examples of coiled coils:
(i) Globular proteins: In globular proteins pairs of helices are usually aligned in antiparallel fashion. An example is the rop protein, a small RNA-binding protein involved in plasmid replication; each helix has about seven turns. Here 2 proteins form a dimeric structure in which all adjacent helices are aligned in antiparallel fashion. Sometimes the same motif is seen in a single polypeptide chain; examples are the electron carrier cytochrome b562 and the invertebrate oxygen carrier myohemerythrin. An extremely long a coiled coil is found in E. coli seryl-transfer RNA synthetase where each helix is about 10 turns long giving rise to an unusual structure.
(ii) Fibrous proteins: A helical wheel analysis shows that many fibrous proteins also dimerize to give rise to a-helical coiled coils. [The prototype here is myosin, the major component of the contractile apparatus (the thick filament) in muscle. 2 long polypeptide chains form a double-headed globular region and a rod-like coiled coil of an overall length of 1340 Å or nearly 1000 residues; the 2 a-helices form a left-handed superhelix with a 110-Å repeat (26 turns right per turn left ca 3.5 degrees or 1% per residue; i.e. the original overtwist is relaxed by about a third in this structure). Here as well as in tropomyosin the alignment is parallel. In addition, the intermediate filaments, a large family of tissue-specific filamentous proteins, display this structural feature. They occur exclusively in animal cells, consist of 2-stranded coiled coils and are on average 300 residues long. The most obvious is a-keratin, but also desmin in muscle; vimentin in connective tissue; neurofilament proteins in nerve cells; fibrinogen in plasma; and lamins. Triple helices are also possible. Examples are keratin, the cytoskeletal protein a-spectrin and influenza virus hemagglutinin; but they may even occur with leucine zippers!]
(iii) Dimerization motif in the bHLH and bZip transcription factors: Oftentimes the coiled coil is used to allow the reversible interaction of two proteins. [A motif referred to as the leucine zipper was first described in CCAAT box/Enhancer Binding Protein (CEBP), a particular eukaryotic transcription factor. It was later recognized that a very large gene family, the myc family or basic helix-loop-helix family of transcription factor makes use of coiled coils for purposes of dimerization. In all of these cases a basic DNA-binding regionis followed by a region capable of making a helices that engage in supercoiling. The structure of the DNA binding and dimerization region of several representatives of this family have been determined (the yeast protein GCN4 - Ellenberger et al., Cell 71,1223-1237, 1993; MyoD, Ma et al., Cell 77, 451-459, 1994). The possibility to reversibly form homo- and heterodimers is very important in higher organisms as it allows the design of a large number of regulatory elements from a limited number of proteins.]
(iv) Membrane proteins: a-helices are thought to play a prominent role in the structure of membrane proteins. There the a-helix provides a simple means of passing a polypeptide chain through the hydrophobic core without leaving a single hydrogen bond partner in the polypeptide backbone unpaired (show); it goes without saying that the side chains cannot have any H bond partners either, i.e. they must be hydrophobic. A large family of membrane receptors of which the prototypes are the EGF receptor (epidermal growth factor receptor) and the insulin receptor are believed to contain a single a-helical transmembrane segment. Note that about 20 hydrophobic residues are required to build a transmembrane helix. In contrast, a-helices in globular proteins even if they are embedded in the interior of the protein and consequently hydrophobic seldom exceed a length of about 10 residues. Of course the polypeptide chain can be threaded repeatedly through the membrane. The first membrane protein that was shown experimentally to contain a-helices in its transmembrane portion was bacteriorhodopsin, a light-driven proton pump from the bacterium Halobacterium halobium; it contains a total of seven such helices, oriented approximately perpendicular to the plane of the membrane. The structure was obtained by 3D reconstruction of electron-microscopic images. The vertebrate visual pigment rhodopsin has a very similar structure, and the same holds for a large number of receptors for hormones and neurotransmitters, which consequently have been referred to as the opsin family (opsin is the apoprotein of rhodopsin), and are now often called seven-pass receptors. The binding site for the visual pigment retinal in the case of rhodopsin and for ligands in the case of chemoreceptors is between the membrane-spanning helices. The a-helical structure of the transmembrane portion of at least one type of protein, namely the photosynthetic reaction center of Rhodopseudomonas viridis, has been established by x-ray crystallography. Again here the photopigments are bound to transmembrane helices. An even more impressive example is the gap junction assembly (Unger et al., Science 283, 1176-1180, 1999) consisting of 6 subunits of 4 helices each on either of 2 apposed membranes, i.e. 48 helices in all!
Other a-helical structures:
(i) Helix-turn-helix: In the vast majority of cases, helices aggregate to form higher order motifs. Thus two short a helices can be connected with a short turn or loop to form an angular structure. Such a structure forms the metal binding site in the calcium-binding proteins. The motif was first discovered in parvalbumin, where the 5th and sixth helix also labeled E and F contribute to it. Since the 2 helices and the intervening loop region can be modeled with thumb, index finger, and middle finger, the motif is also referred to as the EF hand. Helix-turn-helix motifs are important components of bacterial repressors and activators. They are characterized by 2 successive a helices juxtaposed at about 90 degrees by a turn of 4 amino acids; dimerization of this class of bacterial gene-regulatory proteins arranges the a helix of each monomer with its analog so that they fit into successive major grooves of DNA.
(ii) The globin fold: Globular proteins can be made up to a large extent from a-helical segments. Such a structure is the globin fold, first discovered in myoglobin (as an aside might point out the "heretical" features of the F helix: internal proline; the p turn at the C terminal). Hemoglobin can be considered as a myoglobin tetramer; at any rate the a and b globin subunits of hemoglobin display considerable sequence homology and the same folding as myoglobin. b -globin conntains a famous helix crossing (helices B and E) in which replacement of G24 in helix B by any other residue causes destabilization.