Lecture VI Supersecondary Structure continued. b Domains and a/b
Domains. The Collagen Helix.
Reading assignment: Branden-Tooze II, Chapters 4 and 5
[Next 3 lectures: Dan Raleigh on Protein Folding]
B. b Domain Structures
We now turn to b structures in proteins. They are about as frequent as a helices in proteins, involving approximately 30% of all amino acyl residues. b structures occur in several distinct subgroups. (Recall that an equal fraction of amino acids are components of reverse turns) A familiar example is the Anfinsens RNAse whose core is made up by an antiparallel b sheet. As we stressed in the case of helices, it takes more than an an individual helix to form a protein core, and to a large extent this also true for b sheets; usually this is accomplished by packing two sheets together. Individual b strands have a right-handed twist; therefore sheets made of b strands in either the parallel or antiparallel fashion also show this torsion. Viewed across, perpendicular to the strands, the twist is a left-handed one.
1. Antiparallel b structures:
The simplest way to build a larger structure from individual b strands is aligning antiparallel strands that are connected by reverse turns. (If other connections are allowed even a simple sheet containing only 4 strands can be built in 24 different ways!
(a) b -Barrels:
Up/down barrels: These are special cases of the antiparallel sheet. The simplest topology arises from the addition of successive b strands connected with hairpin loops, until the barrel is closed and the last strand is joined by hydrogen bonds to the first strand. Alignment of individual strands is antiparallel. The barrel can also be viewed as composed of two b sheets that cling together through hydrophobic interactions much as 2 slices of bread in a peanut butter sandwich. [An example is retinol-binding protein. This protein transports retinol - vitamin A, from the liver to the tissues requiring it. It transports a single molecule of retinol to the periphery and is degraded after a single use. The hydrophobic interior of the barrel provides a binding site for the hydrophobic ligand solubilizing it in the bloodstream. In general this kind of barrel seems to be particularly suited to act as containers for a variety of hydrophobic ligands. In most instances 2 antiparallel sheets constitute the barrel; but barrels can be more complex, as in the case of neuraminidase from influenza virus in which six sheets participate (show).]
Greek key motifs: When loops not only connect directly adjacent b strands, but also cross over the top of the barrel, Greek key structures arise. Two such keys make up an 8-strand barrel. [This is the case in gamma crystallin, a major eye lens protein. It is perhaps of interest that the junctions between the key motifs coincide with exon boundaries]. Probably genes for large proteins evolved by the accidental juxtaposition of exons coding for specific functions or structural motifs or domains; this process is sometimes referred to as exon shuffling.
Jelly roll: An antiparallel b barrel can also be constructed by twisting an elongated hair pin a sufficient number of times around cylindrical core. "It is called a jelly roll because the polypeptide chain is wrapped around a barrel core like a jelly roll" (my knowledge of British cuisine is inadequate for a better explanation). It has been proposed that during folding of the protein the jelly roll structure nucleates with a hair pin loop in the middle of the sequence to first form a long hairpin which then curls up to give rise to the barrel structure. [This structure is prominent in influenza hemagglutinin, the bacterial DNA binding protein CAP (catabolite gene activating protein) and coat proteins of small spherical viruses (picorna, T=3 plant).]
(b) Transmembrane barrels: Barrels consists of b sheets that have hydrophobic aspects interacting with each other and hydrophilic surfaces facing toward the outside of the protein. If a barrel is wide enough this design can be turned inside out to yield a transmembrane channel. In fact such a channel has been found; it is a protein from the outer membrane of E. coli, porin, which as its name implies provides a pore through the membrane through which solutes can passively diffuse. (Here show porin: all strands are inside membrane, all b turns outside. There are about 13 to 14 hydrophobic residues - 28 to 35 Å in length; only 8 to 10 would be required if the strands were completely perpendicular to the plane of the membrane.
(c) The b-propeller:
The b-propeller was recognized as a special b motif fairly recently also. It derives its name from the radial arrangement of six or seven small up-down sheets that are arranged like blades of a propeller. Some of the proteins are enzymes (e.g. Neuraminidase covered in Branden and Tooze), others function in protein protein interaction (e.g. b-subunit of heterotrimeric G proteins, such as transducin or clathrin, ter Haar et al., Cell95, 563, 1998).
2. Parallel b structures:
In a parallel sheet strands cannot be connected with short segments such as the g and b reverse turns; instead the connections need to be made over longer distances and there we find parallel sheets often in combination with a helices in so-called a/b or mixed domains. Regarding these connections it has been observed that they are almost invariably right-handed. This sems to be a consequence of the right-handed twist that even an isolated b strand exhibits. Since the individual extended polypeptide chain has a slight right-handed screw-sense, sheets made up of several strands running in parallel likewise display a torsion. This causes right-handed connections between strands to be shorter and less strained and presumably for this reason to be preferred. Very few violations to this "right-hand crossing rule" have been observed in proteins. In a parallel b sheet connections are often made by a helices, thus causing an alternation of a and b structures. Nevertheless several examples have been found in which the connecting chains are themselves b structures. These are the
b-Helices:
The parallel b-helix: Several years ago a new domain motif was described, the parallel b helix (Yoder et al., Science 260, 1503-1507, 1993); it has been found in the enyme pectate lyase from a plant (Erwinia chrysanthemi) and B subtilis and from the P22 tailspike protein. It can be pictured as a parallel b sheet whose ends are rolled into a cylinder. The helix sense is right-handed. There are 22 residues per turn, and consequently the rise per residue is minimal, 0.22 Å. Although theoretically this helix could have a very wide lumen (ca 15? Å inner diameter) and therefore might be expected to serve as a channel-like structure, it is actually a dimpled or compacted structure, a crushed cylinder whose interior is filled in with the side chains of the b structure, just as in the simple b barrel.
The left-handed parallel b-helix: After this another surprise: a left-handed parallel b helix has been found in the structure of UDP-N-acetylglucosamine acetyltransferase from E. coli (Raetz and Roderick, Science 270, 997-1000, 1995); the structure is unique in that it violates the almost universal right-handed crossing rule (see below?).
C. Mixed a-b Motifs
A majority of proteins contain both a and b motifs, and a number of regular supersecondary features can be decerned.
b-a-b motif: The simplest element of this kind is a b-a-b motif. Almost invariably these three elements are dispositioned in such a way that the loop that they form has a right-handed screw sense (show). It has been suggested that this is a consequence of the individual stretched peptide chain's tendency to assume a right-handed helical conformation. Similarly a b sheet structure will appear to have a right-handed twist when viewed along one of the extended peptide chains; as a corollary in a perpendicular view the sheet appears with a left-handed twist. This rule of the "right-handed cross-over" however ahould be seen as entirely empirical; its origin is not really understood. at any rate the recent discovery of a left-handed b-helix is a clear violation.
a/b barrels: As a rule, mixed structures have a parallel b sheet at their core. Individual strands all point in one direction and are connected with each other by helices. A relatively simple example is the Rossmann fold b-a-b-a-b which is found as the nucleotide binding site in many enzymes. A more extreme case is that of the RNase inhibitor in which 17 parallel b strands alternate with 16 helices to form an open horse-shoe like structure. Finally when a sheet completely folds back on itself it forms a barrel. As a rule such a barrel is composed of eight strands, which in this case one might refer to as staves. Several examples of this structure are found among glycolytic enzymes, a classical case being triose phosphate isomerase after which the structure also is called the TIM barrel (show). The core of the barrel is filled with branched-chain hydrophobic side chains. In general, the barrel provides stability, while the loop regions which are not required for holding the protein together provide diversity. Thus the active site of such a barrel enzyme invariably is in a pocket formed by the loop regions that connect the carboxy ends of the b strands with the adjacent a helices (b arrows point at active site!!). Occasionally, as in the case of pyruvate kinase, another enzyme in the glycolytic pathway, the a/b barrel is only one of several domains; one of the connecting loops there forms a separate domain made up of antiparallel b strands.
Placement of active sites in mixed structures
Mixed structures. often contain a helices on both sides of the b sheet. Note that active sites or ligand binding sites are almost always located at the carboxy edge of the b sheet (again here the active site is pointed at by the b arrows). Since there is no barrel the active site does not have the form of a funnel, but rather the form of a cleft or crevice. The position of these crevices can be recognized on a topology diagram. They occur at "switch points" where connections from the carboxy ends of two adjacent b strands go in opposite directions, i.e. the two loops are on opposite sides of the b sheet (model with 2 hands). Mixed sheets also occur where an active site formed from parallel strands is flanked by antiparallel strands.
D. Polyproline and the collagen helix
Recall that P is difficult to accomodate within an a helix. This difficulty is caused by steric hindrance and the absence of an H-bond donor function; instead, polyproline has a tendency to form a helix of its own. Steric repulsion between pyrrolidone rings leaves the backbone of poly-P only very little space on the Ramachandran plot; it is limited to f = -80 degrees and y = +150 degrees. This results in the formation of an almost completely stretched-out chain, with an inter-residue distance of 3.1 Å. The bulky proline rings avoid each other by twisting the chain into a left-handed helix, with 3 residues per turn. Three such strands are packed together into a right-handed superhelix in the protein collagen. To enable the 3 strands to wind around each other a G residue is required in every 3rd position because it is the only amino acid small enough to fit in the core of the triple helix. This is a good example of the unique flexibility of protein structure provided by G. The repeat length of the superhelix is 100 Å, or 10 primary turns. The peptide bond, which contains the glycine-derived amide function hydrogen, bonds to the neighboring strand, thereby establishing cross-links perpendicular to the helical axis. In the extracellular space several such superhelices are then bundled together to form higher-order microfibrils. [An excursion on the topic of structure-function relationship: The function of collagen is to provide tensile strength for mechanical support. To this end (a) the chain is nearly fully extended allowing the tension to be borne by the very strong covalent bond. (b) the helix is triple-stranded, preventing individual strands from rotating independently round each other. (c) The screw senses of the elementary helix and the superhelix are opposite to each other so that if one tries to unravel one, one tightens the other. All of these features are also features of man-made ropes as I realized some years ago when I visited the rope walk in Mystic Seaport. This is, I may add, an example of convergent evolution, namely the appearance of a similar design in the absence of a phylogenetic relationship.]
Postscript: Protein structure presentations
For the analysis of these structures it is useful to learn how to interpret specific protein structure presentations: Schematic diagrams present a helices as corkscrew structures and b strands as flat arrows; this type of presentation was popularized by Jane Richardson and is widely used by crystallographers in general and in the Branden-Tooze book in particular; it provides a good view of the secondary structure elements of a protein. Topology diagrams are simplifications which can reveal structural similarities that may remain concealed in the more realistic schematic representations; as much as possible the structure is reduced to the plane of the paper, with flat arrows depicting b strands, and rectangles or cylinders denoting a helices (show examples from Branden/Tooze, p.21). Topology diagrams are especially helpful for analysing b and mixed a/b structures.