Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

In this chapter we want to turn to the experimental structure determination methods of ligands and proteins. There are two techniques in particular that deliver information about the three-dimensional structure of small organic molecules all the way to proteins: crystal structure analysis and high-resolution NMR spectroscopy. The first technique is the older method. It goes back to an experiment of Max von Laue in 1912. It was just 17 years earlier that Wilhelm Röntgen had discovered an electromagnetic radiation, which was later named X-rays, or “Roentgen rays” in German in honor of him. Together with his collaborators Walter Friedrich and Paul Knipping, Laue was able to demonstrate the wave nature of X-rays with a copper sulfate crystal. At the same time they proved the lattice structure of crystals. Only one year later William Lawrence Bragg and his father William Henry Bragg reaped the rewards of these experiments. They determined the crystal structure of sodium chloride. The technique has grown over the years. Today the structures of proteins with 4,000 amino acids have been determined. In the last years electron microscopy has proven to be a very powerful crystal diffraction technique tool for the structure elucidation of membrane-bound proteins and viruses. NMR spectroscopy is likewise a relatively young technique. In 1945 the research group of Felix Bloch and Edward Purcell in the USA observed the resonance absorption of hydrogen atom nuclei in a magnetic field for the first time. From this experiment, the technique has grown, mostly due to progress with the instrumentation, to the extent that the structure determination of proteins with more than 800 amino acids has been accomplished. For this purpose, however, the protein must be extensively labeled with different isotopes.

1 Crystals: Aesthetic on the Outside, Periodic on the Inside

The term “crystal” causes one to immediately think of well-formed minerals or sparkling gemstones with a magnificent cut. The association of crystals with the structures of the molecules that determine our lives only occur to us as a second thought. The crystal is typically associated with “dead” material. When Jack Dunitz took over his chair as professor of organic chemistry at the ETH in Zurich at the end of the 1950s, the famous natural product chemist Leopold Ruzicka dismissively told him that crystals are a “chemical graveyard.” Nonetheless, Dunitz and his research group showed over many years that a crystal in no way belongs in a “graveyard,” but rather is the key to understanding the structure, dynamics, and reactivity of molecules.

If a mineral is considered, the regular construction of the single crystals stands out. Even organic materials have the ability to form shapely crystals. One must only think of the fascinating crystals of candied sugar. Is this external regularity a representation of the inner structure? Before this question is answered, the way that crystals are obtained should be clarified. A mineralogist got it easy. Nature has already provided well-formed crystals over thousands or millions of years. Organic molecules and proteins rarely occur in Nature in a crystalline state. Conditions must be found under which they crystallize.

In general, crystals are grown from a solution. For simple organic substances this can also be accomplished from liquid material or by sublimation. Both crystallization methods are known from water when a lake freezes to ice, or from beautiful crystals of frost. For crystallization from solution a solvent is sought in which the compound is adequately soluble. By changing the conditions, the saturation point of the solution is exceeded. If this occurs slowly, small crystal nuclei form that can grow to large crystals. As a rule the solubility of the compound decreases with sinking temperatures. The saturation point of the solution can be exceeded by changing the temperature. The solution can also be “thickened”, that is, some of the solvent is removed. Another possibility is the addition of a second solvent in which the compound is less soluble. If the ratio of the two solvents is correctly chosen, the saturation point can be slowly approached. For compounds with acidic or basic groups, pH conditions can be found under which the compound exists as a salt. Because of strong ionic interactions the salts often form better crystals. They can be “salted out.” For this, a salt, for example, sodium chloride, is added to an aqueous solution of the compound. The salt “uses up” the water molecules as it goes into solution. It becomes surrounded by a solvation sphere of water molecules. In doing so, the water is removed from the organic compound, which also has a sphere of water surrounding it, the solvent. The saturation point of the compound is exceeded, and the crystallization begins.

Proteins are complex entities that, as a general rule, are only soluble in water. Because of their amino acid composition, they carry charged ionic groups on their surfaces. Even with proteins it holds true that conditions must be found under which they associate in periodic array. This is accomplished by slowly changing the amount of water in which the protein is dissolved. This can work in both directions. Hydrophobic proteins begin to aggregate when the amount of water increases. Proteins that have stronger polar groups on their surfaces aggregate when the water molecules are removed from their surfaces. Adjusting pH to find the right value, the choice of suitable salt for salting out, and different temperatures are the conditions that must be optimized. In addition to salts, surface-active substances (detergents) can also influence the solvent shell and support the crystallization. Despite this, crystallization is a kind of fine art. The search for suitable conditions requires creativity and diligence. Today, however, the crystallization methods are so elaborate that the tedious work of setting up thousands of different test conditions is carried out by robots.

Sometimes considerable effort is invested into structure determination. In 1995, the crystallization and structure determination of HIV integrase, one of the key enzymes in the generation cycle of the virus, was accomplished only after the 40th point mutation of the original protein. This point mutation was made with the goal of changing the surface properties of the protein so that an orderly aggregation to a crystal could occur.

Let us return to the original question of whether the orderly outward appearance of a crystal is a reflection of the internal construction. Chemically, a crystal is homogenously composed. The organic molecule or the protein represents the basic building block. It is only when these building blocks are spatially neatly organized that a periodic array occurs that optimally fills the space. In daily life, many solutions to these packing problems are easily seen, for example, sugar cubes that only fit into the box if they are layered in the right direction, or paving stones that must be neatly laid in a periodic fashion to completely cover the path without gaps (Fig.13.1).

Fig. 13.1
figure 00131figure 00131

Paving stones cover a surface without leaving holes (a). This is only possible if they are derived from a particular basic geometric pattern, for instance a parallelogram, rectangle, square, triangle, or hexagon. This basic pattern can by modulated by complementary bulges and recesses. A path cannot be covered without holes if equilateral pentagons or octagons are used. If an octagonal stone is combined with a square stone, however, the surface is completely covered. It is immediately clear that if a square stone is cut along its two diagonals, two triangles result. Adding four such pieces an octagon can be amended to a square in this way (b).

A single paving stone, when correctly fitted to the next, represents a repeating unit in the lattice. A crystallographer refers to this unit as an elementary unit cell, and the orderly setting of one unit upon another in terms of periodic translation. In the most simple organic crystal structure, the elementary cell is one molecule (Fig.13.2).

Fig. 13.2
figure 00132figure 00132

In the most simple case, molecular packing, or unit cell, is accomplished purely by shifting the molecule in all three spatial directions. The resulting unit, the elementary cell, is derived from an irregularly angled body, a parallelepiped (above right, violet). If a point near the molecule is picked out and all of the molecules in the crystal packing are connected by this point, a three-dimensional lattice results.

2 Just Like Wallpaper: Symmetries Govern Crystal Packings

The contents of an elementary cell can also be more complexly composed, for example, like a wallpaper pattern. A basic motif is repeated so that it fills the surface area. Crystallographers call the basic motif the asymmetric unit. In Fig.13.3 this motif is a flower branch. Not all of the motifs can be generated simply by shifting the branch, some must be additionally reflected. A pair of image and mirror-image branches represent the elementary cell. The surface can now be filled with this building block by simply shifting it. In addition to reflecting, basic motifs can also be rotated. By using reflections and rotations, both so-called symmetry operations, the contents of the elementary cell is generated from the asymmetric unit. This cell is layered on itself in all three spatial directions in an orderly formed crystal lattice. Even as a three-dimensional entity, the elementary cell must take on a particular form to completely fill all of the space. If the basic types of elementary cells are combined with all of the possible symmetry operations, 230 possibilities result for the basic motif to fill the space. The crystallographer calls them the 230 space groups.

Fig. 13.3
figure 00133figure 00133

An area can be covered not only by purely shifting an object, the asymmetric unit. Additional symmetry operations such as reflection and rotation can also be used. This way multiple copies of the object are generated. In the presented case, the flower branch along with its mirror image makes up the unit (the elementary cell is outlined in red) that can be used to cover the surface simply by shifting it regularly.

For chiral molecules, and proteins belong to this group, mirror reflection does not occur. Therefore proteins only crystallize in 65 space groups.

3 Crystal Lattices Diffract X-Rays

Max von Laue used crystals to prove the wave nature of X-rays (Roentgen rays) by diffracting them. For illustration, we shall consider a water wave. When a drop of rain strikes a puddle, circular waves form that propagate from the center outward. The drop generates a so-called elementary wave upon submersion. If two drops that are separated by a particular distance simultaneously strike the water’s surface, circular waves propagate outwardly from both submersion points. It is better to observe this experiment if the water’s surface is constantly being “excited,” for instance, with a constantly dripping tap. The circular outwardly spreading wave fronts meet each other at some point. What happens? A lamellar pattern forms, parts of the water’s surface remain at rest and other parts seem to move vigorously (Fig.13.4). In the cross section the water surface moves sinusoidally (Fig.13.5). How do two waves behave that collide and superimpose with one another? If the wave peak and another wave peak or the wave trough and another wave trough meet, the wave is amplified. If, on the other hand, a wave peak meets a trough, they cancel one another out. The water surface remains calm. The lamellar pattern of moving and still water surface between waves that are moving outwardly and inwardly is caused by this superimposition. It is called interference. The band density depends on the distance between the submersion points of the drops. The ensuing interference pattern therefore contains information about the relative position of the points from which the elementary waves were generated.

Fig. 13.4
figure 00134figure 00134

Two raindrops strike the surface of the water and form circular, outwardly moving water waves. These superimpose on one another to give a band-formed interference pattern. There are areas along these bands where the water surface is quiet. In other areas it moves that much more strongly.

Fig. 13.5
figure 00135figure 00135

The waves run in a sinusoidal manner in cross section. The distance between two wave peaks is called the wavelength. The height of the water wave at the summit is called the amplitude. The position at which the wave crosses the resting position determines the phase. (a) If two wave trains with the same phase meet, they add to one another and the amplitude doubles. This situation is in the places in Fig.13.4 where the water’s surface moves more strongly. (b) If there is a phase difference of exactly one half of a wavelength, the wave peaks meet with the troughs. Both waves cancel one another out. This represents the parts of Fig.13.4 where the water surface is very still. (c) Any other superimposed phase shift causes a wave, the amplitude of which is somewhere between the extremes in (a) and (b).

If parallel water waves (e.g., a wave front at the coast) collide with a barrier that has a small opening (e.g., a harbor entrance) semicircular waves spread outward from the backside. If this barrier has two neighboring openings (double slit), a semicircular wave develops behind each opening. The same picture as with the two raindrops is achieved (Fig.13.4). The waves interfere with one another behind the double-slit barrier, and a diffraction pattern forms. The density of this pattern, that is, the progression of the bands, depends on the geometry of the double slit.

Formally, the diffraction sequence on the crystal lattice is analogous. The same principles are valid, but the superimposition is more complex. A very simple lattice shall be considered that only has one type of atom. An X-ray runs as a parallel wave toward this crystal. It collides with an array of atoms and initiates an interaction that is comparable to that between the raindrop and the puddle. Each atom generates a spherical wave because of the interaction between the atom’s electrons and the X-ray. The circular wave on the water’s surface represents therefore the spherical wave in space. The spreading spherical waves superimpose on one another and form a wave that leaves the crystal in a changed direction (Fig.13.6). Formally seen, the incoming and outgoing waves have an angular relationship to one another that is equivalent to the reflection of the wave in a plane perpendicular to the considered atom row. Therefore, the diffraction of the three-dimensional crystal lattice can be treated formally as a reflection at a plane in the lattice.

Fig. 13.6
figure 00136figure 00136

If a wave front (blue) in one plane meets with a row of atoms (black points on the dotted lines), each atom in this row becomes the starting point for a circular wave. This is analogous to those created when the raindrop hits the surface of a puddle. The circular waves that formed from the back row of atoms superimpose upon one another just as in the case with the water waves (Fig.13.4). All circular waves are generated with the same phase in the indicated direction of the incoming wave (a). As a result of this superimposition, a new wave front forms (red) that leaves the crystal in an altered direction. Relative to the direction of the incoming wave, they have an angle that is formally a reflection of the incoming wave front on the atom row that is marked with the green line. If a different incoming direction is taken the circular waves are not generated from the same place (b), that is, there is a phase difference between them. Their superimposition does not lead to a new wave front.

Many parallel sets of such lattice planes can be inscribed on a crystal with differing relative separation from one another and relative occupation density with atoms (Fig.13.7). The reflected waves contain the information about the geometry (distance) and the relative occupancy (scattering power) in this plane. To record the diffraction properties of a crystal, each set of parallel planes of the crystal must be oriented in the X-ray beam so that a reflection is possible. This laborious work is taken over by a computer-controlled diffractometer.

Fig. 13.7
figure 00137figure 00137

A cluster of parallel planes can be laid through the atoms of a crystal lattice (a, b, c). Their relative distance from one another and their atomic occupation density varies. Each one can give rise to “reflections” in an X-ray diffraction experiment. For this the crystal must be brought into the correct orientation for the incoming beam each time. The X-ray counter is positioned so that it captures the out-going X-ray beam. It is from this geometry that the spatial orientation of the cluster of planes in the crystal is determined. The occupation density of the atoms decides how “well” a particular plane cluster reflects. This information is contained in the intensity (amplitude) of the outgoing wave. (d) Different types of atoms in a molecular crystal have different spatial relationships to one another. A parallel cluster of planes can be placed through each atom in the molecule (here a three-atom molecule). The amplitude of the outgoing beam results in the superimposition of wave trains that are reflected in these planes.

4 Crystal Structure Analysis: Evaluating the Spatial Arrangement and Intensity of Diffraction Patterns

To demonstrate that different lattices indeed generate different diffraction patterns, a simple experiment should be considered. For this purpose a laser pointer and different pinhole filters are needed. The pinhole filters can easily be made. A black and white print out of the periodic alignment as is shown in Fig.13.8 can be reduced and transferred to high-resolution photography film. This homemade aperture represents a two-dimensional periodic lattice. The laser beam is bent through the pinhole mask and generates the diffraction pattern on a screen that is shown in Fig.13.8. In the first two masks the distance and symmetry of the pinhole mask is changed. In the third and fourth mask the repeating motif of the three or five differently sized holes represent a molecule that has two types of atoms. These motifs produce a periodic lattice when lined up next to each other. They have the same dimension as is found in the first image on the left. If the diffraction pictures are compared, the distribution of the intensity of the light points is different. That is what contains the information about the construction of the motif that generated the lattice. It is just this information that is used to determine the crystal structure.

Fig. 13.8
figure 00138figure 00138

A perforated mask can be used for a diffraction experiment with a laser pointer. For this the displayed hole patterns (above) must be brought to the size of the wavelength of laser light. The diffraction patterns below were generated from the masks. The holes in the two left masks are all the same size, which is comparable to having only one type of atom. The hole pattern changes from wide-meshed squares to an angular orientation. The diffraction patterns reflect the symmetry and distance of the holes to one another. In the two masks on the right, the distance between the repeating units is identical to the first masks. The composition of the motif in the repeating unit, however, varies. It is made up of multiple holes and can be compared to the different atoms in a molecule. The distance between the diffracted light reflections (lower row) is identical for the first, third, and fourth masks. The intensity of the diffracted radiation, however, varies from reflection to reflection. It contains the information about the composition and the geometry of the original motif.

The reflections, that is, the intensity of the individual light points in the diffraction pattern, contain the information about the form of the molecule. There is a mathematical technique, the Fourier transform, which can be used to translate the diffraction pattern back to the generating motif. A Fourier transform is the superimposition of many sine and cosine functions. The intensity of the diffraction reflections determines the contribution of the functions, as does the phasing. The importance of these aspects was already underscored in the interference of the waves (Fig.13.5). Unfortunately just this information about the relative phasing is lost in the diffraction experiment. The diffractometer only registers the intensity of the reflections. The missing information is referred to as the phase problem of crystal structural determination. It must be reconstructed for the individual reflections by computational methods and by using appropriate measuring conditions. Frequently large electron-rich elements (e.g., heavy-metal ions) are embedded in the protein (i.e., by coordinating to histidine or cysteine). These heavy atoms dominate the diffraction pattern, and in doing so, they betray their position in the crystal lattice. Another method takes advantage of the so-called anomalous scattering. This effect is based on the interaction of X-rays with the electrons of heavy atoms in particular. This leads to the situation that a spherical wave that is propagating toward an atom is reflected with a phase shift. Simply stated, it is returned with a delay. The effect is dependent on the wavelength and can be exploited to determine the phasing. The crystal is measured on a synchroton (particle accelerator that also produces electromagnetic radiation in a broad wavelength range, including X-rays) and the diffraction experiment is carried out with multiple different wavelengths. Anomalous scattering requires that a heavy atom is contained in the protein structure. This is already the case for metalloproteins. Often another approach is taken. Proteins that are produced in a special expression system (Sect. 12.6), can be generated with selenomethionine instead of methionine. The heavier selenium serves as an anomalous scatterer in the diffraction experiment. There are methods for small molecules that allow a straightforward reconstruction of the phase information from the intensity distribution, the so-called “direct methods.” The development of such methods is being worked on for protein structural determination. Often an already-solved, related protein structure can be utilized as a starting model for a structure determination (molecular replacement method). The model is translated and rotated in the elementary cell by computer simulations until a calculated diffraction pattern is produced that matches the diffraction pattern of the unknown protein.

The phasing obtained at the beginning of the structural analysis with this method is only approximate. Altogether the regeneration of the phasing information is not trivial. Even in the 1960s, phasing calculations kept one scientist busy for several years. The methodical progress and the increased performance of computers now allow this to be accomplished in a few minutes. Even today, however, this step can still be very challenging for proteins. It is becoming apparent though, that the structure determination of medium-sized proteins is becoming routine. Historically, the time span from crystallization to structure determination could be quite long. Urease is certainly a curiosity. It was the first protein to be successfully crystallized. James B. Sumner accomplished this back in 1926. Its 3D structure, however, was first elucidated in 1995, that is, 70 years later!

5 Diffraction Power and Resolution Determine the Accuracy of a Crystal Structure

A picture of the contents of the unit cell is the result of the Fourier transform. It is portrayed in terms of the electron density in space (Fig.13.9). The detail with which the electron density can be determined depends on the spatial resolution with which the diffraction pattern was measured. In relation to the Fourier transform, this is a question of the number of different wave fronts that were superimposed upon one another in the correct amplitude and phase. It can be seen in the diffraction pattern created with the laser beam (Fig.13.8) that the intensity clearly weakens toward the edges. The extent to which the diffraction pattern is perceivable in the edges limits the accuracy with which the generated motif can be spatially resolved. For small organic molecules, this resolution is easily achieved in that the atoms are visible as distinct maxima in the electron density. If the crystal’s quality is diminished due to lattice defects or disorder, the resolution is poorer. The resolution in protein crystals is usually between 1.5 and 3 Å. In the best case, a resolution is achieved that is in the order of magnitude of a bond length. The upper limit falls into the range of the cross section of a benzene ring. Resolutions of less than 1 Å, however, have been achieved (Fig.13.9). In those cases many details are recognizable, such as single hydrogen atoms or multiple arrangements of side chains.

Fig. 13.9
figure 00139figure 00139

View of a crystal structure of aldose reductase (Sect. 27.4). The electron density (the so-called 2F0–Fc density at 1σ level) is displayed as a blue mesh on the predefined contour level around a tryptophan residue. In (a) the diffraction data were obtained at a resolution of 4 Å, and a Fourier transform was used to calculate the electron density. The resolution increases from (a) 4 Å to (b) 3 Å, to (c) 2 Å, and to (d) 0.66 Å. The resolution in the last-shown contour density is so high that hydrogen atoms can be recognized as single density peaks in the difference density map (positive is yellow, negative is violet F0–Fc difference density, 2σ level). The electron density is so clearly structured at 2 Å that it is simple to fit the indole building block in place. At 4-Å resolution this assignment is problematic and can easily lead to errors.

At higher resolution the electron density maxima are directly assigned to the atoms in the molecule (Fig.13.10). In the beginning this assignment is crude, the phases used in the Fourier transform are only approximate. The position of the detected maxima must still be optimized. This is defined as “refinement of the structure.” For this the experimentally observed diffraction pattern is compared with the diffraction pattern that is calculated from the atomic positions of the preliminary model. If the measurement is very accurate, the density of a “pseudomolecule” with spherical atoms can be subtracted from the observed electron densities at the end of the structure determination. What remains is the electron distribution of the bonds between the atoms in the molecule (Fig.13.10). This is, however, only possible with very high-resolution measurements. At lower resolution, as is the case in moderately resolved protein structure determinations, a direct assignment of the atoms of the protein to the electron density maxima cannot be made (Fig.13.11). More commonly the course of the chains is fitted to the electron density. Because proteins are constructed from 20 different amino acids that prefer to take on typical geometries, the interpretation of the electron density is simplified (Fig.13.11). As with low-molecular-weight structures the model is iteratively refined, and the structural data improved.

Fig. 13.10
figure 001310figure 001310

Crystals with an edge of 0.1–0.3 mm are needed for the structure determination of small organic molecules. (a) A diffraction pattern is obtained in an X-ray beam (compare Fig.13.8) that is displayed on a photographic plate or (b) is registered with a diffractometer counter. The molecule that generated this diffraction pattern, which is periodically arranged in the crystal is back-calculated from the reflections. (c) A Fourier transform is carried out with approximate phasing, and a map of the electron density in space is obtained that is contoured according to its height. The maxima are assigned to the atoms in the molecule (here oxalic acid). (d) The spatial blurring of the electron density is associated with thermal motion of the atoms. It is displayed with ellipsoids that represent the 50% probability of the occupancy of each atom. (e) Crystals that scatter well allow the determination of the electron density in the bonds between atoms. (f) The application of symmetry operations generates the molecular packing in the crystal lattice. It delivers information about noncovalent interactions between molecules.

Fig. 13.11
figure 001311figure 001311

(a) The diffraction pattern of a protein crystal clearly shows more reflections. As they are made up by larger molecules the unit cells comprise a bigger volume and exhibit more lattice planes and therefore reflections. However, due to high solvent content and inherent flexibility of the more complicated macromolecules the crystals give rise to poorer diffraction quality and the data are registered to a lower resolution. (b) The enormous data flood is registered with an area detector on a diffractometer. This allows the simultaneous registration of many diffracted intensities. (c) A Fourier transform performed with phases from the first model delivers the distribution of the electron density in space (blue mesh). Because no atomic centers are resolved in this density, the trace of the protein chain (here a segment from a β sheet of tumor necrosis factor, TNF) is fitted to the electron density distribution. (d) Similarly to small molecules, the obtained model is refined until all of the atoms of the protein fit optimally into the density. (e) The color-coded thermal motion of the molecule is shown over the entire molecule. Blue to yellow to red color changes show the transition from mild to severe movement. (f) Symmetry operations generate the molecular packing in the crystal lattice. There are “empty” areas that are occupied by numerous water molecules. Because of the strong thermal motion and the disorder that it causes, they are not found in the electron density map.

Electrons scatter X-rays. Therefore, the number of electrons around an atom determines how well it is detected in the resulting density. Hydrogen atoms have only one electron in their shell. As a consequence, they are often not located or are located with poor accuracy in the electron density. Hydrogen atoms can be recognized as densities in the structure determination of small molecules, but this is only possible in protein structures if the resolution is less than 1 Å. It is unproblematic as long as it only concerns hydrogen atoms at positions that correspond to spatially fixed positions at a rigid molecular scaffold, for instance, hydrogen atoms on phenyl rings. It is more difficult if the hydrogen atom is on a conformationally flexible group or groups that can be protonated or deprotonated. It is good to know if a carboxyl group is ionized, or if it exists as the free acid, and in which direction the hydrogen atom is oriented. This information can only be indirectly gleaned from the protein structure through an exact analysis of the spatial orientation of the surrounding hydrogen-bonding partners.

The accuracy of the structure determination depends on the resolution of the data that was obtained from a crystal. Even if the structure of the protein is displayed on the computer screen like that of an organic molecule, its geometry is much less accurately determined. The error margins in small molecule determinations are approximately 0.01 Å for bond lengths, 0.1° for bond angles, and 1°–2° for dihedral angles (Chap. 16, “Conformational Analysis”). For protein structures, significantly larger errors must be assumed, and they are difficult to quantify. They depend on how the structure was refined. The electron density does not allow individual atoms to be resolved. Therefore amino acids are placed with idealized bond lengths and angles in the electron density. Their geometry is left at the predefined knowledge-based values for the subsequent refinement. The assignment of atom types for the placement of the side chains is partially based on assumptions. Knowledge-based values are used, or attempts are made to keep the hydrogen-bonding network consistent. These aspects are to be considered when judging the accuracy of a protein structure. The result of the crystal structure determination is given in a spatially and time-averaged picture of one “mean” molecule that represents the whole crystal. Often it is discovered that the electron density in some areas indicates only a reduced occupancy of a side chain or a part of a bound ligand. Furthermore, alternative orientations (conformations) are recognizable. Sometimes the electron density from entire areas is missing. This is indicative of “disorder,” and argues for a distribution over multiple orientations in the crystal. This disorder can be dynamic, that is, the relevant groups jump back and forth between two or more orientations. Or the disorder is static which mean several orientations are present side-by-side in a crystal. Because the structure is an averaged picture, these arrangements are scattered throughout the crystal with different orientations. If a part of the molecule is entirely disordered, that is, scattered over numerous orientations, the electron density is usually not visible. Today, just to reduce the damage due to radiation exposure, structures are measured at 100 K by using a nitrogen cool gas stream. At this temperature many movements in the crystal are frozen and static disorder can be observed. Despite this, it has been shown that the determined structure corresponds well to the situation at room or body temperature. These conclusions can be drawn by comparing the results to the analogous determination from NMR spectroscopy (Sect. 13.7) and molecular dynamic simulations (Sect. 15.8).

6 Electron Microscopy: Using Two-Dimensional Crystals to Trace Membrane Proteins

Cryoelectron microscopy represents an ideal complement to X-ray structure determination because it makes the structure of very large membrane-bound proteins accessible. Electrons are used as the radiation source. These slightly penetrate the crystalline sample and they are more strongly absorbed than X-rays. Molecules scatter electrons much more strongly than X-rays. Therefore much smaller crystals can be used. Even crystals that are razor blade thin and are made up of only a few molecular layers are sufficient. Single molecules can even be imaged, but their molecular mass must exceed several million Daltons. Smaller molecular weights make periodically organized arrays of multiple molecules necessary. In the meantime, membrane protein crystals have been successfully grown in two-dimensional periodic molecular orientation. The attempt to grow crystals of such proteins that are large enough for an X-ray structure analysis has only worked a few times and requires very special additives for the crystallization.

In recent times crystallization of membrane proteins has been successful in lipidic cubic phases. Sophisticated mixtures of lipid, water, and protein can form structured three-dimensional lipidic arrays that are pervaded by water channels. Protein molecules diffuse into this structured yet flexible matrix, which facilitates crystal nucleation and growth.

In addition to the work with readily obtained crystals, electron radiation has another advantage over X-rays. It can be used for a diffraction experiment as well as for the direct visualization of an object. The microscopic visualization is unfortunately not possible with X-rays because a convergent lens cannot be built for X-rays. This is successful for electrons because they can be focused by using magnetic fields. Why not use an electron microscope to visualize molecules in general? Despite the reduced radiation, electrons still damage the samples considerably. Furthermore the crystals that are used represent about a millionth the sample size that is used for X-ray structure analysis. The data for an X-ray structure can be collected on one single crystal. In contrast, several hundred to thousand tiny, often only 5-μm large crystals are needed for electron microscopy. They are shock-frozen under high vacuum and directly exposed to the electron beam. Proteins can only withstand these conditions after special preparation. A very low radiation dose is worked with. Because of this, the images are very noisy and must be averaged over many observations. To obtain a detailed resolution in the plane perpendicular to the crystal’s plane, the crystal must be measured in many orientations. Fine structural details are lost in doing this. The analogous patterns in the electron diffraction diagram, as would be obtained in an X-ray experiment, can be corrected by computational methods. With the help of the Fourier transform, an electron density map of the molecule is obtained. Its interpretation or refinement is accomplished in the same way as for the X-ray experiment. The phasing that is necessary for the transform can be determined from the images in electron microscopy.

The technique is relatively young and the methods are developing further. There is more work to be done. Structural determination still takes several years, and only a few laboratories have adequately powerful microscopes. Nonetheless, the knowledge that we have about the structure of membrane-bound receptors today is often based on the results that were achieved with this method (Chap. 30, “Ligands for Channels, Pores, and Transporters”).

7 Structures in Solution: The Resonance Experiment in NMR Spectroscopy

Many atomic nuclei have an angular momentum, or spin. The nuclei that occur in biological systems that have a nuclear spin are the hydrogen isotope 1H, the carbon isotope 13C, the nitrogen isotope 15N, the fluorine isotope 19F, and the phosphorus isotope 31P. Just as a top would, these nuclei rotate about their axes. As long as no magnetic field is applied, the tops orient in all possible spatial directions. In a magnetic field they are forced into alignment (Fig.13.12). If a toy top is spun, it moves in the gravitation field. This field has one preferred direction. If the alignment of the rotation axis of the top and the direction of the gravitation field, which is oriented toward the center of the Earth, are not exactly the same, the top wobbles. The end of the rotation axis performs a circular movement, an arc, with a very precise rotational speed. It depends on the mass and geometry of the top. In physics this movement is known as precession.

Fig. 13.12
figure 001312figure 001312

Atomic nuclei with a rotational momentum behave like a spinning top. In the absence of an external magnetic field, they orient in all possible directions randomly (a). Upon application of a magnetic field, they orient their rotation axes parallel or antiparallel to the direction of the field (b). The precession movement is oriented in an arc around the applied field direction. The two orientations, parallel or antiparallel, with respect to the direction of the field are energetically different. Because of this, there is a small difference in occupancy between the two states. By applying an electromagnetic field with a frequency that corresponds to the rotational speed of the top’s axis, the occupancy can be inverted. This resonance absorption, the exact frequency of which depends on the type of nucleus and its immediate chemical environment, is registered with a spectrometer.

Atomic nuclei with a spin behave in a very similar way. In contrast to the macroscopic top, they obey the laws of quantum mechanics. This means that the rotation axes that their precession movement takes on can only adopt very specific angles with respect to the applied field direction. The result for the 1H, 13C, 15N, 19F, and 31P nuclei is that the rotation axis for the precession arc can only be parallel or antiparallel to the direction of the field. The orientation in the direction of the field is energetically somewhat more favorable than the rotation antiparallel to the direction of the field. Statistically, therefore, more nuclear spins in the substance sample will align with the direction of the field. If an additional magnetic field is applied to the outer magnetic field, and its frequency corresponds to the precession frequency of the nuclear spin, the occupancy of “parallel” to “antiparallel” spinning nuclei can be reversed and a resonance absorption for the sample can be registered. After a particular time span, the original situation is restored (relaxation).

The rotational speed of the top’s axis for precession movements is characteristic for each type of nucleus. It depends further on the composition of the chemical environment in which the nucleus resides. A carbon atom of a phenyl ring has a different resonance frequency than that of an aliphatic chain. The relative position of the resonance absorption in relation to a standard reference is also called the chemical shift. Furthermore the individual nuclei can perceive the spin orientation of the neighboring nuclei. An alignment in the same direction as a neighboring nucleus is energetically different from that of an antiparallel orientation. This influence also modulates the rotational speed of the spin on the observed nucleus. The information transfer regarding the orientation or the magnetic state of the nuclei in the vicinity can be transmitted over several bonds. This transfer can even occur through space without any direct covalent connection.

To measure an NMR spectrum (nuclear magnetic resonance), a solution of the substance has to be placed in a strong magnetic field. In addition, a variable electromagnetic field is applied to the sample. The frequencies at which the nuclei in the sample have resonance, meaning when they flip from parallel to antiparallel, are recorded. The resulting spectrum discloses information about the composition and the chemical environment around the studied nuclei. It contains information about the spatial structure of the molecules under investigation. Based on the work of Richard Ernst, multidimensional NMR techniques have been developed in the last 30 years. By using suitable measuring conditions and selectively irradiating electromagnetic fields, information about the mutual influence of resonance frequencies among individual nuclei is separated and analyzed. This either-way induced information transfer about the magnetic state of neighboring nuclei is apparent from the signal form of multidimensional spectra, which are registered in terms of cross peaks. Only the hydrogen isotope 1H occurs in nearly 100% natural abundance. Therefore, it can be assumed that for statistical reasons, two 1H nuclei will always be adjacent to each other in a molecule. In contrast, the 13C and 15N isotopes are scarce. As a result, statistically they are only very rarely found in the direct vicinity of one another. Data on the mutual influence of the magnetization of these nuclei are required for the spectra. Therefore it is necessary to enrich the proteins with the appropriate isotopes. For this, bacteria are fed with isotopically labeled substrates such as glucose or ammonium chloride and will then produce proteins that are isotopically enriched. It is even necessary to produce deuterated proteins for the structural investigation of very large proteins. Today, by using numerous spectroscopic techniques, spectra from proteins of more than 800 amino acids have been successfully interpreted. The following questions can be addressed by NMR analysis:

  • Which atomic nuclei occur in which chemical environment?

  • What is in the immediate, covalently connected neighborhood of these nuclei? Information about the spatial orientation of atoms in the vicinity is also contained within these spectral parameters.

  • Which geometric relationships are given between different segments of the polypeptide chain? This results from information transfer about magnetic states of nuclei that are not directly connected by covalent bonds.

8 From Spectra to Structure: Distance Maps Evolve into Spatial Geometries

This last-mentioned observation, which results from the nuclear Overhauser effect (NOE), yields intramolecular distances of spatially neighboring but not directly covalently bound atoms. The entire connectivity, that is, the list of all covalent bonds within a molecule, and a list of the recorded intramolecular noncovalent distances are applied to generate the structure for the molecule (Fig.13.13). For this purpose, so-called distance–geometry calculations are used to create the spatial coordinates of the atoms.

Fig. 13.13
figure 001313figure 001313

A multidimensional NMR spectrum contains information about the spatial vicinity of atomic nuclei in a molecule (here, the trypsin inhibitor from bovine pancreas). It is expressed in so-called cross peaks. Information can be extracted about the distance between non-covalently bound atoms in a molecule. The individual signals of the spectra are assigned to atoms in the molecule (e.g., A and B). The positions that these atoms have in the polypeptide chain are known from the sequence of the protein (above left). The intensity of the cross peak indicates which spatial distance is found between nuclei A and B in the folded polypeptide chain (above right). Just as was done for A and B, the many other cross peaks are evaluated and translated into distance conditions.

Often multiple equally good structural models fulfill the experimentally determined distance conditions in complex molecules. If the spectral parameters for a section of the structure are too scarsely distributed with too large distances, it is very difficult to achieve a unique spatial configuration of the atoms. Therefore, the generation of a structural model is coupled with molecular dynamics simulations (Sect. 15.7). These calculations deliver geometries of molecules that represent energetically favorable 3D structures consistent with the spectral parameters. Multiple slightly divergent models are given in areas with few spectral conditions. Therefore, the NMR spectroscopists always suggest a bunch of structural solutions (Fig.13.14).

Fig. 13.14
figure 001314figure 001314

The accuracy of an NMR structure depends on the density of the experimentally determined atomic distances. These come from experiments that deliver information about the exchange of the magnetic state of spatially adjacent, but not directly connected atoms (so-called nuclear Overhauser effect, NOE). With the connectivity list and the NOE conditions, multiple structural models are generated. These models represent the low-energy geometries that agree with the spectral parameters. In the left part of the figure (a) the experimentally measured NOEs (black dashed lines) are distributed over the 3D structure of a domain of the guanine nucleotide exchange factor. For the sake of clarity, only the long-range NOEs are shown. Most of the amino acid side chains are also suppressed; many of these NOEs therefore indicate the positions of atoms that are not shown. In areas in which very few distances could be determined (e.g., in the green loop areas or at the termini), the model is ambiguously defined. Multiple models are consistent with the experimental data (b). The main chain of the protein fans out. In areas where a large number of NOE conditions are found (e.g., the helices and the central β strand), the structural models diverge only slightly from one another.

Attempts are often made to compare the quality of X-ray and NMR structures. Both methods measure different properties, and the structures are derived from different measured variables. This must be considered when making a direct comparison. The accuracy of an NMR structure fluctuates with the density and frequency of spectral distance constraints, while that of an X-ray structure mainly depends on the resolution of the diffraction experiment.

9 How Relevant Are Structures in a Crystal or NMR Tube to a Biological System?

The discussed structure determination techniques investigate molecules in a crystal assembly or in solution in an NMR tube. Are these conditions at all relevant for the biological conditions in an organism? Small flexible molecules change their geometry depending on the environment. They will adopt a different shape in a crystal, in solution, or in the binding pocket of a protein. Therefore the question can be asked whether the data from a small-molecule crystal structure are suitable to deliver information about the molecular geometry in a binding pocket. From the numerous known crystal structures, and in the meantime it is more than 500,000, some general principles about the molecular architecture of organic compounds can be deduced. All of the published crystal structures are electronically archived at the Cambridge Crystallographic Data Centre in England. They can be retrieved and compared with one another. It will be shown in Chaps. 14, “Three-Dimensional Structure of Biomolecules” and 16, “Conformational Analysis” that valuable information about possible molecular and interaction geometries are available through a statistical evaluation of these data, which provides insights also relevant for the conditions in a protein binding pocket.

Nevertheless, are the structures in the crystal of the protein too remote from the conditions in a biological system, much further than, for instance, the solution-phase state? A good many structure determinations that were carried out in solution and in the crystal in parallel are available. Experience has shown that the correlation is usually very large. Deviations are preferably found on the surface area of proteins. There, the amino acid side chains form interactions with the environment. Therefore, these deviations are not surprising. The crystal packing of tumor necrosis factor (TNF) is presented in Fig.13.11. Large holes are conspicuous in the crystal packing. These areas are filled with water molecules that are so loosely incorporated into the crystal that they can freely move to a large extent. Therefore, they are not locatable in the electron density. Channels filled with water in protein crystals can make up to 70% of the crystal’s mass! Therefore, the crystal can also be considered as a highly concentrated, ordered solution. NMR measurements also require high concentrations. They are considerable higher than in biological systems, but are still 10–100 times lower than in protein crystals.

The high water content of protein crystals offers the possibility to allow small molecules to diffuse into the crystals. In the water channels, they move as they would in an aqueous solution. In favorable cases, the binding pocket of the protein is directly accessible from one of these channels. By placing the protein crystal directly in a solution of the active substance (soaking), the latter can penetrate the crystal through the channels, diffuse into the binding pockets, and dock there. Then a new diffraction experiment is carried out with the loaded crystal. The reflections are measured, and, based on the known structure of the protein, the electron density map is generated. The density of the uncomplexed protein is subtracted from that map. The difference density of the incorporated ligand remains. This information is of essential importance for understanding the interactions between small molecules and proteins. The question of whether the experimental structure is really relevant for the biological conditions has still not been answered. Crystalline hemoglobin is able to reversibly take up and release oxygen. It could be shown on crystals of purine nucleoside phosphorylase (PNP) that the enzyme is still catalytically active in the crystal (Fig.13.15).

Fig. 13.15
figure 001315figure 001315

The enzyme purine nucleoside phosphorylase (PNP) transforms guanosine and phosphate to guanine and ribose-1-phosphate. If a protein crystal is placed in a solution of the substrate, the reaction begins. This could also have been caused by a partial dissolution of the enzyme crystal. If the crystal is removed from the solution, the reaction stops. If the crystal is brought back into the solution, the reaction carries on. This experiment demonstrates that even crystalline enzymes are catalytically active. Therefore, a geometry must be present in the crystal that corresponds to the biologically active form.

The research group of Malcolm Walkinshaw at the University of Edinburgh could even show on the example of the enzyme Cyp3, a peptidylproline isomerase, that there is a quantitative agreement between the crystalline and solution states. Different concentrations of an inhibiting prolyldipeptide were allowed to diffuse into the crystal. Afterward, the occupancy of this inhibitor obtained from the differently concentrated soaking solutions was determined in a crystallographic experiment. The binding constants were then ascertained from this occupancy data. They quantitatively agreed with the inhibition constants that were determined in a functional assay in solution.

The diffraction data can be very quickly collected with even more intense, so-called white X-rays from a synchrotron source (the so-called Laue technique). With this experiment, it was possible to observe stable intermediates of enzyme reactions. Structural changes of the two-dimensional crystals of the acetylcholine receptor (Sect. 30.4) could be observed with electron microscopy after loading with the natural ligand. This and other experiments have proven that proteins exist in a crystal lattice that must be, at the very least, very similar to the biologically active form.

10 Synopsis

  • The most powerful methods to determine the spatial structure of molecules are X-ray crystallography and NMR spectroscopy. The former requires the biomolecules to be arranged in periodic arrays in a crystal, and the latter studies them in solution, usually in an isotopically labeled form.

  • Crystals need special conditions to grow from saturated solutions. They spatially arrange in periodic arrays, and the molecules pack through translational symmetry in three dimensions. In addition to the pure shifting of basic motifs, usually one molecule that represents the asymmetric unit, symmetry operation such as mirror reflection, two-, three-, four-, and six-fold rotation or inversion can be applied.

  • Crystal lattices diffract X-rays and the diffraction experiment can be understood as a three-dimensional interference of elementary spherical waves generated at the positions of the atoms in the lattice. The diffraction phenomenon at a 3D lattice can be treated formally as multiple reflections at crystal planes in the lattice.

  • Because the relative phases of the generated elementary spherical waves, superimposed in the various reflections, are not accessible by experiment, they must be regenerated by sophisticated phasing methods. Only then can a Fourier transform be calculated from the measured reflections that represents the spatial distribution of the electron density in the crystal. A model of the crystallized molecules is assigned to this electron density.

  • The diffraction power and resolution of the crystals determine the accuracy of the resolved structure. For proteins, a resolution of 1.5–3 Å is usually achieved. At the lower end, molecular building blocks such as phenyl rings are well resolved, and individual water molecules are visible. At the upper limit, only the overall topology is determined, and the water molecules usually cannot be assigned.

  • The crystal structure is an average structure over space and time. Enhanced B-factors give an estimate of the residual mobility of molecular portions in a molecule.

  • Cryoelectron microscopy is an alternative method to determine the structure of membrane-bound proteins in particular by diffraction experiments. Data are collected from thousands of tiny razor blade-thin crystals.

  • NMR spectroscopy records the resonance of magnetic nuclei such as 1H, 13C, or 15N oriented in a strong magnetic field. The transition between parallel and antiparallel orientation of the nuclear spins can be induced by additional fields. Because the frequency at which these transitions take place depends on the chemical environment in a molecule, the spectral parameters contain information about the 3D structure of the molecules in solution.

  • The multiplicity of the recorded spectral parameters can be transformed into distance maps. They can be translated into the spatial structure of the protein by using a distance geometry approach coupled with molecular dynamics simulations.

  • It could be shown for many cases that the NMR structure of a protein in solution and the X-ray structure in a crystal largely coincide with one another. Differences are observed for the surface-exposed residues.

  • Protein crystals contain up to 70% water and exhibit large water channels that pass through the crystal. Small molecules can diffuse and access binding sites through these channels, particularly if these sites are accessible from one of these channels. The binding modes of small-molecule ligands can be easily determined by using soaking techniques.

  • The significance of the architecture of proteins determined in a crystalline environment for biologically relevant conditions has been demonstrated. For example, enzyme reactions also take place when the protein is arranged in a crystalline state.