9.1 Introduction

The far-field diffraction pattern of a finite and isolated object is continuous, unlike that of an ideal perfect crystal which consists of discrete spots, called Bragg peaks. The arrangement of macromolecules in a crystal lattice provides an enormous amplification of the diffraction signal in these peaks over that of a single molecule. This has been the key strategy for structure determination using X-rays from the earliest days, since it makes possible the measurement of a diffraction signal above background scattering within the very low exposure that can be tolerated before radiation damage modifies the structure under investigation. However, measurements confined only to Bragg peaks represent but a fraction of the information about a structure that could possibly be acquired in a diffraction experiment. This information loss, due to the fact that the Bragg peaks are too sparse to measure the entire content of the object’s Fourier spectrum, usually prevents the possibility to derive the diffraction phases that are needed to synthesize the electron density from that spectrum. This so-called phase problem is the usual state of affairs in crystallography, requiring additional measurements such as multiple wavelength anomalous diffraction or isomorphous replacement to provide the needed missing information. The problem can also be overcome if extremely high resolution is available, such that the information content of the measurement exceeds the information needed to describe all atomic degrees of freedom (the positions of the atoms and their amplitudes of vibration). By contrast the measurement of the fully sampled continuous diffraction pattern of an isolated non-periodic object does not suffer from the phase problem since there are generally more independent measurements in the diffraction intensities than needed to describe the object, independent of resolution. For this reason, and because it is often difficult to produce highly ordered crystals of macromolecules, many approaches have been considered that may give access to the continuous diffraction.

Since there are more crystallographers than those who carry out coherent diffractive imaging of non-periodic objects, we have assumed that most who read this chapter are more familiar with the concepts of coherent diffraction as applied to crystalline systems. We therefore introduce, in Sect. 9.2, coherent diffraction generally, our nomenclature and conventions, and examine some insights about the information content of diffraction patterns and phase retrieval that have been developed in the field somewhat independently of crystallography. In Sect. 9.3, various situations are described where high-resolution continuous diffraction of macromolecules can be observed, many of which have long fascinated crystallographers and diffraction microscopists alike. Given the recent success of utilizing the continuous diffraction of translationally disordered photosystem II crystals for structure determination [1], summarized in Sect. 9.4.8, we devote Sect. 9.4 to such cases and show that the continuous diffraction of rigid units is remarkably resilient to different forms of disorder and correlation that might occur in such crystals, especially if one considers the structural information that can be extracted from the autocorrelation function (the generalized Patterson function [10], equal to the Fourier transform of the diffraction intensities). The exposition laid out in this section might inspire those working in coherent diffractive imaging to apply concepts such as partial-coherent diffraction analysis to macromolecular structure determination. Finally, in Sect. 9.5, we describe procedures and lessons learned to accurately measure macromolecular continuous diffraction, especially using X-ray free-electron lasers, which is somewhat more challenging to do than for Bragg peaks.

9.2 Coherent Diffraction of a General Object

Consider a diffraction experiment where a collimated quasi-monochromatic X-ray beam is elastically scattered by an object with a three-dimensional electron density distribution ρ(r). The incident beam, of wavelength λ, can be described by a wave-vector k in pointing in the direction of beam propagation with |k in| = 1∕λ. The diffraction intensities are recorded in the far field on a pixelated detector. A particular detector pixel records radiation travelling from the object with a wave-vector k out as shown in Fig. 9.1a. In the Born approximation, in which multiple scattering is neglected, the counts measured in this pixel are given by

$$\displaystyle \begin{aligned} I(\mathbf{q}) = I_0 \, P \, \varOmega_p \, r_e^2 \left| \int \rho(\mathbf{r}) \exp(2 \pi i \mathbf{r} \cdot \mathbf{q}) \, d\mathbf{r} \right |{}^2 \;, \end{aligned} $$
(9.1)

where q = k out −k in is the photon momentum transfer vector, I 0 is the incident fluence (number of photons per unit area within the exposure time of the measurement), P the polarization factor, Ω p the solid angle subtended by the pixel from the sample, and r e the classical radius of the electron.

Fig. 9.1
figure 1

(a) Far-field scattering can be described in terms of paths of rays scattered from atoms. (b) The Ewald-sphere construction

Equation (9.1) can be explained in terms of a ray description of scattering and considering the density ρ(r) to be given by the sum of point scatterers of strength f i located at positions r i: ρ(r) =∑i f i δ(r −r i) (Fig. 9.1). A ray scattered in a direction k out from atom 1 will acquire a path difference of \(\ell _1 = ({\mathbf {r}}_1 \cdot \hat {\mathbf {k}}_{\mathrm {out}} - {\mathbf {r}}_1 \cdot \hat {\mathbf {k}}_{\mathrm {in}})\) relative to a ray scattering from the origin O, where \(\hat {\mathbf {k}}\) are unit vectors. This is the difference of the lengths of the thick lines in Fig. 9.1a. The accumulated phase, relative to this arbitrary origin, will therefore be ϕ 1 = (2πλ) 1 = 2π r 1 ⋅q. The point scatterer itself may cause a modification to the scattered wave by the complex value f 1, giving a scattering amplitude \(f_1 \exp (i\phi _1) = f_1 \exp (2 \pi i{\mathbf {r}}_1 \cdot \mathbf {q})\). Equation (9.1) is simply the coherent summation of all scattered waves as obtained by integrating over all point scatterers in the object. The measured distribution of counts depends strongly on the phases ϕ of the scattered waves (and thus on the three-dimensional arrangement of scatterers), since these may lead to complete constructive or destructive interference, or something in between.

Equation (9.1) is further identified as the square modulus of the Fourier transform of the electron density of the object. The strength of the diffraction in a given direction k out only depends on the Fourier component \(\tilde {\rho }(\mathbf {q})\) where we use the definition of the Fourier transform, for any integrable function g, as

$$\displaystyle \begin{aligned} \tilde{g}(\mathbf{q}) \equiv \mathcal{F}_q \{ g(\mathbf{x}) \} \equiv \int g(\mathbf{x}) \exp (2 \pi i \mathbf{x} \cdot \mathbf{q}) \, d\mathbf{x}. \end{aligned} $$
(9.2)

This component is a particular spatial frequency in the object, which may be thought of as a volume grating of a particular wavenumber 2π|q| and direction \(\hat {\mathbf {q}}\). From Fig. 9.1b, it is seen that the magnitude of q is given simply by

$$\displaystyle \begin{aligned} |\mathbf{q}| = 2 |\mathbf{k}| \sin \theta = \frac{2}{\lambda} \sin \theta \end{aligned} $$
(9.3)

for a scattering angle 2θ, and that due to the conservation of |k| (that is, elastic scattering) the vector q lies on the surface of a sphere (called the Ewald sphere). We see from the diagram in Fig. 9.1b that the scattered ray appears to reflect at an angle θ from a plane normal to q. That is, the ray reflects from the volume grating which is tilted at the angle θ relative to the incoming wave-vector. The ray only reflects if the period of the volume grating, d = 1∕|q|, satisfies Eq. (9.3), which is to say \(d = \lambda /(2 \sin \theta )\) which is well recognized as Bragg’s law.

For a given orientation of the object, there are only a subset of spatial frequencies (volume gratings) that can be observed by the diffraction measurement. These are the frequencies that have the right periods d to obey the Bragg condition for given scattering angle 2θ and which happen to be in the right orientation for this to occur (by “reflection”). These are described exactly by the vectors q that lie on the Ewald sphere. In order to measure other spatial frequencies, not present on the Ewald sphere, the object must be rotated to bring these into the reflecting condition. It should be stressed that although Bragg’s law and the Ewald sphere construction are well-known concepts in crystallography, there is no requirement of periodicity of the object in the derivation or application of these concepts. Diffraction from a crystal is a special case (a periodic object), as discussed next.

9.2.1 Diffraction of a Periodic Object

A crystal can be thought of as a special case of the general object (and thus crystallography as a special case of coherent diffractive imaging!). The electron density of an ideal finite crystal can be described as a sum of the unit cell contents convolved with a periodic lattice:

$$\displaystyle \begin{aligned} \rho(\mathbf{r}) = \sum_{b=1}^{N_b} \rho_b(R_b \mathbf{r} - {\mathbf{a}}_b) \otimes \sum_{c=1}^{N_c} \delta(\mathbf{r} - {\mathbf{a}}_c) \; , \end{aligned} $$
(9.4)

where ρ b(r) is the asymmetric unit or rigid body which occurs N b times in each unit cell in positions and orientations given by a b and R b, respectively (relative to an arbitrary cell origin), and a c are the positions of all the N c unit cells that make up the crystal.Footnote 1 These positions are usually periodic in all three dimensions. Below, we use the simplified notation of ρ b representing each of the N b differently oriented and translated asymmetric objects in the unit cell:

$$\displaystyle \begin{aligned} \rho(\mathbf{r}) = \sum_{b=1}^{N_b} \rho_b(\mathbf{r}) \otimes \sum_{c=1}^{N_c} \delta(\mathbf{r} - {\mathbf{a}}_c) \; . \end{aligned} $$
(9.5)

The diffraction of a periodic object takes on a special form. Since the density is given by the convolution of the unit cell and the lattice, the Fourier transform \(\tilde {\rho }(\mathbf {q})\) is given by the product of the Fourier transform of the unit cell with the Fourier transform of the lattice The diffraction pattern \(|\tilde {\rho }(\mathbf {q})|{ }^2\) is therefore also given by a product, where the diffraction intensities of the unit cell are modulated by Bragg peaks given by

$$\displaystyle \begin{aligned} L(\mathbf{q}) &= \left| \mathcal{F}_q \left \{\sum_{c=1}^{N_c} \delta(\mathbf{r}-{\mathbf{a}}_c) \right \} \right|{}^2 \end{aligned} $$
(9.6)
$$\displaystyle \begin{aligned} &= \left|\sum_{c=1}^{N_c} \exp(2 \pi i {\mathbf{a}}_c \cdot \mathbf{q})\right|{}^2\;. \end{aligned} $$
(9.7)

As the number of unit cells tends to infinity, L(q) approaches a sum of delta functions, the reciprocal lattice, with spacing inversely proportional to the real-space lattice spacing. The existence of peaks can be easily explained by considering a crystal of symmetry P1 (i.e., with no additional symmetry) with unit cell dimensions in all directions of w. The electron density of this crystal is given by Eq. (9.4) with N b = 1, ρ(r) =∑c ρ b(r −a c). The diffraction from each object in the crystal is coherently added, as described in Eq. (9.1). Since the phase of the diffracted wave of an object varies with its displacement a c as 2π a c ⋅q, the diffraction from the various cells c in the crystal will usually destructively interfere, since their relative phases would tend to uniformly take on values from 0 to 2π. The exception is in directions where constructive interference occurs, which is when a c ⋅q forms whole numbers, or at values of q spaced by Δq B = 1∕w for a unit cell spacing w. For a crystalline sample, one can therefore only make measurements of the unit cell transform at this minimum spacing of Δq B. As discussed below, this limits the information content of the diffraction pattern.

The idealized situation of an infinite crystal and delta function Bragg peaks is not realized in practice. For a coherently illuminated finite crystal (such as discussed in Sect. 9.3.3) Bragg peaks have a width given by the transform of the shape of the crystal, with widths inversely proportional to the crystal width and a total integrated value proportional to the number of cells in the crystal. More usually, such as for measurements at synchrotron radiation beamlines, the transverse coherence length of the X-ray beam is smaller than the crystal size, and is what determines the width of peaks. The partially coherent diffraction pattern in this case can be well approximated by the convolution of the coherent diffraction I(q) of Eq. (9.1) with the angular extent of the source (as seen by the sample). At such beamlines this angular extent is usually governed by the divergence of the beam, and not surprisingly this has the effect of matching the width of Bragg peaks to this angular extent. There also may be a length scale of the crystal over which strict periodicity persists, discussed in Sect. 9.4, again giving rise to a convolution of intensities. The convolution of I(q) with a correlation function or coherence function of the incident radiation, Γ(q), is to modulate the autocorrelation function of the object (described below in Sect. 9.2.4) with the Fourier transform \(\tilde {\varGamma }\). As will be seen in this chapter, knowledge of the precise form of the correlation function over length scales of many unit cells is not necessarily required to determine the structures of the molecular constituents of crystals.

9.2.2 Diffraction from an Atomic Object

At the high spatial resolution that can be accessed by X-ray wavelengths, the electron density can be described in terms of the constituent atoms of the sample. Since the electron density is highest around the nuclei rather than the bonds, this density can be accurately modelled as a sum over all atoms as

$$\displaystyle \begin{aligned} \rho(\mathbf{r})=\sum_{i=1}^N \zeta_i(\mathbf{r}-{\mathbf{r}}_i)\;, \end{aligned} $$
(9.8)

where ζ i(r) is the density of the ith atom, and there are N atoms in the entire object. The coherent diffracted intensity from this collection of atoms is therefore

$$\displaystyle \begin{aligned} I(\mathbf{q}) = \left|\sum_i f_i(\mathbf{q}) \exp(2\pi i {\mathbf{r}}_i \cdot \mathbf{q})\right|{}^2 = \sum_{ii'} f_i(\mathbf{q}) f^*_{i'}(\mathbf{q}) \exp(2\pi i ({\mathbf{r}}_i-{\mathbf{r}}_{i'}) \cdot\mathbf{q})\;, \end{aligned} $$
(9.9)

where ∗ is the complex conjugate and we have dropped the pre-factors in Eq. (9.1). The structure factors f(q), equal to the Fourier transform of the atom density \(\tilde {\zeta }(\mathbf {q})\) of each atom, can be modelled as a sum of Gaussian functions [54] but are often considered to be constant (due to point-like atoms).

9.2.3 Information Content of Diffraction Data

The three-dimensional (3D) map of the electron density ρ(r) of any general object can be synthesized from its Fourier amplitudes through an inverse Fourier transform. This is simply a coherent sum of all the volume gratings in real space that combine to make up the object,

$$\displaystyle \begin{aligned} m(\mathbf{r}) = \mathcal{F}^{-1}_r\{\tilde{\rho}(\mathbf{q})\} \equiv \int \tilde{\rho}(\mathbf{q}) \exp (-2 \pi i \mathbf{q} \cdot \mathbf{r}) \, d\mathbf{q} \;. \end{aligned} $$
(9.10)

Each period and orientation of these gratings must be summed not only with the correct strength (or amplitude \(|\tilde {\rho }(\mathbf {q})|\)) but also with the correct shift (or phase \(\mathrm {arg}\{\tilde {\rho }(\mathbf {q})\}\)) with respect to other frequencies. While the modulus of the Fourier amplitudes can be obtained from the square root of the measured diffraction intensities, \(\sqrt {I}(\mathbf {q})\), the phases are missing.

This so-called phase problem is perhaps one of the most studied inverse problems in science, and can be generally overcome from complete measurements of the independent diffraction intensities (except in some pathological cases) in two and three dimensions [2] (see Sect. 9.2.4). Unfortunately, the arrangement of objects in crystal lattices does not allow the required complete measurements to be made. For the simple example of a single molecule in a crystal of P1 symmetry, the Bragg peaks only provide half of the possible independent diffraction measurements that can be made in each direction, or an under-representation by a factor of 8 for a three-dimensional object, as is described below. This spells the difference between the feasibility or infeasibility of recovering the electron density directly from intensity measurements when no other information about the object is available.

9.2.4 Shannon Sampling and the Constraint Ratio

The diffraction pattern of a coherently illuminated finite object is “band limited,” which is to say that the modulation of the diffraction intensity as a function of scattering angle θ or momentum transfer q has a certain minimum modulation period. This smallest period is inversely proportional to the width of the object. This is true even for diffraction of crystals, where the finest features in the pattern are the Bragg peaks themselves. As mentioned above, the width of a Bragg peak is inversely proportional to the width of the entire crystal, or at least the width that is coherently illuminated.

The frequency content of a diffraction pattern can be examined through Fourier analysis, by taking the Fourier transform of the diffraction intensities I(q). That is

$$\displaystyle \begin{aligned} \tilde{I}(\mathbf{u}) \propto \mathcal{F}^{-1}_{u} \left\{|\tilde{\rho}(\mathbf{q}) |{}^2 \right\} &= \rho(\mathbf{r}) \otimes_{u} \rho^*(-\mathbf{r}) \\ &= \int \rho(\mathbf{r}) \rho^*(\mathbf{r}-\mathbf{u})\, d\mathbf{r} \\ &\equiv A_\rho(\mathbf{u}). \end{aligned} $$
(9.11)

This is the autocorrelation function of the object, a 3D map of all pair correlations of points within the object, and is a function of the real-space difference vector u. This function is zero for all u that are larger than the maximum separation of any two points in the object. For an object that has a largest width w, for example, the autocorrelation function extends from the origin by w, as well as by − w. Its extent is thus 2w. Since the diffraction pattern I(q) is a Fourier transform of the autocorrelation function, we see that the pattern is band limited with a minimum period equal to 1∕w. In essence, this means that the pattern is smooth at that reciprocal length scale. This can be verified from the Fourier transform of two delta functions spaced apart by 2w:

$$\displaystyle \begin{aligned} \mathcal{F}^{-1}_q\{\delta(r-w) + \delta(r+w)\} = \exp(-2 \pi i w q) + \exp(2 \pi i w q) = 2 \cos{}(2 \pi w q)\;.\end{aligned} $$
(9.12)

Shannon’s theorem [43] states that a band-limited function can be completely specified from discrete samples of that function as long as there are at least two samples per smallest period. Thus, the diffraction pattern discussed above can be completely measured with samples spaced no more than Δq = 1∕(2w) apart. Measuring samples more finely than this does not increase our knowledge of the diffraction pattern, and so this defines the information content of the pattern. It specifies the quantity of independent measurements that can be made of the diffraction intensities. (In practice, a finer sampling than this may help overcome an effective decrease in coherence due to the finite pixel width, or to effectively increase dynamic range and signal to noise.) If the diffraction pattern is measured to a maximum resolution q max, or a range from − q max to q max, then N S = 2q maxΔq = 4wq max samples across the diffraction pattern completely define it. Expressing the resolution as q max = 1∕d, we find N S = 4wd. In two dimensions, there are thus \(N_S^2\) independent measurements possible for an object whose extent fits in a square of width w, and \(N_S^3\) for the corresponding case in three dimensions. Here we have assumed a complex-valued object, which gives a diffraction pattern that has no symmetry. If the object is real valued, then the diffraction pattern is centrosymmetric and the number of independent measurements is reduced by half.

How much information is needed to fully specify the complex-valued electron density ρ(r) to a specific resolution? This time the minimum period to be considered is that of the finest volume grating that makes up the map m(r) at resolution q max = 1∕d, which is d. This modulation must be sampled at a spacing no larger than d∕2. Since the object has an extent of w, then at least 2wd samples are required in each dimension. The number of “unknowns” in the object is two per independent sample, considering that the density is complex-valued, and so the total is given by 2 (2wd)n for an n-dimensional object, compared with (4wd)n possible independent diffraction intensity measurements. Thus, for such an object (fitting within a cube of width w), the potential excess of measurements to unknowns is given by the ratio Ω = 2n−1 [16, 30]. That is, for a one-dimensional object there are an equal number of measurements to unknowns, two times as many in two dimensions, and a four times excess in three dimensions.

Now coming back to the crystal of symmetry P1 with unit cell dimensions in all directions of w, measurements of the intensity can only be made at the Bragg peaks which are spaced apart by Δq B = 1∕w for a unit cell spacing w. These are twice as far apart as the minimum Shannon spacing Δq S = 1∕(2w) required to completely measure the intensities from one of the objects on its own. That is, in this case of P1 symmetry, the Bragg peaks undersample the single-object diffraction pattern by a factor of two in each dimension, leading to Ω = 2n−1∕2n = 1∕2.

The ratio Ω of the excess of measured intensities to those required to describe the object has been termed the constraint ratio [16]. Obviously, the recovery of the structure of the object from the measurements alone requires Ω > 1. This condition is sufficient for reconstruction in the case of zero noise [2, 5], except in certain pathological cases where several different structures give rise to the same diffraction pattern. These cases are so-called homometric structures, defined as those with the same sets of interatomic distances (the same autocorrelation function, and hence the same diffraction pattern). Trivial examples are mirror images, but a homometric pair of objects can be constructed from the convolution of two non-centrosymmetric structures with one of the structures either being inverted or in its original position. The handedness of alpha helices in proteins means that such cases will not exist for macromolecules. Experience shows on the whole that the larger Ω, the easier it is to directly recover the electron density map.

Elser and Millane [16] have pointed out that since the diffraction intensities are equally represented by their Fourier transform, Ω is equal to the number of independent coefficients in the autocorrelation function divided by the number of independent object coefficients. Since the autocorrelation function of any complex-valued object ρ is Hermitian with \(A_\rho ^*(-\mathbf {u}) = A_\rho (\mathbf {u})\), the number of independent coefficients is equal to half the non-zero area (or volume) of the autocorrelation function divided by the area (or volume) of a resolution element in two (or three) dimensions. Since the resolution element is the same for the autocorrelation function and the object, Ω is equal to the ratio of areas or volumes of the autocorrelation and object, divided by two. For shapes other than cuboid discussed above, this may differ from 2n−1. For example, triangular objects have Ω = 3, making these structures potentially easier to determine directly from the diffraction observations. Crystal diffraction need not only result in Ω = 1∕2 as described above. If the volume of the asymmetric unit in the crystal is less than half the volume of the unit cell, there may indeed be sufficient measurements to determine the structure directly from Bragg intensities [22, 32].

The utilization of the excess of measured intensities to determine the structure presumes a shape of the constraint region to be used. This need not exactly conform to the actual boundary of the object, but must fully contain the object. The constraint that is applied in the process of phase retrieval is that the density is known outside this constraint volume (for example, it may be uniform or zero), consistent with the premise that the information required to describe the object is finite. Allowing this constraint volume to exceed the actual extent of the object reduces Ω, but may avoid applying an incorrect constraint. Prior knowledge about the shape of the object may therefore be helpful, which may indeed be available from microscopy or solution scattering. However, it is possible in many cases to determine the shape of an object from the shape of its autocorrelation function [12]—that is from the diffraction intensities themselves. Another strategy is to gradually improve the estimate of the constraint volume (known as the “support” of the object) based on the image obtained by phase retrieval based on a previous larger estimate [29]. This approach, known as “shrinkwrap,” has been extremely successful because a more constraining tighter support produces an improved estimate of the image, which itself provides the means, by a simple threshold and blurring, to obtain an improved support constraint.

9.2.5 Iterative Phasing Algorithms

The feasibility of phasing sufficiently sampled diffraction data, as discussed in the previous section, has led to a vibrant field of research in applied mathematics to create phasing algorithms. This situation and activity has been pursued quite separate to developments in crystallography, where refinement of models constrained by the rules of chemistry or the use of anomalous diffraction are common approaches to solve structures. For continuous diffraction, the measured diffraction and the support constraint alone are sufficient to determine a 3D map of the electron density, without the need for a chemical model. Additional constraints that can be added, such as the positivity of electron density (if appropriate) or a presumed histogram of the electron density (generally known for protein structures for particular spatial resolutions), will improve the ability to phase the diffraction data and may make the solution more robust in the presence of noise. Recently, some of the ideas from phase retrieval of continuous diffraction have been successfully applied to crystal diffraction, including charge flipping [38] and the hybrid input–output algorithm [22, 23].

Much analysis of iterative phasing algorithms has been carried out in the context of images or maps of electron density m(r) as finite-dimensional vectors. A particular map is represented by a point in an N-dimensional space, with the value along each coordinate given by the complex value at each of the N voxels. Out of all possible maps that can be formed, only a particular volume of the vector space will contain maps that obey a particular constraint, such as all maps that have Fourier amplitudes equal to the square root of the measured diffraction intensities. A different volume contains all maps that have zero density outside the support. The intersection of these volumes gives the solution—a map that obeys both sets of constraints. One possible strategy would be to exhaustively calculate images in an N-dimensional sphere of the vector space whose radius is limited by the maximum total intensity of the map, and test if they are in one or both constraint sets. One would need only compute maps within the volume of the support constraint (those that are zero outside the support boundary) and test if the Fourier amplitudes of the map \(|\tilde {m}(\mathbf {r})|\) agree with the square root of the measured diffraction intensities, \(\sqrt {I(\mathbf {r})}\). Such an approach is obviously too computationally expensive for maps with more than a few voxels. A tractable approach would be, starting from a trial point in the vector space, to calculate the next map in a direction that minimizes the error \(\epsilon _M = \| |\tilde {m}| - \sqrt {I} \|\), where ∥⋅∥ is the Euclidean distance equal to the square root of the sum of the squares of the vector components. This can be easily achieved simply by setting the magnitudes \(|\tilde {m}|\) equal to \(\sqrt {I}\) at each reciprocal voxel q. However, such a step will tend to move m out of the support constraint, increasing the corresponding error 𝜖 S, so a correction will be needed to place the map back in that constraint space with the consequence of increasing 𝜖 M from zero.

The error 𝜖 M is the distance of the map m to the modulus constraint set. The error 𝜖 S, equal to the intensity outside the support, is also the distance of the map (a point in the vector space) to the support constraint set. By iterating the steps indicated above of bringing the point to first the modulus constraint set and then to the support constraint set, it may be possible to eventually converge to the intersection point that we seek. This indeed would be the case if the sets were convex volumes and if the point m was always brought to its closest point in each constraint set. The latter condition is accomplished using a projection operator. That is, an updated point m i+1 is obtained from the current estimate m i as m i+1 = P S P M m i, where P S is the projection that brings a point in the vector space onto the support constraint set, and P M is the operator that brings a point onto the set of images that obey the Fourier modulus constraint. The repeated application of these operations approaches a fixed point m → (P S P M)n m. This fixed point will be the global minimum of the distances between the sets if the former condition of the sets being convex is satisfied. However, the modulus constraint set is decidedly not convex, and so the procedure may become trapped in a local minimum. Nevertheless, this formalism of projections has proven valuable in developing robust algorithms that can recover the phases even when given noisy measured Fourier intensities.

The projection operators can be easily constructed for the constraints mentioned above. The support projector P S simply sets the values of all voxels outside the support to zero, which is the closest point in the vector space that satisfies the constraint. For the modulus constraint, \(\tilde {m}\) is brought into agreement with \(\sqrt {I}\) by rescaling the modulus of the complex value at each reciprocal voxel to equal \(\sqrt {I}\), leaving the phase unchanged. In the complex plane, for a particular reciprocal voxel q, this is the closest point to \(\tilde {m}(\mathbf {q})\) on the circle of radius \(\sqrt {I(\mathbf {q})}\). The modulus projection includes performing the Fourier transformation of the real-space map, and the inverse after rescaling:

$$\displaystyle \begin{aligned} P_M \, m(\mathbf{r}) = \mathcal{F}^{-1}_r\left\{\sqrt{\frac{I(\mathbf{q})}{|\tilde{m}(\mathbf{q})|{}^2}}\,\tilde{m}(\mathbf{q}) \right\}\;. \end{aligned} $$
(9.13)

The constraint errors can be seen to be equal to 𝜖 S = ∥P S m − m∥ and 𝜖 M = ∥P M m − m∥. In the latter case the error 𝜖 M is invariant to the Fourier transform (through Parseval’s theorem).

The algorithm m n = (P S P M)n m 0 was introduced by Feinup [18] as a generalization of the first iterative phasing approach of Gerchberg and Saxton [21]. They considered the related problem of recovering the complex-valued image from the measured transmission image and measured diffraction pattern in an electron microscope. Fienup’s introduction of the support constraint brought the possibility of phasing diffraction data alone (without the need of a microscope). He called it the error reduction algorithm since the errors 𝜖 are non-increasing on each iteration step due to its equivalence to a steepest descent minimization [18]. However, due to the non-convexity of the constraint sets, it does not necessarily achieve the global minimum—often this simple algorithm gets trapped in a local minimum. This algorithm can be compared with density modification and solvent flattening that are used in crystal structure refinement, albeit without any structural model guiding the density in the volume inside the support. Perhaps a better analogy for the crystallographer is that this is an omit map where the entire molecule is omitted!

Since the error reduction algorithm often stagnates, Fienup introduced concepts from control theory to design algorithms with “feedback” that improved their convergence properties. Elser expanded on these ideas with his difference map algorithm [14] which actively explores space away from the constraint sets in order to avoid stagnation. He first noted that an algorithm constructed as

$$\displaystyle \begin{aligned} m_n = (\mathbf{I} + \beta \varDelta)^n \, m_0 \end{aligned} $$
(9.14)

converges to a fixed point given by Δ m  = 0 for any constant β and operator Δ. To ensure that this point is in the intersection of the two sets, Δ must take the form

$$\displaystyle \begin{aligned} \varDelta \, m = (P_M f_S - P_S f_M) \, m \end{aligned} $$
(9.15)

where f S and f M are any linear combination of operators for the support and modulus constraint, respectively. The key here is that the last operation in each term of Eq. (9.15) is P M and P S, respectively, taking m to the surface of one or the other constraint set, and giving zero when these intersect. The other operators f S and f M can be designed to give optimum convergence properties, which Elser finds to be

$$\displaystyle \begin{aligned} f_S \, m &= [I+\alpha_S(P_S-I)]\, m \end{aligned} $$
(9.16)
$$\displaystyle \begin{aligned} f_M \, m &= [I+\alpha_M(P_M-I)]\, m \, . \end{aligned} $$
(9.17)

The real-valued parameter α tunes these operators from the identity at α = 0 to a projection with α = 1 and a reflector (such as used in charge flipping) when α = 2. Some particular choices are α S = 0 with α M = 2 [44], or α S = 0 with α M = 1 + 1∕β. The latter is Fienup’s hybrid input–output (HIO) algorithm [17]. The difference map algorithm of Eq. (9.14), like the HIO algorithm, tends to escape from local minima, sometimes by moving in a direction along the line of shortest approach between the two sets at the local minimum [28]. The solution is not the fixed point m to which the algorithm converges, but rather the nearest point on the constraint set, P M m .

9.2.6 Phasing Twinned Data

As will be seen below, continuous diffraction often arises from ensembles of objects that are situated in several discrete orientations, without correlation of the relative positions of those objects. Examples are molecules aligned in a laser field (which may be oriented parallel or antiparallel to the polarization axis of the laser with equal probability, Sect. 9.3.1) or the four orientations of molecules in a crystal with P212121 symmetry and exhibiting displacement disorder (Sect. 9.3.7). This results in an incoherent sum of the diffraction intensities in N b various orientations, \(I(\mathbf {q}) = \sum _{b=1}^{N_b} \left |\tilde \rho _b(\mathbf {q}) \right |{ }^2\), analogous to the diffraction from a twinned crystal. Such an incoherent sum cannot be represented by the square modulus of the Fourier transform of any single object (including the average over all orientations) and thus the application of the phasing algorithms described above will not succeed. Assuming that each of the ρ b are differently oriented versions of the same rigid object as described in Eq. (9.4), the modulus constraint P M of Eq. (9.13) must be modified simply as [16]

$$\displaystyle \begin{aligned} P_M \, m_1(\mathbf{r}) = \mathcal{F}^{-1}_r\left\{\sqrt{\frac{I(\mathbf{q})}{\sum_b |\tilde{m}_b(\mathbf{q})|{}^2}}\,\tilde{m}_1(\mathbf{q}) \right\}\;, \end{aligned} $$
(9.18)

where m 1 is the iterate of the single object reconstruction and m b are the rotated versions, m b(r) = m 1(R b r).

As emphasized by Elser and Millane [16], phasing the twinned data is feasible as long as the constraint ratio Ω > 1. Twinning will always reduce Ω from the ratio Ω 1 for the single object, but it will only be as small as Ω 1N b for N b orientations if the support is invariant under the rotation operations R b. Generally it will lay between these bounds as can be understood by constructing the union of the rotated supports of the object and determining the volume of the unique region as related by the symmetry operations. A molecule contained within a square prism support, for example, gives rise to a support autocorrelation that is also a square prism. If the length of this is greater than the width of the square face and is not oriented along a symmetry axis of the crystal itself, then the symmetrized support autocorrelation has some non-overlapping regions, with the possibility that Ω is considerably greater than unity.

9.3 Observations of Continuous Diffraction

Given that the lack of measured phases does not pose an obstacle to reconstructing an image of the electron density from a set of sufficiently sampled diffraction intensities, it is worth reviewing under which conditions the required continuous diffraction can be measured for the purpose of iterative phasing.

9.3.1 Single Object and a Gas of Aligned Identical Objects

The most obvious case for iterative phasing is that of a single non-periodic object illuminated with a coherent beam. The first demonstration of a fully sampled coherent diffraction pattern measured in the X-ray regime was by Miao et al. [31] after many years of effort led by David Sayre [41]. That the pattern was sufficiently sampled was proven by the fact that its Fourier transform gives an autocorrelation of limited extent. The real test of sufficiency, however, was that it could indeed be phased, by using an iterative phasing algorithm. The object was two-dimensional, fabricated by electron-beam lithography, and diffraction in only one view was needed to obtain the two-dimensional image. Such phasing is much more robust in three dimensions, for which diffraction must be measured also in three dimensions by rotating the sample. This is similar to data collection from a crystal, although still diffraction data frames are recorded as the object is rotated in steps. Each data frame, a measurement of I(q) on the Ewald sphere, is then interpolated into a three-dimensional array, as illustrated in Fig. 9.2. In this example [7], the object consisted of an indent in a silicon nitride membrane that contained a number of colloidal gold particles. The silicon nitride was practically invisible to the X-rays due to its low scattering power, giving a compact object whose diffraction could be phased directly using the Shrinkwrap algorithm [29].

Fig. 9.2
figure 2

Diffraction data collected from a 3D test object, showing (a) a diffraction pattern recorded at a single orientation, (b) 3D diffraction intensities collected at orientations from − 70 to + 70, and (c) the reconstructed volume image. Adapted from [7]

Diffractive imaging of single objects requires a high dose to the sample, which may exceed the tolerable dose to avoid damage at a particular resolution [25]. All methods to observe continuous diffraction at near atomic resolution from biological material therefore require measuring diffraction from many identical objects. One way to work around the dose limitation is by using femtosecond pulses from free-electron lasers, as described in the previous chapters in this book. Since the pulses are destructive, most schemes only give a two-dimensional snapshot diffraction pattern and so a supply of reproducible objects is still required to obtain a complete 3D diffraction dataset, as discussed in Chap. 14. Continuous diffraction can be readily combined from many objects if they share a common orientation. The diffraction will be the incoherent sum of the individual objects, equal to that of a single object multiplied by the number of objects, if the diffraction from these is mutually incoherent or the positions of the objects are random (as will be discussed below). For example, a gas of laser-aligned molecules [24] would give a diffraction pattern proportional to the single molecule, as would a long exposure made of a stream of aligned objects that sequentially pass across the beam [48]. Without alignment, enough signal is required per pattern to determine relative orientations of the particles, as detailed in Chap. 14.

9.3.2 Single Layers and Fibrils

Objects often come into alignment when placed in contact with each other. The most spectacular self-assembly of this kind is of course crystallization, but other examples include liquid crystals in the nematic phase where constituents are aligned in one of their dimensions. Such arrangements might give useful information about their cylindrically averaged density, for example. An early example of applying iterative phasing to such partially oriented systems was to biological membranes containing proteins. The arrangement of the proteins was disordered within the plane and with their orientations fixed only in the direction normal to the plane, but this gives a well-defined density thickness profile of the membrane. Such membranes can be layered, giving rise to a single column of Bragg peaks in the ordered direction, or a continuous rod of diffraction intensity, depending on the regularity of the spacings of the layers. In the latter case the Fourier transform of the intensity rod gives the entire autocorrelation of the thickness profile of the membrane. Stroud and Agard introduced the idea to phase this with a compact support [50] although later understanding as discussed in Sect. 9.2.4 showed that the density profile is not uniquely specified by 1D Fourier magnitudes without additional information [2]. Spence et al. [47] showed this could be overcome in 2D crystals where the diffraction is in the form of a lattice of intensity rods. An analogous case is a one-dimensional crystal, such as a single fibril that is periodic along its axis. The diffraction from this consists of two-dimensional planes of continuous diffraction separated by the reciprocal lattice spacing. The information content of the diffraction is certainly higher than for two-dimensional crystal, although Millane showed that the phase problem is not generally unique without an additional constraint such as the positivity of the electron density [33].

9.3.3 Finite Crystals and Finitely Illuminated Crystals

Deviations from an ideal infinite crystal can in principle give access to information additional to that restricted to the Bragg peaks. The diffraction of a finite crystal was considered by Laue [53] who derived the result that the diffracted wavefield is equal to the convolution of the Fourier transform of the shape of the coherently illuminated crystal with the delta-function Bragg peaks of the infinite crystal. That is, the 3D crystal “shape transform” is laid down on each lattice point. For a crystal with flat facets, this transform consists of continuous truncation rods in directions normal to the facets, giving an opportunity to measure the underlying molecular transform at locations away from the Bragg peaks. Elser suggested that this could be used to obtain, in addition to intensities at Bragg peaks I(q hkl), measurements of the gradient of the intensities, ∇I(q hkl). The extra three independent values per Bragg peak increase the constraint ratio Ω and were shown to be enough to solve the structure by iterative phasing [15]. Spence et al. [49] extended this idea to an ensemble of crystals measured by serial crystallography, as described in Chap. 8, by dividing out the shape transform from the 3D diffraction intensity map, allowing the electron density of the unit cell to be recovered by iterative phasing. A similar effect can be obtained when a small focused beam illuminates the crystals. If the relative position of the beam and crystal is known on each shot, the dataset can be interpreted by the method of ptychography [27]. However, if the diffraction intensities are summed without regard to this relative displacement, the result is merely a convolution of the diffraction pattern that would usually be observed (with a collimated beam) with the angular distribution of the focused probe. This convolution operation can also be carried mathematically on data recorded with a collimated beam, or by using a detector with large pixels, and thus does not bring any new information. The sum described above also simulates the situation of illumination of a crystal with a beam of limited spatial coherence, showing that in that case no new information is revealed either, despite hope that this might mask the periodicity of the crystal and hence provide continuous diffraction of a single unit cell [52].

9.3.4 Crystal Swelling

In the early days of protein crystallography it was noted that the unit cells of some crystals expand or contract in different states of hydration. Bernal et al. suggested that measurements of such crystals could be used to map out the molecular transform with a fine sampling [3]. This is easy to imagine for a crystal of P1 symmetry, where a change in a unit cell dimension made just by changing the distance between molecules will cause Bragg peaks to move over the transform of the single molecule. The measurement of diffraction from crystals in several states would allow mapping the molecular transform in steps along a direction set by the swelling (e.g., in the 111 direction for a crystal that uniformly swells in all directions). For the P1 crystal, this “one dimensional” fine sampling should be enough to provide complete information for phase retrieval, at least to the resolution that the molecules remain identical (and in the same orientation) in the different crystal forms.

The situation is not so straightforward when the arrangements of molecules change upon swelling in crystals of other symmetries. Bragg and Perutz carried out measurements of a set of hemoglobin crystals with differing amounts of salt content [4]. In that case the crystals underwent a shear through a change in the angle β, without any significant change in the unit cell lengths. This could happen if whole layers of molecules in ab planes would slip relative to each other in the direction of the b axis. This then allowed for fine sampling of layer lines of the diffraction pattern in the c direction. However, only in the 00l direction could the ensemble of measurements be easily interpreted. The 0kl central plane of the diffraction pattern is the Fourier transform of the projection of the crystal structure down the a axis, and since molecules were slipping only in the b direction, this projection would be unchanged along c . Even in simpler cases of expansion without shear, the change of displacements of molecules within a unit cell means that even though fine samples can be made from different crystal forms, each of these samples possess a different unit cell transform. So far, a method to apply iterative phasing to such a dataset has not been found, but it is clear that the information content is twice that of a single crystal, which should allow de novo phasing. Structure refinement from multiple crystal forms can be carried out using density modification techniques, as in the programs phenix.multi_crystal_average and DMMulti [11].

9.3.5 Crystals with Large Solvent Fraction

As discussed in Sect. 9.2.4, iterative phasing should be feasible when the constraint ratio Ω exceeds unity, regardless of whether the measurements are from a continuous pattern or from Bragg spots. This ratio is equal to the number of independent coefficients in the autocorrelation function divided by the number of independent coefficients describing the object’s density. For a P1 crystal (without any disorder) this will be half, since the autocorrelation function has the same periodicity as the crystal and it is centrosymmetric. However, if the object only actually accounts for less than half the volume of the unit cell, and the rest consists of solvent of uniform average density, then we will have Ω > 1. Furthermore, if there are two or more identical objects in the unit cell that are not related by crystallographic symmetry, Ω can exceed unity even for solvent fractions smaller than 50% [34]. Iterative phasing can indeed succeed in this case, using only the Bragg intensities, as was recently demonstrated by He and Su [22] using Fienup’s hybrid input–output algorithm (α S = 0 with α M = 1 + 1∕β in Eqs. (9.16) and (9.17)).

9.3.6 Crystals with Substitutional Disorder

Bragg peaks are a consequence of translational symmetry. Any deviation from that symmetry will disturb the constructive interference responsible for the peaks, reducing their intensities, and will also prevent the full cancellation of intensity between the peaks. One way to break this symmetry and still maintain molecular orientation is through substitutional disorder in the crystal. That is, a random occupation of lattice sites by a molecule or, more likely, sites randomly occupied by one or the other of two forms of a molecule can be described as the sum of a purely periodic density (given by the average structure \(\bar {\rho }(\mathbf {r})\)) and the difference Δρ(r) [10].

As an example consider a time-resolved experiment where an optical excitation pulse is set to a fluence where only half the molecules are isomerized. This will occur randomly throughout the crystal volume, and so the crystal can be considered as randomly occupied by two molecular structures, one in the ground state with a structure ρ 1(r) and one with a structure ρ 2(r). Considering for simplicity a P1 symmetry with just one molecule per unit cell, the density of this imperfect crystal can be described by a modification of Eq. (9.4) as

$$\displaystyle \begin{aligned} \rho(\mathbf{r}) &= \bar{\rho}(\mathbf{r})+\varDelta\rho(\mathbf{r}) \\ &= \left[\bar{\rho}_b(\mathbf{r}) \otimes \sum_c \delta(\mathbf{r}-{\mathbf{r}}_c) \right]+ \left[ \varDelta \rho_b(\mathbf{r}) \otimes \sum_c p_c \, \delta(\mathbf{r}-{\mathbf{r}}_c) \right]\;, {} \end{aligned} $$
(9.19)

where p c is equal to either + 1 or − 1 in the case of 50% excitation. The diffraction intensity is equal to the Fourier transform of the autocorrelation of the density ρ(r) as shown in Eq. (9.11). The autocorrelation of the sum Eq. (9.19) gives rise to four terms, given by the autocorrelation of \(\bar {\rho }(\mathbf {r})\), the autocorrelation of Δρ(r), and two cross correlation terms, \(\bar {\rho }(\mathbf {r}) \otimes \varDelta \rho (-\mathbf {r}) + \bar {\rho }(-\mathbf {r}) \otimes \varDelta \rho (\mathbf {r})\). Since \(\bar \rho \) is periodic, each of these cross correlations will essentially carry out a sum of Δρ over all unit cells which will be equal to zero since (by definition) \(\left <\varDelta \rho \right > = 0\). The autocorrelation is therefore a sum of just the autocorrelations of \(\bar {\rho }(\mathbf {r})\) and Δρ(r), showing that the diffraction pattern is the incoherent sum \(\left |\tilde {\bar {\rho }}(\mathbf {q})\right |{ }^2 + \left |\varDelta \tilde {\rho }(\mathbf {q})\right |{ }^2\), where \(\tilde {\bar {\rho }}\) is the Fourier transform of \(\bar {\rho }\) and \(\varDelta \tilde {\rho }\) is the Fourier transform of Δρ. Since \(\bar {\rho }(\mathbf {r})\) is strictly periodic, the first term will give rise to Bragg peaks. The residual density Δρ(r) consists of components that are at lattice positions but differ from each other in a random manner and hence are not periodic. Again, considering the autocorrelation of Δρ(r) (see Eq. (9.11)), for small differences u one obtains the sum of cross correlations of difference densities within common unit cells. If the occupancies p c are uncorrelated, once u crosses unit cell boundaries the terms \(p_cp_{c'}\) cancel out, leaving just the autocorrelation of the single “object” Δρ b. The Fourier transform of this correlation of limited extent is of course continuous. In general, for a fraction x of excited molecules in the crystal, the continuous diffraction will be weighted by x(1 − x) [10] giving

$$\displaystyle \begin{aligned} I(\mathbf{q})= \left|\tilde{\bar{\rho}}_b(\mathbf{q})\,\sum_c \exp(2 \pi i {\mathbf{r}}_c\cdot \mathbf{q})\right|{}^2 + x(1-x) \left|\varDelta \tilde{\rho}_b(\mathbf{q})\right|{}^2 \;. \end{aligned} $$
(9.20)

For a crystal consisting of several objects per unit cell (following Eq. (9.5)), Eq. (9.20) generalizes to

$$\displaystyle \begin{aligned} I(\mathbf{q})= \left|\sum_b\tilde{\bar{\rho}}_b(\mathbf{q}) L(\mathbf{q})\right|{}^2 + x(1-x) \sum_b\left|\varDelta \tilde{\rho}_b(\mathbf{q})\right|{}^2 \;. \end{aligned} $$
(9.21)

The second term can be separated from the first by filtering out Bragg peaks (see Sect. 9.5) and then iteratively phased using a finite support constraint, and the modulus constraint of either Eq. (9.13) or (9.18) depending on the number of orientations of molecules in the crystal [51]. Since Δρ can be negative it is not appropriate to apply a positivity constraint.

Substitutional disorder exists in crystals of tris-t-butyl-benzene tricarboxamide [45]. This molecule crystallizes into a so-called two-component crystal with random occupation of one or the other component. In this case the two components consist of the molecule in one of two different orientations. These actually occur in columns, parallel to the c axis, of molecules of the same orientation. Looking down this axis one observes columns in a hexagonal close-packed lattice either pointing away or towards the observer. Although the occupational fraction x is 50%, there is a correlation between the positions of “up” and “down” columns due to a preference of antialignment of neighbors but a frustration in achieving this in a triangular lattice. This is revealed in the observed continuous diffraction as a characteristic honeycomb shape. The form of this correlation has been of interest to understand the solid state of the molecule, and the correlation function could be obtained by dividing the effect of the molecular contribution Δρ from the pattern [42] in a process somewhat similar to described in Sect. 9.3.3. (Eq. (9.20) only considers uncorrelated occupation.) However, in a beautiful analysis, Simonov et al. did the opposite to extract \(|\varDelta \tilde {\rho }(\mathbf {q})|{ }^2\) which they then phased to obtain an atomic resolution image of the molecule [46].

9.3.7 Crystals with Displacement Disorder

The least amount of change that needs to be made to a crystal to disrupt translational symmetry is to randomly displace its elements. As with substitutional disorder (Sect. 9.3.6), if the mean displacement of the objects in the crystal is zero, then the imperfect crystal can be described as a repeat of an average unit cell \(\bar {\rho }\) with strict translational symmetry and a difference term. As before, the correlations between the differences and the average sum to zero (since, by definition the mean of the difference is zero), so that the general form of the autocorrelation (and hence the diffraction intensities) is an incoherent sum of the periodic part, which gives Bragg peaks, and the non-periodic difference which gives continuous diffraction. However, as compared with substitutional disorder, the average unit cell is blurred out. We can assume small normally distributed displacements, which leads to a blurring given by the convolution of the unit cell density with a Gaussian. The effect of this convolution is to modulate the Bragg peaks by the well-known Debye-Waller factor \(\exp (-4 \pi ^2 \sigma ^2 q^2)\), for a root-mean square displacement σ.

The form of the continuous (diffuse) scattering depends on the object undergoing the displacement, and the nature of correlations of those displacements over the volume of the illuminated crystal. Below we make a general derivation of the diffraction intensities for a crystal consisting of randomly displaced molecules which themselves have randomly displaced atoms, with different correlations between them. A particularly favorable condition is when whole molecules move as rigid units. Again, choosing for simplicity the case of P1 symmetry with just one molecule per unit cell the density of the imperfect crystal is given by

$$\displaystyle \begin{aligned} \rho(\mathbf{r})=\rho_b(\mathbf{r}) \otimes \sum_{c=1}^{N_c} \delta(\mathbf{r}-{\mathbf{r}}_c-\mathbf{\delta}_c) \;, \end{aligned} $$
(9.22)

where δ c is the displacement of the molecule in the cth unit cell, with \(\left < \mathbf {\delta }_c \right > = 0\) and \(\left < \mathbf {\delta }_c^2 \right > = \sigma ^2\). The autocorrelation of Eq. (9.22) will be equal to the autocorrelation of ρ b(r) convolved with the autocorrelation of the displaced lattice. This is the cross correlation of two blurred lattices (each with RMS displacements σ 2). Since we assume here that there is no correlation between the displacements of different unit cells the displacements u between the cells will be spread by a distribution of variance equal to 2σ 2, for all peaks in this autocorrelation except for the lattice point at the origin (which has perfect correlation). The result, derived below in Sect. 9.4.2, is

$$\displaystyle \begin{aligned} I(\mathbf{q})=|\tilde{\rho}(\mathbf{q})|{}^2 \left[N_c (1-e^{-4 \pi^2 \sigma^2 q^2}) + e^{-4 \pi^2 \sigma^2 q^2}\, L(\mathbf{q}) \right]\;. \end{aligned} $$
(9.23)

The continuous part of the diffraction pattern consists of the single-object diffraction modulated by the so-called complementary Debye-Waller factor \((1-e^{-4 \pi ^2 \sigma ^2 q^2})\) which is zero at q = 0 and increases to 1 for resolution lengths d = 1∕q < σ∕(2π). The continuous and Bragg portions of the diffraction pattern can be separated from each other as discussed in Sect. 9.5, but it is seen in Eq. (9.23) that both sample the same single-object diffraction pattern. This is only true for a P1 crystal however. In other cases, the continuous diffraction is twinned in a similar fashion to the case of substitutional disorder (see Sect. 9.4.4), requiring phasing to be carried out using the modulus constraint of Eq. (9.18). Phasing with the continuous diffraction alone misses low-resolution information due to the complementary Debye-Waller factor, as discussed in Sect. 9.4.8.

The complementarity of the Debye-Waller factors in the two terms within the square brackets of Eq. (9.23) shows that the strength of the continuous and Bragg diffraction is comparable. The integrated intensity of Bragg peaks (not including the molecular transform) scales as N c, as does the continuous term. The difference between these cases is that the Bragg counts are concentrated into easily measurable peaks whereas the continuous diffraction is spread over all detector pixels. By energy conservation the total counts in the pattern will not change as σ is varied, although the resolution at which one or the other dominates will. At a resolution where the continuous diffraction dominates, the counts per pixel would be equivalent to the average Bragg counts per pixel at that resolution, had the crystal been perfectly ordered. In such a case, if Bragg peaks were spaced apart by 10 pixels on average, for example, then the average count is actually only 1% of the peak height. Such levels are usually below the background noise, as discussed in Sect. 9.5.

It is interesting to note that the loss of Bragg peaks with resolution, and the corresponding increase in the continuous diffraction, does not need large displacements. For example, RMS displacements of 1 Å give a significant loss of Bragg intensity at a resolution of about 6 Å as can be seen from the expression for the Debye-Waller factor and the definition of q in Eq. (9.3). This large discrepancy is due to the fact that Bragg peaks are formed through constructive interference and are thus a phase effect. At the example resolution of 6 Å, a displacement of any one object by a c = 3 Å would cause its contribution, as seen in Eq. (9.7), to be completely out of phase and adding destructively. Displacements of 1 Å will already give contributions out of phase enough to suppress the formation of Bragg peaks. This extreme sensitivity of Bragg peaks to displacements may explain to a large degree the difficulty of recording high-resolution diffraction from protein crystals, especially those where the molecules are not highly constrained through crystal contacts. However, we will see in the following section that this very sensitivity can expose the continuously sampled Fourier transform which opens up exciting possibilities for de novo structure determination from crystal diffraction using iterative phasing.

9.4 Diffraction from Crystals with Displacement Disorder

After the survey in Sect. 9.3 of the different types of crystal imperfections which give the opportunity to reveal additional structural information about the molecular components, not present in Bragg peaks, in this section we expand upon the analysis of crystals with displacement disorder. This topic has been discussed previously [35, 36, 39], but here we will focus on the connection with continuous diffraction of the objects that undergo the displacements. After introducing a general formalism, we consider different types of random displacements which may be correlated or uncorrelated. We examine the cases of translation of entire unit cells in Sect. 9.4.2, the case of displacements of molecules which themselves exhibit atomic disorder in Sect. 9.4.3, multiple types of rigid object that are independently displaced (Sect. 9.4.4), and then two models which include correlations in the displacements. These are the liquid-like motions model of distortions within rigid bodies examined in Sect. 9.4.5 and displacements of rigid bodies that are influenced by their neighbors in Sect. 9.4.6. Finally, we look at the effects of rotational rigid-body disorder in Sect. 9.4.7. Each of these cases is illustrated with a simulation of crystals of the lysozyme molecule as described in Box 9.1 and depicted in Fig. 9.3.

Fig. 9.3
figure 3

(a) Molecular transform intensities, I o(q), shown in a single planar slice passing through q = 0. (b) Ideal crystal intensities, \(I_o(\mathbf {q}) \sum L_{cc'}(\mathbf {q})\) when there are no displacements. In both cases, for clarity, the atomic form factors have been set to constants (see Box 9.1)

In order to give the most general formalism of a crystal with translational disorder we make use of the description of such an object as a collection of atoms, given by Eq. (9.8).

$$\displaystyle \begin{aligned} \rho(\mathbf{r})=\sum_{i=1}^N \zeta_i(\mathbf{r}-{\mathbf{r}}_i)\,. \end{aligned} $$
(9.8revisited)

We separate the position of each atom in the entire crystal into the sum of the position of the unit cell, the position of the atom within the unit cell, and the displacement from that ideal position,

$$\displaystyle \begin{aligned}{\mathbf{r}}_i = {\mathbf{r}}_c + {\mathbf{r}}_a + \boldsymbol{\delta}_{ac} \; .\end{aligned}$$

Converting the sum of scattered waves from all atoms into a double sum over all unit cells and over all atoms within the unit cell, Eq. (9.9) becomes

$$\displaystyle \begin{aligned} I(\mathbf{q}) &= \sum_{cc'} e^{2\pi i ({\mathbf{r}}_c-{\mathbf{r}}_{c'})\cdot \mathbf{q}}\sum_{aa'}f_a(\mathbf{q}) f^*_{a'}(\mathbf{q})e^{2\pi i ({\mathbf{r}}_a-{\mathbf{r}}_{a'})\cdot \mathbf{q}}e^{2\pi i (\boldsymbol{\delta}_{ac}-\boldsymbol{\delta}_{a'c'})\cdot \mathbf{q}}\\ &= \sum_{cc'} \mathrm{L}_{cc'}(\mathbf{q}) \sum_{aa'} \mathrm{I}_{aa'}(\mathbf{q})e^{2\pi i (\boldsymbol{\delta}_{ac}-\boldsymbol{\delta}_{a'c'})\cdot \mathbf{q}} \end{aligned} $$
(9.24)

where \(\mathrm {L}_{cc'}\) is the lattice sum of the crystal which converges to a set of delta functions at the reciprocal lattice points [35, 36]. \(\mathrm {I}_{aa'}\) is the contribution of each pair of atoms in their ideal positions (within one unit cell). The term \(\sum _{aa'} \mathrm {I}_{aa'}(\mathbf {q}) = |\sum _a f_a(\mathbf {q})|{ }^2 = I_o(\mathbf {q})\) is commonly called the molecular transform of the molecule (Fig. 9.3a) since it is the Fourier transform of the molecular electron density function. Note that we have made no assumption as to the form of correlations between atoms, rigid bodies, or unit cells, just that they are statistically random.

Since the displacements are statistically random (though not necessarily uncorrelated), one is really interested in the average phase contribution from all the atoms. Thus, we can rewrite the expression for the intensity as follows:

$$\displaystyle \begin{aligned} I(\mathbf{q}) = \sum_{cc'} \mathrm{L}_{cc'}(\mathbf{q}) \sum_{aa'} \mathrm{I}_{aa'}(\mathbf{q}) \left<e^{2\pi i (\boldsymbol{\delta}_{ac}-\boldsymbol{\delta}_{a'c'})\cdot \mathbf{q}}\right>\,. \end{aligned} $$
(9.25)

Using the harmonic approximation for small displacements, the average over the exponentials can be simplified as

$$\displaystyle \begin{aligned}\left<e^{2\pi i (\boldsymbol{\delta}_{ac}-\boldsymbol{\delta}_{a'c'})\cdot \mathbf{q}}\right>= e^{-2\pi^2 \left<((\boldsymbol{\delta}_{ac}-\boldsymbol{\delta}_{a'c'})\cdot \mathbf{q})^2\right>}\,.\end{aligned}$$

With this simplification, we get the following general expression for the intensity distribution of a disordered crystal:

$$\displaystyle \begin{aligned} I(\mathbf{q}) = \sum_{cc'} \mathrm{L}_{cc'}(\mathbf{q}) \sum_{aa'} \mathrm{I}_{aa'} (\mathbf{q}) e^{-2\pi^2 (\left<(\boldsymbol{\delta}_{ac}\cdot \mathbf{q})^2\right> + \left<(\boldsymbol{\delta}_{a'c'}\cdot \mathbf{q})^2\right>)} e^{4\pi^2 \left<(\boldsymbol{\delta}_{ac}\cdot \mathbf{q})(\boldsymbol{\delta}_{a'c'}\cdot \mathbf{q})\right>}\,. \end{aligned} $$
(9.26)

The non-trivial behavior is now encoded in the last term which represents the covariance of the displacements among different atoms. This can be seen more clearly with the following:

$$\displaystyle \begin{aligned}\left<(\boldsymbol{\delta}_{ac}\cdot \mathbf{q})(\boldsymbol{\delta}_{a'c'}\cdot \mathbf{q})\right> = {\mathbf{q}}^\intercal \left<\boldsymbol{\delta}_{ac}^\intercal \boldsymbol{\delta}_{a'c'}\right>\mathbf{q}\,.\end{aligned}$$

Here, the central term is the 3 × 3 covariance matrix of displacements between any pair of atoms, \({\mathbf {C}}_{ac\,a'c'}\).Footnote 2 One can also recognize the self terms \(\left <(\boldsymbol {\delta }_{ac}\cdot \mathbf {q})^2\right >\) as being the standard anisotropic B-factors (or Debye-Waller factors) for atom a, i.e.,

$$\displaystyle \begin{aligned}\left<(\boldsymbol{\delta}_{ac}\cdot \mathbf{q})^2\right> = {\mathbf{q}}^{\intercal{\mathbf{U}}}_a\mathbf{q}\,.\end{aligned} $$

Here we have made the assumption that the displacement distributions (not the actual displacements) for the same atom are identical among different unit cells. Putting it all together, we get

$$\displaystyle \begin{aligned} I(\mathbf{q}) = \sum_{cc'} \mathrm{L}_{cc'}(\mathbf{q}) \sum_{aa'} \mathrm{I}_{aa'}(\mathbf{q}) e^{-2\pi^2 {\mathbf{q}}^\intercal({\mathbf{U}}_a + {\mathbf{U}}_{a'})\mathbf{q}} e^{4\pi^2 {\mathbf{q}}^{\intercal{\mathbf{C}}}_{ac\,a'c'}\mathbf{q}}\;. \end{aligned} $$
(9.27)

It is now possible to consider several kinds of disorder and examine how the (continuous) diffracted intensities relate to the molecular transform. The first, in Sect. 9.4.1, is the displacement of every atom in the crystal in an uncorrelated manner, such as may happen in a Coulomb explosion or simply due to thermal motion. This will be compared with the displacements of whole unit cells as single rigid units (Sect. 9.4.2), keeping displacements among different unit cells uncorrelated. For crystals with more than one asymmetric unit per cell, this choice is somewhat artificial since the choice of unit cell is arbitrary. Nevertheless, this will provide a simpler route to understanding what happens in the general case when there are multiple rigid units in the unit cell which are considered in Sect. 9.4.4 after first checking the effect of atomic disorder with rigid-body motion. Following that we investigate motions that are correlated with distance between atoms and units.

Box 9.1

The simulated diffraction images in this section were calculated using the lysozyme molecule (PDB: 4ET8). Each image represents a planar slice (not an Ewald sphere) through the 3D intensity distribution of the crystal. The resolution at the center edge is 2 Å. When showing Bragg peaks, the crystal unit cell was assumed to be 32 × 32 Å in the dimensions reciprocal to the displayed diffraction plane. This cell is too small to fit 4 molecules as demanded by the P212121 space group simulated in Sect. 9.4.4. Nevertheless, the smaller unit cell leads to a larger Bragg peak spacing in reciprocal space, which results in a more esthetically pleasing image. In reality, the tetragonal lysozyme crystal has unit cell 79 × 79 × 38 Å (placing Bragg peaks closer together than simulated) with the space group P43212.

Another simplification applied was to ignore the q-dependence of the atomic form factors f i(q) and consider them to be constant. This is equivalent to assuming point-like atoms. These simplifications from what one would see in a real experiment were made for the sake of clarity. Experimental details are discussed in Sect. 9.5.

9.4.1 Uncorrelated Random Disorder

For uncorrelated motions of atoms, the covariance matrix reduces to

$$\displaystyle \begin{aligned} {\mathbf{C}}_{ac\,a'c'} = \left<\boldsymbol{\delta}_{ac}^\intercal \boldsymbol{\delta}_{a'c'}\right> = \begin{cases} {\mathbf{U}}_a,& \text{if}\ c=c' \ \text{and}\ a=a'\\ 0,& \text{otherwise.} \end{cases} \end{aligned}$$

Separating the two cases of summing over identical or different atoms, from Eq. (9.27) we obtain

$$\displaystyle \begin{aligned} I(\mathbf{q}) = &\sum_{\substack{cc'\\c\neq c'}} \mathrm{L}_{cc'}(\mathbf{q}) \sum_{\substack{aa'\\a\neq a'}} \mathrm{I}_{aa'}(\mathbf{q}) e^{-2\pi^2 {\mathbf{q}}^\intercal({\mathbf{U}}_a + {\mathbf{U}}_{a'})\mathbf{q}} \\ &+\sum_c \mathrm{L}_{cc}(\mathbf{q}) \sum_a \mathrm{I}_{aa}(\mathbf{q}) \end{aligned} $$

where the two exponential terms cancel each other out in the second term. From Eq. (9.24), we can see that Lcc(q) = 1 and \(\mathrm {I}_{aa}(\mathbf {q}) = \left |f_a(\mathbf {q})\right |{ }^2\). Completing the sum in the first term, we obtain the familiar Debye-Waller suppression of Bragg intensities along with structureless diffuse scattering,

$$\displaystyle \begin{aligned} I(\mathbf{q}) = &\sum_{cc'} \mathrm{L}_{cc'}(\mathbf{q}) \sum_{aa'} \mathrm{I}_{aa'}(\mathbf{q}) e^{-2\pi^2 {\mathbf{q}}^\intercal({\mathbf{U}}_a + {\mathbf{U}}_{a'})\mathbf{q}}\\ &+ N_c\sum_a \left|f_a(\mathbf{q})\right|{}^2 \left(1-e^{-4\pi^2 {\mathbf{q}}^{\intercal{\mathbf{U}}}_a\mathbf{q}}\right)\,. {} \end{aligned} $$
(9.28)

In order to make the interpretation of this expression easier, let us assume isotropic displacement distributions. Thus, expressions of the form \({\mathbf {q}}^{\intercal {\mathbf {U}}}_a\mathbf {q}\) simplify to \(\sigma _a^2 q^2\) where \(q = \left |\mathbf {q}\right |\). One can always do the complete analysis with anisotropic distributions with some minor modifications. Applying this approximation and substituting in the full expression for \(\mathrm {I}_{aa'}\) from Eq. (9.24),

$$\displaystyle \begin{aligned} I(\mathbf{q}) = &\sum_{cc'} \mathrm{L}_{cc'}(\mathbf{q}) \left|\sum_{a} f_a(\mathbf{q}) e^{2\pi i {\mathbf{r}}_a\cdot \mathbf{q}} e^{-2\pi^2 \sigma_a^2 q^2}\right|{}^2\\ &+ N_c\sum_a \left|f_a(\mathbf{q})\right|{}^2 \left(1-e^{-4\pi^2 \sigma_a^2 q^2}\right)\,. {} \end{aligned} $$
(9.29)

The two terms in this expression are conventionally called the Bragg term and the diffuse term, respectively. In the diffuse term, since each atomic form factor is spherically symmetric, the pattern as a whole is the same and just adds to the “background” similar to the solvent scatter. The Bragg term here is just the lattice sum multiplied by the Fourier transform of the B-factor-blurred molecule as discussed in Sect. 9.3.7. Thus, if one subtracts the background at each Bragg peak position and phases the integrated intensity, the electron density obtained can be thought of as replacing each atom by a Gaussian blob with a width σ a. Note also that this is the average electron density over all unit cells. The total intensity is shown in Fig. 9.4 where one can see the reduced Bragg resolution as well as the structureless diffuse background.

Fig. 9.4
figure 4

Total intensity calculated in Eq. (9.28) when each atom is displaced independently. Here all atoms are displaced by an average of 0.8 Å (B = 25.3 Å2) resulting in the Bragg peaks being suppressed at high-resolution and featureless diffuse scattering. For details see Box 9.1

Thus, we have seen that in the absence of correlated displacements, the Bragg peaks just represent the Fourier transform of the average unit cell electron density. In fact, this turns out to be generally true even in the case of strongly correlated motion. This is why that by applying the conventional analysis pipeline the crystallographer need not worry about correlated motion and can solve for the B-factor for each atom. In a sense, this has been a great boon for the field. One could even argue that the field may never have taken off as it did over the last 100 years if it were necessary to solve for the whole \(C_{ac\,a'c'}\) matrix to solve the structure.

9.4.2 Rigid-Body Translational Disorder of a Unit Cell

We consider now a case in which the whole unit cell moves as a single rigid unit, with no correlations across unit cells. This is identical to what was discussed in Sect. 9.3.7, and we will obtain the same result by using the atomistic formalism. Note that this disorder model is only likely when there is a single asymmetric unit per unit cell. In other cases, the choice of the original asymmetric unit is arbitrary, leading to different possible choices of unit cells for the equivalent crystal. Such choices would, however, lead to different continuous diffraction if randomly displaced.

The analysis is very similar to the previous section. The covariance matrix reduces to

$$\displaystyle \begin{aligned} {\mathbf{C}}_{ac\,a'c'} = \left<\boldsymbol{\delta}_{ac}^\intercal \boldsymbol{\delta}_{a'c'}\right> = \begin{cases} \mathbf{U},& \text{if}\ c=c' \ \text{for}\ \text{all}\ a\\ 0,& \text{otherwise.} \end{cases} \end{aligned}$$

For an isotropic displacement distribution, U = σ 2 I 3 where I 3 is the 3 × 3 identity matrix. Starting again from Eq. (9.27), separating the cases, and then completing the sum results in

$$\displaystyle \begin{aligned} I(\mathbf{q}) = &e^{-4\pi^2 \sigma^2 q^2}\sum_{cc'} \mathrm{L}_{cc'}(\mathbf{q}) \left|\sum_{a} f_a(\mathbf{q}) e^{2\pi i {\mathbf{r}}_a\cdot \mathbf{q}}\right|{}^2\\ &+ N_c \left(1-e^{-4\pi^2 \sigma^2 q^2}\right) \left|\sum_a f_a(\mathbf{q}) e^{2\pi i {\mathbf{r}}_a\cdot \mathbf{q}}\right|{}^2 \; . \end{aligned} $$
(9.30)

Once again, the first term represents the Bragg intensities. The Bragg peaks sample the molecular transform, multiplied by the Debye-Waller factor. In fact, this term is the same as in Eq. (9.28), showing that the Bragg peak intensities are insensitive to whether displacements are correlated or not. However the diffuse term becomes something quite interesting. The same molecular transform term is present but now sampled everywhere. There is no lattice sum term reducing the sampling to just the reciprocal lattice points. In addition, there is the so-called complementary Debye-Waller term which grows with q. As q increases, the Bragg term vanishes and the intensity is just N c times the intensity from a single unit cell. This can be seen by comparing Figs. 9.5 and 9.3a.

Fig. 9.5
figure 5

Total intensity calculated in Eq. (9.30) when the entire unit cell is displaced as a rigid body. The Bragg peaks are identical to Fig. 9.4 but the diffuse scattering now represents the continuous diffraction of the molecule seen in Fig. 9.3a. For details see Box 9.1

Since the diffuse term represents the continuously sampled Fourier transform intensities of the electron density of the molecule, ρ(r), the constraint ratio Ω is greater than 1 and iterative phasing algorithms discussed in Sect. 9.2.5 can be used to recover the phases. It is as if there were N c copies of perfectly aligned single molecules whose intensities were added on top of one another, as in Sect. 9.3.1.

9.4.3 Rigid-Body Translations Plus Uncorrelated Displacements

Now it is unlikely that the proteins are exactly rigid bodies with no internal motions. As a first approximation to address this, the atomic displacements can be modelled as a combination of rigid-body translation and uncorrelated displacements,

$$\displaystyle \begin{aligned}\boldsymbol{\delta}_a = \boldsymbol{\delta} + \boldsymbol{\eta}_a\;.\end{aligned}$$

The corresponding B-factor matrices assuming isotropic displacement distributions are σ 2 I 3 and \(\beta _a^2{\mathbf {I}}_3\) for rigid-body motion and the individual atomic B-factor “vibrations,” respectively. Once again, we assume that there are no long range correlations across multiple unit cells. With these conditions, the covariance matrix can be written as

$$\displaystyle \begin{aligned} {\mathbf{C}}_{ac\,a'c'} = \left<\boldsymbol{\delta}_{ac}^\intercal \boldsymbol{\delta}_{a'c'}\right> = \begin{cases} (\sigma^2 + \beta_a^2){\mathbf{I}}_3,& \text{if}\ c=c' \ \text{and}\ a = a'\\ \sigma^2{\mathbf{I}}_3,& \text{if}\ c=c' \ \text{and}\ a \neq a'\\ 0,& \text{otherwise.} \end{cases} \end{aligned}$$

The first case describes identical atoms (in the same unit cell), the second different atoms in the same unit cell, and the third are atoms in different unit cells. Since there are three cases, Eq. (9.27) must be separated into three terms. After completing the sum twice we obtain the following expression for the total scattered intensity:

$$\displaystyle \begin{aligned} I(\mathbf{q}) = &\sum_{cc'}\mathrm{L}_{cc'}(\mathbf{q}) \left|\sum_{a} f_a(\mathbf{q}) e^{2\pi i {\mathbf{r}}_a\cdot \mathbf{q}}e^{-2\pi^2 (\sigma^2+\beta_a^2) q^2}\right|{}^2\\ &+ N_c \left(1-e^{-4\pi^2 \sigma^2 q^2}\right) \left|\sum_a f_a(\mathbf{q}) e^{2\pi i {\mathbf{r}}_a\cdot \mathbf{q}}e^{-2\pi^2 \beta_a^2 q^2}\right|{}^2 \\ &+ N_c \left(1-e^{-4\pi^2 \sigma^2 q^2}\right) \sum_a \left|f_a(\mathbf{q})\right|{}^2 \left(1 - e^{-4\pi^2\beta_a^2 q^2}\right)\,. {} \end{aligned} $$
(9.31)

The first term once again represents the Bragg intensities which are modulated by the Fourier transform of the average electron density. For this signal, each atom appears to be blurred by both the rigid body and the uncorrelated motion. The second term represents the same continuous diffraction as before modulated by the same complementary Debye-Waller factor of the rigid-body displacements, but now also modulated by the uncorrelated displacements which suppresses the signal at high resolution. The continuous diffraction sees an average molecule consisting of blurred atoms with structure factors given by \(f_a(\mathbf {q})\exp (-2\pi ^2\beta _a^2 q^2)\). The last term is just the remaining signal which appears as background containing no information about the structure, similar to Sect. 9.4.1. The calculated intensities are shown in Fig. 9.6.

Fig. 9.6
figure 6

Total intensity calculated in Eq. (9.31) when the total displacement of each atom has two components, rigid and uncorrelated. The total average displacement is still 0.8 Å, so the Bragg peaks are identical resulting in the Bragg peaks being suppressed at high resolution and featureless diffuse scattering. For details see Box 9.1

A striking feature of the expression of Eq. (9.31) is that the non-spherically symmetric part of the diffuse intensities can still be phased using iterative phasing techniques to give the average electron density of the rigid body. A method for separating the unstructured diffuse background from the structured continuous diffraction is described in Sect. 9.5. In addition, the q-dependence of these intensities is different from that of the average electron density of the unit cell probed by the Bragg peaks—the continuous diffraction can extend to higher resolution than the Bragg peaks depending on the magnitude of the rigid-body mean square displacement σ 2.

9.4.4 Multiple Rigid Bodies in a Unit Cell

As mentioned earlier, in any realistic situation with more than one asymmetric unit, the whole unit cell would not move as a rigid body since the unit cell is an arbitrary construction. This is the case for any space group other than P1. Instead it is conceivable that each asymmetric unit is displaced as a rigid body, or even smaller domains of molecules move as such. Different rigid bodies may be different orientations of the same molecule, as discussed in Sect. 9.3.7. Some units could consist of identical structures in like orientations, in which case their continuous diffraction contributions simply sum.

We consider the diffuse scattering intensity from a crystal with N b rigid bodies per unit cell, each of which is displaced by an uncorrelated amount with respect to the other with a variance of \(\left <\delta _r^2\right > = \sigma _r^2\). Each atom is assigned to a rigid body b and the position of any atom i can now be expressed as

$$\displaystyle \begin{aligned}{\mathbf{r}}_i = {\mathbf{r}}_c + {\mathbf{r}}_b + {\mathbf{r}}_a + \boldsymbol{\delta}_{abc}\;,\end{aligned}$$

representing the position of the atom a within the rigid body b that is located in cell c, as well as the random displacement of that atom δ abc. The covariance matrix for displacements which are rigid within a body and uncorrelated otherwise is

$$\displaystyle \begin{aligned} {\mathbf{C}}_{abc\,a'b'c'} = \left<\boldsymbol{\delta}_{abc}^\intercal \boldsymbol{\delta}_{a'b'c'}\right> = \begin{cases} \sigma_r^2{\mathbf{I}}_3,& \text{if}\ c=c' \ \text{and}\ b = b' \ \text{for}\ \text{all}\ a\\ 0,& \text{otherwise,} \end{cases} \end{aligned}$$

where we have assumed the same σ for all rigid bodies. This is not unusual since it is possible that the rigid bodies are symmetric units of the unit cell which are related by the space group symmetry. For the sake of clarity, we have ignored uncorrelated atomic displacements. Their inclusion leads to results analogous to Sect. 9.4.3.

Equation (9.27) now gains an extra double sum

$$\displaystyle \begin{aligned} I(\mathbf{q}) = \sum_{cc'} \mathrm{L}_{cc'}(\mathbf{q}) \sum_{bb'} e^{2\pi i ({\mathbf{r}}_b-{\mathbf{r}}_{b'})\cdot \mathbf{q}} \sum_{aa'} \mathrm{I}_{aa'}(\mathbf{q}) e^{-4\pi^2 \sigma^2 q^2} e^{4\pi^2 {\mathbf{q}}^\intercal {\mathbf{C}}_{abc\,a'b'c'} \mathbf{q}} \;. \end{aligned}$$

Separating into the different cases and then completing the sums as before, we obtain the following simple relation:

$$\displaystyle \begin{aligned} I(\mathbf{q}) = &e^{-4\pi^2 \sigma^2 q^2}\sum_{cc'} \mathrm{L}_{cc'}(\mathbf{q}) \left|\sum_be^{2\pi i {\mathbf{r}}_b \cdot \mathbf{q}} \sum_a f_a(\mathbf{q})e^{2\pi i {\mathbf{r}}_a \cdot \mathbf{q}}\right|{}^2 \\ &+N_c \left(1-e^{-4\pi^2 \sigma^2 q^2}\right) \sum_b \left|\sum_a f_a(\mathbf{q})e^{2\pi i {\mathbf{r}}_a \cdot \mathbf{q}}\right|{}^2 \;. {} \end{aligned} $$
(9.32)

The first term is once again the familiar Bragg intensity which can be interpreted as the Fourier transform of the average unit cell, sampled at Bragg peaks. One sums over the atoms in all the rigid bodies with the positions relative to the origin of the unit cell (r b + r a).

The second term of Eq. (9.32), however, is seen as the incoherent sum of the continuous diffraction from each average rigid body. The relative positions of the bodies (r b) disappear. This equation is identical to that of the initial example of Sect. 9.4.2 with the scattering factors of individual atoms f a(q) of Eq. (9.29) replaced with the form factors of the individual rigid bodies, \(\sum _a f_a(\mathbf {q}) \exp (2 \pi i {\mathbf {r}}_a \cdot \mathbf {q})\). What was a background due to unstructured (almost point-like) atoms, becomes an incoherent sum of structured molecules or molecular units. The distinction between the coherent sum of units in the Bragg term of Eq. (9.32) and the incoherent sum of the second term is important. For rigid units of different orientation, the incoherent sum is analogous to diffraction of a twinned crystal (see Sect. 9.3.7). For iterative phasing this requires the modulus constraint of Eq. (9.18).

The calculated intensities are displayed in Fig. 9.7. Close inspection reveals another crucial difference between the “untwinned” Bragg diffraction and the “twin-ned” continuous diffraction is that even at reciprocal lattice points, the Bragg and continuous terms do not sample the same underlying transform. Here is it seen that absences in the Bragg peaks do not necessarily correspond to minima of the continuous diffraction and in some places strong Bragg peaks overlay minima of the continuous diffraction. This situation is in stark contrast to the results of Sect. 9.4.3 where the unit cell transform could be factored out. In that case, the end result (Eq. (9.30)) is a more complicated lattice function than L(q), as can also be seen within the square brackets of Eq. (9.23). The difference between incoherent and coherent addition can also be illustrated with a simple toy problem of diffraction of a two-dimensional array of rod-like structures, shown in Fig. 9.8. Here there are two orientations of these independently displaced rigid units. The continuous diffraction consists of the incoherent sum of rod diffraction in the two different orientations, Fig. 9.8b, whereas the Bragg peaks sample the underlying unit cell transform of Fig. 9.8c.

Fig. 9.7
figure 7

Total intensity calculated in Eq. (9.32) when each rigid body in the unit cell is displaced. In this case, there are four rigid bodies, each of which is the asymmetric unit of the P212121 space group. One can see the systematic absences in the Bragg peaks but since the diffuse scattering is the incoherent sum of the intensities, there are no cancellations due to the space group at the forbidden positions. For details see Box 9.1

Fig. 9.8
figure 8

The continuous and Bragg diffraction do not necessarily sample the same underlying transform, as illustrated in a 2D crystal of C4 symmetry consisting of eight rods which are all randomly displaced independently in the crystal (a). (b) The continuous diffraction consists of equal weightings of the single-rod diffraction in the two unique orientations, whereas the Bragg peaks sample the transform of the average unit cell (c). The patterns are markedly different and even display different symmetries

In Sect. 9.2.4, we saw how the feature size in Fourier space (the Shannon voxel or speckle) is inversely related to the size of coherently diffracting volume. If one combines two objects incoherently (by adding the intensities) however, the feature size does not change and contrast is reduced [8]. This means that the observed speckle size gives a measure of the size of the rigid bodies that contribute to the diffraction pattern. This can be examined directly by taking the Fourier transform of the measured 3D diffraction intensities, giving the autocorrelation function of the disordered crystal. If the Bragg peaks are first filtered from the diffraction intensities, the transform \(\tilde {I}(\mathbf {u})\) of the second term of Eq. (9.32) is the sum of the complete autocorrelation functions of the rigid units (convolved with a narrow Gaussian given by the Fourier transform of the complementary Debye-Waller factor). More broadly, as expanded in the next section, the autocorrelation of the rigid units will be convolved with the correlation function of their displacements, making it possible to determine if the boundary between the rigid bodies is “soft,” for example. When small domains of molecules are displaced, a case of great interest to understand protein dynamics and function, the resulting speckle size will be large, and the incoherent sum of the continuous diffraction from the many independent domains will be of low contrast, perhaps not distinguishable from uncorrelated atomic disorder.

9.4.5 Liquid-Like Motions Within a Rigid Body

A popular model of disorder is liquid-like motions [9, 39]. Here the covariance matrix between two atoms a and a′ is a diagonal matrix which decays exponentially with the distance between them. This can be expressed as

$$\displaystyle \begin{aligned} {\mathbf{C}}_{aa'} = \left<\boldsymbol{\delta}_{ac}^\intercal \boldsymbol{\delta}_{a'c'}\right> = \sigma^2 \exp(-r_{aa'}/\gamma) \,{\mathbf{I}}_3\;. {}\end{aligned} $$
(9.33)

This is termed liquid-like because it treats the molecules like an ideal fluid. There is no shearing motion which would lead to off-diagonal terms in the covariance matrix and the correlations are strictly non-negative. One can also have different values for σ and γ for atom pairs within the same molecule and in different molecules. To investigate how the scattered intensity is related to the molecular transform in this case, we consider a P1 crystal with the following covariance matrix:

$$\displaystyle \begin{aligned} C_{ac\,a'c'} = \left<\boldsymbol{\delta}_{ac}^\intercal \boldsymbol{\delta}_{a'c'}\right> = \begin{cases} \sigma^2 \exp(-r_{aa'} / \gamma)\,{\mathbf{I}}_3,& \text{if}\ c=c' \\ 0,& \text{otherwise.}\\ \end{cases}\end{aligned} $$

This means that within the unit cell, correlations decay with distance but there are no correlations across unit cells. Using the now-familiar procedure, Eq. (9.27) becomes

$$\displaystyle \begin{aligned} I(\mathbf{q}) = &\sum_{cc'} \mathrm{L}_{cc'}(\mathbf{q}) \sum_{aa'} \mathrm{I}_{aa'}(\mathbf{q}) e^{-4\pi^2 \sigma^2 q^2} \\ &+ N_c e^{-4\pi^2 \sigma^2 q^2} \sum_{aa'} \mathrm{I}_{aa'}(\mathbf{q}) \left(e^{4\pi^2 \sigma^2 q^2 \exp(-r_{aa'} / \gamma)} - 1\right) \;.\end{aligned} $$

As a sanity check, we see that as γ →, the expression becomes the same as in Eq. (9.30) with a B-factor suppressed Bragg term and continuous diffraction in the second term. To evaluate the second term for finite γ, it is helpful to expand the \(I_{aa'}\) term

where P(u) is the autocorrelation of the electron density weighted by the covariance terms and δ(u) is the 3D Dirac delta function. The advantage of this conversion to a continuous integral is that the Fourier transform is directly evident. The ideal autocorrelation function is the inverse Fourier transform of the molecular transform intensities,

$$\displaystyle \begin{aligned}P_o(\mathbf{u}) = \mathcal{F}^{-1}_u\left[I_o(\mathbf{q})\right]\,. \end{aligned} $$

The weighted autocorrelation can be written as the product of the ideal one with the liquid-like motion weighting

$$\displaystyle \begin{aligned} P(\mathbf{u}) = P_o(\mathbf{u}) \times \left(e^{4\pi^2 \sigma^2 q^2 \exp(-u / \gamma)} - 1\right)\,. {}\end{aligned} $$
(9.34)

Using the fact that the Fourier transform maps products to convolutions and using a Taylor-series expansion, the diffuse term I D can be written as

$$\displaystyle \begin{aligned} I_D(\mathbf{q}) &= N_c e^{-4\pi^2 \sigma^2 q^2} \mathcal{F}_q\left[P_o(\mathbf{u}) \times\left(e^{4\pi^2 \sigma^2 q^2 \exp(-u / \gamma)} - 1\right)\right] \\ &= N_c e^{-4\pi^2 \sigma^2 q^2} \sum_{n=1}^\infty\frac{(2\pi \sigma q)^{2n}}{n!} I_o(\mathbf{q}) \otimes \mathcal{F}_q\left[e^{-n u / \gamma}\right] \\ &= N_c e^{-4\pi^2 \sigma^2 q^2} \sum_{n=1}^\infty \frac{(2\pi \sigma q)^{2n}}{n!} I_o(\mathbf{q}) \otimes \left[\frac{8\pi n \gamma^3}{\left(n^2 + 4\pi^2\gamma^2q^2\right)^2}\right] {} \end{aligned} $$
(9.35)

where the last term is the 3D Fourier transform of the spherically symmetric function e n|u|∕γ. Thus, the effect of liquid-like correlations within the rigid body results in a series of convolutions with incrementally broader kernels. In addition, the contribution of each successive term is weighted more at higher q. At very low resolution, the higher order terms can be neglected and the molecular transform is only slightly modified. At higher resolution, more terms have to be taken into account which blur the intensity more. This can be understood intuitively in that at low resolution (long length scales), the rigid body moves highly coherently since the deviations from rigid-body motion are small. At higher resolution, the effective correlation length decreases which broadens the features in Fourier space.

Figure 9.9 shows our standard example pattern modified for different values of the correlation length γ. In the first case γ is around 2.5 times the size of the rigid body so it can be approximated well as a rigid body, while in (b), the correlation length is only a quarter of the size. Thus, at higher resolution, the best approximation is that of uncorrelated displacements leading to structureless diffuse scattering.

Fig. 9.9
figure 9

Diffuse scattering evaluated from the liquid-like motions model, Eq. (9.35), with the parameters σ = 0.8 Å in both cases and γ = 100 Å and 10 Å in (a) and (b), respectively. The pattern in (a) is fairly similar to Fig. 9.3a, except near the corners of the image. But if the correlation length is significantly smaller than the particle, which is around 40 Å wide, the blurring will make phase retrieval difficult. For details see Box 9.1

The first line of Eq. (9.35) emphasizes the power of investigating the continuous diffraction (after filtering the Bragg peaks) in real space by Fourier transformation. After accounting for the Debye-Waller term, which could be obtained through a Wilson plot analysis [20], the transform of I D(q) yields the weighted autocorrelation function of Eq. (9.34). It may be possible to determine the form of the correlation function through inspection or modelling, or to solve for an approximation of it during the phasing process, similar to procedures of partially coherent diffraction imaging [19, 58].

9.4.6 Crystal Growth Model

One can consider the case where the bodies such as whole molecules are rigid but there is an exponentially decaying correlation between the displacements of those rigid bodies from their ideal positions. This situation can be surmised to be more likely than uncorrelated displacements of molecules, since molecules are in contact in the crystal and the position of one molecule is influenced by the positions of these neighbors. The so-called growth models capture this dependence of the position of one molecule on the positions of those that came before it [55], which we simplify here. Again, for simplicity, staying with the P1 model of the whole unit cell being a rigid body, we obtain the following covariance matrix in analogy to Eq. (9.33):

$$\displaystyle \begin{aligned} C_{ac\,a'c'} = \left<\boldsymbol{\delta}_{ac}^\intercal \boldsymbol{\delta}_{a'c'}\right> = \sigma^2 \exp(-r_{cc'} / \gamma)\,{\mathbf{I}}_3 \;. \end{aligned}$$

Since the covariance only depends on the distance between the unit cell origins, this represents each unit cell moving as a rigid body. Now Eq. (9.27) does not separate into different terms, but can just be written as

$$\displaystyle \begin{aligned} I(\mathbf{q}) &= e^{-4\pi^2 \sigma^2 q^2}\sum_{cc'} \mathrm{L}_{cc'}(\mathbf{q}) e^{4\pi^2 \sigma^2 q^2 \exp(-r_{cc'} / \gamma)} \sum_{aa'} \mathrm{I}_{aa'}(\mathbf{q}) \\ &= e^{-4\pi^2 \sigma^2 q^2} I_o(\mathbf{q}) \sum_{cc'}\mathrm{L}_{cc'}(\mathbf{q}) e^{4\pi^2 \sigma^2 q^2 \exp(-r_{cc'} / \gamma)}\,. \end{aligned} $$

This is the first time we consider correlated displacements across unit cells, which has the effect of modifying the lattice sum itself. Now even for an infinite crystal, the diffraction is not expressed as an array of delta functions. The sum can, however, be expressed as a sum of convolutions as in Sect. 9.4.5, Eq. (9.35), but with I o(q) replaced by the ideal lattice sum, which is the array of delta functions (the Dirac comb).

$$\displaystyle \begin{aligned} I(\mathbf{q}) = e^{-4\pi^2 \sigma^2 q^2} \sum_{n=1}^\infty \frac{(2\pi \sigma q)^{2n}}{n!} \left[\sum_{cc'}\mathrm{L}_{cc'}(\mathbf{q})\right] \otimes \left[\frac{8\pi n \gamma^3}{\left(n^2 + 4\pi^2\gamma^2q^2\right)^2}\right]\,. {} \end{aligned} $$
(9.36)

The Bragg peak is broadened by an increasing amount as q increases until it disappears at a high enough resolution and one obtains the direct continuous diffraction term I o(q). This can again be understood intuitively in the following way. At low resolution (long length scales), a large chunk of the crystal is well-ordered resulting in sharp, strong Bragg peaks. At smaller length scales, only a small region of the crystal has the necessary degree of periodicity to interfere coherently to produce fat Bragg peaks, as discussed in Sect. 9.3.3. Two cases, with γ respectively bigger and smaller than the unit cell are shown in Fig. 9.10a, b. A recent investigation discerned correlations over lengths of many unit cells in the merged experimental patterns of several protein crystals in which Bragg diffraction extended to high resolution [39]. The broadening and weakening of Bragg peaks with scattering angle is consistent with such growth models. As mentioned in Sect. 9.4.5 and summarized in Box 9.2, iterative algorithms to recover both the phases and the correlation function offer not only a route to obtain maps of electron density, but may also give new insights into the arrangements of molecules in crystals.

Fig. 9.10
figure 10

Scattered intensity evaluated from the growth model, Eq. (9.36), with the parameters σ = 0.8 Å and γ = 100 Å and 10 Å in (a) and (b), respectively. Even with a relatively high σ, since γ is so large in (a), Bragg peaks persist until the edge of the image since many unit cells move as a group. In (b), the situation is closer to rigid-body disorder of Sect. 9.4.2. For details see Box 9.1

9.4.7 Rotational Rigid-Body Disorder

If the rigid body undergoes rotational oscillations (librations), the diffuse scattering is, in general, not a direct modulation of the continuous diffraction intensities. This is because rotations displace different parts of the molecule by different amounts. The atoms on the rotation axis for axial rotations do not displace at all. This effectively makes the rotationally disordered electron density fundamentally different from the rigid-body density. The situation is more tractable if translational disorder is also present and the displacement of all atoms due to rotations is small compared to translations. In that case, one observes a rotationally blurred copy of the continuous diffraction, which can still be phased under certain conditions using partial coherence methods. Figure 9.11 shows the intensity distributions when there is both translational and rotational disorder. The tolerance to rotation is similar to that of the Crowther condition for the angular step size of tomography [13], given by the resolution divided by the molecule width, and thus smaller rigid bodies can tolerate larger degrees of rotational disorder.

Fig. 9.11
figure 11

Intensity distribution from both rotational disorder about the center of mass of the molecule with standard deviation of 4 and rigid-body translational disorder with σ = 0.8 Å and 0.2 Å in (a) and (b), respectively. The pattern in (a) is recognizably a slightly rotationally blurred version of Fig. 9.3a. As the relative contribution of rotational disorder is increased, additional features appear and the diffuse scattering cannot be interpreted as simply transformed continuous diffraction. For details see Box 9.1

Box 9.2

In Sect. 9.4, we have seen the effect on the scattered intensity through various forms of crystalline disorder. In general, a few or more of these are likely to occur in all crystals to some extent. In conventional protein crystallography, uncorrelated disorder is assumed by default since Bragg intensities are insensitive to correlations in displacements. It is striking by comparing the figures in this section how persistent the structural information is in the continuous diffraction. This stems from the fact that the correlation functions are themselves relatively unstructured and their main effect is to broadly modulate the pair correlation (autocorrelation function) rather than to change the distribution of those pair correlations. As such the continuous diffraction, in combination with Bragg intensities and a suitable treatment of the correlation function, should provide robust information to obtain the molecular structure. This is akin to strategies of accounting for partial coherence, which have brought much success to the field of diffractive imaging [58].

9.4.8 An Example System: Photosystem II

Continuous diffraction from a disordered crystal was first identified as useful for structure determination in crystals of the membrane protein complex photosystem II (PS II) [1]. Membrane protein surfaces have both hydrophobic and hydrophilic parts. One way to generate a stable crystal in an aqueous environment is to use detergents which form micelles around the hydrophobic surfaces. These micelles mediate the crystal contacts, but since they are flexible, the contacts are relatively soft. The crystals are therefore not perfectly ordered and, in this experiment, produced observable Bragg peaks to a resolution of only around 4.5 Å. However, one can observe weak continuous scattering at higher scattering angles all the way to the edge of the detector (Fig. 9.12). After subtracting the background and combining the strongest diffraction patterns as described in Sect. 9.5, one can observe a striking continuous signal. This continuous diffraction data is available as CXIDB Entry 59 (http://cxidb.org/id-59.html).

Fig. 9.12
figure 12

(a) An X-ray FEL snapshot “still” diffraction pattern of a PSII microcrystal shows a weak speckle structure beyond the extent of Bragg peaks, which is enhanced in this figure by limiting the displayed pixel values. (b) Structure factors obtained from Bragg peak counts from 25,585 still patterns, displayed as a precession-style pattern of the [001] zone axis. (c) A rendering of the entire 3D diffraction volume assembled from the 2848 strongest patterns. (d) A central section of the diffraction volume in c normal to the [100] axis. Speckles are clearly observed beyond the 4.5-Å extent of Bragg diffraction (indicated by the white circles in b and d) to the edge of the detector. Caption and figure from Ref. [1]

These molecules crystallize in the orthorhombic P212121 space group, which means that there are four asymmetric units in each unit cell. Since the choice of which four units make up the unit cell is arbitrary, we do not consider that the whole unit cell moves as a single rigid body. In order to verify that this is indeed continuous diffraction from some rigid bodies, the sizes of the features (or speckles) in the pattern can be examined. This is done quantitatively by calculating the autocorrelation as described in Sect. 9.2.4 using small regions (masked with a Gaussian function) in the high-resolution parts of the 3D intensity distribution. It is clear from Fig. 9.13a that the autocorrelation is not only of finite extent (or support) but that it also has the same point-group symmetry as the crystal. From the previous sections, this suggests a strong possibility that rigid-body translations can explain the scattering and that each asymmetric unit is an independent rigid body. For PS II the asymmetric unit is itself a dimer, so it is possible that each monomer is independently displaced as a rigid body. However, the monomer is too small and the resulting symmetrized autocorrelation seen in Fig. 9.13c does not match the data. If the rigid body was smaller than the dimer, the features in the measured diffraction pattern would be bigger.

Fig. 9.13
figure 13

(a) Electron density autocorrelation projected along the crystal c axis. (b) Point-group symmetrized autocorrelations calculated from the PS II dimer, and (c) the PS II monomer projected along the same axis. Reproduced from Ref. [1]

Using these parameters as justification, the authors proceeded to perform iterative phasing on these high-resolution diffuse intensities. As a proof of principle first step, the low-resolution data, where the Bragg peaks are visible, was replaced by the molecular transform obtained from the inverse Fourier transform of the phased Bragg peaks. Thus, the phase problem was reduced to just determining the high-resolution structure from the 222-twinned intensities. In addition, since the low-resolution model was assumed to be known, one could also obtain a good molecular envelope. The use of this envelope encodes the assumption that the protein is “compact,” i.e., it is a connected volume and it does not consist of thin tendrils of electron density far from the bulk of the molecule. Both of these are reasonable for almost any protein, especially if the object moves as a rigid body. In Fourier space, the modulus projection P M of Eq. (9.18) was applied to the high-resolution voxels while the entire complex Fourier amplitude was replaced by the Bragg model at low resolution. With this modified projection, the difference map algorithm was applied [14, 16] to obtain a higher resolution structure.

Further analysis confirmed that the continuous diffraction consisted of the sum of four independent rigid bodies [8]. This was achieved by examining the statistics of the diffraction intensities in shells of reciprocal space. As for Bragg peaks, the intensities of the continuous diffraction of a molecule closely follow a negative exponential distribution that would be expected from the coherent diffraction of a set of atoms in random locations [59]. The most likely intensity value in the pattern is zero, which is seen in the surroundings of the rarer maxima of speckles. The distribution changes markedly when summing the intensities of two different patterns. In this case it is unlikely that all the zeros coincide, and the distribution changes to a gamma distribution with reduced contrast and variance. Such distributions form the basis for analyzing the diffraction of twinned crystals [40], and the same can be applied here. Additionally, this was found to give a way to estimate the contribution of a spherically symmetric unstructured background (such as the third term in Eq. (9.31) which gives intensities that are approximately normally distributed), as described in Sect. 9.5. This analysis showed that the total counts in the continuous diffraction were about four times that of the Bragg peaks, and the background contained 100 times more photons. The improved analysis also gave a Pearson correlation between the modelled diffraction and the measured continuous diffraction of about 0.77, as shown in Fig. 9.14. This degree of correlation was achieved by assuming 1 of rotational disorder. Without this rotational blurring of the modelled patterns a correlation of 0.67 was obtained.

Fig. 9.14
figure 14

(a) Central slices of the merged volume of experimental continuous diffraction intensities, normal to the (010) lattice vector, compared with (b) the same section of the simulated continuous diffraction assuming a rotational disorder of 1 RMS and a translational disorder of 2 Å RMS. (c) The difference of the experimental and simulated intensities, shown on the same color scale as (a) and (b). (d) Plot of the Pearson correlation coefficient in resolution shells between the experimental and simulated data, confirming that rigid-body displacements of the Photosystem II dimer account for the majority of the observed continuous diffraction. Reproduced from Ref. [8]

As we have discussed in the previous section, the real crystal probably consists of other kinds of disorder than just uncorrelated rigid-body displacements. There may be some amount of uncorrelated atomic disorder (Sect. 9.4.1), liquid-like motion within the rigid body (Sect. 9.4.5), correlations between rigid bodies (Sect. 9.4.6), and rotational rigid-body disorder (Sect. 9.4.7). In addition, there may be biologically relevant conformational motions reflecting the dynamic behavior of the protein in its native state. The exploration of all these possibilities to improve the retrieved structure is ongoing and will hopefully lead to reliable structure improvement using this continuous diffraction. Finally, since the continuous diffraction is also visible at lower resolutions, one could also envision fully de novo phasing using both the Bragg peaks and the inter-Bragg intensities.

9.5 Measuring and Processing Continuous Scattering

Although the technique of crystallography is well advanced and data collection at many beamlines is routine, the accurate measurement of continuous diffraction requires extra preparation and care, and the data analysis is different to conventional measurements. Diffraction patterns are recorded with a quasi-monochromatic collimated beam following the same source requirements as for monochromatic macromolecular crystallography. However, diffraction patterns are ideally recorded as “stills” or “snapshots” (no sample rotation during exposure) so that the pixelated diffraction pattern recordings can be mapped into voxels in a 3D array in a similar fashion to tomography or coherent diffractive imaging (see Fig. 9.2). The angular step size between measurements is set by the Crowther condition of tomography [13], equivalent to the Shannon sampling of speckles on subsequent patterns at the highest scattering angles q. This in turn depends on the size of the rigid unit, which can readily be estimated from the speckle size in any given diffraction pattern. The required coverage of reciprocal space depends on the point group of the crystal. For a P1 crystal, this corresponds to the half space (in fact a little more to compensate Ewald sphere curvature), unless anomalous diffraction is to be measured. Diffraction can be recorded from one or several crystals, using an X-ray tube, synchrotron radiation beamline, or free-electron laser, depending on the crystal size and experiment need. The latter source naturally gives snapshot patterns, usually of unknown orientation. As with serial crystallography, the orientations of patterns can be determined by indexing the observable Bragg peaks [56, 57] as described in Chap. 7. Once relative orientations are known, patterns must be corrected for detector artifacts [61], background removed, and patterns scaled, before interpolating and summing them into a 3D array [60].

For the separate analysis of continuous diffraction from Bragg data, the Bragg peaks must be filtered from the data, which is best carried out prior to merging the patterns. There are several kinds of low-pass filters to remove sharp peaks and other features from the pattern, such as blurring with kernels of several pixels width or determining values that differ from the median within such a kernel. Since the locations of Bragg peaks are known, they can be further masked from the 3D array. Bragg peaks account for only a small fraction of all voxels, and one can quite aggressively mask out pixels at the peaks and surrounding them without losing information from the continuous diffraction. Indeed, when in doubt whether the peaks are influencing the continuous diffraction it is better to increase the mask of peaks rather than to phase with inappropriate data. The missing data can be allowed to float during the iterative phasing process, which will be heavily constrained by the support in any case.

The main difference between the measurement and treatment of Bragg and continuous diffraction of course stems from the diffuse nature of the continuous diffraction, which is much weaker per pixel than Bragg peaks since the counts are spread out over many more pixels. The continuous diffraction due to substitutional disorder in time-resolved measurements is presumably even more difficult to measure since it stems from smaller units than the entire molecule. Molecular continuous diffraction tends also to be much weaker than the structureless background that arises from air scatter, the medium containing the crystal, beamline optics, and atomic disorder in the crystal (possibly induced by the X-ray irradiation [6]). In the example of PS II (Sect. 9.4.8), the background was about 100 times higher than the diffuse scattering.

As compared with Bragg peaks, which can easily be distinguished and separated from any incoherent background, it is not so easy to separate continuous diffraction from it. For a crystal with a single rigid unit (P1 symmetry), it may be possible to use the local minima of the measured pattern or the merged intensities to estimate the level of the incoherent background. However, when the continuous diffraction consists of the incoherent sum of multiple rigid units the background estimation can no longer be guided by the local minima since intensity zeroes in the continuous diffraction are unlikely in this case. As long as the background is due solely to scattering from structureless components, such that it is spherically symmetric (after correction for the polarization of the incident beam), then the statistics of the intensities can be used to separate continuous diffraction of structured rigid objects from this unstructured background [8]. Briefly, as mentioned above, the distribution of intensities in shells of q of the continuous diffraction from a disordered crystal follows Wilson statistics. For a single rigid object per unit cell the distribution of intensities is the negative exponential distribution, but when the crystal consists of N b rigid bodies (which incoherently sum as in Bragg diffraction from a twinned crystal) the intensities follow the Gamma distribution with a probability function

$$\displaystyle \begin{aligned} p(I) = \frac{N_b^{N_b} I^{N_b-1}}{\varSigma^{N_b}(N_b-1)!}\exp\left(\frac{-N_b I}{\varSigma}\right),\;\; I>0, \end{aligned} $$
(9.37)

with a mean of Σ and a variance of Σ 2N b. That is, in the absence of background, the number of rigid units can be determined by comparing the mean and variance of intensities in a q shell. As shown by Chapman et al. [8], the addition of an unstructured background with a normal distribution (or more accurately, Poisson when photon counts are low) alters the distribution of intensities and their corresponding moments. The mean of the structured diffraction (Σ), as well as the mean (μ back) and variance (σ back) of the background can be determined from the mean, variance, and skewness of the measured intensities. This allows the background μ back to be estimated and subtracted in each q shell of every measured pattern, as well as the estimation of a scaling factor Σ that can be used when merging patterns into the 3D array. An improved variation of this method can be applied when photon counts are known [8].

This method of background estimation relies upon the fact that the background really is spherically symmetric. It is therefore very important to minimize any parasitic scattering in the experiment that is not symmetric, such as scatter from beamline optics and shadowing of that scattering by components downstream of the optics. The sensitivity of the analysis of continuous diffraction to this kind of artifact is much greater than for Bragg diffraction, so a beamline set up that is adequate for conventional crystallography is not necessarily suitable for continuous diffraction measurements. At synchrotron beamlines, such shadows can originate from a cryo or humidity jet nozzle, the beamstop holder, or the loop holder. The shadows are usually produced by the X-rays scattered upstream from the sample, for example from air. The beamstop and its holder must also be more carefully aligned than for conventional experiments to ensure that low-angle data is usable in subsequent analysis. If such shadowing is stable during the course of the measurement, then the effected regions of the detector should be masked and not used in further analysis. This may reduce the efficiency of data collection, or in the worst case may cause an incomplete measurement. A way to determine the detector mask to be used is to record long exposures with or without the sample or sum a large number of patterns together (a pixel-based sum in the detector frame of reference), and then subtract the rotationally averaged pattern after polarization correction. The resulting mask is then applied to the patterns to recompute the difference from the rotational average, and perhaps updated. This procedure is repeated several times to ensure the reliable detector pixels are identified.

Any non-symmetric background that varies over the set of measured patterns will not be correctable. This is most likely caused by the means to introduce the sample into the beam. Liquid jets used in X-ray FEL experiments (see Chap. 5) are particularly good since they tend only to give diffuse scattering from the liquid, although the tip of the nozzle may cause shadowing at high angles and misalignment of the jet to the focused X-ray beam causes streaks perpendicular to the liquid column at low resolution. Unstable jets can have different and unpredictable scattering from pattern to pattern. The recent double-flow focus jet is particularly stable [37]. “Fixed target” raster-scanning of crystals at X-ray FELs (Chap. 5) may give variable shadowing due to the movement of the sample support, and the support itself gives diffuse scattering. Chips made of Kapton can produce rather sharp rings, and single-crystal silicon chips can produce non-symmetric diffuse scattering due to strain or thermal disorder. For a tomographic series collected at a synchrotron radiation facility, crystals are often mounted in nylon loops which give different scattering depending on the angle of rotation of the loop. This can be avoided by mounting a large crystal sticking out of the loop or by measuring only in a limited angular range where the loop does not come close to occluding the beam. Other variable contributions to the dataset include diffraction from ice or salt, which produce strong Bragg peaks or Debye-Scherrer rings. These can be identified and removed plotting a radial average curve (intensity vs detector radius or q), smoothing it and analyzing the difference between the original and smoothed curves. This method works rather well because ice or salt rings are usually quite sharp.

Measurement of single crystals at synchrotron radiation beamlines must contend with radiation damage and care taken not to exceed tolerable doses. While a thorough study of the effects of radiation damage on continuous diffraction has not yet been made, global damage will have the effect of increasing background and reducing contrast of the continuous diffraction, as indicated by an increase in β a in Eq. (9.31).

The final arrangement of data into a 3D volume is carried out by interpolating each diffraction pattern onto the voxels of that array that intersect with the corresponding Ewald sphere. Besides scaling by the mean signal, Σ, for each pattern, the voxel values must be divided by the number of observations in that particular voxel. This tends to be large for low-resolution voxels and reduces approximately as 1∕q until the center edge of the detector is reached. For serial crystallography from randomly oriented crystals, this usually defines the boundary of accurately measured data since the detector corners give much lower coverage. In such experiments we find it is not usually the best strategy to include every indexed pattern in the 3D merge but to keep only the strongest with the best signal to noise ratio of the continuous diffraction. Adding a large number of weak patterns usually increases noise without the benefit of improved signal. Outlying patterns that have poor correlation with the constructed 3D volume can be excluded, although this process may introduce bias.

Box 9.3

Processing of the measured data for further analysis consists of several steps:

  • Determination of the exact geometry of the experiment (relative position of the detector with respect to the sample) and masking out bad regions or pixels of the detector.

  • Masking of statistically outlying regions in each pattern (such as ice or salt rings) and the removal of Bragg peaks.

  • Correction of polarization and subtraction of symmetric background.

  • Scaling of each pattern, either using the Bragg peaks (for example, from XDS output [26]) or by estimating continuous signal level from Wilson statistics [8].

  • Merging of all diffraction patterns into a single 3D volume using geometrical information and beam parameters. For serial crystallography the orientation can be obtained by indexing [60].

  • Final subtraction of the radially symmetrical background from the 3D merged data, based on Wilson statistics.

  • If several datasets are merged, each 3D volume must be scaled, using again the procedure described in Chapman et al. [8].