We have an object, possibly a very small object, and we want to make a magnified image of it. There are various strategies open to us. For many years, the answer was to make one or more lenses as accurately as possible and arrange them as accurately as possible in a microscope column (Fig. 17.1a-da). New possibilities opened up with the advent of computers. If the image has minor systematic faults arising from the physical hardware, we can process it in the computer to improve upon it (Fig. 17.1a-db). Alternatively, we can build flexible adaptable optics that are controlled by a feedback loop, via detailed quantification of the distortion in the image. One of the biggest breakthroughs in transmission electron imaging relies on computationally measuring lens aberrations and then compensating for them using complicated nonround lenses driven by dozens of variable currents (Fig. 17.1a-dc). Similar compensation strategies have been employed in astronomical telescopes and in many other fields of optics.

Fig. 17.1a-d
figure 1

The computational imaging paradigm. (a) The conventional microscope. (b) Errors in aberrated or imperfect optics are corrected by postprocessing. (c) Errors in the optics are measured in a detector plane. Variable optics are then adjusted to improve the image via a feedback computation. (d) Ptychography: the detector measures something rich in information, but not the image. Decoding computation is used to form the final image. The system is a type of transmission line

A more radical conceptual leap is to realize that perhaps we should not worry about our image at all. All we need is a detector lying somewhere in an optical system that records something rich in information (Fig. 17.1a-dd), whether an image or not. Now what we need is optics that encodes information about our object as efficiently as possible before it reaches the detector. Once we have these data, we can decode them (assuming we know something about the encoding optics) and computationally generate our image. This imaging paradigm is nowadays widely called computational imaging . A communication channel replaces the optical transfer function of the traditional lens: structural information is transferred via three steps: physical encryption; detection; and finally a decoding algorithm.

Ptychography is a method of computational imaging. It employs a source of radiation (light or matter waves of arbitrary wavelength), an object that scatters that radiation, and a detector. We will see that we can have all sorts of optical elements between the source and the object, and between the object and the detector: the variety of modern implementations of ptychography is enormous. But it has five defining properties:

  1. 1.

    There must be an optical component, which is usually, but not always, the object itself, which can move laterally relative to whatever is illuminating that optical component.

  2. 2.

    The detector must be in an optical plane in which the radiation scattered from the optical component has intermixed to create an interference pattern , usually a diffraction pattern, but more generally any interference pattern, possibly even an image.

  3. 3.

    The detector collects at least two interference patterns, arising from a minimum of at least two known different lateral physical offsets of the illumination with respect to the object or other relevant optical component (modern implementations can use hundreds or thousands of lateral offsets). The offsets must be arranged so that adjacent areas of illumination overlap with one another.

  4. 4.

    The source of radiation must be substantially (but not necessarily wholly) coherent.

  5. 5.

    The image of the object is generated by a computer algorithm, which solves for the phase of the waves that arrived at the detector, even though the detector itself was only able to measure the intensity (flux or power) of the radiation that impinged upon it. If we were designing a normal communication channel, say a telephone transmission line , the very last thing we would ever choose to do is to have this catastrophic disposal of phase information right in the middle of the system. But that is what happens with light, x-ray, and electron detectors. An important strength of ptychography is that it can handle this regrettable phase problem with no effort at all.

One thing is clear. Unlike the immediacy of a conventional microscope, ptychography puts a huge obstruction between the microscopist and the image. First, we must wait while at least the two interference patterns are recorded; the experiment takes time. Second, we have to rely on the computer to reconstruct the image from the data. The data usually look nothing at all like the object of interest; we must wholly trust a computer algorithm to deliver our results, something that unnerves quite a lot of scientists.

So, why would anyone want to use such a roundabout way of creating an image? There are three key benefits of the technique:

  1. 1.

    It does not need a lens. Most implementations of ptychography do, in fact, use a lens, but the lens can be of very poor quality; it will not affect the final (high) resolution of the ptychographic image. This has been the driving motive for x-ray ptychography , where high-resolution lenses are hard to make and very costly. X-ray ptychography nowadays routinely improves upon lens-based x-ray imaging resolution by a factor of between 3 and 10. Ironically, the original motive of ptychography was to overcome the resolution limit of electron lenses, but aberration correctors now provide such high resolution—fractions of an atomic diameter—that extra ptychographic resolution has little to offer, at least at the time of writing. Of course, at visible light wavelengths, lens optics is a mature field, already offering wavelength-limited resolution.

  2. 2.

    It produces a complex image (that is, both the modulus and phase of the exit wavefield at the back of the object). Image phase is an elusive thing, which is, nevertheless, crucial for imaging transparent objects like live biological cells that do not absorb radiation, but do introduce a phase change in the wave that passes through them. Consequently, all sorts of optical methods have been developed over the last century for expressing the phase in an image, for example Zernike contrast , differential phase contrast, holography, or processing through-focal series. However, it is a matter of fact that ptychography produces extraordinarily good phase images; the transfer function of the technique is, at least under most circumstances, almost perfect. This phase signal remains a pressing need over all wavelengths and is the main motive for ptychography at visible light and electron wavelengths. It is also the key to the success of high-resolution x-ray ptycho-tomography (Sect. 17.6.1).

  3. 3.

    We have stated that all sorts of optical components can be used in a ptychographical experiment. One would suppose that the characteristics of these components would have to be known exactly, or at least very well. After all, methods like holography need exquisite optical alignment, and even then various calibration steps must be undertaken to characterize the reference wave inhomogeneities. But a remarkable peculiarity of ptychography is that the method self-calibrates. It blindly characterises the optical components in the experimental set up. It computationally provides a map of all the aberrations in any lens being used in the system, including apertures and slits. It can measure (and remove the effects of) any partial coherence in the source. It can find and correct for errors in the lateral displacements that are themselves the central source of the ptychographical information. It can infer the physical position of the detector. It can even correctly estimate the intensity of thousands of pixels that are inoperable in the detector and infer the intensity that would have been measured outside the edges of detector had the detector been larger.

What is the secret of this remarkable technique? There are many inverse computational imaging methods that solve for extra information, say the phase of an image, using multiple images collected as a function of some variable or other. More images mean more measurements, and more measurements usually mean more overall diversity in the entire data set. It happens that the source of diversity in ptychography—lateral shift—is easy to implement experimentally; unlike, say, a through-focal series, ptychographical data can be collected in endless abundance; and the diversity of this data is large. In other words, the communication channel of ptychography (Fig. 17.1a-dd) has a very wide bandwidth (Sect. 17.4). Because most of this bandwidth is redundant, any errors in the encoding system can be corrected; it is very hard for the message (the image) to be lost or corrupted by instrument noise (of course, the fundamental limits of counting statistics will always apply).

Any computational imaging strategy must have its decoding algorithm. Ptychography's involves a procedure that must solve the phase problem, which has historically been seen as extremely difficult. Wave amplitudes add linearly, their intensities do not, and so the solution space is highly nonlinear. We might suppose that the decoding algorithm in ptychography must be very complicated and highly ill-conditioned. This is not the case. Perhaps another key reason for its success is that the most popular reconstruction algorithms available today are both intuitive and very easy to code. They also invariably work without too much tweaking or insider knowledge. It is astonishing that any one of the core algorithms can be used for any of the very diverse range of ptychographical set ups, or for any of wavelength—photon or electron. The only exception to this is that the WDD inversion method (Sect. 17.10) must have very densely sampled data; conversely, any of the iterative algorithms works for densely sampled data.

1 Nomenclature

1.1 Ptychography/cCDI

History dictates that in certain communities ptychography is seen as a type of coherent diffractive imaging ( ). Recent developments, especially Fourier ptychography which records images but not diffraction patterns, perhaps renders this classification outdated. Furthermore, CDI is inextricable linked with the term oversampling, which is not a fundamental constraint in ptychography. Here, we will therefore reserve the term conventional CDI (cCDI) for methods that recover structure from a single diffraction pattern; ptychography always uses data from more than one interference pattern and is probably best thought of in terms of Fig. 17.1a-dd.

1.2 Illumination/Probe-Object/Specimen

The illumination function is very often called a probe, because the illumination is often made using a lens that convergences a conical beam onto the specimen. We will use probe interchangeably with illumination, depending on context. The same applies to object and specimen.

1.3 Exit Wave or Transmission Function

There has been some confusion about whether ptychography solves for the exit wave of the object or the transmission function of the object. Very early work on ptychography sometimes used \(\psi_{\mathrm{e}}\) to represent the exit wave from a specimen illuminated by a plane wave. This is only valid if the object function is indeed identical to the exit wave under plane wave illumination, which is true if the object is infinitively thin. However, as soon as we introduce depth effects, say by solving for multiple layers of a specimen (Sect. 17.6.2), then the only interpretation of the object function is as a physical transmission function. The actual exit wave, from one probe position, bears no obvious relation to any of the layers in a 3-D object, and there is no single two-dimensional () function that can account for all the different exit waves that occur at every probe position. Propagation can also lead to features in the exit wave having higher amplitude than any part of the incoming wave. In short, we solve for transmission functions (and sometimes multiple layered transmission functions), not the exit wave. We do, however, have to solve for each different exit wave at each probe position.

1.4 Object Functionsas Propagating Waves

In some configurations, ptychography solves for a wave while an aperture of some type acts mathematically in the role of the illumination. In Fourier ptychography (Sect. 17.5.2), the aperture lies in the back focal plane of the objective lens and the object function is the complex-valued diffraction pattern lying in the same plane. In selected area ptychography ( —Sect. 17.5.3) the aperture lies in an image plane, and now the object is the complex-valued image formed by the objective lens. The important point is that the mathematics of ptychography applies to any complex-valued function moved across another complex-valued function. Which one of these scatters (object/aperture) and which one illuminates (probe/image or diffraction pattern) is inconsequential.

1.5 Fourier/Detector Projection

Historically, the projection in the diffraction plane of an iterative reconstruction algorithm has been called the Fourier constraint . Because in ptychography this constraint can occur in the Fresnel near field or in the image (in the case of Fourier ptychography), we will call it here the detector constraint.

1.6 Bright and Dark-Field Data

We may occasionally use the term dark-field intensity . If the illumination is of the form of a convergent beam, then in the far field the aperture in the probe-forming optics appears as a disc. The intensity in the disc is what is used for bright-field imaging in scanning transmission microscopy (the intensity there is collected as a function of a continuous scan of the probe). The intensity outside the disc is then the dark-field intensity, which is invariably much weaker than the bright-field disc. By reciprocity, in Fourier ptychography, dark field intensity is called the same as in conventional microscopy; i. e., when recording a dark-field image, the incident beam has been tilted far enough so that it is blocked off by the objective aperture. The resolution of a perfect lens can only be improved by ptychography if dark-field data are processed.

1.7 The Fat-H and the Trotters

We will later introduce these two informal terms, which are widely adopted by the community but have not been recorded in any published paper. We think this is timely because the subject to which they relate (Wigner distribution deconvolution ( )—see Sect. 17.10) has had a recent resurgence. They are compact terms for complicated data structures and are now regularly used in discussions at conferences, etc.

1.8 STEM/STXM

We will call both the scanning transmission electron microscope ( ) configuration and the scanning transmission x-ray microscope ( ) STE/XM. This is because, being optically equivalent to one another, ptychography treats them both identically.

1.9 Ronchigram

This term is used in the electron literature but rarely in the x-ray imaging literature. It refers to the unscattered beam created by a convergent focused beam in the far field. This is usually circular in electron microscopy (being the shadow image cast by the condenser aperture). For an x-ray Fresnel zone plate lens, it is usually doughnut-shaped because of the central stop required to block undiffracted intensity. If Kirkpatrick–Baez (KB ) mirrors are used, it is rectangular. When the probe is defocused from the object plane, it is equivalent to a Gabor in-line hologram.

1.10 Circles

As a warning to the reader, we remark that the science of ptychography involves lots of diagrams of circles. The probe function (in real space) is often circular or represented by a circle. Diffraction discs (in reciprocal space) from a crystalline object are circular when a focused lens with an aperture in its back focal plane is used to form the probe. The Fat-H and the trotters are made out of parts of circles. Fourier ptychography and SAP ptychography have their own circular apertures. The modulus constraint in the complex plane is circular. One must know which circle is which; they are not all the same!

2 A Brief History of Ptychography

This chapter is not a historical review. However, for the benefit of those new to the subject, we now make one or two nonessential observations about its history.

First: where did the name come from? Ptychography derives from the Greek ptycho, meaning to fold. Hoppe and Hegerl [17.1] introduced it to describe a method of calculating the phase of the Bragg reflections from a crystal [17.2, 17.3, 17.4]. If a localized spot of radiation illuminates a crystal, the Fraunhofer diffraction pattern is a convolution (or in German Faltung, folding) of the crystal Bragg reflections with the Fourier transform of the illumination function. The latter is wide, because the illumination is narrow, and so the Bragg peaks, which are usually perfect spots, are made to overlap one another. If the radiation is coherent, the overlaps interfere with one another (as can be seen in Fig. 17.2a-cc, which we discuss in detail in Sect. 17.10). Hoppe conjectured that this interference could be used to estimate the phase difference between any pair of overlapping discs, bar an ambiguity of a complex conjugate. By shifting the illumination to a second position, the Fourier shift theorem allows the ambiguity between two diffracted beam to be resolved. He suggested that it might be possible to extend this idea to non-periodic objects, but he did not propose a general solution.

Fig. 17.2a-c
figure 2

Ptychographic interference. (a) A lens brings a beam to a crossover in front of a periodic object. (b) In the far field we see a shadow image (experimental data using a laser incident on a TEM grid). (c) When the lens in (a) is stopped down by an aperture, we see explicit diffraction orders which interfere with one another

For an explicit description of this phenomenon, see [17.5]. Modern ptychography has very little in common with this original concept, but the name has stuck. In fact, it is still true to say that the word captures the essence of a diffraction pattern determined by a convolution, and a shift of illumination relative to the object. Note that in the context of near-field or full-field ptychography, which we will discuss in Sect. 17.5.4, the Fresnel integral can also be formulated as a convolution, but in this case there is no requirement for the illumination to be localized. In Fourier ptychography, the measured data are the convolution of the image (the Fourier transform of the object diffraction pattern) with the Fourier transform of the lens aperture, i. e., the impulse response function of the lens. Many workers in the field now interpret the folding of ptycho to mean the stitching together of localised areas of the object function. Although a logical inference, it was not the original intention of the term.

After having had one research student (Hegerl) work on the technique, Hoppe abandoned it, as he describes in [17.6] written at the time of his retirement in 1982. There was some work done on ptychography by a small group in Cambridge led by one of the authors during the 1990s (reviewed in [17.5]). Nearly all of this was based on noniterative ptychographic inversion algorithms, except for the work of Landauer, who developed iterative algorithms for Fourier ptychography [17.7]. Chapman successfully applied one of these techniques to soft x-ray ptychographic data [17.8], but this class of direct inversion algorithm required very large quantities of data, which could not easily be handled by the computers available at the time. The x-ray data took a long time to collect, because in those days, source brightness was relatively low. Certainly, the electron detectors available then were utterly dismal. Now that technology has moved on, there is a resurgence of interest in these techniques [17.10, 17.9], which will be discussed in Sect. 17.10, and which may yet prove to be very powerful.

The big explosion of interest in ptychography began in 2007, starting in the x-ray synchrotron microscopy community [17.11, 17.12, 17.13]. We can identify four reasons for this:

  1. 1.

    The development of third generation synchrotrons supplied very bright, spatially coherent sources, suitable for conventional coherent diffractive imaging ( ).

  2. 2.

    Following the first Coherence conference in 2001 [17.14], there developed a large community of scientists interested in both the physical implementation and iterative solutions of the cCDI x-ray phase problem as it relates to single diffraction patterns from finite objects [17.15, 17.16]. This meant that when the first real-space iterative solution to the ptychographical phase problem was demonstrated experimentally [17.11, 17.17], there were many workers who could immediately implement it on existing beamlines and instrumentation. It helped that the simplest experimental set up required only an aperture and a stepper stage and that the associated iterative solution [17.18], although very quickly superseded by more comprehensive approaches, was very simple to code. Furthermore, because of the diversity of ptychographical data, reconstruction algorithms for it are relatively robust, at least compared with those used for cCDI.

  3. 3.

    There was a strong demand for higher resolution in x-ray microscopy that could not be easily satisfied by improved optics, but which could be delivered easily by ptychography [17.12, 17.19]. Although ptychography was originally developed to overcome the electron lens resolution problem, by the time it came to maturity, aberration-corrected lenses could provide all the resolution one could usefully employ, although its ability to recover phase accurately is still in demand.

  4. 4.

    The phase sensitivity of ptychographical micrographs, measuring the projected optical potential quantitatively and linearly, meant it could be very effectively used for tomographic imaging [17.20, 17.21], which has become one of its most scientifically significant applications (Sect. 17.6.1).

These benefits were already established in the literature by 2010. Since then there have been numerous developments over many different wavelengths and in many different optical configurations. We will try to cover most of the important trends in this chapter, but as the rate of progress in the field accelerates, much of what we write here will quickly become out of date. Treat this chapter as an elementary introduction to the field.

3 How Ptychography Solves the Phase Problem

Chapter 20, Spence of this volume is dedicated to single-pattern diffractive imaging. In many situations of experimental importance, such as the strategy of diffract and destroy, we can only record one diffraction pattern, because after one exposure the object of interest has been damaged or completely destroyed. However, we can use some of the concepts in cCDI to work our way towards an understanding of ptychography, especially in how it solves the phase problem.

3.1 The Phase Problem

Let us consider a very simple version of ptychography, as shown in Fig. 17.3. A source of illumination passes through an aperture and then the object. The resulting exit wave from the object propagates to the far-field Fraunhofer diffraction plane, where it is recorded on a detector with \(N\times N\) pixels. For simplicity, we assume the aperture is so close to the object that there is no diffractive spreading of the beam between the aperture and the object. Figure 17.4 shows two \(N\times N\) arrays of complex (real and imaginary) numbers, related to one another by Fourier transformation. The leftmost array shows an estimate of our object where it has been illuminated by the round aperture. The rightmost array represents the pixels in our detector; this array is the data we measure. Because the Fraunhofer integral is a linear, invertible Fourier transform, we should be able to back Fourier transform the pixels in the detector array and then discover the exit wave coming from the object.

Fig. 17.3
figure 3

The simplest type of diffractive imaging experiment. An aperture, which we will start by assuming lies right against the specimen, i. e., there is no propagation spreading of the wavefield between it and the specimen. A detector lies in the far-field Fraunhofer diffraction plane

Fig. 17.4
figure 4

The Fourier relationship between a real-space object delineated by an aperture and its complex-valued Fourier transform lying in the far-field diffraction plane

The exit wave is also a complex function, intimately related to the structure of the object. Roughly speaking, its modulus represents the object's transmittance (1 equals total transmittance, which is free space , 0 equals total absorption), and its phase is the accumulated phase difference, relative to free space, caused by the real part of the refractive index of the object as the wave passes through its thickness . Ideally, we want to measure both the modulus and the phase of the exit wave to find out the most about the object.

For some radiations, for example electromagnetic radio waves, it is easy to build an \(N\times N\) detector that can measure both the modulus and the phase of the wave arriving at it. This is how aperture synthesis radio astronomy works. In our case, however, all the radiations that have short enough wavelengths to be useful for high resolution microscopy oscillate at correspondingly high frequencies: no detectors can sense the phase of these oscillations.

What we have here is a classic inverse problem. If we know the object wave, calculating its diffraction pattern—the forward calculation—is easy; it requires one 2-D Fourier transform from the left-hand side to right-hand side of Fig. 17.4. However, the backward, or inverse, calculation—inferring the object from the recorded data—appears to be profoundly intractable; we can assign any phase at all to each pixel of the diffraction patterns, but how can we select the single set of phase assignments that correspond to the actual phases lost in the experiment?

3.2 Iterative Solution Methods

Like all inverse problems, we proceed by applying constraints , i. e., knowledge from the experiment (the data) and also a priori information, which we know about the object independently of the measurements we have made. In the 2-D image phase problem, the most powerful of these is if we know the object is both 2-D and exists wholly within a delineated, finite, area [17.22, 17.23]. This area is called the support of the object. In practice, everything outside this region is either free space or must be blocked off by an aperture, like in our experiment in Figs. 17.3 and 17.4.

Each pixel in the diffraction pattern corresponds to a single Fourier component in the object exit wave. Changing the phase of that pixel has the effect of laterally shifting the Fourier wave component corresponding to that frequency in the object function. The phase of all the Fourier components in the object must, therefore, be such that they add up to zero outside the support. One can imagine that if all the phases are correct except one, then it is bound to give an amplitude contribution to the object wave outside the support, because it will not cancel out all the other correct amplitude contributions, also lying outside the support. The set of possible phases is now fantastically reduced, although it is still not obvious that there is only one unique combination of phases that gives rise to the localization. In fact, it turns out that this single constraint can very often imply a unique object wave solution [17.23], except for several unavoidable ambiguities, such as a shift of the whole object function, or that the object and its Hermitian conjugate have the same diffraction pattern.

Even if there is a unique solution, this does not mean that we can construct a solution algorithm that will always find it. A key breakthrough in the generalized 2-D image phase problem occurred when Fienup [17.24, 17.25] modified a solution strategy originally pioneered by Gerchberg and Saxton [17.26, 17.27]. The method is intimately related to the iterative methods used in ptychography, so it is worth explaining it conceptually in some detail. With reference to Fig. 17.5, we set up an iterative computational loop. On the left-hand side, we have our computational array representing our estimate of the object function, and also our known aperture, which selects part of our object function. On the right-hand side, we have a computational array representing an estimate of the modulus and phase of our measured data. We also have the measured data itself (its modulus) in an array of identical size.

Fig. 17.5
figure 5

Computation process for projection onto constraint sets. Real space is on the left, where the aperture constraint is applied; reciprocal space on the right, where the modulus of the measured diffraction pattern is applied but where phase remains untouched

In Fig. 17.5, A to B and C to D are forward and backward propagation transforms, respectively. These model the relationship between our object plane and our diffraction plane, normally via a Fourier transform or a Fresnel integral of some type. Between B and C, we enforce our knowledge of what we have measured in the detector plane: that in this plane the modulus of the wavefunction must be the square root of the intensity of our data. Between D and A we enforce our aperture constraint: there must be no amplitude lying outside the extent of our aperture. Where we start the iteration is a matter of taste, although if we know anything else about the object (more a priori knowledge)—say that it is likely to be mostly transparent, which would be true for many specimens of interest—then we could start at D with an image field made up of 1s (total transparency) in every image pixel.

Now we apply our aperture constraint, moving from D to A. We put all the values of our object function to zero everywhere outside the aperture (mathematically, this process is called a projection ). We computationally propagate the result to B. The calculated estimate of the detector wave at B is complex, but its modulus will (most likely) bear no relationship to the measured modulus at the detector. Moving from B to C, we perform another projection, this time applying the constraint of the measured modulus (we replace the modulus that arrived at B with the measured modulus), but we do not touch the phase that came out of B. Now the modulus of the data at C are correct, but the phase is almost certainly wrong.

When we back transform from C to D, the wrong phase from C will almost certainly result in the image having some amplitude over all the field of view , including outside the region of the aperture where we know there should be none. We get rid of this wrong result by simply reapplying the aperture constraint (D to A), forcing all those wrong pixels to zero. We then go around the iteration again, perhaps for as many as \(\mathrm{10000}\) times. Sometimes, but certainly not always, this strategy will converge upon a reasonable estimate of the object. There are dozens of variations on this approach, some of which we will discuss later.

Crucial to what follows is how much data we measure relative to the number of variables we are attempting to solve for. The Fourier transform of the object wave maps one-to-one to the number of pixels in the diffraction plane. Once we have lost the phases of the diffraction pattern pixels, we still need at least enough measurements—more than twice the number of pixels in our object wave—in order to give the two numbers we require for the real and imaginary components of the object wave pixels.

The number of pixels in the detector fixes the size of our calculation box in the object space, which must be able to contain our aperture. Of course, as far as the computer is concerned, the two arrays—in the object space and the detector space—are only arrays of numbers, which are always of the same dimension. In reality, the detector pitch has a physical size corresponding to the angle subtended from the specimen. This size is inversely proportional to the field of view of our calculation box in the object space, such that, for small angle scattering

$$\Updelta\theta=\frac{\lambda}{D}\;,$$
(17.1)

where \(\Updelta\theta\) is the angular dimension of a detector pixel , \(D\) is the field of view in the object space, and \(\lambda\) is the wavelength of our radiation.

With reference to Fig. 17.6a,b, the upper sine waves in both detector arrays represent a modulus component of the highest frequency that can occur in the diffraction pattern determined by the corresponding physical width of the aperture in the object plane. The lower sine waves are the intensities of these modulus components, which have twice the periodicity of the underlying modulus (the periodicity of \(\sin^{2}\), say, is twice that of \(\sin\)). Clearly the sampling condition of the intensity is not the same as the sampling in our original complex function. For the same sized object (or in our case aperture), the Fourier transform array of intensity will be under-sampled by a factor of 2.

Fig. 17.6a,b
figure 6

Intensity components of the Fourier transform of an object have half the periodicity of amplitude components. (a) When the aperture fills the real-space object estimate (left), its Fourier transform (right) is undersampled by a factor of 2. (b) Halving the size of the aperture, the intensity of the diffraction pattern can now be sampled adequately, at the Nyquist periodicity in reciprocal space

If we want to measure intensity properly, we have a choice. We can buy a new detector with four times as many pixels (\(2N\times 2N\)) in order to fulfil the Nyquist sampling in the detector plane, or stick with the same detector and make the (physical) diameter of the aperture less than half the width of the calculation box. Let us do the latter, as shown in Fig. 17.6a,b. We have now halved the periodicity of all intensity components in the diffraction pattern so that the detector pixel size can, indeed, record all the information in it. (We could also put the detector twice as far away from the object, but the resolution of our object pixel will then worsen by a factor of 2.)

The requirement for the object aperture to be half the lateral size of the calculation box in real space means that the majority of the unknowns in real space—the empty pixels—are in fact known; they are all zero. There are now more than enough numbers measured in the diffraction pattern to solve for the real and imaginary parts of the object within the smaller aperture area. Note that making the aperture smaller still will not give us any more information because this has the effect of sampling the diffracted intensity at more than the Nyquist frequency . By definition, the Nyquist condition has already captured all the information there is; a qualification is that higher sampling may help if the modulation transfer function ( ) of the detector falls off quickly.

3.3 Ptychography: Multiple Diffraction Patterns

Now comes the trick. Our detector fixes the size of the calculation box surrounding our aperture function. But there is nothing to stop us declaring an indefinitely large array in our computer in order to describe a much bigger object. We match the pixel size of this large array with that of our detector-defined calculation box in the object plane. We move the aperture from one position to the next over the object (or move the object with respect to the aperture) as shown in Fig. 17.7, and collect a diffraction pattern from each aperture position. We now run our iterative loop in Fig. 17.5 on each of these areas, one after the other.

Fig. 17.7
figure 7

If we move the aperture over a larger field of view, we can collect a diffraction pattern from each aperture position. The constraints are still performed as before (Fig. 17.4), but an on-going estimate is maintained over the whole field of view (right). An estimate of the object function from the first aperture position is fed into the object estimate for the second position within the area of overlap

The first published example of such a calculation is shown in (Fig. 17.8a-d) [17.28]. Four calculations are run simultaneously, but the areas covered by any one aperture do not overlap with any of the others; the calculations are completely independent of one another and show some of the usual ambiguities inherent to the phase problem. Most noticeably, the cormorant in the phase part of the image also has a centrosymmetric inversion present, with its phase reversed. This is the complex conjugate ambiguity arising from the fact that the Fourier intensity is the same for both these centrosymmetric functions; these two solutions fight with one another because they are both equally valid given the recorded data.

Fig. 17.8a-d
figure 8

When separate reconstructions are undertaken (via the method in Fig. 17.4), each using a single diffraction pattern from four entirely different areas of an object (a,b), the usual ambiguities of the phase problem arise. The modulus (a) and the phase of the reconstruction (b). In (b), the cormorant appears twice, one reflected and opposite phase (i. e., its complex conjugate). When the four calculations are undertaken simultaneously with overlapping areas constrained to be identical ((c) and (d), modulus and phase respectively), the reconstruction loses the ambiguities. Reprinted with permission from [17.8]. Copyright 2004 by the American Physical Society

Now consider the lower half of Fig. 17.8a-d. There is one continuously updated estimate of the whole area of the object. At each iteration, a circle of the object corresponding to one of the aperture areas is removed and imbedded in a separate aperture box, as with the single diffraction pattern iteration. Once the object has been updated (a whole cycle from A to A in Fig. 17.5), this circular area is replaced from where it was removed. Now, when we begin our iteration for the adjacent aperture area, we already have a first estimate of the object in the overlap area. This information is fed into the second iterative loop, thus forcing the object solution to be consistent with all the diffraction patterns.

We see that the overlapping update of the object very quickly delivers a much better reconstruction than when we were processing each aperture area independently. The picture appears after just a few iterations. Ambiguities are destroyed. Centrosymmetric ambiguities cannot exist in adjacent aperture areas; there has to be only one value for both functions in the area of overlap, so the ambiguous polarity of both object estimates is forced to resolve itself. This is the power of ptychography. The degree of overlap between these simple aperture functions can be really very small, yet still the solution is forced to be unique. Ptychography provides a new prior knowledge of the illumination positions, or at least their relative positions. It also provides more measurements than unknowns because some of the unknowns (object pixels) are expressed in more than one diffraction pattern. The subset of object functions that are consistent with two diffraction patterns—and with the exact known illumination positions and their precise area of overlap, where the object wave must be identical for both diffraction patterns—is drastically reduced.

Hoppe's original formulations of ptychography reached a similar conclusion, although by a rather different route. He thought about the solution strategy in reciprocal space, in terms of interfering diffraction beams [17.2, 17.5]. Sampling intensity between any two spots makes it possible to estimate their relative phase within an ambiguity of a complex conjugate. Changing that interference condition, by shifting the illumination to a new position, can obtain a second estimate of relative phase in order to resolve the ambiguity. The real space picture shown in Fig. 17.8a-d is probably much easier to understand.

So, ptychography can solve the phase problem easily because it folds together information from more than one diffraction (or scattering) pattern. Remember, the support constraint cCDI problem is generally soluble with just one diffraction pattern, except for a few ambiguities; a little extra information from the illumination overlap constraint is a disproportionately powerful way to remove these ambiguities and improve the likelihood of finding a correct and unique solution.

Anything more than this—any extra information in our data over and above the need to solve the phase problem—can now be used for all sorts of different things. In Sect. 17.4, we discuss how it can be used to account for experimental errors and unknowns. Sections 17.5 and 17.7 will describe other uses for diversity: multislice volumetric imaging and multimodal decomposition of incoherent states in the illumination and/or object or detector. Describing ptychography as a solution of the phase problem is, therefore, perhaps an understatement. Yes, it solves the phase problem, but that is only the first step, and a tiny first step, of what it can achieve.

3.4 An Example Ptychographic Algorithm

Unlike cCDI, real-space ptychography rarely has a sharp support function. Having an aperture right up against an object is impractical. (Although Fourier ptychography and SAP do, indeed, employ sharp apertures.) In real-space ptychography, the illumination is not sharply defined, but is soft in the sense of an extended, slowly decaying or ringing amplitude, like an Airy disc or a wave propagated from an aperture to the object, which gives rise to Fresnel fringes. In this section, we discuss how an iterative reconstruction can cope with this type of soft illumination.

When we do the reconstruction for a top hat sharply defined illumination function, we can cut out the current estimate of the object and put it into a separate calculation box. After going around our iterative loop (Fig. 17.5), we then paste the new function back into the image from where it came. Of course, we only paste the area defined by the probe, not the whole calculation box function, most of which will contain zero amplitude. We do not touch the area of the object that was not illuminated at this probe position. The whole process is called the object update.

When we have a soft-edged illumination function, the update has to be subtler. Now we have to copy a box in the object that is big enough to contain most of the illumination. We multiply this copy of the small area of the object by the probe function (D to A) to get the exit wave function, \(\psi_{\mathrm{e}}\). Then, we go round the iterative loop. What comes out of C to D is a new estimate of the exit wave, which we can call a corrected version of \(\psi_{\mathrm{e}}\), namely \(\psi_{\mathrm{c}}\). It is corrected because the experimental data have been fed into the loop (B to C); \(\psi_{\mathrm{c}}\) will usually look substantially like \(\psi_{\mathrm{e}}\), certainly after the iteration has run over all the probe positions many times.

However, unlike the sharp aperture, we cannot just cut out a part of this function and paste it back into the image estimate, because it is unevenly modulated by the probe amplitude. Instead, we use the new estimate of the soft exit wave to alter, but not replace, the existing running estimate of the object. For example, there may be points within the illumination function (say the rings of an Airy disc function) that are zero. No photons or electrons went through those pixels of the object, so it is unreasonable to change our estimate at those pixels based on whatever we measured in the diffraction plane at that probe position; we just leave them alone. Conversely, areas that were strongly illuminated by the probe scattered most information into the diffraction pattern, so it makes sense to weight the alterations we make in the object estimate more heavily in those areas, and less so in weakly illuminated areas.

How can we do this in a consistent reliable way for a complicated probe? We can develop a heuristic algorithm as follows [17.18]. A more formal treatment can show that this update approximates to Newton's method [17.29]; it is a very efficient and effective search algorithm, although many more complicated, but computationally more intensive algorithms, can improve upon it.

The two-dimensional exit wave is given by

$$\psi_{\mathrm{e}}=aq\;,$$
(17.2)

where \(a\) is a 2-D illumination function and \(q\) is a small area of our 2-D object function, located around the probe position. For brevity, we do not include the \(x,y\) coordinates of the functions. If these were 2-D arrays in MATLAB, for example, the multiplication would be pixel by pixel, coded as

$$\texttt{Exitwave=Illuminationes.*Specimen;}$$

All the arrays have the same pixel size, but the size of the box our probe is imbedded within is usually much smaller than the total object size.

We go round the right-hand side of our iterative loop, A to B to C to D, thus applying the detector projection constraint. The back propagation C to D gives us a new exit wave, which also corresponds to a new estimate of the object function, such that

$$\psi_{\text{NEW}}=aq_{\text{NEW}}\;.$$
(17.3)

We want to alter \(q\) in the light of \(\psi_{\text{NEW}}\), to give a better estimate of it, \(q_{\text{NEW}}\). \(\psi_{\text{NEW}}\) should be an improved estimate on \(\psi_{\mathrm{e}}\) because we have injected known experimental data during the detector projection constraint. Subtracting the equations and rearranging, we have

$$q_{\text{NEW}}=q+\frac{1}{a}(\psi_{\text{NEW}}-\psi_{\text{e}})\;.$$
(17.4)

The trouble with this equation is that when \(a\) is small or zero—which it certainly will be in places if it was something like an Airy disc—the second term will tend to infinity. A common way of dealing with this is via a Wiener filter. If we multiply top and bottom by the conjugate of \(a\), \(a^{*}\), the denominator is then real, so we can add a small real number, \(\varepsilon\), to avoid this catastrophe, giving

$$q_{\text{NEW}}=q+\frac{a^{*}}{(|a|^{2}+\varepsilon)}(\psi_{\text{NEW}}-\psi_{\mathrm{e}})\;.$$
(17.5)

However, we are still giving the same credence to the change we are going make to \(q\) at any point spanned by the illumination. It would seem logical to change it most where the amplitude of \(a\) is large, as we postulated above. The simplest scheme is to multiply the second term by the magnitude of \(a\), scaled so that its maximum is unity. That is to say, we put

$$q_{\text{NEW}}=q+\frac{|a|}{|a_{\max}|}\frac{a^{*}}{(|a|^{2}+\varepsilon)}(\psi_{\text{NEW}}-\psi_{\mathrm{e}})\;,$$
(17.6)

where \(|a_{\max}|\) is a single number which is the value of the maximum modulus of the probe. All the other terms are 2-D functions, with the subtraction, multiplication, and addition being pixel by pixel. Now we are completely changing the object with the new estimate at the point where the probe has maximum modulus, and all other points are only being changed in proportion to the modulus of the probe incident at that point. Points not illuminated are not changed at all. A little thought will show that when \(a\) is the sharp aperture we first described, this update has an identical effect as the cut-out-and-paste strategy. When the solution is correct, the object is not altered: an elementary requirement of any search algorithm.

Once the update has been applied at one probe position, it must be applied at all other probe positions spanning the desired field of view, continuously updating the same object function. The whole process is repeated, perhaps 50 times—i. e., \(\mathrm{5000}\) updates for a \(10\times 10\) array of probe positions—always refining the same estimate of the object. The algorithm is called the ptychographical iterative engine ( ) [17.17]; a name that also playfully teases an eminent scientist who, in the early 1990s, described ptychography as pie in the sky. It can be altered in all sorts of ways by introducing various constants or raising the scaling factor to some power. We will discuss these changes further in Sect. 17.9.

3.5 A Survey of Ptychographic Algorithms

In Sect. 17.10 we will describe the WDD method, which is a closed form direct solution to the ptychographic data inversion problem. It predated the iterative algorithms we describe here by more than ten years, but had the disadvantage of requiring fine sampling in the probe movement. This resulted in the need to collect and process very large quantities of data, which was not practical during the 1990s. Now that computers are so much larger, there is renewed interest in WDD, especially in the electron microscopy community. However, the widespread adoption of ptychography only occurred after the first iterative algorithms became available. Starting with PIE in 2004 [17.18], a growing list of alternative algorithms have been demonstrated, so that it is now hard to keep abreast of all the developments. To ease the burden somewhat, this section provides a brief historical survey.

Our survey will split ptychographic algorithms into two kinds. Class 1 are those that invert the standard ptychographic data set, where the illumination is coherent, no account is taken of noise, the specimen shifts are accurately known, and the multiplicative approximation is satisfied. (Apart from the original PIE, all of the algorithms in this category also solve for the probe.) Class 2 are those algorithms that loosen one or other of the standard assumptions—for example, by accommodating partial coherence or allowing for thick (nonmultiplicative) probe/specimen interactions.

The first algorithm to appear in class 1, after PIE, was the conjugate gradient approach suggested by Guizar-Sicairos and Fienup [17.30]. Although there is an iterative variation of the WDD method that can solve for both the object and the probe [17.31], this was the first algorithm using a large probe step size to solve for the probe and employ a global, rather than a step-by-step approach to ptychographic reconstruction. Figure 17.9a,b explains this important distinction; most class 1 algorithms adopt the global update strategy, since in this way, the many well-tested nonlinear optimization routines are readily adapted to the ptychographic problem.

Fig. 17.9a,b
figure 9

There are two strategies that iterative algorithms take to recover an image from a ptychographic data set. (a) A whole collection of updated exit waves are calculated in parallel, one for each of the diffraction patterns in the data set. This collection is then used to perform one batch update of the probe and the object. Popular algorithms such as the difference map and conjugate gradient method take this approach. (b) Updated exit waves are calculated serially, one-by-one, with each update being fed into a corresponding update to the object and probe. This is the tack taken by the PIE family of algorithms

Next, a key paper by Thibault and colleagues in 2008 [17.12] conclusively demonstrated the power of simultaneously solving for the probe. Thibault's team repeated the original x-ray experiment by Rodenburg et al by imaging a zone plate using hard x-rays, but used the probe-solving ability of the difference map (DM ) algorithm [17.32] to realize a significant improvement in image quality and resolution over the earlier work.

The authors' extended PIE ( ) algorithm [17.33], published in 2009, extended the PIE scheme to solve for the probe. Optical bench experiments were used in the original paper, but shortly after, Schropp et al [17.34] used ePIE in the x-ray regime to characterize the x-ray beam's focus (perhaps the first important real-world application of ptychography), and in the same year (2010), ePIE was shown to work with electrons [17.35]. Along with DM, ePIE has become the most widely used reconstruction method, so Sect. 17.9 will look in detail at the mechanics of these algorithms and how they are coded.

The ptychographic inversion problem lends itself well to a variety of nonlinear optimization strategies, as Marchesini and colleagues showed in a wide-ranging survey in 2010 [17.36]. The survey covered conjugate gradient and Newton-type second-order optimization, as well as set projection approaches, in particular the relaxed average alternating reflections ( ) method popular amongst the cCDI community. The survey paper began a series of studies by Marchesini's group, which continued with papers on alternating direction minimization [17.37] and the idea of phase synchronization to accelerate algorithm convergence [17.38], as well as class 2 algorithms to combat diffraction pattern noise. RAAR itself has gone on to form the basis of the scalable heterogeneous adaptive real-time ptychography ( ) ptychography system at the advanced light source, where ptychographic images can now be obtained in close to real time [17.39].

Almost all of the work on ptychography up until the start of 2014 concerned x-ray microscopy. However, around this time, the emergence of Fourier ptychography and further demonstrations of electron ptychography began to broaden the appeal of the technique and so spurred further interest in new algorithms. One example was the (global ptychographic iterative linear retrieval using Fourier transforms) scheme used to reconstruct atomic scale images of cerium dioxide at Oxford [17.40]. GPILRUFT tackled the reconstruction by linearizing the inversion problem and so was the first to go some way toward provable convergence results, although the significant practical difficulties with electron ptychography that the authors faced seemed to outweigh any benefits from the new algorithm. Fourier ptychography ( ) used ePIE at the outset [17.41], but the very different nature of the data in FP—combined with the fresh eyes of newly-interested research groups—quickly resulted in alternatives; rather than give a full run down here, the reader is directed to a comprehensive review by Yeh et al, in particular for the comparison there between the step-by-step and global approaches [17.29].

The most recent work on class 1 algorithms, at least that the authors are aware of, come from two papers. A 2015 paper by Hesse [17.42]—working with D. Russell Luke, inventor of RAAR—presented the PHeBIE proximal gradient algorithm, together with a welcome rigorous look at the convergence properties of ePIE and DM. This year (2017), a paper by one of the authors [17.43] re-examined and improved ePIE, with changes to the probe and object update steps (Sect. 17.9) and the introduction of momentum, an idea borrowed from the machine learning community.

Of the algorithms in class 2 (those allowing a relaxation of the assumptions in the standard ptychographic model), most have attempted to deal with noisy data, and most of these have assumed that noise arises from counting statistics and so is governed by the Poisson distribution (Sect. 17.4.7). Quite early on, Thibault and Guizar-Sicairos took this tack with their maximum likelihood algorithm [17.44]; since then, ePIE has been adapted to accommodate Poisson noise [17.45], and a variety of schemes have been used for FP to the same end [17.29, 17.46]. Another major source of noise, camera readout, was combatted by Marchesini by adapting the Fourier constraint in RAAR [17.47] and by the authors with an adaptation of ePIE in the electron [17.48] and optical [17.49] regimes.

Arguably, the most important class 2 advance came with the advent of mixed-state ptychography [17.50] (Sect. 17.8). The mixed-state forward model can quite readily be applied to any of the conventional algorithms. Apart from dealing with partial coherence in the x-ray [17.50], electron [17.51], or optical [17.52] regimes, one or other mixed-state algorithm has since been employed to deblur diffraction patterns in fly-scan ptychography, where the probe rapidly scans across the specimen without stopping [17.53]: multiwavelength ptychography [17.54]; ptychography with a vibrating specimen [17.55]; and in the previously mentioned probe relaxation algorithm to handle a probe that fluctuates during the experiment [17.56].

Another popular grouping of class 2 algorithms corrects errors in the measurement of specimen translations (Sect. 17.4.4). That this is possible was first shown by Guizar–Sicairos and Fienup in their his early conjugate gradient paper [17.30]. Later, an annealing algorithm that randomly agitated the measured specimen positions during the reconstruction showed that position correction could be effective in optical and electron ptychography [17.57], and a cross-correlation-based add-on to ePIE gave excellent results in the x-ray regime [17.58]. A refined conjugate gradient search also solved the position error problem effectively [17.59].

Last in our survey are the class 2 algorithms that relax the thin (or multiplicative) specimen assumption. Two approaches have been reported: multislice ptychography (Sect. 17.6.2 and [17.60]) and diffraction tomographic ptychography  [17.61]. This is an exciting area for further research, although the hugely enlarged object space for volumetric imaging makes the reconstruction task immensely more demanding.

The following sections provide further details of the myriad of ways in which ptychography can be implemented, improved, and expanded; we will revisit ptychographic algorithms in more detail in Sect. 17.9.

4 Sampling and Removal of Artifacts in Images

We are going to use a very old result to illustrate some important concepts about information content in ptychography. Figure 17.10a-l was published by Bunk et al [17.62], almost immediately after the first experimental demonstrations of visible light and x-ray iterative phase-retrieval ptychography [17.11, 17.17]. It used the PIE reconstruction algorithm described in the last section.

Fig. 17.10a-l
figure 10

An early example of a visible-light ptychographic reconstruction obtained using iterative solution methods, collected in the simple aperture configuration (Sect. 17.5.10), illustrating the improvement in the reconstruction as the degree of overlap between adjacent illumination positions increases. (aj) The degree of overlap increases in steps of \({\mathrm{10}}\%\), starting at zero overlap in (a), to \({\mathrm{90}}\%\) overlap in (j). (k) has \({\mathrm{100}}\%\) overlap, which means the experiment has effectively only one diffraction pattern to process, as in cCDI. (l) Low magnification image of the specimen. Reprinted from [17.62], with permission from Elsevier. Artifacts can now be removed by various strategies (Sect. 17.4.8)

If you are completely new to ptychography, you might be disappointed that these reconstructions seem to be full of artifacts, especially in view of what has been said in the previous sections. The first thing to stress is that their lack of quality is absolutely nothing to do with the capabilities of the authors of the paper. When these results were published, they were cutting edge, and certainly no worse than the first proof-of-principles. However, at that time, really the only thing that was known about ptychography was that it could solve the phase problem for an indefinite field of view, as discussed in the previous section. All of the many developments that have taken place since then mean that now very high-resolution, artifact-free reconstructions can be obtained with almost total reliability. For example, see Fig. 17.11a,b, where a modern visible light ptychograph is compared with traditional contrast techniques. However, we start our experimental narrative here because (a) the work in Fig. 17.10a-l represents the first experimental exploration into the effect of extra information in ptychography (beyond the solution of the phase problem) and (b) because we think it will be useful to illustrate to any newcomer to the field what sort of things can go wrong if you do not know the tricks of the trade. We will re-assess the artifacts in these pictures in Sect. 17.4.8.

Fig. 17.11a,b
figure 11

Example of the phase image of a modern visible-light ptychograph of cells (a); compare Fig. 17.10a-l. (b) The conventional fluorescence image. Ptychography does not need fluorescence signals, so the cellular structure can be imaged directly without affecting the cells in any way, e. g., for screening live embryos. Reprinted from [17.63], published under CC-BY license

In this section, we are going to consider the width and nature of the communication channels illustrated in Fig. 17.1a-dd in the context of ptychography. First, we consider sampling of our data. A great emphasis in the early days of x-ray cCDI was on the sampling condition in the diffraction plane, which was called oversampling [17.64]. In ptychography, the intensities measured at the pixels in the detector change as we scan the illumination or object. If we move the illumination in very small steps, the changes are small and incremental; changes are much larger for large step sizes. A more general view of sampling in ptychography is, therefore, to examine not only the sampling in the diffraction pattern in reciprocal space, but also the sampling in real space—the grid over which we scan the illumination. We need to consider the sampling over a four-dimensional ( ) cube made up of 2-D diffraction patterns collected from an array of 2-D probe positions. (Or, in the case of Fourier ptychography, the sampling of the illumination beams in angle space and the pixel sampling in the image plane.)

There are two ways of thinking about sampling in real space. One way is simply to state the periodicity of the probe movements in real space. More commonly, workers in the field often talk about the overlap parameter. This is the ratio of the step size through which the illumination is moved in relation to the width of the illumination. Because the illumination is invariably roughly circular, both these definitions are imprecise; the step size has to be about \({\mathrm{30}}\%\) before all the gaps between circular areas have been covered just once.

Let us look at Fig. 17.8a-d in detail. In this visible light optical experiment, an \(11\times 11\) grid of illumination positions is fed into the PIE serial iterative reconstruction method [17.18]. In the first frame, the probes do not overlap with one another, at least as defined by the aperture diameter. In subsequent frames, the probes are made larger so that the overlap between them increases in steps of \(\mathrm{10}\) up to \({\mathrm{100}}\%\). In fact, some structure comes out of the \({\mathrm{0}}\%\) overlap data set because PIE can account for diffraction effects caused by the small propagation distance from the aperture to the sample, thus allowing some information to seep between probes. Clearly, \({\mathrm{100}}\%\) overlap contains no ptychographic (probe shift) information whatsoever. The result is worse than an error reduction support constraint algorithm because the PIE constraint in real space has soft edges arising from the broadening of the probe.

Clearly, as the overlap increases, the quality of the reconstruction becomes better and better, at least until the \({\mathrm{100}}\%\) overlap catastrophe. Is this what we expect? Because of the geometry of the gaps between circular apertures, as discussed above, the overlap must be at least \({\mathrm{30}}\%\) before every pixel of the object is illuminated even once. This accounts for the sudden jump in the quality of the image between Fig. 17.10a-lc and Fig. 17.10a-ld. As the overlap increases further, we are making more and more measurements for a smaller and smaller field of view of the specimen; the ratio of measured data points to unknowns is increasing. Remember, a single diffraction pattern contains enough numbers to solve for an isolated object (any one of these illuminated areas). Once we have enough overlap to suppress the few ambiguous solutions that can arise in cCDI, it is not obvious why having any further extra data—often called redundant data—should necessarily make the reconstruction better. We will find that a key application of this redundant ptychographic data is to suppress the artifacts present in these early results. (Of course, if redundant data are employed usefully, the word redundant becomes a misnomer.)

A key requirement for cCDI is that the sampling in the diffraction plane must become smaller (more dense) as the size of the object increases. This follows from a simple analysis of the scattering geometry—that beams scattered from the edges of the object will become out of phase more quickly as a function of scattering angle if the size of the object is large; i. e., the detector pixels lying in angle space must be smaller to pick up all the relevant interference information. As we have seen, when we measure intensity in the far field, the calculation box over which we solve for the object must have dimensions of roughly twice the size of the object itself. One might suppose that this same condition must hold in ptychography. Indeed, most ptychographic reconstructions are undertaken with the probe imbedded in a similar calculation box.

Surprisingly, the minimum sampling condition for ptychography is not constrained by the probe size. Rather than think of overlap as a measure of redundancy, it is more informative to think of the probe movement defining a grid of real-space sampling. The fundamental minimum sampling condition in ptychography must take into account both real and reciprocal-space sampling. Strangely, the size of probe is independent of the sampling requirement, quite unlike in conventional cCDI, provided that for a given real-space sampling the probe is big enough so that adjacent illumination areas overlap somewhat and span the entire field of view.

If we simplify the illumination shape as a square, so that we do not have to handle the awkward geometry of overlapping circles, it can be shown using simple physical arguments [17.65] that the minimum ptychographic sampling condition is

$$\Updelta R=\frac{1}{2\Updelta u}\;,$$
(17.7)

where \(\Updelta R\) is the sampling interval in real space and \(\Updelta u\) is the sampling interval in reciprocal space. The same conclusion can be reached by a more formal derivation [17.66].

We see that we can exchange sampling between real and reciprocal space as we wish; it is as if we have a dial that can, in a continuous manner, reduce sampling in one plane and increase it in the other, while still preserving the necessary quantity of information to reconstruct the specimen. If the probe is large, but the sampling in real space is very fine, this formula implies that the pixel size in the detector can be large, even though the structure of the diffraction pattern is very fine (a large probe in real space implies small features in reciprocal space). This is quite contrary to anything that follows from cCDI. However, it transpires that we can recover unmeasured small pixels (that do satisfy the conventional diffraction sampling condition), from the large pixel data—see Sect. 17.8.5. Very dense sampling in real space is normally associated with a very small probe (Sects. 17.5.1 and 17.10 below), so that features in the diffraction plane are anyway very large and can be captured by only a few large detector pixels. This type of data, although subject to the same sampling condition, is better processed by noniterative means (Sect. 17.10.4).

It should be emphasized that the fundamental sampling condition relates only to Fourier domain ptychography where the scan is over an infinite field of view, and where we know the probe function. It is also the minimum sampling required to solve the phase problem. In any practical ptychography experiment, the sampling in diffraction space is high, and there is considerable probe overlap in real space. So, we generally have much more information than we need. Now we discuss the things we can do with these extra data in order to improve image quality.

4.1 Probe Recovery

One of the most important breakthroughs in iterative phase retrieval ptychography was to discover that it is possible to solve for both the object function and the illumination function [17.12, 17.67]. The two functions express themselves equivalently in the mathematics, so perhaps this is not quite so surprising. It was known some time ago that the WDD method (Sect. 17.10) can be used to solve for both object and illumination, but experimental tests on the optical bench were not particularly convincing [17.31]. On the contrary, iterative methods to retrieve the probe work very well. The two most popular algorithms for this simultaneous recovery involve either projections over the whole data set at once (DM) or a serial update process (ePIE), which were briefly introduced in Sect. 17.3.4 and will be discussed in detail in Sect. 17.9.

An immediate unintended consequence of this development was that workers in the x-ray field began to use ptychography not to make images of an object, but solely to characterize and reconstruct the illumination function. There are now many examples in the literature. Because the full complex field is recovered, it can be backpropagated to the lens aperture, thus elegantly displaying any phase aberrations in the optics. This is enormously more informative than a simple resolution test, say by scanning the focus of the beam across a knife edge. In the particular example shown in Fig. 17.12a-f, [17.68], the probe calculation from a refractive aberrated optic was used to make a perturbing phase plate that corrects for the aberrations. This is an example of how ptychography can enhance the technology of its lens-based imaging cousins in order to improve the very fine probes used for analytical STX/EM. Similarly, Fig. 17.13 shows a cross-section through an electron probe recovered from ptychographic data in the scanning electron microscope ( ) [17.69]. The explicit map of the complex wavefield of the probe in both of these examples is not available by any other means.

Fig. 17.12a-f
figure 12

An example of a longitudinal cross-section through a focused x-ray beam, calculated via ptychography, taken from [17.68]. Panels (b,e) show cross sections from (a,d) taken at the dashed line. The beam is calculated at one level of defocus where the object is positioned, and then propagated computationally to produce the cross-sections. In the top panels, the optics are imperfect, generating a large crossover. In the lower panels, the optics have been corrected by inverting the aberrations in the lens measured from the top cross-over. The inference fringes in the Ronchigrams on the right (caused by a diffraction grating in the beam) are like those in Fig. 17.2a-c: when they are straight, there is only defocus present and no higher-order aberrations

Fig. 17.13
figure 13

As Fig. 17.12a-f, a cross-section through a propagating electron probe in an SEM run in STEM mode using a transmission specimen and a detector at the bottom of the specimen chamber, reconstructed via ptychography. From [17.69]

No matter how well the optics within a ptychography experiment are calibrated, all reconstructions nowadays solve for both the object and the probe. Of course, for a set up that remains constant from one day to the next, it is logical to start the reconstruction with the last known probe solution.

An interesting and possibly very important development in probe recovery has recently been demonstrated by Odstrcil et al [17.56]. In the context of EUV ptychography, experimental constraints dictate that every single probe is different and unknown. We might suppose that absolutely no progress can be made in such a situation. The whole technique of ptychography depends on the probe and the object remaining constant. The premise of his reconstruction technique is that though all the probes are different, each probe can be described as a sum of a few (\(5{-}10\)) fundamental probes, all of which are orthogonal to one another. There are still innumerable possible probes, but each one is described by a few numbers, instead of the 1000s of pixels needed to describe a completely general probe.

A little thought will suggest that this is a very reasonable assumption. After all, the optical components remain the same. In this case, each shot for the EUV source has a different structure, but each probe will be perturbed by a set of possible variables that can change in the experiment, and these variables may be rather few. If each probe were completely different from every other, we would indeed have an impossible problem.

If all the real probes are already known, finding the underlying fundamental probes can be done by the standard techniques of principal component analysis. However, at the start of the reconstruction, they are not known. The reconstruction starts by assuming all the probes are the same and does a normal reconstruction. The resulting probe and object are very poor approximations of their real counterparts. However, the object function can be used in the iterative update to make a new estimate of each of the probes for all the positions. These are now used to find principal components. They are not the actual principal components because the first estimate of all the probes is bad. Furthermore, none of the wrong probes can be fully described by the small number of wrong principal components. The updated probes are projected onto the first estimate of the principal components, thus making a new set of probes that are now just described by the first estimate of the principal components. These new probes are used to update the object. Then they are updated themselves. The second iteration of probes creates a second iteration of principal components, and so on and so forth. The algorithm converges on the actual principal components and, hence, the actual probes. Of course, because each probe is only described by a handful of numbers, we only need a fraction of the diversity in the ptychographical data set to solve for them all. Some example results are shown in Fig. 17.14.

Fig. 17.14
figure 14

A set of orthogonal probe functions that can be used to compose a probe function that varies from one position to the next. Reprinted from [17.56], published under CC-BY license

4.2 Some Pathological Instances Where Ptychography Struggles

We have said that ptychography suppresses all of the ambiguities that arise in cCDI. This is not quite true, it does, rarely, suffer from its own special ambiguities. Of course, now that we are solving for two complex functions, object and probe, we can expect that the sampling condition will become twice as demanding. That is true, but other factors must also be taken into account when solving for both functions. The specimen and probe functions can never be completely and unambiguously separated from one another. A simple example is that the probe can increase in amplitude, while the specimen reduces in amplitude (appears more opaque), but the product of the two maintains the total measured flux on the detector. This is not serious as far as observing the structure of the object, but it needs to be handled carefully if quantitative absorption data are required, say by calibrating the total flux in the probe using an area of free space around the object. During the reconstruction the probe can be periodically propagated to the detector plane (without the influence of the object) where it can be constrained by the correct free-space intensity. Indeed, it is always advisable to scale the first estimate of the probe by the integrated intensity in the detector plane. If there is a large disparity between the intensity of the physical probe and the first guess of the estimated probe, many reconstruction algorithms find it very hard to recover. If the edge of the field of view of the reconstruction is very bright or very dark, you have probably made this mistake.

More profound questions arise when we consider the information content of the object and the illumination. To get any diffracted information, the specimen and probe must have structure. If the object structure is sparse, consisting of a very few simple features separated by large areas of constant phase or modulus, then we might suppose that the probe is very poorly constrained. Consider a largely nontransparent object with only a few empty features. If the probe is scanned with large step size, but over a small field of view, only a few of these object features will intersect with it; there may be a large subset of areas within the probe that are never transmitted through to the detector, yet alone solved for by any algorithm!

Even when the object function has a lot of structure, there are certain types of probe which are difficult to solve for, one example being a defocused convergent beam, which we will discuss further in Sects. 17.5.5 and 17.10.6, Probe Complexity and Noise Suppression. Another example occurs in visible light ptychography, where it is common to use a fixed diffuser of some type to create a probe with complicated phase and modulus structure [17.70]. Counterintuitively, convergence is poor when both the object and illumination is highly structured. However, once a good estimate of a complicated probe is known (and can be used to seed the reconstruction), then convergence onto the object function is much better than using a probe with little structure. The WDD formulation can be used to suggest probe structures, which are more likely to improve the convergence of the reconstruction (Sect. 17.5.6).

There are various very unusual combinations of object function and illumination structure and/or shift positions where ptychography provides no extra phase information at all. A trivial example is if all the illumination positions are identical so that the overlap between them is perfect (Fig. 17.10a-l). Obviously all the diffraction patterns are identical and, therefore, lend no extra diversity. Similarly, if the object is periodic, and the scan of the illumination has the same periodicity (or any factor times the object periodicity), then all the diffraction patterns will also be identical. Certain illumination functions can also cause the obliteration of diversity, for example, a convergent beam of finite angular extent when incident on a high-frequency periodic structure can mean that there is no overlap in the diffraction orders in the far field (Fig. 17.2a-c), in which case no phase information can be expressed.

If the entire field of view is free space, then clearly we cannot find any sort of sensible solution. If the reconstruction starts with the assumption that the object is free space and with a known probe function (which is now the only information expressed in the diffraction pattern, but without its phase), then in theory the object function should not depart from free space. We find that, in general, if a significant area of the field of view has some sort of object structure, then areas that are free space will be reconstructed correctly, although residual errors in the probe reconstruction arising from the limited field of view occupied by the object may express themselves in free space at the probe position locations. The free space problem is clearly a condition for which the conventional microscope is vastly superior: it will show blank free space. Luckily, not many microscopists want to look at free space.

If the object is unknown, and especially if it is likely to be sparse or weakly scattering, it is always better to use a probe that has more structure within it. This can be shown using arguments based on the WDD method [17.71, 17.72], although whether these are directly applicable to iterative methods has yet to be proved. Figure 17.15a-i shows an x-ray example of how making the probe (in this case formed by a zone plate lens) much more complex by the introduction of a pinhole improves the reconstruction quality.

Fig. 17.15a-i
figure 15

Effect of having a more structured illumination. (ac) The real-space probe reconstruction, the diffraction pattern from the probe when there is no specimen present, and an example reconstruction, respectively, for when a simple aperture is used to form the illumination. (df) As above, but for convergent illumination. (gi) As above, but for a convergent probe clipped by an aperture, which clearly extends the probe function in reciprocal space. The quality of the reconstruction improves from top to bottom (\({\mathrm{ph/\upmu{}m^{2}}}=\text{photons}/{\mathrm{\upmu{}m^{2}}}\)). This is shown quantitatively in the original publication using Fourier ring correlation. Reprinted with permission from [17.71]. Copyright 2012 by the American Physical Society

4.3 Nonperiodic Scan

Although we have formulated the sampling condition in terms of a periodic scan over the object, it was realized quite early that periodic scans are not optimal [17.67]. As we noted in the previous section, ptychography offers no information if the probe is shifted across a periodic object, at the periodicity of that object, because each diffraction pattern is identical and contains no phase information. We can reverse the argument. A probe scanned periodically over a specimen will not contain any ptychographical information for a Fourier component in the object that matches multiples of that periodicity. A periodic scan will always tend to produce image artefacts at that periodicity. However, when we solve for both the probe and the object, the problem creates the so-called raster scan pathology, first pointed out by Thibault and colleagues [17.21]. Either the probe or the object can develop structure at the scan periodicity, causing a further source of ambiguity.

The solution is to deliberately introduce nonperiodicity into the scan. One common way of doing this is via a spiral scan, starting from the center of the field of view [17.12, 17.20, 17.67], a technique that is now used very widely in the synchrotron x-ray world. Alternatively, a broadly periodic scan can have small random offsets added to each probe position. There are situations where neither of these strategies is easy to implement, for example, when a STX/EM configured for smooth rectilinear scans is modified to collect ptychographical data. In fact, if the early iterations in the reconstruction use computationally perturbed probe positions, then periodic artefacts from a regular scan can be suppressed at the cost of resolution. Final polishing of the solution can then use the real regular probe positions [17.73].

4.4 Refining Probe Positions

We can suppose that our knowledge of the probe positions relative to the specimen is the key a priori constraint in ptychography, replacing the real space object support constraint in cCDI, especially when we are solving for both the specimen and the probe. However, a densely-sampled data set can allow refinement of the probe positions after the experiment has been completed. This has proved important for electron ptychography (at least when real-space step sizes are large). A STEM scan is designed to be periodic, but when random position offsets are added to these (see previous section) hysteresis in the scan coils does not always move the probe to the assumed positions.

If we know our scan positions but think there might be distortions from specimen drift, stretching, or rotation of the scan, then these can be parameterized using only a few variables, which become a few more variables in our search space. Guizar-Sicairos and Fienup were the first to investigate the search for unknown probe positions using conjugate gradient methods [17.30], but this is computationally intensive. The two most commonly used techniques have low computational overhead, increasing the cost of the whole reconstruction by only a factor of 3 or so. Both look for perturbations in the position of every scan point one at a time, which reduces the computational overhead, but they use quite different mechanisms: annealing and cross-correlation. In both cases, an initial reconstruction is obtained, making no account of position errors.

In the annealing algorithm [17.57], each probe position then has a number (say, five) of random offsets applied to it, but only up to a given maximum. Using the existing estimate of the object and probe, a diffraction pattern is calculated from all five positions. One of these will most closely match the measured diffraction pattern from that point, i. e., it will have the lowest error metric. This position is now chosen as the correct position, and then the object estimate is updated using that probe position. (Note that we are describing this in terms of a serial update algorithm, like the aperture serial iterative update described in Sect. 17.9.1, but it can be incorporated into the parallel methods.) The same process is applied to all probe positions. For the next iteration the new altered probe positions are the starting point, but once again random offsets are added to these. Thus, it is possible for a probe position to wander quite far from its original putative position. However, as the calculation proceeds, the random maximum distance added to the current probe position estimates is slowly reduced. This forces an estimated probe position to settle on a single point, in the meantime gradually stopping it from jumping large distances from a good quality estimate. Figure 17.16a,b shows the improvement that can be obtained using the method, in this case for electron ptychography data. The effects of a drastic period of drift, starting half way through the experiment, are entirely removed.

Fig. 17.16a,b
figure 16

Example of the improvement in the object and probe reconstruction when probe position distortions are present in an electron ptychography experiment. (a) Reconstruction of gold particles on a carbon support film with serious drift present (i. e. a distorted probe scan). (b) Ptychographic image after probe-position refinement. Reprinted from [17.57], with permission from Elsevier

The correlation algorithm [17.58] starts by storing a copy of the current estimate of the object function. It then updates the original object function at just one probe position. The current (pre-updated) estimate of the object was previously reconstructed in this particular probe area using lots of diffraction patterns from all the overlaps occurring within it. In calculating the next update, this good image (averaged from lots of data) is fed into the reconstruction algorithm at A (Fig. 17.5), where the exit wave estimate is generated. It is the detector update, B to C, that impresses the wrong position information for this probe position, but most of the original image data (determined mostly by the phase of the diffraction pattern, which is not changed) will survive and still be present in the new estimate of the exit wave at D. In other words, the updated object function should look like the previous object estimate, but with the newly updated area being a copy of the image shifted to the wrong position. This is true at least to first approximation.

We have a copy of the pre-updated estimate of the object, and the updated estimate, with part of it shifted. Cross-correlating these two should give a peak that is displaced from the origin. The magnitude of this displacement is very small, because the cross-correlation is dominated by the areas of the two images that are mostly identical. However, the peak will lie in a certain direction from the origin in the two-dimensional plane of the cross-correlation. This can be used to steer the next estimate of the probe position. The length of the vector from the origin to the peak has to be multiplied by some factor to define an actual new position for that probe.

New algorithms for probe position refinement continue to appear. For example, Tripathi et al [17.59] combined conjugate gradient methods with the conventional DM and ePIE core algorithms, giving excellent results (Fig. 17.17a-e). Needless to say, researchers tend to be quite conservative, using algorithms that they have confidence in. Most algorithms have free parameters that can be tweaked, and so a lot depends on experience; the optical set up being used also impacts their efficacy. Consequently, it is hard to compare them objectively. Groups choose one, develop the requisite knowledge to optimize it, and then tend to stick with it. Probe positions are just another set of dimensions in the solution space, so there are undoubtedly much more comprehensive and efficient ways for solving for them yet to be found.

Fig. 17.17a-e
figure 17

An example of probe-position correction in x-ray ptychography . (a,b) The uncorrected object and probe reconstructions. (c,d) After probe position refinement. (e) Comparison of putative and actual probe positions calculated from the probe-refinement procedure. Reprinted with permission from [17.59], The Optical Society

4.5 Field of View

In addition to sampling per se, another important variable is the field of view of the whole scan. When the sampling in real space has high periodicity but the probe is large (or in other words, the overlap is very large), the center of the field of view will be illuminated many times, whereas the edge of the field of view is only ever illuminated once. With reference to Fig. 17.18, the ratio of the probe size, the step size, and the scan size (\(4\times 4\)) is such that only the very center of the object is illuminated 12 times (the corner probe positions do not overlap in this area). Extending the scan to \(6\times 6\), and the area illuminated 12 times increases in area by a factor of 9. In other words, at low scan sizes, the constraints on the data (generated by multiple sampling the same object area) very rapidly increase as the scan is enlarged. This accounts for the (perhaps surprising) fact that the bigger the field of view (the more numbers we have to solve for), the easier it is to solve for those numbers. Early attempts at iterative ptychography, especially electron ptychography [17.35, 17.74, 17.75], often used very small scan patterns, which may account for the fact that only the very central regions of such reconstructions were of reasonable quality. Using a small field of view also makes solution of the probe function much more difficult, because we need extra diversity to solve for it. In general, a \(10\times 10\) scan with a \({\mathrm{70}}\%\) overlap parameter is a safe minimum requirement; anything less than this should be avoided. Larger fields of view are always more desirable.

Fig. 17.18
figure 18

As the field of view is increased, defined in terms of the number of illumination areas, the region where the object has been illuminated most often increases in size quickly

4.6 Missing Data and Data Truncation

In practice, most ptychographic data sets fulfil the fundamental minimum sampling criterion in (17.7) by many factors. This means that astonishingly large quantities of data can be discarded, or simply not measured, without affecting the quality of the final reconstruction. One way of thinking of this is via Hoppe's ptycho convolution. If in a conventional real-space ptychography experiment the illumination is parallel (which, of course, means it has no localization at the specimen or convolution in the far field), then a particular scattering vector will arrive at just one pixel on the detector. If data from this pixel are lost—say that detector pixel is faulty—the Fourier component in the object relating to that scattering vector is also irredeemably lost. However, in ptychography we have a localized probe, which means the diffraction pattern is convolved with the scattered amplitude. (Note that in Fresnel full-field ptychography, where there is no localization in the illumination, there is still a convolution in the Fresnel integral.) This means that information relating to any one scattering vector is expressed in a number—often a very large number—of pixels surrounding the faulty pixel. We can, therefore, happily dispose of the signal from a detector pixel on the understanding that information expressed around it will fill the gap left by it. We do this in an iterative reconstruction algorithm by what is called floating the dead pixel. When the exit wave is propagated to the detector, the missing pixel assumes the modulus and phase of the forward calculation. This is well determined by the existing estimates of the object and probe (propagated to the missing pixel via the exit wave), which have been generated by all the good detector pixels in all the many diffraction patterns.

A more radical manifestation of this phenomenon is accounting for intensity data scattered outside the detector, and then using redundancy in the ptychographic sampling to recover intensity that would have been measured had the detector been large enough [17.76]. This sounds improbable, but it does work. As an example, we refer to a visible light optical demonstration in Fig. 17.19a-c. The diffraction pattern (Fig. 17.19a-ca) has been recorded as usual. The well-developed speckle arises from the fact that the illumination in this particular experiment is highly structured, and there is a wide range of angles in the incident radiation. Clearly, at the edges of the detector the intensity is still strong, and we can reasonably infer that it extends beyond the edges of the recorded data.

Fig. 17.19a-c
figure 19

An example of reconstructing data that have not been measured. (a) The diffraction pattern, which is clearly larger than the detector size. (b) Reconstruction using the measured data. (c) Reconstruction with a much larger diffraction pattern, but with the same data. Pixels outside the region of measured data are left to float. See main text. From [17.76]

In Fig. 17.19a-cc we see two reconstructions. The low-resolution image has been reconstructed as usual, using only the recorded data. The high-resolution image has used a much larger computational array for the detector, with all pixels outside the measured region being floated during the reconstruction. This method is not giving us anything for free or breaking the laws of physics in any way. The lost data have to have a certain value in order to be consistent with the convolution of the object diffraction pattern with the angular distribution of the illumination function (i. e., the Fourier transform of the illumination function). In this case, the latter is very wide in diffraction space. We end up solving for a region of reciprocal space the width of the illumination (in reciprocal space) convolved with the width of the detector. This is exactly the same as the transfer function of an ordinary optical microscope, which is the convolution of the condenser lens aperture size with the objective lens aperture size.

Recovery of lost diffraction data does have some practical applications. Many high-performance x-ray detectors are arranged in tiles, with gaps between each tile. Rather than interpolating the data in these regions, or simply ignoring them, ptychography can recover accurately the measurement that should have been made there. In the context of scattering outside the whole detector, it must be emphasized that if the diffraction patterns do, indeed, spill over the extent of the detector (this should not happen if the experiment has been designed properly), there can never be a consistent solution for the inverse calculation. Truncated data should, therefore, ideally always be padded with floating pixels.

One may ask—how many pixels can I ignore, and how does that number relate to the necessary minimum sampling condition? We leave this as a computational exercise for the reader, who may be surprised at the vast quantity of pixels that can be ignored and floated. Two hints: choose the pixels randomly, not in any sort of systematic array; and remember that just because an image looks OK that does not mean to say you have actually recovered all the information that was in the object in the first place. Although sparse objects can be hard to reconstruct, because of what we discussed in Sect. 17.4.2, objects that are moderately sparse (for example, resolution test specimens) contain rather low information and so seem to reconstruct well, even if the sampling condition is not reached.

4.7 Shot Noise

All data have noise. In the case of electrons and x-rays, specimen damage is always a concern, meaning that minimal dosage is always desirable, and so our preferred data will always suffer from a degree of Poisson noise , even if the detector is perfect. The lower the dose, the lower the specimen damage; but the fewer the counts, the higher the noise. This is a leading issue in all imaging sciences, especially electron microscopy of soft matter like biological tissue or polymers.

Something that is widely misunderstood is that damage must be a serious weakness of ptychography, because each area of the object must be illuminated many times. However, this is only true if we worry about the noise in any one diffraction pattern. We must remember that each pixel of the object scatters photons or electrons into several diffraction patterns; these scattering events, and the information they contain, are not lost; they are simply distributed over a number of detector pixels that just happen to lie in different diffraction patterns. Provided our reconstruction algorithm can put together all these counts, the noise in the reconstruction is, like conventional imaging, determined by the total number of counts that passed through each pixel element of the object.

This phenomenon of dose fractionation occurs in many fields, for example tomographic reconstruction [17.77, 17.78]. Unfortunately it only works well if our detector is perfect. Background or readout noise mean that we need to minimize the number of times we readout the detector. If there is only one count per diffraction pattern scattered from the object, then this would be drowned out by just a few false counts arising from the detector noise. Hard x-ray detectors and very modern electron detectors do nowadays achieve virtually perfect event counting, so dose fractionation in ptychography can now be fully exploited. When we come to discuss the Wigner distribution deconvolution method later in Sect. 17.10, we will see that extraordinarily low counting statistics can be tolerated in each diffraction pattern.

A low count in each diffraction pattern has consequences for sampling in real space. Suppose our illumination area is large, and we make exposures that are so short that on average only one photon or electron arrives in each diffraction pattern. Clearly, if we are going to get enough flux to pass through any one pixel of the object in order to form a reasonable image of it, then we cannot move our illumination in large step sizes because most image pixels will not scatter even a single photon or electron. The optimum step size will then depend on the characteristics of the detector: some single-event counting detectors can handle very few counts per pixels, so the step size must be very small. The smallest meaningful step size depends on the frequency spectrum of the probe. Moving it less than the periodicity of its highest frequency Fourier component will not alter the diffracted intensity (or rather the probability distribution of the intensity) to produce new, independent information, because we are sampling in real space at periodicity of less than the Nyquist condition.

With low count rates we must be careful about how we reconstruct the data. In any inverse problem, noise masks the minimum in the error metric and can create many local minima and a false global minimum. Finding the minimum without getting stuck in local minima is much harder, and if we do find the global minimum, it will not be a perfect representation of the object function. After all, a perfect reconstruction implies that we know the diffraction pattern perfectly, which we clearly do not, because the low counts are distributed stochastically, albeit with a probability determined by the underlying wavefunction. For noisy data, it is preferable to use a conventional algorithm (DM, ePIE etc.) initially, and then when close to the solution, refine with maximum likelihood (ML ) [17.44]. A formal study of the convergence properties of this approach has not as yet been undertaken, but the results are impressive Fig. 17.20a-d.

Fig. 17.20a-d
figure 20

Demonstration of the ML method by Thibault and Guizar–Sicairos. (a) Original image. (b) DM reconstruction. (c) ML reconstruction assuming Gaussian statistics. (d) ML reconstruction assuming Poisson statistics. From [17.44]. ©Deutsche Physikalische Gesellschaft. Reproduced by permission of IOP Publishing. CC BY-NC-SA

4.8 Artefacts

As we emphasized at the beginning of this section, Fig. 17.10a-l was one of the best ptychographical reconstructions obtained by 2008 [17.62]. By now, we hope a reader new to the field will appreciate the developments that have occurred since then, so that if they were faced with a similar reconstruction they would know how to improve upon it. These are the issues:

  1. 1.

    The object and the probe are sparse. At the center of the field of view, the dominant feature in the object is just a line. Data transfer in ptychography is structure dependent. The probe is also a simple propagated aperture that does not have much diversity. Nevertheless, there is no reason why this should not reconstruct, and where the real-space sampling is finest, it does. However, at the outset, the combination of probe and specimen means that this experiment is demanding.

  2. 2.

    There are periodic structures, because of the regular scan, which we now know cannot easily transfer certain frequencies (even with a known probe). An irregular or circular scan would immediately solve this problem.

  3. 3.

    The algorithm employed, PIE, does not solve for the probe. The probe has been estimated from knowledge of the aperture and a computational propagation to the specimen, using a physically measured distance between the aperture and the object. Solving for the actual probe will certainly improve the solution.

  4. 4.

    The scanning stage may not be perfect. Nowadays, if there is any doubt about hysteresis or backlash, adopting one of the probe position refinement algorithms could well also improve the reconstruction.

  5. 5.

    Perhaps most important of all (although this has not been a subject covered in this section), is that we now know that this very simple aperture-only set up is one of the worst ways of doing ptychography; see Sect. 17.5.10.

  6. 6.

    The data may also benefit from a modal decomposition (covered in Sect. 17.8), not because the laser source is incoherent, although it could have more than one mode within it, but because a throw away mode can take out any detector noise which may well be present.

Finally, we remark that dose-fractionation properties of ptychography were not realized when these results were published. For this reason, the authors concluded that the optimum overlap condition is not the largest possible (very dense real-space sampling). They balance the overlap parameter with a consideration of the total dose, assuming that each diffraction pattern must have the same number of counts, and not that the counts can be fractionated between them. If the detector is not perfect, then, of course, their analysis still applies. If the detector is perfect, we now know that having as much overlap as possible is optimal, although this generates huge quantities of data, where each diffraction pattern may only contain rather few counts.

5 Experimental Configurations

Ptychography is very versatile. The ways in which it can be undertaken are diverse. Most of the optical set ups that have so far been explored are used with more than one type of radiation, although for good reasons rarely with all types. For example, the simple aperture configuration is easily implemented using visible light or x-rays, but would be fiendishly difficult to do with electrons. Making an aperture small enough at electron wavelengths, and opaque enough outside the aperture, would imply an extraordinarily large aspect ratio for the hole; which is very hard to make and would contaminate almost instantly. Similarly, Fourier ptychography is perfect for visible light and possible for electrons (where it has historically been called tilt-series reconstruction). However, it is virtually impossible for synchrotron x-ray ptychography where the beamline direction is fixed; all the optics and the detector would have to be scanned around the object, an impossibly demanding experiment with little to recommend it. However, these examples are exceptions. The benefits and limitations of most aspects of any particular ptychographical optical set up are usually the same, independent of radiation type. In what follows, we will, therefore, categorize ptychography by optical configuration.

We first make a few general comments. In all that we have said so far we have assumed that the detector lies in the Fourier domain of the object function. In fact, there is no requirement for this to be true, as long as we know the form of the propagator between the object and the detector, which, when the detector is some distance from the object but not far enough to satisfy the Fraunhofer condition, will in general be a Fresnel propagator. All the reconstruction algorithms can equally well apply the detector intensity constraint at any plane downstream of the object.

So far we have mostly discussed an illumination field (a complex-valued wave) being incident upon a scattering object (a complex-valued transmission function), i. e., real-space ptychography . Remember that these can be exchanged with one another. We can instead have an aperture or stop of some type, analogous to the illumination, which multiplies a wavefield. We can then move the aperture or wavefield relative to one another in order to solve for both functions. This is the principle of Fourier ptychography, although the same situation occurs in other configurations that we will discuss. The wavefield can be an image or a diffraction pattern and is usually formed by a lens.

In this section, we assume that the multiplicative approximation applies (that the exit wave is the illumination function times the transmission function) and that the source of radiation is perfectly coherent. We will explore how to circumvent these approximations in Sects. 17.6 and 17.8, respectively.

5.1 Focused-Probe Ptychography

With reference to Fig. 17.21, we have a coherent source and a lens that focuses a tight beam crossover through the plane of the object. In x-ray synchrotron ptychography the lens is very far from the source (many 10s of metres), so the radiation incident on the lens is parallel, and the coherence width is roughly the size of the lens. The crossover is then at the focal length of the lens. In scanning transmission electron microscopy (STEM) there are a number of lenses between the source and the final focusing lens, but the effect of these is to demagnify the source so that it appears (when looking back from the focussing lens) to be distant, thus ensuring good spatial coherence. Note that the spatial coherence width across the lens in this configuration (for all types of substantially monochromatic radiation) is approximately the inverse of the angular size of the source, as seen when looking back to the source from the plane of lens.

Fig. 17.21
figure 21

The focused probe geometry. A lens forms a beam-crossover in the plane of the object. In the far field, the diffraction pattern has a bright region (called the Ronchigram in electron microscopy), which is a shadow image of the lens pupil. Weak dark-field diffraction occurs outside this bright area

If there is a circular sharp aperture within the plane of the lens, then in the absence of the specimen there appears a round disc of illumination on the detector (Fig. 17.22a,b). This is always the case in electron microscopy, but x-ray microscopy often uses a Fresnel zone plate to focus the beam, which requires a central stop, and so the far-field pattern appears as a doughnut shape (also in Fig. 17.22a,b). If Kirkpatrick–Baez (KB) mirrors are used, and they often are because they do not absorb and waste any useable x-ray flux, then there is a rectangular box in the far field. For the present discussion, we will only discuss the use of a circular aperture.

Fig. 17.22a,b
figure 22

Diffraction patterns in the focussed probe geometry. (a) For electrons in an SEM run in STEM mode using a transmission specimen and a detector at the bottom of the specimen chamber. From [17.69]. (b) For hard x-rays using a Fresnel zone lens. From [17.79]. ©IOP Publishing. Reproduced with permission. All rights reserved

Compared to the difficulties of the simple aperture configuration (Sect. 17.5.10), one benefit of using the lens is that most of the unscattered counts are spread over a relatively large area, which avoids saturation of the detector, although there is still a large dynamic range between the central disc and the high-angle diffracted dark-field intensity. In all imaging configurations, there is a direct relationship between counts per unit area and the obtainable resolution given a certain image contrast. Poisson statistics dictate that detectable contrast depends on the \(\sqrt{N}\) of the total number of counts passing through a pixel. If we halve the pixel size in \(x\) and \(y\), we need four times the flux per unit area to be sensitive to the same contrast. For this reason, high-resolution ptychography generally employs a focused beam wherever a small field of view can be tolerated [17.19, 17.80].

A focused beam implies that the probe is very small, and so the sampling in the diffraction plane can be very large. If we had only had one pixel in the detector plane positioned right in the middle of the far-field disc, we would have created a conventional STE/XM in the bright-field mode. The output of this pixel as a function of the probe position, which is scanned on a very tight grid across the specimen, would be the conventional bright-field image . Is this ptychography with a single diffraction pixel? It certainly represents the limit of low sampling in diffraction space and very dense sampling in real space, but it certainly is not ptychography, if only because it has not solved the phase problem; like all bright-field images, the phase of the image has been lost.

When we double the information (in each \(x\)\(y\) coordinate) by splitting the detector into four pixels, or at least four quadrants of a circle, as shown in Fig. 17.23a,b, we are now on the first step towards ptychography, sampling in reciprocal space on an extraordinarily coarse grid and on a very fine grid in real space. However, these four pixels mean that now we have (in principle) enough information to solve the phase problem. We have two numbers in each \(x\)\(y\)-direction that can be used to calculate the real and imaginary components of each real space image pixel. In fact, for this to be true we have to make some strong assumptions:

  1. 1.

    The object is weakly scattering.

  2. 2.

    The illumination optics are perfect, which includes not having any defocus.

  3. 3.

    We must accept that the reconstruction can only process data lying within the central disc of the diffraction pattern, so we rely on all the resolution coming from the lens (not from the high-angle dark-field intensity).

The only (nonnegligible) gain is then the recovery of the image phase. To bring to bear the full power of ptychography to remove lens aberrations in the STE/XM configuration, to process the dark-field high-resolution scattering, and to be able to cope with strongly scattering specimens, we must still sample the reciprocal space on a fine grid (Sect. 17.10).

Fig. 17.23a,b
figure 23

Sector detectors. The simplest configuration (a) can have its transfer characteristics improved by further subdivisions (b)

The focused probe arrangement has one very important advantage: analytical signals, like x-fluorescence spectroscopy, can still be simultaneously collected at the resolution of the probe crossover. This is true for both x-ray and electron microscopy. Certainly, the main rationale for aberration-corrected STEM is that elemental composition and bonding information can be obtained at atomic resolution, whether by x-ray spectroscopy or electron energy loss spectroscopy ( ). The incoherent annular dark field ( ) image also has several benefits that STEM microscopists are loath to lose. With a focussed probe geometry, x-ray and ADF data, as well as some less common signals like secondary electrons, Auger electrons, and cathodoluminescence, can be collected simultaneously with ptychographic data. The EELS detector must be on the optic axis and so short of drilling a hole in the diffraction pattern detector, EELS cannot be collected simultaneously with ptychographical data.

Figures 17.24a-d and 17.25 show examples of electron ptychographs collected simultaneously with the ADF signal. Ptychography produces an excellent phase signal, which is sensitive to both heavy and light atoms. The ADF signal is sensitive to the atomic mass of the atoms. The principal advantage of ADF imaging is that the contrast is incoherent and it increases monotonically with the projected mass of the atoms. It, therefore, has higher resolution than the bright-field image, is approximately quantitative, and it does not suffer from coherent artefacts. However, a consequence of the mass dependence is that it is difficult or impossible to image light atoms within a matrix of heavy atoms.

Fig. 17.24a-d
figure 24

Image of GaN recorded by conventional electron contrast methods. (a) ADF and (b) ABF images. (c) Modulus and (d) phase of the ptychographic reconstruction. Only the latter can clearly image the very light nitrogen atoms (a few are marked orange) between the heavy Ga atoms (blue). Reprinted from [17.9], published under CC-BY license

Fig. 17.25
figure 25

(a) ADF image of a carbon nanotube with \(\mathrm{C_{60}}\) balls fitting inside it. A few heavy atoms are also picked out in the image. (b,c) Phase of the ptychographic image, the latter with the positions of the \(\mathrm{C_{60}}\) balls and heavy atoms high-lighted. (dg) Contrast from various configurations of sector detectors, all of which are weaker or noisier than the ptychographic reconstruction. (h) Center of mass approach, which measures the average shift of intensity in the detector plane. From [17.10]

Figure 17.24a-d compares an ADF image with its ptychographical counterpart. The light oxygen atoms, easily visible by ptychography, are entirely absent in the ADF image. Similarly, Fig. 17.25 shows a very light structure (\(\mathrm{C_{60}}\) inside carbon nanotubes), imaged with high phase contrast via ptychography, together with the ADF picking out a few heavy atoms. The ptychographic phase is also shown to have higher contrast than other less comprehensive sector-detector type phase imaging methods. The combination of ptychography with ADF imaging may well prove to be the most effective use of electron ptychography . Note that both Figs. 17.24a-d and 17.25 were reconstructed using the WDD inversion method (Sect. 17.10).

As a very final note, we remark that while this chapter was in press, Jiang et al [17.81] demonstrated for the first time that electron ptychography can surpass the resolution limit of an aberration-corrected electron lens, thus obtaining the highest resolution transmission image ever recorded.

5.2 Fourier Ptychography

As we saw in the previous section, a scanning transmission microscope employs a lens to focus the image of a small bright source onto the object; the image is constructed by scanning this tightly focused spot across the object while recording the transmitted intensity, which falls on a detector downstream of the object. The optical set up in a conventional transmission microscope would at first appear to be very different. The object is illuminated by a plane wave, and the resulting exit wave is brought to a focus at an image plane by a lens lying downstream of the object. During the late 1960s, when the first STEM instruments were developed, there was some confusion in the community when it was realized experimentally that the bright-field STEM image has identical features, such as Fresnel fringes and limited contrast transfer, to the TEM image, despite the fact that they are formed in such a completely different way. It was Cowley who first suggested that the well-known principle of reciprocity could account for this equivalence [17.82]. This states that if we have a source of radiation at a point A, which has an intensity \(I_{\mathrm{A}}\), and we record an intensity \(I_{\mathrm{B}}\) due to this source at another point B somewhere else in the optical system, then the reverse of this experiment will give the same result: if B radiates with intensity \(I_{\mathrm{A}}\), the signal at A due to that source will be \(I_{\mathrm{B}}\). With reference to Fig. 17.26a,b, we can now see that our two types of microscope—scanning transmission and conventional transmission—encode identical information. All we have to do is reverse the directions of the rays in the ray diagrams of the two machines.

Fig. 17.26a,b
figure 26

Two very different configurations of transmission microscope. (a) STE/XM, (b) a conventional microscope with tilted illumination. Via the principle of reciprocity, both set ups can collect the same information

Consider a single pixel in the detector plane of a scanning transmission microscope. Keeping all the optical components the same, we now replace that with a source of radiation and we place a detector at the point originally occupied by the source. Remember, our scanning mode detector was positioned in the Fraunhofer diffraction plane a long way away from the specimen, so its coordinates are a function of angle. When replaced by a source, the incident radiation bathes the whole specimen with a tilted plane wave, as illustrated in the lower half of Fig. 17.26a,b. In the conventional transmission microscope we do not need to scan a probe, because the image arrives simultaneously over the whole image plane. Rather, each image pixel is, via reciprocity, like a different probe position, because the effect of moving the source in a scanning transmission machine is to move the probe. In short, we have a four-dimensional data set that can be collected in two ways: a set of diffraction patterns recorded as a function of probe position or a set of images recorded as a function of plane wave illumination angle. It stands to reason that we can, therefore, use this reciprocal configuration to do everything that conventional ptychography can do. The method is nowadays called Fourier ptychography. It was first proposed by Hoppe, shortly after his work on ptychography [17.83].

Consider a conventional microscope in which the illumination is a coherent plane wave traveling parallel to the optic axis. In the back focal plane of the lens, we see the conventional parallel beam diffraction pattern. If the specimen does not scatter too strongly, this will consist of a bright spot on the optic axis with weaker diffraction amplitude from the specimen lying around it, as shown in Fig. 17.27. Now, when we tilt the beam, the bright central spot will move laterally and, provided the specimen is not too thick, the diffracted amplitude will shift with it by the same amount. If we place an aperture also in the back focal plane, then we have constructed a sort of ptychographic experiment. The shifting diffraction pattern is like the object wave we want to solve for—except in this case, it happens to be a diffraction pattern. The aperture is like the conventional illumination function. Our data are recorded in the Fourier domain of these two functions, which in this case, is the image plane. Now, the folding (convolution, or ptycho) of the wave intermixture is the convolution of the impulse response function of the lens/aperture and the exit wave of the object, which gives the convolved image recorded in intensity. All the general principles of ptychography apply. If we are going to call this technique Fourier ptychography, we have to rename conventional ptychography as real-space ptychography .

Fig. 17.27
figure 27

Diffraction amplitude at the back focal plane of a conventional microscope. As the illumination is tilted, different parts of the diffraction pattern are steered through the lens. In Fourier ptychography, the aperture is treated as an illumination function, the diffraction pattern as the object. By shifting the angle of illumination a wide area of the diffraction plane can be reconstructed

In visible light Fourier ptychography, the imaging system often has a low numerical aperture, meaning that every image recorded in the image plane has very poor resolution. Once a large diffraction pattern has been calculated using the ptychographical methods, we can transform back and obtain a very high resolution picture. Why would anyone want to do this? After all, nowadays optical lenses are very good indeed. One obvious reason is that we end up with both the modulus and phase of the image, which is very important for imaging transparent objects such as biological cells. However, another key advantage is very high resolution combined with very large field of view . Supposing we have a CCD camera with \({\mathrm{1000}}\times{\mathrm{1000}}\) pixels. If we deliberately stop down the imaging lens so it has very poor resolution for each image we record (because the aperture stop in the back focal plane is so small), we can demagnify the image on the CCD and capture a very large field of view. If we now step the diffraction pattern in the back focal plane through enough incident beam tilt angles to extend the field of view of the diffraction pattern (not to be confused by the field of view in the image plane, which determines the pixel pitch of the diffraction pattern) say by a factor of 10, and then back transform to the image plane, we have a high-resolution image of a wide field of view—\({\mathrm{10000}}\times{\mathrm{10000}}\) pixels. This wide field of view is vital for things like counting abnormal cells in a cell culture, where statistics from huge numbers of cells are key.

The physical set up of visible light Fourier ptychography also has some significant advantages over its real space counterpart. The different angles of illumination can be generated by an array of light emitting diodes (LEDs), so neither the illumination function nor the object function has to be moved. Moving the illumination in any optical set up, say by using deflection coils in an electron microscope or by using mirrors or prisms in the case of visible light, invariably changes the shape of the probe via the introduction of aberrations or phase gradients, so that one of the principal constraints of ptychography is lost. There are ways around position-dependent probe variations (Sect. 17.4.4), but it is preferable to avoid this complication. The disadvantage of moving the specimen is that it takes time, and there is invariably hysteresis in mechanical stages, although we have already described how computational refinement of probe positions is possible [17.57]. In contrast, a Fourier ptychographic microscope can be made as a fixed structure with no moving parts, with a fast readout camera easily synchronized with the switching of each illumination source.

Figure 17.28 shows an image of an optical microscope, modified with an LED array mounted on a Lego structure, which was used to generate the first published visible light ptychographic image [17.41], together with a demonstration of the resolution improvement over the raw data. Figure 17.29 shows the reconstruction process, in both real space and reciprocal space. Figure 17.30 shows an example of a biological structure imaged using the approach. The final reconstruction here is composed of 0.9 gigapixels.

Fig. 17.28
figure 28

(a) A Fourier ptychography microscope. (b) A conventional microscope has been modified using a Lego framework so that an array of photodiodes can illuminate the object at different angles (see bottom of Fig. 17.26a,b). (c) A typical image collected at low-resolution (a single illumination tilt) with magnified section below. (d) Ptychographically reconstructed image at the same magnification

Fig. 17.29
figure 29

Iterative reconstruction for Fourier ptychography. Top three images show different illumination angles. Below these, the raw data for a bright-field image and two typical dark-field images. Bottom three central frames show the back-focal plane reconstruction developing. Far right is the recovered image intensity (top) and phase (bottom). From [17.41]

Fig. 17.30
figure 30

(a) Example of a very wide field of view, high-resolution, 0.9 Gigapixel image reconstructed by Fourier ptychography [17.41]. (c2,c3) are conventional images taken with \({\times}20\) and \({\times}2\) object lenses, respectively. (b,c1,d,e) are the corresponding high resolution Fourier ptychography reconstructions enlarged from the 4 areas in (a) identified by the respective red dotted lines. Inset in (a) shows the scale of the total field of view relative to a US quarter coin (diameter \({\mathrm{25}}\,{\mathrm{mm}}\)) for the \({\times}20\) and \({\times}2\) objective

Fourier ptychography has also been undertaken in the electron microscope, the original concept predating the recent interest in the visible light version by 40 years [17.84]. It is generally called tilt-series reconstruction in electron microscopy, and has shown the ability to improve resolution over and above that of a good electron lens [17.85]. It is impractical to have an array of electron sources, as is used with visible light, so the single illuminating beam must be scanned through a range of discrete incident angles by double-deflection coils, which also have a habit of suffering from hysteresis. Figure 17.31a-c shows an example of the raw data acquired in an electron microscope as a function of illumination angle, and the corresponding reconstruction in the back focal plane. Figure 17.32 illustrates the gain in resolution in real space over and above a conventional through-focal series reconstruction, which uses only one normally incident beam.

Fig. 17.31a-c
figure 31

Fourier ptychography in the electron microscope, where it is usually called tilt-series reconstruction. (a) Raw data from various tilt angles. The specimen is silicon orientated on the \(\langle 112\rangle\) zone axis. (b) The region of reciprocal space passing through to the conventional image. (c) The region of reciprocal space reconstructed. Reprinted with permission from [17.85]. Copyright 2009 by the American Physical Society

Fig. 17.32
figure 32

(a) Diffractogram (modulus of the Fourier transform of the image) for the conventional image, shown in (b). (c) Diffractogram of the reconstructed image, clearly extending much further into reciprocal space. (d) The final high-resolution reconstruction, including the expected calculated exit wave in color. Reprinted with permission from [17.85]. Copyright 2009 by the American Physical Society

Since 2013, when visible light Fourier ptychography was first demonstrated, there has been a great deal of research undertaken on it, and the field is expanding very quickly. All the inversion algorithms developed for real-space ptychography apply equally well, with one or two minor alterations, to Fourier ptychography. Indeed, all the key developments in real-space ptychography have been reproduced in Fourier ptychography, and many have been superseded; see, for example, [17.29, 17.61, 17.86, 17.87, 17.88, 17.89, 17.90, 17.91, 17.92, 17.93]. If you understand reciprocity, everything we have discussed in Sect. 17.4 with respect to sampling, diversity, and reconstruction refinement still applies, as do most of the methods we discuss in Sects. 17.6 and 17.8. The field is also making significant contributions to the theory of the inverse problem in ptychography, which we discuss in Sect. 17.9.

In the visible light domain, the Fourier configuration of ptychography is extremely promising. For more information, the interested reader is directed towards the book by Zheng, which was recently published on the subject [17.94].

5.3 Selected Area Ptychography ( )

Another configuration where the reconstruction is of a wavefield instead of a physical object, in this case an image formed by a lens, is called selected area ptychography, or SAP. With reference to Fig. 17.33, a conventional microscope with the specimen illuminated by coherent radiation is used to form a conventional image. An aperture is placed in the plane of the image, and the resulting diffraction pattern is recorded some distance downstream of the aperture. The specimen is physically moved laterally, so that the image wavefield moves across the aperture. We treat the image as our object function and the aperture as our illumination. Once again, everything we have said about real-space ptychography as far as reconstruction algorithms goes, applies. All electron microscopes have a selected area ( ) aperture in the first image plane in order to select one area of an object from which to obtain a diffraction pattern. This is used to characterize small areas of a specimen that may be composed of very small crystal grains or small isolated objects. Hence, SA ptychography: SAP. The configuration has been shown to work in the electron microscope in a proof-of-principle experiment [17.48] (Fig. 17.34a-e). Unlike real-space electron ptychography , it can image a very large field of view and may well compete with conventional electron holography, say for mapping electric or magnetic fields.

Fig. 17.33
figure 33

The SAP configuration. The object is moved, causing its image to move relative to a selected area aperture in the first image plane of the objective lens. The detector lies in the Fraunhofer plane (or more normally the Fresnel diffraction plane) of the aperture

Fig. 17.34a-e
figure 34

Example of electron SAP. (a) The conventional bright-field image. The spherical latex balls appear with flat contrast because the contrast mechanism relies on weak phase. In the unwrapped ptychographic phase (b) the strong phase is rendered perfectly. (c) The phase wraps, clearly confirming that these objects are very strong phase. The fractured phase wrap in the balls is because of the gold structure underlying the balls. (d) Phase (colored) and modulus of the low magnification ptychographic reconstruction. (e) is the bright-field image of the aperture, showing that the reconstruction modulus is accurate. From [17.48]

To date, the most extensive use of SAP has been at visible light wavelengths, where it is commercially available as a means of characterizing biological cell life cycles [17.63, 17.95]. The main advantage of the technique is that the full coherent resolution capability of the optical objective lens can be exploited, giving high-resolution and extremely high quality images, together with a clean phase image. The latter is crucial for imaging live, unstained, or unlabeled cells. Resolution can be increased further by arranging for the illumination to include a range of incident angles within it. We showed an example image using this technique in Fig. 17.11a,b. Because the free space background phase signal is so flat, and there are no ringing effects in the image caused by distortion of low frequencies (as happens, say, with Zernike phase contrast ), even very weakly scattering transparent objects appear with high contrast. This means that segmentation of the image is easy and accurate, allowing for reliable cell counting statistics and the measurement of other biologically important parameters such as reproduction rates, motility, cell volume, etc. (Fig. 17.35a,b). Very long experiments (over several days) can be performed (in a suitable cell incubator) without the need to refocus the image, which can be achieved computationally post data acquisition.

Fig. 17.35a,b
figure 35

Visible light SAP (see also Fig. 17.11a,b). One of the great advantages of the technique is that cells do not need to be stained or labeled and so can be observed over days reproducing and moving. (a) This image shows some cells that are at various stages in the process of division. (b) The phase image is particularly amenable to precise segmentation. From [17.95]

5.4 Fresnel Full-Field Ptychography

An early definition of ptychography suggested that a prerequisite for the method is that the illumination function is localized, so that the convolution in the Fourier domain allows diffraction components to interfere with one another. This has nowadays proved to be overprescriptive. Consider the two experiments shown in Fig. 17.36a,b. Figure 17.36a,ba consists of a corrugated wave front (i. e., the surfaces of constant phase depart significantly from plane surfaces) incident upon the object. Behind the object, but relatively close, is the detector. To work out the intensity of the radiation at the detector plane, we add up a sum of Huygen's elementary spherical waves, each centered on one point of the exit wave function, and having the modulus and phase of the exit wave function at that point. The real part of the impulse response of any one of these waves looks something like the graph on the right-hand side of Fig. 17.36a,ba.

Fig. 17.36a,b
figure 36

Full-field Fresnel ptychography. (a) The incident wave must have structure in order to provide ptychographical diversity. Wavelets scattered from the object have most influence on the detector pixels directly downstream of them. This is because of the stationary phase effect of the Fresnel integral (right). (b) Shadow imaging to increase the magnification of the technique

This intermixture of the waves has a similar effect to the convolution in Fourier domain ptychography, although how the wave components add together is rather different. The Fourier integral involves the whole object at once, adding all rays that head off to the detector at a particular angle. In the near field, the propagation integral also adds rays arriving at any one detector pixel , but their path lengths and angles vary considerably from one position on the object to the next. Like the Fourier integral, it seems as if all elements of the object contribute wave amplitude to each detector pixel. However, the pixel size of the detector means that the intensity of any one pixel is only affected by a rather localized region of the object exit wave. Over the surface of the detector, the elementary spherical wave has beyond a certain width, very rapidly varying oscillations. This is a stationary phase effect. An elementary spherical wave from one element of the object gives a large contribution to the scattering integral over an area of the detector where its phase is substantially flat, i. e., of roughly constant value. Detector pixels laterally displaced from the source of the elementary wavelet experience a quickly changing phase. Beyond a lateral displacement of more than the radius of the first Fresnel zone (the area in which the phase changes by less than \(\uppi\)), the phase begins to change very rapidly, roughly as a function of the lateral displacement squared. There will very quickly come a point where the size of the pixel is such that it spans many phase cycles, so that the contribution from the wavelet integrates to zero. Partial coherence in the illuminating beam (i. e., a finite source size ) exacerbates this effect. So, in Fresnel ptychography we do not need a localized source; the Fresnel integral itself defines a localized area that contributes to any one detector pixel, although the local area so defined is different for each detector pixel.

Fresnel ptychography requires us to move the object laterally with respect to the illumination. Of course, if the illumination is a simple plane wave, the out of focus image on the detector will just move laterally without changing at all, thus not giving us any information. That is why the illumination must have diversity—the wavefronts must be distorted or uneven. Henceforth, the iterative solution of the ptychographic phase problem can proceed as before, using any of the standard algorithms, the only difference being a change from the Fourier propagator to one that models the physical propagation from the object to the detector. Note also the comments in Sect. 17.5.11.

The experiment shown in Fig. 17.36a,b will not give us any magnification of the object; the reconstruction has the same resolution as the detector pixel pitch (although the phase is recovered). X-ray near-field ptychography was, therefore, undertaken in the x-ray shadow image microscopy mode that was pioneered by Cosslett and Nixon [17.97] in the 1950s, see Fig. 17.36a,bb. The ratio of the distances from the source to the detector and source to the object determine the magnification constant. Figure 17.37a-n shows a conventional shadow image taken in this configuration using \({\mathrm{16.9}}\,{\mathrm{keV}}\) hard x-rays [17.96]. The source was generated by the focused beam crossover created by two Kirkpatrick–Baez (KB) mirrors. Figures 17.37a-na,f show raw data taken in this configuration, without and with a fixed diffuser in the beam path, respectively. Figures 17.37a-nb–e and g–j show magnified versions of Fig. 17.37a-na,f when moving the illumination. Figures 17.37a-nk,m are modulus and phase reconstructions without a diffuser. Figures 17.37a-nl,n are improved images using the diffuser. Interestingly, even without the diffuser, the beam line optics, which, of course, always introduce some minor imperfections in the wavefield, have introduced enough diversity in the incident wavefield for the reconstruction to work. However, with the diffuser in place, the reconstruction is much better. This is an example of diversity improving ptychographic data.

Fig. 17.37a-n
figure 37

Example of hard x-ray near-field ptychography. Raw data (a) without and (f) with a diffuser. (e,g) Raw data as they are scanned. The data using the diffuser vary more rapidly, implying that the diversity of the raw data is greater. (k) Modulus and (m) phase for no diffuser. (l) Modulus and (n) phase with the diffuser. From [17.96]

Near-field ptychography has several advantages. The field of view is large, even when only a few specimen positions are used. Strictly speaking, only four scan positions are needed to recover the complex components of each pixel in the \(x\) and \(y\)-directions. (This is similar to the need for four sector detectors discussed in Sect. 17.5.1.) Of course, more specimen positions are beneficial because they further constrain the solution; the results shown in Fig. 17.37a-n nevertheless used only 16 positions. In real-space ptychography , the diffraction pattern always has a high-dynamic range, especially between the bright unscattered beam and dark-field features lying at high scattering angles. This can make it very difficult to choose an appropriate exposure time; what is correct for the unscattered beam is far too short for the scattered beams. Conversely, in near-field ptychography, the whole detector is evenly illuminated, which makes setting the optimal exposure time easier.

5.5 Defocused Probe Ptychography

With reference to Fig. 17.38, we can use a lens to form a convergent beam, but rather than place the object at the exact focus of the probe, we can defocus it somewhat, or equivalently move the specimen up or downstream of the beam focus. This configuration combines near-field ptychography (Fig. 17.36a,bb) and the focused probe scanning transmission microscopy STE/XM. It differs from near-field ptychography in that the degree of defocus is relatively small, so that the field of view within the central disc is small, and so that diffracted data lying at high angles outside the bright disc are also processed. It is, therefore, a complicated mixture of Fresnel-type interference and Fourier domain diffraction. Of course, the reconstruction process remains the same, the only difference being that the probe structure is dominated by curved wavefronts. If that curvature is included correctly, the far-field pattern is just the Fourier transform of the exit wave. We do not have to use the Fresnel integral for the central disc, because the premultiplier of the curved phase distribution, followed by a Fourier transform, is itself a way of constructing the Fresnel integral.

Fig. 17.38
figure 38

Defocused probe ptychography. The far-field is both a magnified Fresnel shadow image (Fig. 17.36a,b) and also has high-angle dark-field intensity

This type of defocused probe is very commonly used in synchrotron x-ray ptychography , because by moving the specimen forwards and backwards away from the beam crossover, the diameter of the probe can be changed at will. Thus, the probe size and the step size can be matched to the field of view [17.98]. Another benefit is that it helps keep the number of exposures small, which is important when the duty cycle of the camera readout and/or the settling time of the stage make up a significant proportion of the total elapsed time of the whole experiment. This is especially true of ptycho-tomographic scans that can take many hours or even days.

Yet another benefit of having a large probe is to limit dose-rate specimen damage effects, at least for electron ptychography , where damage can be severe. As we have emphasized, ptychography is a dose-fractionation method; moving the illumination by large or small step sizes does not affect the total dose that needs to go through the sample in order to produce an image with adequate signal to noise in each reconstruction image pixel. However, there is some evidence that the dose rate can be as important as the total dose, for example, in the time dependence of damage observed by electron energy loss spectroscopy [17.99]. It is possible that ions displaced by knock-on damage can relax back into their original location if there is sufficient time to do so before the next knock-on event occurs. That means that a low dose rate per unit area may induce less damage for the same amount of total dose. Using a large probe achieves exactly this. It is also possible that a large area of illumination will ameliorate other annoying problems that arise in electron microscopy, such as the build up of contamination (sometimes exacerbated by a focused probe) or local charging of the specimen, which can lead to uncontrolled and sudden specimen movement.

Figure 17.39a,b shows an example of a defocused probe electron ptychograph obtained from an SEM. This was not a STEM operating at high accelerating voltage, but a conventional SEM (an FEI Quanta 600) operating at \({\mathrm{30}}\,{\mathrm{keV}}\), with a two-dimensional detector mounted below the specimen stage. The stage had also been modified to accommodate a transmission specimen, which in this case was a standard TEM resolution test specimen consisting of small gold particles sitting on a thin amorphous carbon support film. Figure 17.22a,ba, discussed previously, shows an example of the raw data. Although it is hard to see it, the central disc, which in electron microscopy is called the Ronchigram, has some structure within it, which is essentially the same as the Fresnel near-field image in the equivalent x-ray experiment shown in Fig. 17.37a-n. The difference here is that the range of illumination angles in the beam is small, and the experiment is also going to process the dark-field diffraction peaks lying well outside the central discs—indeed, this is where all the high-resolution information in the experiment comes from.

Fig. 17.39a,b
figure 39

Electron ptychography with a defocused probe in an SEM run in transmission mode (as a STEM), with a detector mounted at the bottom of the specimen chamber. (a) The specimen is a standard test specimen consisting of gold particles on amorphous carbon. Phase is represented by color, the brightness modulus. From  [17.69]. The enlarged image (b) has used the same raw data as (a), but the contrast has been improved over that of in the original paper [17.69] by employing modal decomposition to remove partial coherence effects. See Sect. 17.8

Atomic fringes are visible in some gold particles—those that are orientated on a zone axis. The smallest fringes visible are separated by \({\mathrm{0.23}}\,{\mathrm{nm}}\), corresponding to an increase in resolution over the lens capability by a factor of about 5. These results imply that we could dispose of conventional TEMs, at least for imaging (as opposed to focused-probe analysis), and use instead a rather less costly SEM fitted with a transmission detector. In fact, careful inspection of Fig. 17.39a,b shows that some of the atomic fringes are delocalized from the gold particles—a similar problem that arises in defocused TEM images, and one that is fatal for accurately determining the exact position of atomic columns. Delocalization is particularly sensitive to any intensity pedestal or read-out noise in the detector. The comprehensive solution would be to have a single electron detector.

From the point of view of the ease of reconstruction, using a defocused probe has both strengths and weaknesses. The central disc is essentially a Gabor hologram. In any iterative reconstruction, this means that the first low-resolution image of the object should be quite a good holographic estimate. Indeed, it is known that cCDI reconstructions are improved in this Fresnel mode [17.100]. If there are errors in the probe positions, this is very obvious in the Ronchigram, and can be used to coarsely adjust probe position errors, provided only defocus (and not higher-order aberrations) are present [17.101]; in this case, the Ronchigram is an undistorted near-field image of the object.

Unfortunately, reconstructing data from curved wave illumination also has several hazards, which can quickly lead to stagnation of the reconstruction process, or simply give a completely wrong solution. Defocus corresponds to adding an extra curved phase to the transfer function of the lens. This curvature will also appear across the Ronchigram disc in the far field. In real space in the sample plane, there is also a corresponding curved phase over the illumination. In an iterative reconstruction, the phase of the correct curvature must be seeded in the first estimate of the probe in real space.

Suppose that the probe is physically defocused so it has a diameter \(D\). There is only one phase curvature over this probe that will give rise to a disc in the far field of the correct, recorded diameter. As an extreme example, suppose our first guess of the probe has no phase across it at all. Propagating this to the far field will give us an Airy disc —an intensity distribution only a fraction of the width of the measured far-field disc. When we apply the Fourier constraint, creating a bright disc of modulus, and back Fourier transform, we have a function that is nothing like our real probe function. In fact, it will probably be so small that it will not even overlap with the adjacent probes that were used to create the data. Recovering from such a remote position in the solution space is virtually impossible.

A further problem is that the magnification of the Ronchigram image is also a function of defocus (as is obvious from Fig. 17.38). This means that if the step size in real space is poorly calibrated, the only way the reconstruction algorithm can reconcile the conflicting data is to both increase (or decrease) the magnification of the image and increase (or decrease) the size of the illumination, which is achieved by changing the estimated defocus of the illumination. The result is a reconstruction that looks out of focus, but it cannot be put back into focus simply by repropagating to the correct plane, because the reconstruction does not relate to any actual physical plane within the wave disturbance; it is just the best estimate of the object given by the conflicting data. The same effect occurs if the object to detector distance is not measured accurately, something which is always poorly calibrated in the conventional transmission microscope, where the intermediate lenses are used to form the diffraction pattern.

In x-ray ptychography , the object to detector distance is fixed and can be measured very accurately. The stepper motors used to scan the object are also usually well calibrated, as is the focal length of a zone plate lens. It is also easy to measure the distance the object has been moved out of the beam crossover, again by using a stepper motor in the \(z\)-direction. Apart from inputting the correctly defocused probe function at the start of the reconstruction algorithm, which can be immediately calculated from these experimental parameters, the problems described above rarely apply.

5.6 Diffusers

As we remarked in Sect. 17.4.2, the bandwidth of ptychography in the sense of the transmission line in Fig. 17.1a-dd is a function of both the probe structure and the object structure. An object that has broad flat features, i. e., one that has low entropy, is, in general, more difficult to reconstruct. The probe and the object appear equivalently in the mathematics of ptychography, except the probe contributes to every diffraction pattern, whereas each region of the object is only expressed in a few diffraction patterns (an exception to this is near-field ptychography). It is incontrovertible that having no structure in the illumination—a flat plane wave covering the whole of the object plane—cannot possibly give us any ptychographical information at all. It would seem logical, therefore, that having a probe function that has lots of structure can greatly reduce the likelihood of encountering a probe-specimen combination that cannot easily reconstruct.

Think of a probe composed of random phase and modulus. The randomness appears in both real and reciprocal space, in the latter appearing as a well-developed speckle pattern. As we discussed in Sect. 17.4.6, this sort of pattern is excellent for reconstructing gaps in the data that have not been recorded in the diffraction plane, either because of missing pixels or because the detector is too small, so that intensity has fallen outside it. At the other extreme, a simple large aperture with flat phase has a tiny Airy disc response in the far-field, so the ptychographical convolution at any one pixel is only substantially affected by a few pixels around it. The diffuse probe would seem to constrain the data set much more effectively than a simple probe.

The choice of probe positions also matters. If we integrate all the flux that has passed through the object, summing up the intensity that arrived at it from all the probe positions, it would be unfortunate to find that some areas had not been illuminated. We could never possibly reconstruct the object at those points. A probe with strongly varying random modulus would be likely to span the object with a relatively even total flux. Sharp features in the probe also make the intensity at each detector pixel change more rapidly as a function of probe position, which would seem to put more information into the recorded data. Another issue is the bit depth of the detector. An even speckle pattern is more likely to optimize the total information content read out from the entire diffraction pattern, especially if the detector is imperfect in any way, because each pixel has made the most of its available dynamic range.

The question of the optimal probe has not been fully resolved. Some insight can be offered by the WDD method, which requires a division by a function that depends on the probe. If the probe is made in such a way that this division is stable (i. e., the Wigner distribution relating to it—see (17.32)—has few minima), then there is some evidence that the reconstruction is more stable, noise robust, and accurate [17.71, 17.72]. Suffice it to say that diffusers, placed at one position or another in the optical path, generally improve the reconstruction.

5.7 Bragg Ptychography

One of the principle applications for cCDI is a method of using a Bragg diffracted beam from a small crystalline particle. The configuration has two principal advantages. By tilting the object through a small angle, many diffraction patterns can be recorded as the 3-D Bragg reflection is scanned through the Ewald sphere, thus plotting out the intensity of the 3-D Fourier transform of the object. Using the knowledge that the particle is finite, one can then use single-shot Fienup-type iterative methods to recover the phase of the volumetric plot of the reflection and thus reconstruct the shape of the crystal. More interestingly, any departure in perfect crystallinity will alter the intensity and phase of parts of the reflection. The Bragg condition by definition assumes a fixed phase relationship between all the scattering points (atoms) in the object. If atoms become displaced, say by a strain field, then these relative phases change. Similarly, the real space reconstruction of the object will have internal phase shifts mapping the strain field [17.102].

Hruszkewycz et al were the first to demonstrate experimentally that the same principle can be applied to ptychography [17.103]. They investigated the strain of an epitaxial SiGe layer grown on a silicon-on-insulator (SOI ) device. The geometry of the experiment is shown in Fig. 17.40. A cross-sectional TEM image of the object and the measured strain maps are shown in Fig. 17.41a-d. The phase of the ptychographic reconstruction gives a direct measure of the displacement of the atom planes in the SiGe; the derivative of this gives the slope of the planes, which can be mapped out and compared with calculations. More recent work has demonstrated that the method can also be extended to mapping 3-D strain in semiconductors [17.104, 17.105].

Fig. 17.40
figure 40

Geometric set up for Bragg ptychography . Reprinted with permission from [17.103]. Copyright 2012 by the American Chemical Society

Fig. 17.41a-d
figure 41

Example results from a Bragg ptychography experiment. (a) Cross-sectional TEM view of the sample. (b) Ptychographical reconstruction in modulus and phase (color coded). The phase is proportional to displacement from the unstrained condition. (c) Displacement and (d) its derivative, the latter being proportional to the curvature of the atomic planes. Reprinted with permission from [17.103]. Copyright 2012 by the American Chemical Society

Bragg ptychography has potentially very important applications in the semiconductor industry, where strain induced by epitaxy of materials with dissimilar unit cell size can be used to control the nature of the bandgap. Although local strain can be measured by electron microscopy, the need to prepare a thin sample leads to relaxation of the strain; inference of the original bulk strain is then difficult. X-ray Bragg ptychography can deal with bulk materials in their original state of strain, although there are limitations on the depth of penetration into the surface of the material and, relative to electron microscopy, the resolution of the technique.

5.8 Visible Light Reflective Ptychography

Visible light has been used to demonstrate ptychography in the reflective configuration, with both the illumination and detector normal to the surface of interest, as shown in Fig. 17.42. Clearly, the phase of the reflected beam is sensitive to surface topology, and vertical sensitivity has been shown to be comparable with white light metrology [17.106]. The comparison is shown in Fig. 17.43. There is a very wide array of competing surface topology measurement techniques, and so it is unlikely that visible light reflective ptychography will have wide application, even though these early results could be significantly improved upon. A complication is that to measure a structure with vertical features larger than half the wavelength, multiple phase wraps abound in the image; a very serious problem when a large step change in height is encountered. A solution is to employ a second color of light, in a second experiment, very close in wavelength to the first, thus generating a large artificial wavelength by forming the difference between the two phase images, as demonstrated in [17.106].

Fig. 17.42
figure 42

Set up for visible light ptychography in the normal incidence reflective mode. In this case, two sources, very close in wavelength, enter on the right. By switching between them, a long synthetic wavelength can be generated by combining two reconstructions. From [17.106]. ©IOP Publishing. Reproduced with permission. All rights reserved

Fig. 17.43
figure 43

(a) White light interferometry of the test structure in (b). (c) The reflective ptychographic reconstruction from the same object. From [17.106]. ©IOP Publishing. Reproduced with permission. All rights reserved

5.9 Transmission and Reflection EUV Ptychography

A considerable limitation of x-ray ptychography is the need for a synchrotron to obtain high flux and high coherence. Beam time is scarce, so experiments cannot be easily refined during a single scheduled run. A promising alternative is to use a higher harmonic source or laser-produced plasma EUV sources in the ordinary laboratory environment. A coherent laser source can generate pulses that are very well controlled, both spatially and in time, using all of the many optical techniques nowadays used for femtosecond studies. Such pulses can be passed through a nonlinear medium, such as a gas. In the pulse of intense electric field, electrons are almost dissociated from their respective nuclei, but accelerate and decelerate, passing through the atomic potential, thus adding harmonics to the transmitted EM wave. In this way, EUV radiation is produced.

As far as ptychography is concerned, the huge benefit of this method is that the source of radiation is essentially fully coherent, unlike a synchrotron that relies on a large distance between source and optics (and, thus, the consequent loss of useful flux) to achieve spatial coherence.

Transmission EUV ptychography can only image thin and weakly scattering transmission specimens. Figure 17.44 shows a ptychographic reconstruction of cells from a rat's brain. The resolution is about that of a visible light optical microscope. A disadvantage of EUV is that the specimen must be held in vacuum, which means that it is not possible for biological structures to be imaged wet, nondesiccated, or live. However, there may be many other potential applications to very thin objects which otherwise do not scatter light strongly. Note that this reconstruction used the varying probe algorithm described in Sect. 17.4.1. Another important application is surface topography measurement using glancing angle reflection, an example of which is shown in Fig. 17.45 [17.108]. In this geometry, care must be taken to map the detector coordinates to the scattering configuration and the elongated probe shape and phase. The method has very promising applications in high-resolution semiconductor metrology.

Fig. 17.44
figure 44

Transmission EUV ptychograph of rat neurons. Color and modulus coded as in the color wheel. Courtesy of Jo Bailey and John Chad, from the Centre for Biological Sciences, University of Sheffield, and Magdalena Miszczak, Michal Odstrçil, Peter Baksh, and Bill Brocklesby, from the Optoelectronics Research Centre, University of Southampton

Fig. 17.45
figure 45

(a) Modulus and (b) phase of a reflective EUV ptychograph of a test object. The phase is essentially a topographical plot of the object. (c) The distribution of surface heights measured over the field of view by ptychography (CDI) and AFM. The fact that the peaks around \({\mathrm{32.7}}\,{\mathrm{nm}}\) are almost coincident implies the two methods give equivalent results. The reconstruction compares favorably with the AFM and SEM images. Reprinted from [17.107], published under CC-BY license

5.10 The Simple Aperture

The simplest ptychographical set up imaginable comprises a source, an aperture, a moveable object, and a detector, with no lenses or any other optical components. This was the original goal of totally lensless imaging, and at first, ptychography seemed to liberate imaging from the need of any sort of interferometer or lens at all. However, despite its simplicity, the aperture-only set up should be avoided if at all possible.

The biggest problem is the very bright central spot of the diffraction plane. All types of detector find this hard to handle. Single-photon/electron devices are count-rate limited, so that the full flux of the source cannot be employed. Over-exposed CCD pixels bleed charge into adjacent pixels. A central stop can mitigate the problem, but the powers of ptychography to fill in missing pixels are at their weakest when the probe function in reciprocal space (in this case, an Airy disc) is so narrow, as we discussed in Sect. 17.5.6. Losing this low-frequency information leads to unwelcome large-scale distortions in the image.

5.11 Probe Reconstruction in Fresnel Configurations

Many of the configurations discussed have been described in terms of the detector lying in the Fourier domain. In fact, it is often convenient to place the detector, or its conjugate equivalent, nearer to the object. So, for example, in the case of SAP, the diffraction lens can be defocused to avoid the high-intensity zero-order diffraction peak.

One may suppose that the Fresnel integral must be used in the reconstruction process and that, consequently, the exact distance from the object to the detector must be known. In fact, if the reconstruction simply assumes the detector is in the Fraunhofer plane, the object function appears as usual. However, the probe function will have a phase curvature over it, with a radius equal to the object to detector distance. Without deriving the reason for this formally, we observe that the phase has the effect of a computational lens, steering parallel beams (which correspond to the Fourier integral) to a focus on the detector. The approximation is only true for small scattering angles, but is another example of how ptychography can self-calibrate.

6 Volumetric Imaging

A two-dimensional picture of an object is good, but imaging it in three dimensions is greatly more informative. In the field of biological imaging, cell colonies grown on a flat piece of glass cannot possibly satisfactorily model their development in a natural three-dimensional tissue structure. There has, therefore, been a huge investment in developing reliable volumetric imaging methods, most notably in the visible light domain with confocal scanning microscopy. This is now the workhorse of many biological studies. The ability to label and map the distribution of individual proteins is a powerful component of the technique, allowing detailed studies of how genetic information is expressed within different parts of a cell.

Materials science also has a pressing need for three-dimensional information. One of the biggest weaknesses of electron microscopy has historically been the projection effect. All the three-dimensional information in the object is concertinaed into a two-dimensional image, rather like a shadow image. Electron tomography add-ons are now supplied by most electron microscope manufacturers for moderately low resolution reconstructions, and the most recent research now demonstrates atomic resolution in 3-D, which is as much as can ever be hoped.

There are two very different ways of undertaking 3-D imaging via ptychography, which we discuss in the next two sections. The first is an extension of conventional tomography, which puts together many ptychographical images recorded at different object rotations: we call this ptycho-tomography. Alternatively, a single data set is used: the probe is scanned as usual but without rotating the sample. The reconstruction procedure is then via a multislice update, wherein the propagated wavefront through layers of the object is reconstructed for every probe position and every layer in the object. Unlike ptycho-tomography, this multislice method can account for multiple scattering in the object.

6.1 X-Ray Ptycho-Tomography

To date, ptychography has had its biggest scientific impact in volumetric imaging at high resolution: x-ray ptycho-tomography. Its first application [17.20], see Fig. 17.46a,b, was at hard x-ray wavelengths, for which it is ideally suited. Hard x-rays can penetrate thick objects, which is clearly good for tomography. They can also pass through air without creating too much unwanted scattering, unlike soft x-rays, where the object must be in vacuum or close to very thin transparent windows upstream and downstream of the object, which themselves create unwanted scattering. However, at high energies x-rays often pass through an object with very little absorption. It so happens that the real component of the refractive index of many materials at these energies (which induces a phase change in the x-ray beam) is much larger than the imaginary component (which determines absorption). The image phase of a ptychograph is, therefore, of a much higher contrast than the conventional absorption signal. Even better is the fact that phase accumulates linearly as a photon passes through an object, and the rate of that accumulation is related to the refractive index, which is material dependent. This means that the phase image really is the linear projection of the matter within the specimen. All this, combined with the much-enhanced resolution of ptychography over other x-ray methods, means that there is a huge application niche for the technique in both materials science and biological science.

Fig. 17.46a,b
figure 46

The first reported example of a 3-D x-ray ptycho-tomographic reconstruction. The sample is bone. It is the phase signal that allows for the high-contrast ptychographical imaging of biological structures, which otherwise do not absorb strongly at hard x-ray wavelengths. (a) Volume rendering with the bone matrix in translucent colors: L indicates osteocyte lacunae, and C indicates the connecting canaliculi. (b) Isosurface rendering of the lacuna-canalicular network. From [17.20]

The pioneering work at the Swiss Light Source in the Institut Paul Scherrer has refined the technique, so that nowadays it is used as a routine method, which can analyze and image all sorts of materials; the list of publications dedicated to specific science problems is far too numerous to list here. One example is the rather nice series of tomographs showing the in-situ fracture of a microcomposite in Fig. 17.47a-f. Figure 17.48a-g shows a tomographic reconstruction of a significant volume of an Intel device and was a recent example at the time of writing. The extraordinary size, detail, and resolution of the reconstruction is stunning. Figure 17.49a-d shows a detector device (of the same type used to collect the data). The experimental reconstruction is so good that it looks almost like a computer aided design ( ) drawing.

Fig. 17.47a-f
figure 47

In-situ ptycho-tomography time series of the destruction under compression of a microcomposite. Reprinted with permission from [17.109], John Wiley & Sons

Fig. 17.48a-g
figure 48

Ptycho-tomography of a volume of an Intel microprocessor. (a) 3-D rendering, with detail shown in (b). (cg) Various cross-sections through the ptycho-tomograph showing various structures within the processor, such as transistor gates and connectors. From [17.98]

Fig. 17.49a-d
figure 49

Ptycho-tomography of a detail of a solid-state hard x-ray detector (d). The same type of detector was used to collect the data. (a) Schematic of the circuitry. (b) Blueprint of the circuitry—compare with (d). (c) Detail of the gate structure. From [17.98]

Ptycho-tomography encounters all the usual problems of tomography, such as registration of successive projections. Thermal instabilities inducing specimen drift during a long scan can mean that a poorly mounted specimen moves out of the field of view, etc. When very high resolution is required, these issues are best addressed by investing in very high quality stages with laser interferometric feedback. A computational complication is that a thick object will induce phase wraps in the image. This is a dramatic nonlinearity within an otherwise excellent linear signal. Luckily, there are numerous ways of handling phase wraps; it is a very large field in its own right, but care must be taken.

6.2 Multislice Reconstruction

In everything we have discussed so far, the object function has been modeled as a two-dimensional transmission function. So, for example, in hard x-ray ptycho-tomography, any one ptychograph is treated as a projection of the electron density through the whole (thick) object onto a two-dimensional surface normal to the incident beam; an assumption that is implicit in the back-projection methods used in the tomographic reconstruction. Similarly, a thin, weakly scattering object in transmission electron microscopy is accurately approximated as a 2-D projection, constituting an integral of the 3-D atomic potential of the object along the direction of the optic axis. Indeed, for a long time, the projection effect in TEM was one of the technique's key weaknesses, in that a detailed understanding of an atomic arrangement, say occurring at the interface of two crystallites , could only be easily gained if the pertinent feature repeated itself along the beam direction. Atomic scale tomography (Chap. 15) is nowadays making significant progress in tackling this problem.

The 2-D approximation breaks down for two reasons. The first arises from the geometry of the rays scattered by features in the object that lie at the same \(x,y\) point in the 2-D plane of the projection but are separated in the \(z\)-direction, parallel to the optic axis. With reference to Fig. 17.50, the 2-D approximation assumes the diffraction pattern at a particular angle arises from the path difference (and, hence, phase difference) between any two points in the object plane (like points B and C). When these are separated along the optic axis (A and B), an extra path difference is introduced, shown as \(\Updelta p\), meaning that the 2-D Fourier transform can no longer be used to calculate the diffraction pattern. The effect can also be thought of in terms of the curvature of the Ewald sphere in reciprocal space [17.110].

Fig. 17.50
figure 50

In forming the Fourier integral, parallel rays from a single surface of the object are summed (e. g., points B and C). When the object is thick, rays from points in the same (\(x,y\)) position (A and B) have an extra path length introduced, indicated by \(\Updelta p\). The geometry is best handled by computing the scattered amplitude of where the Ewald sphere cuts the 3-D Fourier transform space of the whole object, at least in the first Born approximation

A second effect is multiple scattering (or, in the parlance of electron microscopy, dynamical scattering). The mathematics of ptychography, which has no constraints on the form of the specimen function or the illumination function, can deal with an arbitrarily strong 2-D object (i. e., one with very strong phase and modulus changes within it). Strong phase can be represented by a Taylor series expansion of \(\mathrm{e}^{\mathrm{i}\phi}\), which leads to a diffraction pattern that can be formulated as multiple convolutions, equivalent to multiple scattering [17.111]. However, in practice, strong phase requires a substantially thick object. The geometric and multiple scattering 3-D effects then become intermixed so that the exit wave bears little or no relation to the projection of the object. This is particularly problematic for electrons, which for many materials of interest scatter very strongly.

There are various ways of calculating the effects of thickness -induced phase changes and multiple scattering. One of the most common and flexible approaches used in electron microscopy is the multislice method originally proposed by Cowley and Moodie [17.112]. In this, the 3-D object is represented by a series of 2-D slices lying normal to the optic axis. The layers are assumed to be transmission functions (like those in everything we have discussed so far) that are thin enough to satisfy the two-dimensional approximation. The incident wave forms a product with the first layer in order to calculate an exit wave from that layer. The exit wave is then propagated, via the Fresnel integral, angular spectrum method, or similar, to the second layer, where it forms a new incident wave. The exit wave from the second layer is the product of its transmission function with this new incident wave. The process—product of incident wave times transmission function, propagation, product of new incident wave on the next layer, etc.—is used through the whole specimen. The Fresnel propagations account for the geometric breakdown of the two-dimensional approximation, and the serial scattering from each layer accounts for multiple scattering.

However, this technique is only appropriate for a forward calculation; given a model object, we can use it to calculate what the exit wave will look like. In electron microscopy, much work has been spent altering specimen models in order to find a good match with the measured bright-field high-resolution image, which itself is an interference pattern altered by the additional effects of the transfer function of the lens. Even if the exit wave can be measured in modulus and phase, say via a through-focal series, no one image can be inverted directly to give the 3-D object; a 2-D image does not have enough measurements in it to solve for all the many layers of the 3-D object.

The same does not apply to ptychography, where it is now well-established that the data collected in a single ptychographic scan can, surprisingly, solve for many 2-D layers within the object, at the same time removing multiple scattering effects and calculating the evolution of the incident radiation as it propagates through the object [17.113, 17.114, 17.115, 17.60]. Once again, this is possible because of the enormous diversity in ptychographic data.

We note that the probe is localized and so is necessarily composed of a sum of incident plane waves, which have a significant range of incident angles (\(k\)-vectors). A simple ray diagram, illustrated in Fig. 17.51, suggests that as a defocused STE/XM probe is scanned laterally, features in the object at different depths will appear to move over the shadow image at different rates. In reality, for finite wavelength, interference effects dominate the diffraction plane, and in real-space the probe can have a very complicated wave structure. However, this model illustrates that 3-D information affects the recorded data, and so, in principle, can be extracted from it. Ptychographical translation diversity also means that we get a different exit wave for each probe position, unlike the single exit wave in conventional imaging. If the step size (sampling) in real space is small, there exist hundreds or thousands of exit waves to process; there is plenty of data to provide multiple slices in the object.

Fig. 17.51
figure 51

Simple ray diagram illustrating why ptychography (probe movement) encodes 3-D information. For a convergent probe (or any localized probe) features in the object cross the shadow image at different rates (for a constant probe shift speed), according to their depth in the object. In reality, wave interference effects greatly complicate the diffracted information, but the latter is still encoded with similar information

The first algorithm to demonstrate multiple-layer reconstruction computationally reversed the forward multislice calculation [17.60], as shown in Fig. 17.52. To start, the forward calculation is carried out, but each incident and exit wave from each layer is stored for later use. There is a running estimate of each layer of the object and also of the probe incident upon the first layer. After undertaking the forward calculation to give an estimate of the diffraction pattern, the detector modulus constraint is applied as usual. Backpropagation gives us a new estimate of the exit wave from the last layer. The last layer of the object is then updated as usual for the two-dimensional case (17.6) or (17.9), except the role of the probe is replaced by the incident wave at the last layer calculated from the forward calculation. This incident wavefunction is then also updated as if it were the probe, and backpropagated to the second from last layer, where the procedure is repeated using the stored incident and exit waves at that layer from the forward calculation, and so on and so forth. Finally, the actual probe function incident on the first layer is updated and used for the incident wave at the next probe position to be processed.

Fig. 17.52
figure 52

The inverse multislice method. At each layer of the specimen, the incident wave from the previous layer is treated in the same way as the probe in a 2-D ptychographical reconstruction. The forward calculation (green pointers) proceeds as usual. The inverse calculation uses the normal update of object layer and incident wave, at each layer. The updated incident wave is backpropagated to be used in the update for the previous layer, etc.

Figure 17.53 shows a visible light example of a 3-D reconstruction through slices of a root. It compares favorably with the confocal microscopy image of the same object. The data reconstructed 34 layers of the object, each separated by \({\mathrm{2}}\,{\mathrm{\upmu{}m}}\). Only five of the reconstructed slices are shown. In generating such an image, the algorithm had to calculate two images (modulus and phase) for each layer, two images for the probe, and two images for the exit waves from each layer; i. e., 138 two-dimensional images from one ptychography experiment. (We note that the incident waves are uniquely defined by propagation from the previous exit wave, so they do not constitute independent variables.) Ptychography is, indeed, a very information-intensive technique. On the other hand, we know that two lenses in the confocal configuration can obtain all this information; ptychography just happens to do it in a computer. Similar results have been obtained in x-ray ptychography  [17.116].

Fig. 17.53
figure 53

(ad) Selected slices from a ptychographical multislice reconstruction of an embryonic root tip. (eh) Comparison with conventional confocal images of the same slices. Ptychography does not require the specimen to be labeled or stained. From [17.115]

Fourier ptychography (Sect. 17.5.2), which of course contains identical 3-D information, is usually thought of as solving for the diffraction pattern lying in the back focal plane. The multiple layers cannot be solved for there because as the illumination is tilted, the Ewald sphere rolls through reciprocal space, so the diffraction pattern changes as it is moved. However, the lens and aperture transfer function can be regarded as a propagator between the exit surface of the object and the detector plane (the image). Diversity arises from the different incident waves' angles, so that the inverse propagation gives an equivalent result.

In what we have described, both the forward and back propagation depends on knowing, or estimating, the separation of the layers and the refractive index of the free space between them. If either of these is wrong, the propagation integrals give the wrong wavefunctions, and so the reconstruction algorithm does not converge. However, these can just be put into the algorithm as another set of free variables, as shown by [17.117] and illustrated in Fig. 17.54a-f in the case of multislice x-ray imaging.

Fig. 17.54a-f
figure 54

X-ray 3-D multislice reconstruction. All are phase images. (a,c) Reconstruction of two layers, their separation is assumed known and fixed. (e) Plot of the separation as a function of iteration (fixed). (b,d,f) These are similar, except here the separation is also recovered as a free variable, greatly improving the reconstruction. Reprinted with permission from [17.117], The Optical Society

This particular multislice formulation also does not account for backwardly propagating waves that have been reflected off the layers; forward-only scattering is a good approximation for the behavior of high-energy electrons and x-rays but not for visible light. Ever more comprehensive search algorithms within larger solution spaces may accommodate these issues.

The depth resolution of the technique clearly depends on the angles subtended at the specimen by the illumination pupil and the angular size of the detector, but it is also affected by the strength of the scattering from one layer to the next. A strongly scattering layer increases the range of incident angles upon the next layer, and hence the potential lateral and depth resolution. A weakness of the approach is that because ptychography relies on coherent wave inference, the 3-D transfer function in reciprocal space is doughnut shaped; at high or low resolution, the depth resolution is very small [17.114]. This performance compares poorly with the transfer characteristics of confocal microscopy, where the contrast mechanism arises from incoherent fluorescence. Intensity in real space means that the transfer function in reciprocal space is the autocorrelation of the coherent transfer function, which has the effect of filling in the missing low frequencies, enhancing both lateral and depth resolution. There has been work on incoherent optical Fourier ptychography using structured illumination [17.92], which could be a truly revolutionary development.

A potentially important application of multilayer reconstruction using visible light is to image large biological cells, or clusters of cells, without having to kill or stain them; in ptychography strong contrast arises from the real part of the refractive index, which is expressed in the phase of the transmission function. This could be useful for, say, checking the viability of human embryos before implantation. X-ray imaging is less dependent on the breakdown of the projection approximation because the scattering angles involved are very small, and so the depth of the field is generally much larger than the thickness of the object. Reversing and removing multiple scattering effects in electron microscopy via ptychography could represent a major breakthrough, overcoming one of the biggest limitations of imaging with electrons, although whether this will be possible remains to be seen. We note that the WDD method can also extract depth information, but this has only been demonstrated for weakly scattering objects [17.9]; see Sect. 17.10.6, 3-D Imaging.

7 Spectroscopic Imaging

One of the most common and useful ways of mapping elemental distributions in specimens is to collect the fluorescent x-ray spectrum from the object while it is being irradiated by a scanned focused probe of high-energy electrons or x-rays. As long as the incoming beam has sufficient energy, it can eject inner electrons from the specimen atoms. Electrons that then fall into the resulting empty core state can irradiate x-rays which have characteristic energies specific to the particular element. This fluorescent signal is incoherent, and so it cannot be used in conventional ptychography, although see [17.92].

However, we can plot the distribution of a given element using coherent ptychography if we take two images, one above and one below the absorption energy of the core state. Figure 17.55a,b shows an example taken of a fibroblast cell that has cobalt ferrite nanoparticles within it [17.73]. These are not visible in the image taken at \({\mathrm{703}}\,{\mathrm{eV}}\), below the absorption edge, which is at \({\mathrm{710}}\,{\mathrm{eV}}\), but are visible in the image taken above the absorption edge, at \({\mathrm{712}}\,{\mathrm{eV}}\). Interestingly, the phase of the absorption can also be measured. Figure 17.56 shows an example of a very high resolution map of two separate iron compounds within a particle, scanned as a function of energy [17.80]. Each point in the image has a different spectral response. Because the shape of the absorption lines depends on the local bonding environment of the iron, the authors were able to map the relevant compounds using principal component analysis. The authors compare the resolution of the same type of analysis undertaken with a focused probe STXM with a \({\mathrm{25}}\,{\mathrm{nm}}\) optic. The resolution of the ptychographic chemical map is estimated to be \({\mathrm{18}}\,{\mathrm{nm}}\), compared with \({\mathrm{70}}\,{\mathrm{nm}}\) for the STXM data.

Fig. 17.55a,b
figure 55

Soft x-ray phase ptychographs of a Balb/"​"​3T3 mouse fibroblast, marked by \(\mathrm{CoFe_{2}O_{4}}\) particles, taken (a) below (at \({\mathrm{702.8}}\,{\mathrm{eV}}\)) and (b) above (at \({\mathrm{711.8}}\,{\mathrm{eV}}\)) the absorption edge of Fe. The distribution of iron is clearly visible in the latter. From [17.73]

Fig. 17.56
figure 56

High-resolution x-ray ptychographical chemical mapping of \(\mathrm{FePO_{4}}\) and \(\mathrm{LiFePO_{4}}\) in a small particle. By taking images at different energies, the loss peaks (a) can be used in a principal component analysis to map the two compounds (b). From [17.80]

8 Mixed-State Decomposition and Handling Partial Coherence

We saw in Sect. 17.4 that a typical ptychographical data set is extremely rich in diverse information. This can be used to correct many imaging parameters automatically. In Sect. 17.5.2, it was found that we could extract even more information. Provided the sampling in both real and reciprocal space is dense, so that the minimum sampling condition defined by (17.7) is well surpassed, we have seen that we can solve for dozens of 2-D layers through the object thickness.

Thibault and Menzel [17.50] proposed one of the most important extensions for the use of information diversity in ptychography. An assumption of the phase problem is that when we measure the intensity of a pixel, it has associated with it one modulus and one lost phase. The pixel has to be small enough so the wave does not vary substantially across its width, i. e., the sampling condition is fulfilled. Yet what happens if two separate noninterfering waves (i. e., ones that are incoherent with respect to one another) are incident on the detector? We only measure one intensity, but now we have lost the two phases, and, even worse, the two moduli as well. We seem to have four unknowns where before we had only one unknown. In fact, in this case we have only three unknowns, because we know the intensities of the two moduli must add up to the measured intensity, a piece of information that will be key.

There are many situations where this occurs in practice. X-ray and electron sources are mostly incoherent across their physical width in the plane of their emission. However, a long way from a small incoherent source, the wave becomes substantially spatially coherent. A star is a huge incoherent source, but seen from earth it twinkles coherently, a result of the Van Cittert–Zernike theorem. Good coherence requires the source to be a very long way from the experiment, but then flux per unit area is low, so we must balance our desire for as much spatial coherence as possible with the competing need for as much flux as possible. Inevitably, there will always a small degree of partial coherence in our wave experiments.

It is not only a diffracted wave that can be a source of incoherence. Vibrations in the specimen or any part of the instrumentation can be equally harmful. These are more generally called state mixtures. Our detector is sampling many different configurations of the experiment during the time it takes to make an exposure. This is equivalent to adding together (incoherently) the coherent waves that would have been scattered from all the different states in the system during the measurement time.

Coherence theory is a large subject area in its own right. One can consider any two points in a wavefield. Each oscillates in time. If they oscillate in perfect synchrony (though usually with different phase), then they are coherent. If there is no correlation between their disturbances, they are wholly incoherent with respect to one another. The general situation lies between these extremes; there is some statistical correlation, but it is not perfect. The coherence function describes the degree of correlation between pairs of points in the wavefield, but this can be an awkward way of analyzing the effects of partial coherence. Wolf [17.118] suggested a different approach, widely adopted in practical situations. The wavefield is decomposed into a set of modes, each of which is entirely incoherent with respect to any other mode. The modes do not interfere with one another, but can be treated separately, each propagating through the optical system independently. State mixtures in the object and the detector, or any part of the optical system, can also be treated as modes.

An example would be modeling the effects of partial coherence caused by having a finite source. The source can be divided up into points, each of which is perfectly coherent. Each source wave (mode) is propagated through the whole optical system to the detector where its intensity is added to the intensity of the other waves that arrive at the detector from all the other point sources. This process might blur the intensity at the detector because the extended source has induced significant incoherence into the experiment. However, if we choose our points on the source to be very close to one another, and the whole source is small, the intensity at the detector from the different points might be, for all intents and purposes, identical. This means that all these different modes are so similar to one another they may as well be treated as one mode. In general, we can decompose a partially coherent wavefront into as many modes as we like, but this is not an optimal representation of its coherence properties. The modes we will talk about here have been orthogonalized with respect to one another. This can be thought of as a sort of principal component analysis, minimizing the number of modes we need to describe the system completely.

In quantum mechanics, the density matrix is used to handle mixed states. In any particular representation, the usual single-state operators (for energy, position, momentum, etc.) can operate on it. To find the expectation of a particular measurement, the trace of the resulting matrix is formed, which is simply a way of calculating the total probability (expectation value) of making a measurement when two or more states that are incoherent to one another are present in the same experiment. Diagonalizing the density matrix is equivalent to finding the set of incoherent states that are orthogonal to one another. A pure state then has only one entry of unity in the density matrix. This type of analysis is now very common in the field of quantum computing, where the decoherence of a wavefunction limits the capability of a real-world quantum computer. Thibault and Menzel wrote their paper casting the ptychographic incoherence problem in these terms. In fact, actually undertaking a multimodal decomposition in ptychography is computationally very easy, and the process is quite intuitive, as we hope to show below, so a reader not familiar with quantum mechanics need not worry about understanding the process from this perspective.

8.1 Visible Light Model Example

We start with a simple experiment using visible light. With reference to Fig. 17.57, we undertake a ptychography experiment where three completely different wavelengths of light (green, blue, and red) illuminate the object simultaneously. These three wavelengths are incontrovertibly incoherent with respect to one another. We have one specimen object, but the different colors of light will be absorbed differently in different regions of the specimen if it has any color differences within it. To ensure this is the case, the specimen is composed of an artificially manufactured projector slide that has been specially prepared; it consists of three superposed images, each of a different color (Fig. 17.58). (Ideally, the pigments used for the three colors would each absorb one, and only one, of the three incident light wavelengths, but this has not been achieved perfectly in this experiment.)

Fig. 17.57
figure 57

Example of ptychographic multiplexing. Three distinct wavelengths of light are incident simultaneously. The detector is only sensitive to the total summed intensity. BS labels beam splitters; GL, BL, and RL labels green, blue and red lasers, respectively. Reprinted from [17.54], with permission from Elsevier

Fig. 17.58
figure 58

The test specimen used in Fig. 17.57, an old-fashioned projector slide consisting of three super-posed images, each of a different color. The dyes do not absorb perfectly at the laser frequencies in Fig. 17.57, so there is cross-talk in the reconstructions in Fig. 17.60. Reprinted from [17.54], with permission from Elsevier

So, we have three ptychographical experiments going on simultaneously. Each color of light sees a different sample. The different colors of light also interact with the illumination-forming optics in different ways (diffracting by different amounts before they reach the sample), so that we also have a different probe function for each color of light. However, we can solve for all of these functions, three objects and three probes, using the diversity in the ptychographic data, despite the fact that the intensity from each experiment is collected on the same detector all at the same time. (The detector is color insensitive, it simply measures the total power of light incident upon it.)

At first solving for all six functions from this scrambled up data set sounds impossible. Surprisingly, we just have to make one minor change to any one of the common iterative reconstruction algorithms. First, we set up and run three reconstruction iterations simultaneously, each one solving for their respective object and probe functions. The only difference is when we come to applying the detector intensity constraint. We do not know the intensity (and hence modulus) of any one of the color signals at a particular detector pixel , but we do know the total intensity that they all add up to.

In Fig. 17.59, the height of the two columns represents intensity. The first column is the estimated intensity that has come out of our forward calculations (at B from A in Fig. 17.5). The three simultaneous forward calculations have given us three estimated moduli, which have been squared and added together. Each forward calculation also gave us an estimated phase. The height of the column on the right-hand side is the measured total intensity. To apply the constraint, we maintain the ratio of intensities of each color in the estimated intensity, but scale them uniformly to fit the measured data. We now have three new moduli estimates, each the square root of their scaled intensity estimates, plus the three phases that came out of the separate color iteration loops. These are fed back into their respective iterations at C in Fig. 17.5. Amazingly, after running the iterations as usual, the three reconstructions appear from their respective iteration loops. It helps if the starting estimates of the probe or objects are slightly different so that they can diverge into the separate solutions, but we do not need to know whether those estimates have anything to do with the real functions; it is just an effective way to seed the three separate reconstructions. The form of the constraint being applied—that the sum of the calculated intensities must equal the measured intensity—just has to be true when the solution is correct. Diversity in the data (assuming there is enough) drives the algorithm to that solution.

Fig. 17.59
figure 59

Graphical illustration of the detector intensity constraint when more than one mode is present in a ptychography experiment

Figure 17.60 shows the three object reconstructions. Note that there is some cross-talk between the images but that is because the dyes in the object slide do not absorb at one wavelength exclusively, so some of the structure from one wavelength is expressed slightly in one of the other images. Even so, they are convincingly separated. Interestingly, each image and each probe reconstruction comes out a different size in their respective object arrays. This is because of the wavelength in (17.1). The detector pixels are the same physical size for all the reconstructions, so the wavelength changes the magnification in the reconstruction array. This has been adjusted for in Fig. 17.60.

Fig. 17.60
figure 60

Reconstructions relating to the object in Fig. 17.58. Reprinted from [17.54], with permission from Elsevier

8.2 X-Ray Illumination Modes

Most x-ray synchrotron beamlines have some partial coherence within them, no matter how carefully the optics is arranged. Even if the coherence width at the final slits lying upstream of the experimental set up is estimated to be entirely coherent according to the van Cittert–Zernike theorem, vibration in any intermediate optical element, for example the monochromator, can substantially reduce the effective coherence.

Unlike the light example given in the previous section, under most normal circumstances the object function is fixed, however, the partial coherence is equivalent to having multiple modes in the illumination. In Fig. 17.61 we show a multimode decomposition of an x-ray probe in the defocused condition (Sect. 17.5.5). The reconstruction proceeds in exactly the same way as before. In this case, eight parallel iterations have been undertaken. There are an infinite number of ways that these modes can express themselves, each being a different representation made up of any linear combination of the intensities of wavefunctions that might, or more likely, might not be orthogonal to one another. The underlying wavefunctions can be orthogonalised using the standard Gram–Schmidt approach, but this is not fundamental insofar as we can choose any arbitrary vector with which to start the Gram–Schmidt process. Diagonalization of the density matrix, which can be computationally undertaken with principal component decomposition, does give a unique and the most compact representation of the modes.

Fig. 17.61
figure 61

Orthogonal modal decomposition of partially coherent hard x-ray illumination. From [17.79]. ©IOP Publishing. Reproduced with permission. All rights reserved

As a consequence of the circular path of the high-energy electrons, the source in a synchrotron appears wider in the horizontal plane than in the vertical plane. As expected, we, therefore, see more lateral modes than vertical modes. Lateral incoherence appears as vertical fringes in the modal structure, because of the Fourier relationship between coherence and source width. In fact, the defocused probe is not exactly in a Fourier relationship to the source, but the effect is the same. These results were obtained from a beamline that we had every reason to believe was fully coherent; the number of modes, therefore, came as quite a shock. It turned out that unbeknownst to anyone, the monochromator had a vibrational instability. The moral is: always perform a modal decomposition on all data that have any possibility of including partial coherence.

How many modes should you include in an illumination modal decomposition? In the computer, you can declare as many modes as you like, but once they are orthogonalized only a few should have any significant power. Of course, you cannot solve for more modes than you have numbers in your data, so at some point the higher-order mode structures will disintegrate. Note that when you add up the intensity of all the modes, they must be the same as the total intensity of the probe. If the underlying complex modes are normalized, then the diagonal terms of the density matrix represent the probabilities, or weights, of how the state mixture has been prepared. The sum (trace) of these is always unity because they are probabilities. The sum of the squares of the probabilities is a tidy measure of coherence: unity corresponds to total coherence (only one coherent state in the system); anything less is a measure of the degree of partial coherence.

Finally, note that the patterns in Fig. 17.61 have no actual physical meaning, they are simply the lowest rank-representation of the coherence of the experiment. However, to conserve computing power, one would not want to run more parallel probe estimates than are necessary in the reconstruction process, so in this sense, the orthogonal representation has practical value.

8.3 Electron Modes

Matter waves can also be decomposed into modes in exactly the same way, as shown in Fig. 17.62. These data were collected on a TEM operating in the SAP mode (Sect. 17.5.3). The source profile, as seen looking up the column from the detector plane, can be calculated by backpropagating each complex mode, which are here lying in the image plane confined by a selected area aperture, back to the source plane via a back Fourier transform. The intensities of each mode are then added together to produce an estimate of the source, also shown in Fig. 17.62. Adjusting the condenser alters the source size, an effect that can be seen in both the mode reconstructions and source shape reconstructions. Even though the column was aligned, we note that the cold field-emission source is not perfectly round. In should be emphasized that these source intensity plots do not show the physical shape of the source, but rather its shape as seen through the selected area aperture. The image is diffraction limited because the total wavefunction relating to the source is truncated outside the aperture diameter.

Fig. 17.62
figure 62

Modal decomposition of a propagating partially coherent electron wave for two different spot sizes (apparent source size). The diffraction limited source, as seen backwards through the microscope, is shown on the right. Reprinted with permission from [17.51]. Copyright 2016 by the American Physical Society

8.4 Mixed-Object State

Figure 17.63a-d is taken from the original Thibault and Menzel paper [17.50] and demonstrates that this mixed-state concept also applies to the object function. In this model calculation, each gray square represents a spin than can interact ferromagnetically or antiferromagnetically with its immediate neighbors. The system is in a temperature bath, enough to overcome the average bond energy so that the spins flip up and down randomly. The modeled ptychography experiment integrates data from all the oscillating spins over a longer time interval than it takes for them to flip. A phase change is expressed in the transmitted wave according to whether the individual spin is up or down.

Fig. 17.63a-d
figure 63

A model calculation showing how an object that has mixed states can reveal correlations in those states in a ptychographic reconstruction, despite them oscillating at a much higher frequency than the exposure time of each diffraction pattern. Energy couplings for the red and blue bonds in (a) are in the opposite sense. The recovered images have probability distributions for the red and blue boxes in (b), as expected (c,d). From [17.50]

After the mixed state decomposition, the principal modes can be extracted from the data showing that, at least on the scale of the probe diameter, the relative probabilities of the adjacent spins matches what would be expected at this temperature. This type of analysis is not dependent on the speed of the state changes relative to the integration time of the experiment, which means that, in theory, it could be applied to very high frequency phenomena, such as, in the case of electron ptychography , coupled bonding effects in an array of atoms. This may become a truly powerful experimental technique.

8.5 Upsampling

One odd implication of (17.7) is that the sampling condition in ptychography is not dependent on the probe size. What does that mean if we have a large probe but only a few pixels in the diffraction plane? In Sect. 17.10.4, we will discuss direct ways of using very large pixels (sector detectors ) to solve for the object, but these techniques rely on a highly focused (very small) probe made by a perfect lens. The specimen must also be very weakly scattering. If the probe is large, large detector pixels cannot sample the rapid intensity variations that arise in the diffraction plane.

In order to exploit very dense sampling in real space, even though the detector pixels are larger than the features caused by a large probe, we need to up-sample the big pixels [17.119]. During the reconstruction, this involves declaring an array size in the detector plane that would, indeed, satisfy the sampling condition given the size of the probe. Supposing we now have \(3\times 3\) pixels that fit into each big detector-sized pixel. We treat each computational pixel as a separate mode, running nine concurrent reconstructions. The detector constraint is applied as before: after each forward calculation the modulus is changed according to the scaling of intensity illustrated in Fig. 17.64. In this way, we reconstruct an artificial detector sampling that does fulfil the probe-size constraint. Doing this for data that is believed to be properly sampled can also be beneficial if there is a type of partial coherence in the beam that expresses itself as a convolution of the diffraction pattern. The method can also remove the MTF of the detector.

Fig. 17.64
figure 64

Example of up-sampling. The diffraction patterns in the top row have had the lower left quadrant expanded, so as to show the process more clearly. Lower images correspond to reconstructions from the upper diffraction patterns. From left to right we have: the original data and their reconstruction; the original data up-sampled by \(2\times 2\); the original data binned by \(3\times 3\); the binned data to the left upsampled by \(12\times 12\). The best reconstruction comes from upsampling the raw data, possibly because there is some incoherence in the data. Reprinted with permission from [17.119]. Copyright 2014 by the American Physical Society

Although up-sampling works, it should be avoided if possible, because there is bound to be a degradation in the results. The reconstruction of all the upsampled pixels relies on the tiny changes of intensity in the big detector pixels that occur as the large probe is scanned in small steps across the object; data that are easily lost in noise or the finite bit depth of the detector. Figure 17.52 shows an example of up-sampling x-ray ptychography data.

8.6 Other Uses: Detector Noise, Diffuse Scattering, Continuous Scans

Modal decomposition can be used for other things. A free mode (one that is not constrained by orthogonalization) can be used to dump any intensity from the detector that is inconsistent over the whole data set. For example, if the detector has a pedestal—a constant offset or background noise—the inversion will try to put a delta function (the Fourier transform of a constant function) somewhere into the field of view of the object reconstruction. An incoherent mode can accommodate this. All the intensity that is inconsistent in any way with the forward calculation will be dumped into it. Scattering by air in a hard x-ray ptychography experiment, or inelastic scattering in an electron experiment, will also be expressed in the mode, but in this case, unevenly distributed over the detector. Dealing with this class of problem in a more controlled way, say by calibration or self-calibration of the detector, is probably a better approach.

A continuous line scan, in which data collection is speeded up by constantly taking exposures as the object is moved continuously across the probe, or vice versa, can also be handled by modal decomposition. Each exposure occurs over a blurred track of probe positions—i. e., a combination of several probe positions—each of which can be treated as an illumination mode [17.53].

9 Theory of Iterative Methods for the Ptychographic Inverse Problem

In Sect. 17.3, we surveyed the wide and growing range of ptychographic algorithms reported in the literature, and in Sect. 17.4.8, we examined the remarkable scope for exploiting the redundancy in ptychographic data through clever expansions of the original ptychographic phase problem. In this section, we will look in greater detail at how the most popular iterative ptychography algorithms work, and how they can be implemented on a computer.

9.1 The PIE Familyof Algorithms

Section 17.3.4 introduced the PIE algorithm and explained its operation. It turns out that the reasoning behind that original formulation can be readily expanded to arrive at a whole class of algorithms that work in a similar manner. Returning to (17.6), reprinted below as (17.8), we saw that the core of the PIE algorithm was the object update function

$$\begin{aligned}\displaystyle q_{\text{NEW}}&\displaystyle=q+\frac{|a|}{|a_{\max}|}\frac{a^{*}}{(|a|^{2}+\varepsilon)}(\psi_{\text{NEW}}-\psi_{\mathrm{e}})\\ \displaystyle&\displaystyle=q+w\Updelta q\;.\end{aligned}$$
(17.8)

Here, a new object estimate, \(q_{\text{NEW}}\), is generated from the previous object estimate, \(q\), by adding a specially weighted proportion of the old and new exit waves, \(\psi_{\mathrm{e}}\) and \(\psi_{\text{NEW}}\), and dividing by the probe (with a fudge factor to avoid zero divisions). To make our discussion here clearer, we have rewritten this update in terms of a \(\Updelta q\)—the exit wave difference divided by the probe—and a weight function, \(w\). It turns out that this weighting in the update function is only one of a whole host of possibilities that can be employed to reconstruct ptychographic data [17.43].

For PIE, the weighting corresponds to the normalized probe modulus,

$$w=\frac{|a|}{|a_{\max}|}\;.$$

This works well in practice and is often used by the Fourier ptychography community, where it has been re-derived as a second-order gradient descent [17.29]. The ePIE algorithm makes a very basic change, replacing the normalized probe modulus with the normalized probe intensity, \(w=|a|^{2}/|a_{\max}|^{2}\), which has the benefit of removing the need for the zero-division fudge factor, since the \(|a|^{2}\) term in \(w\) cancels the probe division in \(\Updelta q\). The result is an alternative update function

$$q_{\text{NEW}}=q+\frac{a^{*}}{|a_{\max}|^{2}}(\psi_{\text{NEW}}-\psi_{\mathrm{e}})=q+w\Updelta q\;.$$
(17.9)

We can plot the two weightings as a function of the probe modulus as shown in Fig. 17.65. As is rather obvious, ePIE's plot is a quadratic, meaning that where the probe is intense, there will be a large weight given to the \(\Updelta q\) term in (17.9), whilst where the probe is dim, the weighting of \(\Updelta q\) will be small, and the object will change little in these regions. Equally obvious is the linear weighting for the PIE plot—again, where the probe is intense, \(\Updelta q\) is strongly weighted; where it is dim, the weighting of \(\Updelta q\) is smaller. What is not obvious from these plots is whether either of these weighting functions is in any sense optimum, and this is an open question at the time of writing. What we can say is that there are further alternatives that also reconstruct ptychographic data very well, and which offer greater scope for tuning the reconstruction to accommodate a specific experiment—for example, using different tuning parameters when the object is very weak or where the initial probe estimate is very poor. One weighting that we have demonstrated to work extremely well takes the form

$$w=\frac{|a|^{2}}{\alpha|a_{\max}|^{2}+(1-\alpha)|a|^{2}}\;,$$

where \(\alpha\) is a tuning parameter. The plots in Fig. 17.65 give a couple of examples of how this function behaves for different \(\alpha\) values—notice how the curve can be adjusted to give more or less weighting to dim parts of the probe, so the experimenter can adjust the algorithm to a lower weighting if data is very noisy, or to a higher weighting if the data are very clean. We have found in practice that an \(\alpha\) value around \(\mathrm{0.2}\) gives a considerable improvement in convergence rate over both PIE and ePIE. The object update function for this new form of weighting is

$$q_{\mathrm{NEW}}=q+\frac{a^{\ast}}{(1-\alpha)\lvert a\rvert^{2}+\alpha\lvert a\rvert^{2}_{\max}}(\psi_{\mathrm{NEW}}-\psi_{\mathrm{e}})\;.$$
Fig. 17.65
figure 65

The way PIE-type ptychographic algorithms update the object estimate depends on the probe amplitude and \(\alpha\) for a given probe position, they update the object strongly where the probe is bright and only weakly where it is dim. The exact relationship between probe modulus and update strength (\(w\)), is shown for three different algorithms. The new rPIE update can be tuned to occupy different parts of the graph by varying its tuning parameter \(\alpha\)

We call this rPIE, because it can be expressed as a regularized version of the ePIE update.

9.1.1 The Probe Update

Although ePIE used a slightly different update function to PIE, the main advance it offered when it was first suggested was to solves for the probe as well as the object. The implementation of this probe update is straightforward—simply interchange the appearance of \(a\) and \(q\) in any of the object update functions above to produce a probe update function, then apply this function after the object has been updated; for rPIE it helps to have a separate tuning parameter, \(\beta\), replacing \(\alpha\) as well.

9.2 Projection Between Sets Methods

Another way to think about the cCDI problem is illustrated in Fig. 17.66. The plane of the figure represents all the possible solution images that can exist. In reality, the dimensionality of the space is enormous, with each axis corresponding to the complex values of each image pixel. There are two sets of images lying within the colored shapes. One set is consistent with the diffraction pattern intensity; the other is consistent with real-space priors that we may know about—for our discussion we can use the aperture constraint (where the object is known to be zero outside of a known support). The loop in Fig. 17.5 alternately projects a current estimate of the solution onto the nearest point of the aperture constraint and then the detector (Fourier) constraint. It is the nearest point because the change we make in either domain is the minimum alteration we have to make to any pixel to get it to be consistent with its set.

Fig. 17.66
figure 66

Graphical illustration of projection onto sets. One set, labelled R, is consistent with all possible images that satisfy the aperture constraint in real space, the other, labelled F, with images that have a Fourier transform whose modulus satisfies the detector constraint

Even if there is only one unique solution image (the sets touch at one point—marked S\({}_{1}\)), there is no guarantee that our strategy will not get stuck jumping between the sets at the point S\({}_{2}\) in the diagram. Convergence to the correct solution is only guaranteed (and then only for perfect, noiseless data) if the two sets are convex, i. e., a line drawn between any two points within the set lies entirely within that set. Unfortunately, the phase problem is nonconvex; the steps B–C , revising the modulus of the detector wave estimates in Fig. 17.5, can be thought of as circles in the complex plane, and clearly a line between two points on the perimeter of a circle does not lie entirely on that circle.

Nevertheless, this projection between sets concept is incredibly general and can be applied to many optimization problems—it was even used by Elser to solve Sudoku puzzles [17.32]. As a result, there is a large volume of literature devoted to set projection algorithms and the analysis of their properties.

Consider next Fig. 17.67a,b. Here, we will restrict attention to two convex sets, set 1 and set 2, represented by the two black lines in the figure. We have already discussed one strategy to find the intersection between these two sets—our required solution—which is to alternately project between the two sets. The green trace in Fig. 17.67a,ba shows how this strategy bounces between the two constraint sets and staggers its way toward the intersection. Because the two constraints shown here are convex, this strategy is guaranteed to converge to the right answer, but it takes quite a large number of steps to do it, and as Fig. 17.67a,ba shows, when the sets are nonconvex, this strategy gets stuck. The difference map (DM) [17.32] is one alternative to alternating projection, and the way it spirals towards the intersection, like water down a plug hole, is illustrated by the blue trace in Fig. 17.67a,ba. (Note that DM in its most general form has a tuning parameter, but this is usually held at 1 for ptychography, under which condition DM is equivalent to several other algorithms, e. g., the Douglas–Rachford algorithm and an algorithm called averaged successive reflections ( ).) Yet another method—relaxed averaged alternating reflections (RAAR) [17.120]—is illustrated by the red trace. RAAR can tighten the spiral behavior of DM with a parameter \(\beta\). The spiralling action of these two algorithms accomplishes two things: it speeds convergence, by eliminating the zig-zagging of the alternating projections routine and it widens the accessible search space, which for nonconvex constraint sets means that they can escape the local minima illustrated in Fig. 17.67a,ba.

Fig. 17.67a,b
figure 67

Parallel update algorithms, such as DM and RAAR, can be thought of in terms of projections between sets. (a) The most simple set projection approach—alternate projections (green trace)—bounces between two constraint sets; more advanced methods spiral in to the intersection of the two sets. (b) These advanced methods consist of a series of projections and reflections between the constraints. A single iteration of DM starts at \(p_{0}\) and steps through \(p_{0}\)-\(b\)-\(c\)-\(d\); a single iteration of RAAR goes \(p_{0}\)-\(b\)-\(c\)-\(d\)-\(p_{1}\)

DM and RAAR both employ reflections as well as projections between sets. Referring to Fig. 17.67a,bb, consider the point \(p_{0}\). The projection of this point onto set 1 is \((P_{1}[p_{0}])\) and it lies at a, the nearest point on the line to \(p_{0}\). The reflection of \(p_{0}\) about set 1 is at \(b\)—it lies in the same direction as \(a\), but is twice as far from \(p_{0}\); we can express this reflection as

$$R_{1}=p_{0}+2(P_{1}[p_{0}]-p_{0})=2P_{1}[p_{0}]-p_{0}\;.$$

In terms of these projections and reflections, alternating projections can be easily summed up as \(P_{2}[P_{1}[P_{2}[P_{1}[p_{0}]]]]\) etc.

DM follows a different pattern: from the point \(p_{0}\), reflect about set 1, then reflect about set 2, then go halfway between \(p_{0}\) and the result of these two reflections. In Fig. 17.67a,bb, this is the path \(b\) to \(c\) to \(d\). RAAR adds a final step: draw a line between the points \(a\) and \(d\) and travel a certain proportion, \(\beta\), of the way along this line to find \(p_{1}\)—this is how RAAR tightens the spiral in Fig. 17.67a,ba. (Clearly, for \(\beta=1\) RAAR and DM are equivalent.) We can only really skim the surface of this fascinating topic, so we refer the reader to the extensive literature for more details.

9.3 Implementing Ptychographic Algorithms on the Computer

It would require a book in itself to describe implementation details for all of the many algorithms for ptychography; instead, the following gives a framework that the coder can extend by reference to the literature. We will first set out processes that are common to all of the algorithms, namely initializing the object and probe, forming the exit wave and propagating it, and updating the exit wave at the detector to match the measured data. From these preliminaries, focus narrows to pseudocode examples of the three algorithms discussed above—ePIE, DM, and RAAR.

9.3.1 Initialization of the Reconstruction

Some additional nomenclature, summarized in Fig. 17.68, is needed to deal with the discrete nature of the algorithms (i. e., the unavoidable fact that the diffraction data, specimen, and probe must all be represented by finite-sized matrices in the computer):

  • The diffraction patterns are assumed square and of pixel dimensions \(N\times N\).

  • The pixel pitch of the diffraction patterns (in meters), i. e., the detector pixel pitch is \(d_{\mathrm{c}}\).

  • The object reconstruction is of pixel dimensions\(K\times L\).

  • The pixel pitch of the object reconstruction is \(d_{\mathrm{o}}\).

  • The pixel pitch of the probe is the same as the object, \(d_{\mathrm{o}}\).

  • The pixel dimensions of the probe are as for the diffraction patterns, \(N\times N\).

  • Remember, there are \(J\) diffraction patterns in total, and the specimen shift when the \(j\)-th diffraction pattern was recorded is \(\boldsymbol{R}_{j}=(x_{j},y_{j})\)

Fig. 17.68
figure 68

To explain how to implement ptychographic algorithms in code, we need to define some variables, as shown here. To digitally estimate the exit wave from a given specimen position \(x\), a calculation box with the same number of pixels as the detector must be extracted from the larger object matrix

The first step in any of the algorithms is to decide on the propagator, from which the pixel pitch of the object (\(d_{\mathrm{o}}\)) follows. For the Fourier propagator the pixel pitch is

$$d_{\mathrm{o}}=\frac{\lambda z}{Nd_{\mathrm{c}}}\;.$$
(17.10)

Where \(z\) is the distance between the specimen and the detector, and \(\lambda\) is the illumination wavelength.

For angular spectrum and Fresnel propagation

$$d_{\mathrm{o}}=d_{\mathrm{c}}\;.$$
(17.11)

Having established the pixel pitch, the object matrix can be initialized. Usually, this is chosen as a matrix of 1s, representing free space , whose size is governed by the extent of the specimen shifts. More exactly, \(K\) and \(L\) are chosen as

$$(K,L)=\frac{\max(\boldsymbol{R}_{j})-\min(\boldsymbol{R}_{j})}{d_{\mathrm{o}}}+(N,N)\;.$$
(17.12)

Knowing the pixel pitch in the object matrix also allows conversion of the specimen shifts from the experiment into equivalent pixel shifts in the computer. To do this, we anchor the top left corner, pixel [1,1], as the origin and map the specimen shifts according to

$$\boldsymbol{R}_{j}^{\text{pix}}=\text{ROUND}\left(\frac{\boldsymbol{R}_{j}-\min(\boldsymbol{R}_{j})}{d_{\mathrm{o}}}\right)+1\;,$$
(17.13)

where \(\text{ROUND}(x)\) means round \(x\) to the nearest integer value. Initialization of the probe is highly dependent on the experimental geometry used to collect diffraction data. The simplest case arises when the experiment uses an aperture to form the probe—here, a disc of 1s embedded in a matrix of \(N\times N\) 0s suffices, with the disc's diameter roughly matching that of the physical aperture (once scaled from pixels to metres via (17.10)). In contrast, a convergent beam probe is easy to model, but hard to model accurately because of the difficulty in measuring the exact amount of defocus. (A defocus error in the initial probe is one situation where many reconstruction algorithms struggle for convergence [17.121] and see Sect. 17.5.5.) Assuming the defocus is known, the convergent beam probe is modeled by Fourier transforming an aperture multiplied by a quadratic phase profile. The size of the aperture should reflect the numerical aperture of the probe-forming optics, which itself can be determined from the bright-field disc in the diffraction pattern. If the diameter of the bright-field disc is \(D\) pixels, and the defocus is \(d_{\mathrm{f}}\) meters, the initial probe can be calculated as

$$F^{-1}\left[\text{circ}(D):\exp\left(-\mathrm{i}{\uppi}d_{\mathrm{f}}d_{\mathrm{c}}^{2}\frac{n_{1}^{2}+n_{2}^{2}}{\lambda z^{2}}\right)\right].$$
(17.14)

In (17.14), \(\text{circ}(D)\) is a 2-D function that takes the value 1 inside a disc of diameter \(D\) and 0 elsewhere, with \(n_{1}\) and \(n_{2}\) index pixels in the diffraction pattern space (with the origin at the center of the detector). For a probe with a diffuser in the beam path, the simplest strategy is to model an initial probe as above, disregarding the diffuser completely, relying on the reconstruction algorithm to untangle the diffuser's effect. Alternatively, if anything about the phase can be inferred (for example, a good approximation of the phase curvature at the detector plane), this approximate phase can be applied to a diffraction pattern, or an average of all of the diffraction patterns, and the result backpropagated to (hopefully) obtain a better initial probe estimate.

Regardless of how it is modeled, a useful final step in the probe initialization, as has been discussed, is to normalize the probe power to the diffraction data, by ensuring that the sum of the initial probe intensity over every pixel is equal to the pixel sum of the measured intensities in the brightest diffraction pattern.

In what has become an indispensable final initialization step, the diffraction data are transferred from computer memory onto a graphics processing unit (GPU ). To give an idea of the speed increase offered by GPU computing, a typical ptychographic reconstruction carried out with the authors' MATLAB version of ePIE takes \({\mathrm{90}}\,{\mathrm{s}}\) to complete 100 iterations using an NVIDIA Titan GPU; the same reconstruction using an i7-4770 \({\mathrm{3.4}}\,{\mathrm{GHz}}\) CPU takes \({\mathrm{868}}\,{\mathrm{s}}\). The data set in this case contained 400 diffraction patterns, each of \(512\times 512\) pixels. Optimization of the code and implementation in \(C\) gives even greater speed up.

9.3.2 Modeling the Exit Wave at the Detector

To model the exit wave leaving the specimen, for a particular specimen shift (say shift \(x\)), the calculation box (Sect. 17.3.3) must be extracted from the larger object matrix, as illustrated in Fig. 17.68; this equates to cutting out a region of pixels starting from \(R_{\mathrm{x}}^{\text{pix}}\) and extending to \(R_{\mathrm{x}}^{\text{pix}}+[N,N]\). The final step in computing the exit wave is to multiply the extracted reconstruction box, pixel for pixel, by the probe matrix.

9.3.3 Computer Implementation of the Propagators

Propagation of the exit wave to the detector plane can be via Fourier transform, the angular spectrum, or Fresnel transform. The MATLAB code that the reader may use to implement these propagators digitally is given in Fig. 17.69. This code ignores multiplicative amplitude and phase constants, which do not have an effect on the ptychographic reconstruction; a more complete discussion of modeling wave propagation in MATLAB can be found in the book by Voelz [17.122].

Fig. 17.69
figure 69

MATLAB code for propagators

9.3.4 Revision of the Exit-Wave

Replacing the modulus of the wavefront at the detector with the measured data is most efficiently achieved by dividing the propagated wavefront by its own modulus and multiplying by the square root of the measured intensity. Care should be taken to avoid division by zero if this approach is adopted—for example, by adding a small number to the modulus before the division as in Fig. 17.70 (eps is the smallest number MATLAB can represent).

Fig. 17.70
figure 70

MATLAB code for propagators

9.3.5 PIE-Type Algorithms

After the preliminaries given above, the PIE-type algorithms can be written in just a few lines of code. As an example, in Fig. 17.71 we give the pseudocode for implementation of rPIE; changing the update function to realize ePIE or PIE is straightforward. One caveat: the code in Fig. 17.71 assumes a sequential order in which to address the diffraction patterns, in practice it is better to randomize this order, and re-randomize it after every iteration.

Fig. 17.71
figure 71

MATLAB code for rPIE

9.3.6 Set Projection Algorithms

Looking back to Fig. 17.67a,b, in order to implement the set projection methods, we need to define the two sets that represent the ptychographic problem, as well as the two projections onto these sets. The first set, set 1, represents the detector constraint we have already discussed; it is the set of all exit waves that have the correct (measured) modulus at the plane of the detector. We project onto this set by taking the current estimates of the object and probe, and for each specimen shift forming the exit wave (\(a\,q\)), propagating, correcting the modulus and propagating back. This is accomplished as shown in the pseudocode in Fig. 17.72.

Fig. 17.72
figure 72

Pseudocode: parallel Fourier constraint

The second set, set 2 in Fig. 17.67a,b, represents the set of exit waves for which the probe and object are consistent. We illustrated this concept in Fig. 17.8a-d; the overlap between the regions of the object illuminated by the probe during the experiment links together the exit waves, because we know they must have been formed in the experiment by an unchanging object and a static probe. (We have seen that these assumptions can be somewhat relaxed in practice.) Projection onto this consistency set is via the probe and object update functions, which for the set projection methods take on a slightly different form to those of the PIE-type versions. Figures 17.73 and 17.74 present pseudocode outlines of these updates, which are applied one after the other to implement the projection, with the updated object feeding into the probe update.

Fig. 17.73
figure 73

Pseudocode description of the object update used by DM, RAAR, and other batch update algorithms

Fig. 17.74
figure 74

Pseudocode description of the probe update used by DM, RAAR, and other batch update algorithms

9.3.7 Alternating Projections

Having defined the two projections, the most basic algorithm applies them alternately: project onto set 1, project onto set 2, project onto set 1, \(\ldots\) This is achieved in the fashion shown in Fig. 17.75.

Fig. 17.75
figure 75

Pseudocode description of the simplest batch update method—alternating projections

9.3.8 DM and RAAR

Implementation of DM and RAAR proceeds in a similar manner, the only complication being the slightly more involved way that the exit waves are updated when the detector constraint is applied. Both methods can be coded along the lines of Fig. 17.76; setting \(\beta\) to 1 in this code gives the standard version of DM.

Fig. 17.76
figure 76

Pseudo code description of the RAAR algorithm—the standard implementation of DM is obtained by setting beta\(\,=1\)

9.3.9 Implementation Tips and Tricks

Some general points that can be helpful:

  • Algorithms can be accelerated by pre-computing the exponential phase terms in the propagators and by pre-square-rooting the diffraction patterns.

  • Common implementation is via MATLAB—to avoid repeated use of fftshift in the reconstruction, the diffraction data can be fftshifted instead, as can the pre-computed exponents in the propagators—this can give a significant speed boost.

  • Generally, single precision numbers are sufficient for excellent reconstruction accuracy and offer another significant speed boost.

9.4 A Basic Comparison of Algorithms

There has yet to be a comprehensive comparison of the different ptychographic algorithms with either real-world or simulated data, although Yang et al have evaluated many of the batch-type algorithms (DM, RAAR, conjugate gradient) [17.36], and work by Waller and colleagues assessed a range of batch and serial approaches for Fourier ptychography [17.37]. The difficulty in performing such an evaluation comes from the huge range of real-world scenarios to which ptychography may be applied—even tests restricted to the x-ray regime would need to cover situations ranging from very weak phase objects illuminated with a high energy convergent probe to strong, optically thick samples illuminated by a diffused soft x-ray probe. Here, a brief comparison between the algorithms detailed in the previous section is provided—more as an example of their various traits than as any sort of assessment of their performance. It should also be said that the authors have a great deal of experience with the ePIE-type algorithms, and much less knowledge of the tricks and short-cuts that might improve operation of the batch-type alternatives.

Our first comparison is by simulation. We used as a specimen a photograph of one of the authors' daughters, converted into a phase-only object with a phase range of 0–\(2\uppi\) (so that the darkest parts of the photograph mapped to zero phase, and the brightest to a phase of \(2\uppi\)). As a probe, we simulated a convergent beam with a small defocus. After every iteration of each algorithm, we calculated the sum of the differences between the evolving object reconstructions and the true photograph—these error values are plotted in Fig. 17.77. We also paused the algorithms at various points to take a snapshot of the phase reconstruction—these snapshots are shown in Fig. 17.78.

Fig. 17.77
figure 77

Progress of reconstructions using different algorithms in a simulated experiment. The graph plots an error metric that is the sum of the difference between the intensity of the exit waves that the algorithms estimate and the measured intensities captured by the detector; note the link between the spiralling action of DM and RAAR in Fig. 17.82 and this figure

Fig. 17.78
figure 78

Progress of reconstructions using different algorithms in a simulated experiment. Here, we have taken snapshots of the object estimate at various points during the reconstructions. Notice how the batch/parallel update algorithms, DM and RAAR, handle the center and edges of the object quite equally, whilst the serial update algorithms, ePIE and rPIE, obtain the center parts much more quickly but take time to reconstruct the edges

We can note a couple of salient points here. First that the center of the ePIE reconstruction evolves quite quickly, in the early iterations, whilst the outside part takes much longer to appear; second, that rPIE converges very quickly—at the center and the edges—given this ideal, noise-free data set. The batch algorithms are much more balanced in the way they update the object, with the edges and the center of the reconstructions evolving at an equal pace. DM converges quite quickly but tends to oscillate around the solution (like the spiralling action seen in Fig. 17.67a,b), whilst RAAR, although slightly slower initially, gives a very good convergence rate once it arrives near the solution.

For a second comparison, we collected data from an optical bench experiment. Our experiment used the simplest geometry of a probe formed by an aperture and a CCD camera placed in the far field. As a specimen, this time we used a prepared microscope slide holding a section taken from a clam's gill (chosen only because it looks quite beautiful at high magnification).

After 100 iterations of each algorithm, the images in Fig. 17.79 emerged (the amplitude part is shown). In this instance, DM does not fully converge, and the result is a slightly speckled image. RAAR performs very well, with the resulting image displaying a good level of detail and good noise suppression. ePIE and rPIE also both produced good results (although perhaps RAAR just wins out).

Fig. 17.79
figure 79

These amplitude images of a clam's gill were reconstructed using data gathered in an optical bench experiment. All of the algorithms work quite well for this real-world data; the ptychographer must choose their own poison!

These simple comparisons mirror what is quite clear from the literature: that, given a carefully conducted experiment and reasonably clean data, the inversion problem at the heart of ptychography is well conditioned and amenable to solution by any optimization technique—from simple gradient descent right through to the cutting-edge nonlinear heavyweights.

10 Wigner Distribution Deconvolution (WDD)and Its Approximations

We are now going to discuss a class of direct, noniterative solution methods for the ptychographic phase problem, which can be used when the sampling in real space (i. e., the distance moved by the probe) also occurs at the Nyquist sampling frequency; this is determined by the rate at which intensity in the diffraction pattern changes as a function of probe position. The methods we will describe have their most practical implementation in the focused probe configuration (Sect. 17.5.1 and Fig. 17.21). In this case, there is an aperture in the probe-forming lens so that, in the absence of aberrations, the probe is of the form of a bandwidth-limited Airy disc function. The highest frequency in the illumination is determined by the diameter of the lens aperture, which means that as this probe is moved laterally, there is also a maximum spatial frequency at which the far-field intensity can alter. We can think of this via reciprocity. In the case of Fourier ptychography, the maximum spatial frequency that can arrive in a conventional image is also determined by the size of the aperture in the back focal plane of the lens employed, according to Abbe's theory. Clearly, there is no point in measuring the image (or moving the probe in real-space ptychography ) at a higher spatial frequency (step size) than the maximum Fourier component of intensity in the image.

There are a couple of qualifications to this last statement. First, if we record the conventional image on a pixelated detector, say when undertaking Fourier ptychography, it is often advisable to sample at a higher frequency than Nyquist's to avoid effects from the roll off of the spatial transfer function of the detector itself. Second, it must be remembered that the Nyquist frequency of the image is determined by the interference of beams passing by opposite sides of the lens aperture, i. e., separated by its diameter. This is twice the frequency of Fourier components in the conventional coherent bright-field image obtained from a weakly scattering object, where the interference phenomenon is between a central unscattered beam on the optic axis and beams scattered up to the radius of the aperture. Later, we will see how this double-frequency/resolution issue manifests itself in ptychography.

Given (17.7), we might suppose that having maximal sampling in real space means that we can tolerate very low sampling in reciprocal space. We will find out below (Sect. 17.10.4) that this is true, but only if:

  1. 1.

    The object is weakly scattering.

  2. 2.

    We are prepared to sacrifice our ability to remove even the most minor aberrations in the lens we are using to form the probe (including defocus).

  3. 3.

    We accept that we cannot improve upon the resolution of the lens, as defined by its aperture, i. e., we cannot make use of dark-field scattering.

Under these conditions, we can, nevertheless, find the phase of the image more accurately than via the bright-field image.

To begin with, we are going to start by thinking about the absolutely maximally sampled data set in both real and reciprocal space. We will define maximal sampling in reciprocal space as the detector having a pixel size that is smaller than the reciprocal size of the whole field of view of the reconstructed image. This is much more demanding than simply being the inverse of the size of the probe (as is usual in ptychography), since the probe is invariably much smaller than the field of view. Matching reciprocal coordinates deriving from the field of view of the scan in real space and the total field of view as seen by the detector is only necessary if we want to use WDD to image strongly scattering specimens. Of course, these stringent conditions do not apply to iterative methods.

The experimental demands made by such a vast quantity of data are phenomenal—for a modest \(512\times 512\) pixel field of view, with a \(512\) diffraction pattern collected at every image pixel, we have nearly 69 billion measurements. If the bit depth of the detector is 16, you could only fit eight of these data sets onto a terabyte drive—and all this to solve for eight \(512\times 512\) pixel images! What is the advantage of all of this? One answer is that these extreme, very densely sampled data are the most we could ever hope to measure from a ptychography experiment, and so it must axiomatically be a good thing. Another answer is that very densely sampled data can be used to solve the ptychographic phase problem using a linear, closed form of inversion called Wigner distribution deconvolution (WDD). This was developed in the early 1990s [17.110, 17.123, 17.124, 17.125, 17.126, 17.31], more than 10 years before the modern iterative solutions for ptychography (for a review [17.5]). Given the agonizing history of the phase problem during the twentieth century, it is quite extraordinary that WDD solves an apparently nonlinear and intractable inverse problem with a handful of Fourier transforms. It can also do almost everything else modern iterative methods can achieve: solve for the illumination [17.31], remove partial coherence effects [17.110], and suppress 3-D scattering effects [17.127, 17.128]. Balanced against the absurd quantities of data it requires is the fact that it is computationally very fast. And, anyway, in an age of big data, is this a problem? A domestic consumer can buy a terabyte disc for less than $100; when the original work on WDD was done in the 1990s, the same amount could buy \({\mathrm{100}}\,{\mathrm{MB}}\).

10.1 Notes on Nomenclature

In this section, we will be talking about four-dimensional data sets, each a function of two-dimensional variables, illustrated in Fig. 17.80. Those in mixed real and reciprocal dimensions are called \(I(u,R)\) and \(H(r,U)\), where \(U\) is the reciprocal coordinate of \(R\), and \(u\) is the reciprocal coordinate of \(r\). We also have a function \(G(u,U)\), which is a function of two reciprocal coordinates, and \(L(r,R)\), which is a function of two real coordinates. Other conventions could be chosen (say that capital coordinates are the reciprocal of real coordinates). An advantage of the present scheme is that \(G(u,U)\) is our most important function, and it is important to stress that it lies entirely in reciprocal space.

Fig. 17.80
figure 80

Fourier relationships between the recorded intensity, \(I\), which is measured as a function of probe position, \(R\), and detector coordinate, \(u\), and the \(G\) and \(H\)-sets. We do not discuss \(L\), which is not relevant in the context of the main text

Now, \(R\) is the probe position, \(r\) lies in the object space, and \(I(u,R)\) is our detector intensity such that

$$I(u,R)=\left|\mathfrak{F}\left\{a(r-R)q(r)\right\}\right|^{2}\;,$$
(17.15)

where, as before, \(a(r)\) is the probe and \(q(r)\) is the specimen transmission function; it is important here that we keep track of coordinates, so we have now included the dependency on \(r\) explicitly for these functions. We will use \(A(u)\) to denote the Fourier transform of the probe and \(Q(u)\) to represent the Fourier transform of the specimen.

With reference to Fig. 17.80, where each two-dimensional vector is represented by just one coordinate (so that the 4-D data set is shown as a 2-D function), the Fourier relationships between the data sets are as follows. Horizontal pointers represent a Fourier transform over just one variable, from \(u\) transformed to \(r\), or vice versa. Vertical pointers are transforms also over one variable, \(R\) to \(U\) or vice versa. Diagonals represent Fourier transforms over all coordinates, \((u,R)\) to \((r,U)\) or vice versa, and \((u,U)\) to \((r,R)\), and vice versa.

The Table 17.1 shows the relationship between these coordinates and those of the original work done on WDD in the 1990s.

Table 17.1 Relationship of names of coordinates between this and an earlier work

It has been realised that the review published in 2008 [17.5], which adopted the same nomenclature convention as here, included a confusing mistake. The reader is advised to ignore the paragraph at the bottom of p. 122 (which confuses \(r\) and \(U\)) but instead refer to the Table above. This error propagated through to the equations 94–102. In these, \(L(U,R)\) should have read \(L(r,R)\), and \(D(u,r)\) should have read \(D(u,U)\), with the coordinates substituted equivalently in the RHS of the equations.

10.2 Phase Recovery from Data Sampled Densely in Real Space

Rather than launching straight into the mathematics of WDD, we think it is important to give the reader a physical insight into the most important and mysterious aspect of the technique: how is phase information extracted from raw data which have only been recorded in intensity? A Fourier transform of a single diffraction pattern's intensity gives the autocorrelation function. Although this can be useful, especially if there is a strong reference signal in the original wavefield, the object function is far from self-evident. Conversely, WDD relies on the principal strength of ptychography: probe movement. Let us see how this works.

Consider a focused probe that reaches its crossover some distance in front of a periodic grating, as shown in Fig. 17.2a-c (we discussed this interference in relation to Hoppe's definition of ptychography). In the far field we would expect to see a shadow image of the object. (If you have an optical bench at your disposal and you want to understand ptychography, this is an exceedingly informative experiment.) Figure 17.2a-cb shows an example result. In this case, the object is a TEM grid illuminated by a laser beam focused by a single lens that has a variable aperture. In Fig. 17.2a-cc, the aperture has been closed down. We now see discrete diffracted orders that are interfering with one another, giving fringes perpendicular to the scattering vector of the diffracted reflection, but with the same periodicity of the features that were cast in the shadow image of the object function. If the aperture is shut right down, the illumination is effectively parallel, so the discs become the usual diffraction spots and cannot interfere with one another. We see rather directly how interfering diffraction orders evolve into a shadow image. Incidentally, if there are isolated features on the grating, like pieces of dust, they are not at all easily visible in the coherent shadow image because the interference of the diffraction orders dominate. If the source is partially coherent, resolution is lost, but so are these very strong diffraction effects, and so the isolated features become visible.

If we move the object (or probe) at a continuous speed, the shadow image and/or the interference fringes move across the diffraction plane at a rate that depends on the defocus of the probe. Greater defocus leads to less magnification in the shadow image, but this image appears to move laterally more slowly. The two effects cancel each other, so that at any one point in the shadow image, the variation of bright-dark-bright is determined only by the pitch of the grating, irrespective of defocus. This is exactly what would be expected given that the only change in the experiment is the object shift, so any change in intensity anywhere in the optical system must oscillate in synchrony with the periodic structure in the object. The degree of overlap between the diffracted discs is also directly related to the periodicity of the object, according to the usual diffraction grating equation.

Remember:

  1. (i)

    The position of the diffracted beam, and its overlap with other diffracted beams, is determined precisely by a specific periodicity in the sample.

  2. (ii)

    As the probe is moved, the intensity within the overlap areas changes periodically, at exactly the same periodicity within the specimen that defines the position of the diffracted beam. This intimate relation defines the structure of what we will call the Fat-H, as well as other characteristics of WDD. This principle is not confined to crystalline or periodic objects.

It is impossible to picture the full densely sampled 4-D data set. We are, therefore, going to use one-dimensional ( ) lines, where a line represents a 2-D image or a 2-D diffraction pattern. The raw data set can then be represented as a 2-D function, plotted as a function of probe position and diffraction pattern intensity, as shown in Fig. 17.81. Horizontal lines correspond to diffraction patterns. Vertical lines are images (the signal detected at a diffraction pixel as a function of probe position). The data are the same as in a Fourier ptychography experiment, where vertical lines are images collected at a particular angle of illumination. We will also sometimes plot 2-D functions that represent other slices through the 4-D function. All these are not plots of one variable against another, but of a function of two-variables, where each point in the 2-D plane will have a value that is not shown, although we could, in principle, show this using variable shading. Except for the raw data, all the functions are complex valued.

Fig. 17.81
figure 81

The maximally sampled 4-D data set. Along horizontal lines we have diffraction patterns, each from one probe position. Along vertical lines we have images, each recorded as a function of probe position using the signal collected from one diffraction pattern pixel

We are going to start by considering ptychographic phase retrieval for a periodic object, as in Fig. 17.2a-c. First we look at the raw intensity data, plotted as a function of \(R\) and \(u\), namely \(I(u,R)\) (shown on the top left-hand side in Fig. 17.80), where \(R\) is the probe position coordinate and \(u\) is the diffraction pixel coordinate. If we have a periodic object, we have multiple strips, as shown in Fig. 17.82. In this example, the strips (1-D representations of the 2-D diffracted discs in the full 4-D data set) are not overlapping. When they do overlap (Fig. 17.83), we see interference, which periodically changes as a function of probe position. For simplicity, the interference is shown as if the probe were perfectly focused. If it were defocused, the interference fringes in this plane would be diagonal. As the probe is moved, the interference then shifts laterally across the overlap in the diffraction plane. For the focused probe, there is then no structure in the overlap of the discs, but the interference signal still changes as a function of the probe position. The position of these fringes relative to one another will deliver the solution of the phase problem.

Fig. 17.82
figure 82

Shaded regions show where there is intensity in the recorded data when there are nonoverlapping diffraction discs arising from a crystalline specimen. The discs at the bottom show a perspective view of the two-dimensional diffraction pattern. In the main diagram, each diffraction pattern is a horizontal line, as in Fig. 17.80

Fig. 17.83
figure 83

Raw data when discs from a crystalline specimen (as in Fig. 17.2a-c) overlap. The interference within overlap changes periodically. If there was defocus in the probe (Fig. 17.2a-c), these interference effects would be diagonal; think of a horizontal line moving down the figure. Each pattern has fringes in the overlap region, which move laterally as the probe is moved. The position of these fringes solve the phase problem

We now take a Fourier transform with respect to the probe shift coordinate, but not across the horizontal 1-D diffraction pattern coordinate. This means that we take out a vertical strip of pixels in our 2-D data set, do a 1-D Fourier transform on it, and then replace it in the same strip where we took it from, and so on for all such vertical lines. The result is shown in Fig. 17.84, where the vertical coordinate is now a reciprocal coordinate of the probe position \(R\) labeled by \(U\). We call this function the G-set [17.124, 17.125]. The \(u\) coordinate remains unchanged, spanning the detector. Except for at \(U=0\) (the zeroth component of the Fourier transform), there is no amplitude in the \(G\)-set in the vertical direction wherever the diffracted discs do not overlap—because these regions did not change as a function of \(R\). However, we have lines of amplitude, each with the width of the aperture overlap and positioned at the frequency of the structural element in the object that gave rise to the interference. When we do the mathematics, we find that the phase of these features corresponds directly to the phase difference between each pair of the diffracted discs, although we may have to deconvolve the influences of an aberrated or defocused probe. Once we have all such phase differences, we can construct the whole Fourier transform of the object; the phase problem is solved, once again by exploiting ptychographical probe-movement translational diversity.

Fig. 17.84
figure 84

Once the Fourier transform is taken with respect to the probe position, the periodic features in Fig. 17.83 appear at specific frequencies

This particular focused probe experimental geometry was how Hoppe first formulated the concept of ptychography, at least as a gedanken experiment [17.2]. Instead of using all the probe positions, he proposed using just two positions, which just about provides adequate information to unlock the phase problem if the object is, indeed, a perfect crystal , and the diameter of the interfering discs are such that there is only one overlap occurring at any one point in the diffraction pattern [17.5]. Moving over towards a general noncrystalline object there is a continuous spectrum of diffracted intensity, and so many diffraction discs, and their interferences, all overlap with one another inseparably. The advantage of collecting a whole field of view of probe positions, and Fourier transforming with respect to probe position coordinate, is that the interferences in the regions of overlap are teased apart.

However, this is not the end of the story. It turns out (see below) that we can quite easily form an image from a weakly scattering object using the \(G\)-set directly. More generally, if the probe is aberrated, the lines (i. e., the aperture overlaps) in Fig. 17.84 are complex variables with fine structure. In the case of defocus, the fringes cause any particular point in the diffraction pattern to become bright and dark at different times (the variation of intensity has different phases) as the probe position is scanned. Worse, if the object is strong, diffracted discs do not only interfere with the central disc, but with other, possibly strong, diffracted discs. This means that there can be multiple overlap areas at any one value of \(U\), which can themselves overlap with one another! Our single Fourier transform has not perfectly separated all the ptychographic interferences. We will see that these more complicated effects can be deconvolved via the WDD method; in the meantime, we will explore more fully the weak phase object approximation in the case when the probe-forming optics are perfect.

10.3 Weak Phase Object Approximation: The Fat-H

Before we can go further, we have to derive a mathematical definition of the \(G\)-set. We recall that the exit wave from our object function \(q(\boldsymbol{r})\), with incident probe \(a(\boldsymbol{r})\), can usually be approximated as a simple point-by-point product

$$\psi(\boldsymbol{r})=a(\boldsymbol{r}-\boldsymbol{R})\cdot q(\boldsymbol{r})\;.$$
(17.16)

The complex amplitude arriving at the detector is then

$$M(\boldsymbol{u})=\int\psi(\boldsymbol{r})\mathrm{e}^{\mathrm{i}2\uppi\boldsymbol{u}\cdot\boldsymbol{r}}\mathrm{d}\boldsymbol{r}=\mathfrak{F}\{\psi(\boldsymbol{r})\}\;.$$
(17.17)

The intensity at the detector is now

$$I(\boldsymbol{u},\boldsymbol{R})=|M(\boldsymbol{u},\boldsymbol{R})|^{2}=|\mathfrak{F}\{a(\boldsymbol{r}-\boldsymbol{R})\cdot q(\boldsymbol{r})\}|^{2}\;.$$
(17.18)

which can alternatively be written as a convolution of the Fourier transforms of \(a\) and \(q\), namely \(A\) and \(Q\)

$$I(\boldsymbol{u},\boldsymbol{R})=|([A(\boldsymbol{u})\mathrm{e}^{\mathrm{i}2\uppi\boldsymbol{R}\cdot\boldsymbol{u}}]\otimes Q(\boldsymbol{u})|^{2}\;.$$
(17.19)

The exponential derives from the Fourier shift theorem and can be thought of a phase ramp added to the aperture transfer function, which, like a thin prism, has the effect of shifting the probe in real space [17.5]. What we aim to do is to form \(G\), namely

$$G(\boldsymbol{u},\boldsymbol{U})=\int I(\boldsymbol{u},\boldsymbol{R})\mathrm{e}^{\mathrm{i}2\uppi\boldsymbol{R}\cdot\boldsymbol{u}}\mathrm{d}\boldsymbol{R}\;,$$
(17.20)

the Fourier transform of our data with respect to just the \(\boldsymbol{R}\) (probe position coordinate)—but not with respect to the detector coordinate, \(\boldsymbol{u}\). Equation (17.20) is not very helpful for doing this, so we have to rewrite it, making the convolution explicit so that

$$\begin{aligned}\displaystyle I(\boldsymbol{u},\boldsymbol{R})&\displaystyle=\iint A(\boldsymbol{u}_{\mathrm{a}})Q(\boldsymbol{u}-\boldsymbol{u}_{\mathrm{a}})A^{*}(\boldsymbol{u}_{\mathrm{b}})\\ \displaystyle&\displaystyle\quad\,\times Q^{*}(\boldsymbol{u}-\boldsymbol{u}_{\mathrm{b}})\mathrm{e}^{\mathrm{i}2\uppi\boldsymbol{R}\cdot(\boldsymbol{u}_{\mathrm{a}}-\boldsymbol{u}_{\mathrm{b}})}\mathrm{d}\boldsymbol{u}_{\mathrm{a}}\mathrm{d}\boldsymbol{u}_{\mathrm{b}}\;,\end{aligned}$$
(17.21)

where \(\boldsymbol{u}_{\mathrm{a}}\) and \(\boldsymbol{u}_{\mathrm{b}}\) are dummy variables; the result of the integral does not depend upon them, although they are needed to compute the integral. Now we substitute into (17.20), to get

$$\begin{aligned}\displaystyle G(\boldsymbol{u},\boldsymbol{U})&\displaystyle=\iiint A(\boldsymbol{u}_{\mathrm{a}})Q(\boldsymbol{u}-\boldsymbol{u}_{\mathrm{a}})A^{*}(\boldsymbol{u}_{\mathrm{b}})\\ \displaystyle&\displaystyle\quad\times Q^{*}(\boldsymbol{u}-\boldsymbol{u}_{\mathrm{b}})\mathrm{e}^{\mathrm{i}2\uppi\boldsymbol{R}\cdot(\boldsymbol{u}_{\mathrm{a}}-\boldsymbol{u}_{\mathrm{b}}+\boldsymbol{U})}\mathrm{d}\boldsymbol{u}_{\mathrm{a}}\,\mathrm{d}\boldsymbol{u}_{\mathrm{b}}\,\mathrm{d}\boldsymbol{R}\;.\end{aligned}$$
(17.22)

Note that \(A\) and \(Q\) have no dependence on \(\boldsymbol{R}\), in the above. After all, in ptychography the illumination and the specimen stay the same, wherever the probe is moved. We can, therefore, integrate over \(\boldsymbol{R}\) to give

$$\begin{aligned}\displaystyle G(\boldsymbol{u},\boldsymbol{U})&\displaystyle=\iint A(\boldsymbol{u}_{\mathrm{a}})Q(\boldsymbol{u}-\boldsymbol{u}_{\mathrm{a}})A^{*}(\boldsymbol{u}_{\mathrm{b}})\\ \displaystyle&\displaystyle\quad\,\times Q^{*}(\boldsymbol{u}-\boldsymbol{u}_{\mathrm{b}})\delta(\boldsymbol{u}_{\mathrm{a}}-\boldsymbol{u}_{\mathrm{b}}+\boldsymbol{U})\mathrm{d}\boldsymbol{u}_{\mathrm{a}}\mathrm{d}\boldsymbol{u}_{\mathrm{b}}\;,\end{aligned}$$
(17.23)

where we have used the fact that the integral over the complex exponential function is zero everywhere except at \(\boldsymbol{R}=\boldsymbol{0}\). This is only strictly true if we integrate over infinite limits—a fact that does have consequences when our field of view is finite, as will be discussed in Sect. 17.10.6, More About Sampling and Probe Size. We integrate over \(\boldsymbol{u}_{\mathrm{a}}\) (the choice of \(\boldsymbol{u}_{\mathrm{a}}\) or \(\boldsymbol{u}_{\mathrm{b}}\) is not essential), in which case the delta function in (17.23) only has value at, \(\boldsymbol{u}_{\mathrm{b}}=\boldsymbol{u}_{\mathrm{a}}+\boldsymbol{U}\), so

$$\begin{aligned}\displaystyle G(\boldsymbol{u},\boldsymbol{U})&\displaystyle=\int A(\boldsymbol{u}_{\mathrm{a}})Q(\boldsymbol{u}-\boldsymbol{u}_{\mathrm{a}})A^{*}(\boldsymbol{u}_{\mathrm{a}}+\boldsymbol{U})\\ \displaystyle&\displaystyle\quad\,\times Q^{*}(\boldsymbol{u}-\boldsymbol{u}_{\mathrm{a}}-\boldsymbol{U})\mathrm{d}\boldsymbol{u}_{\mathrm{a}}\;,\end{aligned}$$
(17.24)

or more conveniently for our discussion, we substitute \(\boldsymbol{u}_{\mathrm{c}}=\boldsymbol{u}-\boldsymbol{u}_{\mathrm{a}}\) to give

$$\begin{aligned}\displaystyle G(\boldsymbol{u},\boldsymbol{U})&\displaystyle=\int Q(\boldsymbol{u}_{\mathrm{c}})Q^{*}(\boldsymbol{u}_{\mathrm{c}}+\boldsymbol{U})\\ \displaystyle&\displaystyle\quad\,\times A^{*}(\boldsymbol{u}-\boldsymbol{u}_{\mathrm{c}}+\boldsymbol{U})A(\boldsymbol{u}-\boldsymbol{u}_{\mathrm{c}})\mathrm{d}\boldsymbol{u}_{\mathrm{c}}\;,\end{aligned}$$
(17.25)

i. e., the convolution

$$\begin{aligned}\displaystyle G(\boldsymbol{u},\boldsymbol{U})=&\displaystyle Q(\boldsymbol{u})Q^{*}(\boldsymbol{u}-\boldsymbol{U})\otimes_{\mathrm{u}}A(\boldsymbol{u})\\ \displaystyle&\displaystyle\times A^{*}(\boldsymbol{u}+\boldsymbol{U})\;.\end{aligned}$$
(17.26)

The subscript \(u\) on the convolution is critically important: it denotes that we are only convolving the two functions in the \(u\)-direction, which we will try to illustrate diagrammatically later.

Now let us try to pick this apart. The first thing to note is that when \(\boldsymbol{U}=\boldsymbol{0}\), i. e., along the horizontal axis in Fig. 17.84, we just have \(|Q(\boldsymbol{u})|^{2}\) convolved with \(|A(\boldsymbol{u})|^{2}\). This happens at the zero component of the Fourier transform over \(\boldsymbol{R}\), so it is equivalent to the integral of the intensity of all our diffraction patterns. Physically, this is equivalent to an incoherent convergent beam electron diffraction ( ) pattern collected using a wholly incoherent tungsten electron source (or an incoherent x-ray source). Each point in the (large) source gives rise to a displaced probe, and all the resulting diffraction patterns add together in the diffraction plane.

The next most important feature arises when we consider a weakly scattering object function. The Fourier transform of a weak specimen has a large spike at \(\boldsymbol{u}=\boldsymbol{0}\), corresponding to the largely unscattered transmitted beam. At all other values of \(\boldsymbol{u}\), \(Q(\boldsymbol{u})\) has very small amplitude. In Fig. 17.85a we plot \(Q(\boldsymbol{u})\) and \(Q^{*}(\boldsymbol{u}-\boldsymbol{U})\) on top of one another. The reader is asked to imagine what the product of these two functions will look like. Clearly, there is a massive spike at \(\boldsymbol{u}=\boldsymbol{U}=\boldsymbol{0}\), because this is where the two bright central features of \(Q(\boldsymbol{0})\) meet up: the intensity of the transmitted beam.

Fig. 17.85
figure 85

(a\(Q(\boldsymbol{u})Q^{*}(\boldsymbol{u}-\boldsymbol{U})\) in the \(G\)-set for a weakly scattering object. (The same function for a strong object is shown in Fig. 17.94.) (b\(A(\boldsymbol{u})A^{*}(\boldsymbol{u}+\boldsymbol{U})\) for a simple top-hat aperture function. Each point in the plane has a complex-valued function associated. The lines and shaded regions denote areas where amplitude can exist, though each point will have a complex value associated with it

Now suppose \(Q(\boldsymbol{0})\), the center of the diffraction pattern, has an amplitude of unity. Along the vertical axis, \(\boldsymbol{u}=\boldsymbol{0}\), we have

$$Q(\boldsymbol{0})Q^{*}(-\boldsymbol{U})=Q^{*}(-\boldsymbol{U})\;.$$
(17.27)

It is easy: we can just take the data along this line, reverse them, take the complex conjugate, and Fourier transform to give the complex image! There is another line of high amplitude lying along the locus \(\boldsymbol{u}-\boldsymbol{U}=\boldsymbol{0}\), where we have

$$Q(\boldsymbol{u})Q^{*}(\boldsymbol{0})=Q(\boldsymbol{u})\;,$$
(17.28)

giving us another, even more direct estimate of \(Q\). Everywhere else in this 2-D \(G\)-set, at points not on these two lines, only very weakly scattered values of \(Q\) and \(Q^{*}\) meet up to form a product. In the weak object approximation, we ignore this second-order amplitude. We cannot ignore it when the object is strong.

There is only one problem. To make this function manifest, we must have taken a Fourier transform with respect to \(R\)—but this weak phase object, with its central spike in reciprocal space, is, by inference, illuminated by a plane wave, so there has been no probe, no effects of probe movement, and, thus, no ptychographical interference. The phase retrieval only works when we consider (17.26). It is the effect of the convolution of the aperture, leading to the sort of fringes we saw in Fig. 17.2a-c, that gives us the phase. Ironically, once we have done the experiment, we will deconvolve (via the WDD method) the aperture function, and hence obtain the function in (17.36) and Fig. 17.85a in splendid isolation, as we show below.

Our \(G\)-set is, in fact, given by (17.26). We first explore where data can arrive in the \(G\)-set for a weak object. Now we consider the aperture term in (17.26)

$$A(\boldsymbol{u})A^{*}(\boldsymbol{u}+\boldsymbol{U})\;.$$
(17.29)

In one dimension, the simplest aperture is just a top-hat function of unity modulus, with no phase components. A little thought will show that in \(\boldsymbol{u},\boldsymbol{U}\) space, (17.29) then describes a skewed parallelogram, as shown in Fig. 17.85b.

Now consider the consequence of the convolution in (17.26) for a weak specimen. Remember that we are not convolving Fig. 17.85a with Fig. 17.85b in the two dimensions like the blurring of a two-dimensional image; we are convolving only along the \(\boldsymbol{u}\)-direction. At some value of \(\boldsymbol{U}\), we have to take out two rows of pixels along horizontal lines from Fig. 17.85a,b, convolve these two one-dimensional functions, and then put the resulting one-dimensional row of pixels back into the \(G\)-set at the same value of \(\boldsymbol{U}\). We then do this for all such 1-D functions at all values of \(\boldsymbol{U}\).

One way to picture this is as follows. A one-dimensional convolution, say \(g(x)\otimes h(x)\) can be achieved by flipping one of the functions, say \(g(x)\) becomes \(g(-x)\), and then forming the correlation of the two by moving one past the other and observing the integral of the product of the two functions as a function of displacement. For our functions, we can flip the aperture parallelogram (as a function of \(\boldsymbol{u}\)), and shift it laterally across Fig. 17.85a, as shown in Fig. 17.86. With some further thought, it can be seen that we end up with a function that looks like Fig. 17.87, which we will call the Fat-H. Note that under all circumstances,

$$G(\boldsymbol{u},\boldsymbol{U})=G^{*}(\boldsymbol{u},-\boldsymbol{U}).$$

This is a property of all Fourier transforms of real functions: here we have the Fourier transform of the raw (real, intensity) data along the original \(R\) coordinate. In all that follows we can ignore either the top or bottom half of the \(G\)-set. When we are dealing with real data sets (which for this technique are enormous), this is an important thing to remember—you can throw away half of it.

Fig. 17.86
figure 86

A way of picturing the convolution in (17.26). For each separate value of \(U\), we must form the integral of the two functions multiplied by one another as the parallelogram (a horizontally flipped version of \(A(\boldsymbol{u})A^{*}(\boldsymbol{u}+\boldsymbol{U})\)) is scanned across \(Q\left(\boldsymbol{u}\right)Q^{*}(\boldsymbol{u}-\boldsymbol{U})\)

Fig. 17.87
figure 87

The result of the correlation in Fig. 17.71 for a weak object function. We call this the Fat-H. Lines drawn between the extreme tips of the structure represent symmetric scattering conditions (Sect. 17.10.6, The Bragg–Brentano Plane); in reality these are 2-D planes extracted from the 4-D data set

So far in this analysis, all our 2-D diffraction patterns have been represented as 1-D lines, the other axis on our diagrams being reserved for the probe position or its Fourier transform. Next, we consider what is happening in the 2-D diffraction plane (which we label by coordinates \(u_{x}\) and \(u_{y}\)), but picked out at particular values of \(\boldsymbol{U}\), as shown in Fig. 17.88. The five horizontal lines correspond to the five shapes shown in the 2-D diffraction pattern plane shown on the right-hand side. As we go to higher and higher frequencies in \(\boldsymbol{U}\), which is the rate of change of intensity in the diffraction pattern as a function of probe position, the discs separate further and further—remember that the position of the diffracted disc in \(\boldsymbol{u}\) is itself determined by the periodicity in the object that gave rise to it—recall our discussion concerning Fig. 17.2a-c and the movement of the shadow image fringes across it.

Fig. 17.88
figure 88

The Fat-H is drawn as a function of \(\boldsymbol{U}\) and \(\boldsymbol{u}\), assuming both object and aperture are 1-D functions. In fact, every horizontal line in the Fat-H is a 2-D plane plotted as \(u_{x}\) and \(u_{y}\) (right). At higher \(\boldsymbol{U}\) (higher Fourier frequencies in the probe position coordinate), we see occluded aperture functions called the trotters. See Fig. 17.92a,b for an experimental example

It is exceedingly important to understand that the presence of the occluded aperture shapes in Fig. 17.88 (which we will see below are generally called the trotters ) does not depend on the object being crystalline or periodic. Our experiment in Fig. 17.2a-c used a periodic object as a simple demonstration. Any noncrystalline object is still made up of a set of Fourier components. Each one of these components lies at a particular value of \(\boldsymbol{U}\), and, therefore, can still only be expressed in the overlap areas defined by the aperture functions in the Fat-H. It is also important to appreciate that when we are dealing with the full 4-D data set, we must take the Fourier transform with respect to the probe position over both its 2-D coordinates in order to reveal the occluded aperture shapes (trotters) in Fig. 17.88. Note that the areas which are doubly shaded are where there can be amplitude for a weakly scattering object, but that does not mean there is amplitude at every such position. The presence of amplitude depends on structure in the specimen and the effects of phase aberrations in the aperture function.

As an aside, we will explain why those working in the field often call the occluded aperture functions trotters. During the 1990s, when this data set was first explored experimentally [17.124], Rodenburg built a cardboard 3-D model of the 2-D overlap regions as a function of just one of the coordinates in \(\boldsymbol{U}\). It looked somewhat like Fig. 17.89. Cutting this object horizontally (i. e., at one value of \(\boldsymbol{U}\)) gives the shape of the overlaps (Fig. 17.88) in 2-D, plotted as a function of \(u_{x}\) and \(u_{y}\). Slicing vertically down the middle between the two pointed features (the points of smallest disc overlap), gives the Fat-H (Fig. 17.87), a function of \(\boldsymbol{u}\) and \(\boldsymbol{U}\). This 3-D object had an uncanny resemblance to an upside down pig's trotter. Alternative names for the aperture overlap areas (e. g., the aperture offset functions) do not have the friendly and compact resonance of trotters. The name, always used in the plural even though the two occluded apertures are part of one 3-D trotter, has stuck amongst the cognoscenti; in what follows we will use it in parentheses. In fact, the pig's trotter is a genuinely useful insight into the nature of the ptychographic data set, even if its name is flippant.

Fig. 17.89
figure 89

The pig's trotter in 3-D. One coordinate in \(\boldsymbol{U}\) is plotted in the vertical direction. The other two coordinates are \(u_{x}\) and \(u_{y}\). This is an alternative representation of Fig. 17.88

10.4 Sector Detectors

In Sect. 17.5.1, we alluded to the fact that when the illumination is a perfectly focused probe, a ptychographic data set arising from a dense (Nyquist) sampling in real space can give us a phase-sensitive reconstruction even if we only have four pixels in the detector plane. In fact, there are some very straightforward and direct ways of doing this. Indeed, so direct that the reader may become irritated that we have gone through all the shenanigans of constructing the \(G\)-set in order to describe these techniques, although the \(G\)-set will become very important in later sections, when the probe is aberrated and/or of small numerical aperture and/or the specimen is strong.

Equations (17.27) and (17.28) tell us that when the object is weak, its Fourier transform is expressed directly into the \(G\)-set as a function of \(\boldsymbol{U}\). To get to here, i. e., to effect the ptychographic interference, we need a convergent probe, which consequently gives us the Fat-H. If there are no phase components in the aperture, then because the convolution in (17.26) is taken only in the \(\boldsymbol{u}\) coordinate, \(Q\) is unaffected in all the unshaded areas in Fig. 17.90; \(Q(\boldsymbol{U})\) is expressed in every vertical line in the Fat-H, lying at any \(\boldsymbol{u}\) position within the central undiffracted disc. There is a problem in that the double overlap area, shown shaded in Fig. 17.90, will have little or no amplitude if the object is weak phase. We will not labor through the theory here, but it derives from the fact that the image of a weak phase object has no contrast, and so its Fourier transform is zero. Where there is only one sideband present (unshaded regions in Fig. 17.90), there is contrast in the image.

Fig. 17.90
figure 90

Amplitude in the shaded area of the Fat-H depends on the contrast transfer function of the lens. Unshaded areas are single sidebands, which always express contrast from the specimen, but are still affected by the complex transfer function of the lens. Sector detectors integrate vertical lines of these Fourier components

So, thinking of the Fat-H, all we have to do is put two 1-D detectors at \(\boldsymbol{u}> \boldsymbol{0}\) and \(\boldsymbol{u}<\boldsymbol{0}\). We can take the Fourier transform of any vertical line in the Fat-H in either the upper or lower half of it (i. e. one sideband) and thus obtain an image in modulus and phase. Lower frequencies are lost, or at least corrupted, in the shaded regions of Fig. 17.90. In the two-dimensional detector plane, we have something that looks like the sector detector shown in Fig. 17.23a,b.

However, there is a problem with the transfer characteristics of the images that come out of these detectors. At the center of the detector we get no transfer at all. This is equivalent to a central bright-field detector in STE/XM; we see nothing if the object is weak phase, except uniform brightness over the field of view. Both high frequencies and low frequencies pass through the very edge of the detector—i. e., on the outer extremes of the Fat-H. In between we have partial transfer of different frequencies. However, this can be filtered out, at least approximately. Each sector gives an image. The Fourier transforms of these images give diffractograms (the Fourier transform of the image intensity)—an integral over the area in \(\boldsymbol{u}\) spanned by the detector, plotted as a function of \(\boldsymbol{U}\). It is possible to weight each point in each of the four diffractograms by dividing by the line length in the \(\boldsymbol{u}\)-direction that intersects the shaded area in Fig. 17.90. For the 4-D data set, the division is by the area of the occluded apertures (trotters) at that particular value of \(\boldsymbol{U}\), see [17.129, 17.130]. In discussing sector detector transfer properties, the trotters are sometimes superimposed on the sector detector, as shown in Fig. 17.91a-c. It then becomes clear that there can be benefits in using different diameters and annular divisions of the detector to improve the transfer characteristics of sector detectors.

Fig. 17.91a-c
figure 91

Sections through the trotters (Fig. 17.88) superimposed on sector detectors. This is a way of understanding the frequency transfer properties of sector detectors. (a) Shows the effect of summing over an annular bright field ( ) disc. For a weak phase object, the trotters are out of phase, and so largely cancel in this configuration. For differential phase contrast (DPC ) imaging, we might choose to subtract the top left quadrant in (b), from the bottom right quadrant. This gives strong contrast because the red and blue sectors are opposite phase. In (c), spatial frequencies at right angles to those in (b) cancel. Reprinted from [17.131], with permission from Elsevier

Sector detectors are nowadays commercially available in electron microscopes, although the processing done on the data is usually more approximate than what we have described above. For example, we can get an approximate estimate of the phase gradient in the object simply by taking the difference in the intensities measured at opposite sectors. This signal must then be integrated to give the absolute phase change induced by the specimen.

Note that at the extreme edge of the Fat-H, we get twice the resolution of the bright-field image , whose diffratogram lies along \(\boldsymbol{u}=\boldsymbol{0}\), hence the title of the paper where the trotters were first observed [17.124]. This is nothing mysterious. The coherent bright-field image uses an incident plane wave that has a single incident \(k\)-vector. The maximum angle to which this can scatter is half the diameter of the aperture, which occurs at \(\boldsymbol{u}=\boldsymbol{0}\) in the Fat-H. When we have a convergent probe, scattering can occur from one side of the aperture to the other, i. e., across its whole diameter. (Note that we are not talking about dark-field intensity, which is scattered outside the aperture disc in the focused probe configuration.) As we have said before, it is well known that conventional microscope resolution is defined by the inverse of the addition of the numerical apertures of both the condenser lens—the range of incident vectors illuminating the specimen—and the objective lens. The same applies here, except our objective is the bright-field disc in our diffraction pattern, which we process computationally, not via another lens.

10.5 Dense Sampling in Real Space and Reciprocal Space

Why bother to have an expensive 2-D detector in the diffraction plane when a data cube that is densely sampled in real space can give an adequate phase image from a few sector detectors? After all, collecting a 2-D diffraction pattern at every densely-sampled probe position massively increases the data we have to handle. The answer is that we can cope with aberrations in the optics, we can handle strong specimens, we can exploit dark-field intensity lying outside the central unscattered beam that contains higher-resolution information than the probe-forming optics, we can remove partial coherence effects, and we can choose to image specific layers in a three-dimensional specimen. The contrast in the final reconstruction is also much better [17.10]. Of course, iterative algorithms can do all of these things, and without having to have dense sampling in real space. However, this section is about maximally sampled data—let us call it the complete data set—and why we can invert it with a linear set of transforms. It would seem logical that if we have the data-handling capabilities necessary, the complete data set must be the most informative. Once we have the data, WDD is bound to give a faster reconstruction, but whether it is better than iterative methods remains to be seen.

Several researchers have recently obtained this complete data set from the electron microscope using an ultra-fast (\({\mathrm{4000}}\,{\mathrm{fps}}\)), single-electron counting diffraction camera. Watching the data come out of this in real time as the probe is scanned is extraordinary. The central disc in the diffraction plane fills the entire camera. All that can be seen appears to be pure noise. However, when a plane taken out of the \(G\)-set is displayed, the occluded apertures (trotters) are astonishingly clear, as shown in the example in Fig. 17.92a,b. This very powerfully demonstrates the dose fractionation property of ptychography. The noise statistics from all the many diffraction patterns has been re-assembled exactly as we expect, in this case by Fourier transform integration over all probe positions.

Fig. 17.92a,b
figure 92

Experimental trotters in phase (a) and modulus (b). Any aberrations in the lens are very sensitively expressed in the phase. These data were collected on a high-performance aberration-corrected machine, so the presence of phase distortions is surprising. Reprinted from [17.131], with permission from Elsevier

As we have hinted, we can do several things using this data to improve the fidelity of the reconstruction over and above what is possible with sector detectors. Most simply, we can avoid those parts of the Fat-H (shaded in Fig. 17.90) where the sidebands of \(Q\) overlap with one another, say by integrating each slice in \(\boldsymbol{U}\) only over the relevant trotter shapes. When there is any defocus or aberration in the probe, the shaded areas can have unwanted amplitude. After all, that is how we get contrast into a conventional bright-field image . By defocusing we pump amplitude into the diffractogram, which in the \(G\)-set lies along the vertical line, \(\boldsymbol{u}=\boldsymbol{0}\). A sector detector unavoidably captures this unwanted region of data.

However, it is much more effective to deconvolve the trotters from the data. We do this as usual by taking the Fourier transform across the plane of the trotters (i. e., over \(\boldsymbol{u}\) but not over \(\boldsymbol{U}\)). In other words, we Fourier transform (17.26) with respect to \(\boldsymbol{u}\). By the convolution theorem this will give us the product of the Fourier transform of the functions depending on \(A\) and \(Q\). The coordinate \(\boldsymbol{u}\) was the reciprocal of the exit wave variable, \(\boldsymbol{r}\) (17.18). So, we say this function depends on \(\boldsymbol{U}\) and \(\boldsymbol{r}\), and we call it \(H(\boldsymbol{U},\boldsymbol{r})\), where

$$\begin{aligned}\displaystyle H(\boldsymbol{r},\boldsymbol{U})&\displaystyle=\left[\int Q(\boldsymbol{u})Q^{*}(\boldsymbol{u}-\boldsymbol{U})\mathrm{e}^{\mathrm{i}2\uppi\boldsymbol{r}\cdot\boldsymbol{u}}\mathrm{d}\boldsymbol{u}\right]\\ \displaystyle&\displaystyle\quad\,\times\left[\int A(\boldsymbol{u})A^{*}(\boldsymbol{u}+\boldsymbol{U})\mathrm{e}^{\mathrm{i}2\uppi\boldsymbol{r}\cdot\boldsymbol{u}}\mathrm{d}\boldsymbol{u}\right]\end{aligned}$$
(17.30)

or

$$H(\boldsymbol{r},\boldsymbol{U})=\chi_{Q}(\boldsymbol{U},\boldsymbol{r})\,\chi_{A}(-\boldsymbol{U},\boldsymbol{r})\;,$$
(17.31)

where for some general reciprocal function, \(F\), we have

$$\chi_{F}(\boldsymbol{r},\boldsymbol{U})=\int F(\boldsymbol{u})F^{*}(\boldsymbol{u}-\boldsymbol{U})\mathrm{e}^{\mathrm{i}2\uppi\boldsymbol{r}\cdot\boldsymbol{u}}\mathrm{d}\boldsymbol{u}\;,$$
(17.32)

which is our definition of a Wigner distribution, although in signal processing theory it is usually called an ambiguity function.

With reference to Fig. 17.65, let us try to clarify all the steps we have taken, and also to describe the final steps we have to take in order to produce an image using the WDD method. At the top left-hand side we have our recorded data, \(I\). This is a real function (intensity) recorded as a function of the diffraction plane coordinate \(\boldsymbol{u}\) and the probe shift coordinate, \(\boldsymbol{R}\). Below it is the \(G\)-set: a Fourier transform has been taken vertically over the probe position coordinate, \(\boldsymbol{R}\), transforming it to \(\boldsymbol{U}\); the \(\boldsymbol{u}\) coordinate remains untouched. When the specimen is weak, this is where we see the trotters and the Fat-H. However, the information relating to the specimen is still bound up in the \(G\)-set via the convolution in (17.26). To remove the effects of the aperture, we now transform horizontally along the coordinate of the convolution, \(\boldsymbol{u}\), to the coordinate \(\boldsymbol{r}\), this time leaving the position of the rows of pixels in the \(G\)-set unchanged. Alternatively, we can take the obvious short-cut, which was how this theory was originally formulated [17.110, 17.123], by taking Fourier transforms over all the coordinates at once, jumping straight from \(I\) to \(H\), as illustrated by the diagonal line. However, we then lose the ability to employ or understand the weak object approximations.

As we have described it, the model depends on reciprocal functions \(Q\) and \(A\). The reader is advised that most of the theoretical development of the WDD method in the literature used the real-space functions \(q\) and \(a\) in the definition of \(H\) and \(\chi_{F}\). This has no impact on the key ideas, but we now think that understanding the convolution of \(A\) and \(Q\) in the \(G\)-set—and their deconvolution—might be an easier way to understand what are otherwise rather impenetrable equations. However, for the record, the equivalent definition of \(\chi_{F}\) for a real space function, \(f\), is

$$\chi_{f}(\boldsymbol{r},\boldsymbol{U})=\int f^{*}(\boldsymbol{R})f(\boldsymbol{R}+\boldsymbol{r})\mathrm{e}^{-\mathrm{i}2\uppi\boldsymbol{R}\cdot\boldsymbol{U}}\mathrm{d}\boldsymbol{R}\;.$$
(17.33)

To proceed with the deconvolution we remove the aperture function (which for the time being we presume we know), so that we get

$$\chi_{Q}(\boldsymbol{r},\boldsymbol{U})=\frac{H(\boldsymbol{r},\boldsymbol{U})}{\chi_{A}(\boldsymbol{r},\boldsymbol{U})}\;.$$
(17.34)

Like all deconvolutions, this division is highly unstable wherever \(\chi_{A}\) is small or zero. Like the iterative update (Sect. 17.3.4), we use a Wiener filter , so that

$$\chi_{Q}(\boldsymbol{r},\boldsymbol{U})=\frac{\chi_{A}^{*}H(\boldsymbol{r},\boldsymbol{U})}{|\chi_{A}(\boldsymbol{r},\boldsymbol{U})|^{2}+\varepsilon}\;.$$
(17.35)

Then, all we have to do is back Fourier transform \(\chi_{Q}\) with respect to \(r\), to a function only dependent on \(Q\). This is usually called the \(D\)-set. It exists in the same coordinate system as the \(G\)-set, but now the aperture function has been removed. As we pointed out before, we need the aperture function to get the interference data in the first place, but it also places an important restriction on the \(D\)-set: there is no information beyond the extreme ends of the Fat-H in the vertical direction.

The final step is to decide how we are going to handle the \(D\)-set, given by

$$D(\boldsymbol{u},\boldsymbol{U})=Q(\boldsymbol{u})Q^{*}(\boldsymbol{u}-\boldsymbol{U})\;.$$
(17.36)

It is bad enough thinking what this represents in a 2-D plot, and it s even worse thinking about it in 4-D! In Fig. 17.93, we show our original interfering disc experiment next to the intensity of a diffraction pattern from a nonperiodic object. For a simple periodic object, the discs give us the phase between the unscattered beam and the scattered diffraction orders, i. e., between two points in the diffraction pattern indicated by the white arrow. However, in general, when the object is nonperiodic, the \(D\)-set gives us the phase difference between every single pixel in the diffraction pattern and every other single pixel. So, for our \(512\times 512\) scan with \(512\times 512\) detector pixels , we have 69 billion pairs of relative phases; six are illustrated in Fig. 17.78.

Fig. 17.93
figure 93

(a) With crystalline ptychography, it is easy to separate distinct diffraction orders, and so there is a very limited number of pertinent phase differences: one such diffraction order is labelled with a pointer, extending between the undiffracted disc and a first order diffraction disc. (b) For a continuous object, the \(D\)-set in (17.36) defines phase differences between every pixel of the diffraction pattern and every other pixel of the diffraction pattern: six such separations are shown here. In practice, there are many billions of such pairs of pixels

We should remember that there is a cut-off in the \(\boldsymbol{U}\)-direction of \(D\) because of the finite width of the aperture, hence the finite height of the Fat-H, so only the relative phases between points separated by less than this distance in reciprocal space can be measured. Nevertheless, all pixels, over the whole diffraction plane (including all the dark-field data lying outside the central disc) can be reached by taking multiple steps from one pixel to the next, where each step is smaller than the cut-off. In our optical crystalline example (Fig. 17.93), this is like hopping from one disc to the next. An experiment doing exactly this has been shown to work using electrons scattered from a silicon sample, thus obtaining an image (albeit only of a periodic crystal) at several times the intrinsic resolution of the lens used to from the focused probe [17.132, 17.133]. An optical crystalline experiment, stepping much further out into reciprocal space, has also been demonstrated [17.134].

This process of stepping out does not work well with nonperiodic objects. The steps must be taken via features of high modulus to reduce the accumulation of phase errors, and thus the method can only use a fraction of the available data. A much more effective solution is to use a projection method [17.72], which repeatedly sums together phase differences in the 4-D cube lying in planes of \(\boldsymbol{U}\), at the same time working out along the \(\boldsymbol{u}\)-direction. This makes full use of the data, but is beyond the scope of this chapter. For more details, see [17.72].

Finally we remark—perhaps the most important observation of all—that when we fully deconvolve the data, there is no restriction on \(Q\), and hence the object \(q\), being weak. From the point of view of the mathematics it can be as radically strong as we like, incorporating massive and abrupt phase changes and wild variations in modulus. Of course, real specimens that are very strong tend also to have finite thickness . Sections 17.10.6, 3-D Imaging and The Bragg–Brentano Plane describe 3-D imaging from the bright-field data, but there has been no work done on the influence of 3-D scattering processes on dark-field WDD data, or whether 3-D structure can be recovered using it; \(Q\) can extend as far out into reciprocal space as we like. Indeed, that was the original motive of WDD: to overcome the resolution limitation of the electron microscope lens. Figure 17.94 shows schematically how a strong object spans the \(D\)-set and the associated cut-off due to the height of the Fat-H.

Fig. 17.94
figure 94

The \(D\)-set for a strong object. The reader is encouraged to imagine the product of the two functions \(Q(\boldsymbol{u})Q^{*}(\boldsymbol{u}-\boldsymbol{U})\)

The WDD method was demonstrated with visible light in the 1990s [17.125, 17.126, 17.31]. There has also been one proof-of-principle at soft x-ray wavelengths [17.8]. There is now some renewed interest at electron wavelengths [17.10, 17.9] and see Figs. 17.24a-d and 17.25.

10.6 Other Aspects of WDD

10.6.1 Partial Coherence

The Wigner distribution, which includes a correlation-type relation, is known as a powerful tool for quantifying and understanding partial coherence, which is about statistical correlation. The same applies to WDD. Perhaps one of its most important characteristics is that, like modal decomposition in the iterative reconstruction methods (Sect. 17.8.2), it can remove the effects of partial coherence. This is not surprising—the data are the same, so the same information should exist within them. Many solutions of the phase problem start with the premise that the source and the interference processes are perfectly coherent. This is never quite true for short wavelength sources (x-rays or electrons), and so we must pay close attention to any retrieval strategy that can remove partial coherence.

When the source is of finite size, and every point of emission on it is incoherent with respect to every other point on it, then the mutual complex degree of coherence lying over the lens aperture (which lies in the Fourier plane relative to the source) can be derived via the Van Cittert–Zernike theorem, and is given by

$$\Gamma(\boldsymbol{u})=\mathfrak{F}\{s(\boldsymbol{r})\}\;,$$
(17.37)

where \(s(\boldsymbol{r})\) is the intensity distribution of the source. Incorporating this into our WDD schema is mathematically tedious [17.110], so we simply state the result. Now, our final \(D\)-set is given by

$$D(\boldsymbol{u},\boldsymbol{U})=\Gamma(\boldsymbol{U})Q(\boldsymbol{u})Q^{*}(\boldsymbol{u}-\boldsymbol{U})\;,$$
(17.38)

a surprisingly simple equation. If we think of the image obtained from any one detector pixel at position \(u\) as the probe is scanned, then a finite source will blur the coherent image, thus attenuating its high-frequency components. The amplitude of the \(D\)-set is then attenuated by \(\Gamma(\boldsymbol{U})\) in the \(\boldsymbol{U}\) (and only the \(\boldsymbol{U}\)) direction, because \(\boldsymbol{U}\) is the Fourier transform coordinate of the probe position. In principle, we can therefore divide \(D(\boldsymbol{U},\boldsymbol{u})\) by \(\Gamma(\boldsymbol{U})\) and restore a coherent data set. Other sources of incoherence like chromatic spread or detector pixel size and, in the case of electron microscopy, instability in the lens power supplies, can also be mapped in the \(G\)-set [17.135].

10.6.2 More About Sampling and Probe Size

If the specimen is weak phase, we are by definition not interested in any scattered data lying outside the central undiffracted disc. The Fat-H derives from the assumption that the second-order cross-terms between the scattered amplitude of \(Q\) and \(Q^{*}\) are negligible; only \(Q(\boldsymbol{0})\) times \(Q(\boldsymbol{u})\) has significant value. Equivalently, the \(D\)-set only has amplitude along the two lines \(\boldsymbol{u}=\boldsymbol{0}\) and \(\boldsymbol{u}=\boldsymbol{U}\). This means that there is no opportunity for stepping out or the projection strategy mentioned in Sect. 17.10.5. Under these circumstances, the sampling in \(\boldsymbol{u}\) only has to be sufficient to adequately deconvolve the occluded aperture function ((17.29), the trotters). What is that sampling? It clearly must sample the trotters at a higher frequency than any modulus or phase structure within them. That is roughly the inverse of the probe size—i. e., the same sampling condition that applies to all other forms of ptychography. Actually, near the top of the Fat-H, where the trotters are tending towards delta functions, their Fourier transform is somewhat wider. However, the deconvolution only takes out aberrations and has the effect of performing an integration over the trotters, and so it does not need to be perfect.

Contrariwise, when we have a strong specimen, the whole plane of the \(D\)-set has significant amplitude. To cleanly undertake the deconvolution and then make use of all the phase differences in the \(D\)-set (at least when the object is nonperiodic), the sampling in \(\boldsymbol{u}\) must be the same as the sampling in \(\boldsymbol{U}\). The final result of the whole process, e. g., obtained via the projection method [17.72], is a single complex-valued diffraction pattern, plotted over \(\boldsymbol{u}\). Of course, the pitch of pixels in \(\boldsymbol{u}\) must, therefore, be the inverse of the whole field of view (not just the size of the probe). Meanwhile, the weak phase object methods take all the reciprocal information from the \(\boldsymbol{U}\)-direction. This also has a pixel size that is the inverse of the field of view (as spanned by the probe), but the flexibility of having so much lower sampling in \(\boldsymbol{u}\) vastly reduces the demands on the size of the data set. There are possible solutions to this problem, say be tiling small fields of view , but at the time of writing, we are not aware that such alternatives have been explored.

Finally, we mention that the theory of WDD, at least for strong objects, depends on undertaking Fourier transforms over infinite limits or periodically repeating objects. For a continuous image, the data must be attenuated at the edge of the field of view by a soft window function, and even more space must be left within the unit cell to accurately account for the probe function as it scans up to the edge of the field of view. All this is tractable, but a reader wanting to try to do WDD must be aware of it. If the probe is a focused crossover it is very small, so this is not a significant problem.

10.6.3 Probe Solution

The redundancy in the densely-sampled data set is extreme, and so it would be surprising if it were not possible to solve for the probe as well as the object function, as is routine when using iterative methods. Indeed, there is such a solution [17.31] (there must be many others awaiting discovery). It combines elements of blind deconvolution techniques with WDD. In short, whenever we have an estimate of \(A\), we can form the corresponding Wigner distribution (17.30) in the \(H\)-set. We divide as usual to solve for \(Q\), and then transform, along the \(\boldsymbol{u}\) coordinate to the \(G\)-set. We then estimate \(Q\) from data lying along the \(\boldsymbol{U}\) coordinate. This is then used to form its Wigner Distribution. Now the data in the H-set are divided by this estimate, to give an estimate of \(A\)'s Wigner distribution, and hence, after transforming back to the \(G\)-set, a new estimate of \(A\); and so on and so forth. The principle is that the convolution in the \(\boldsymbol{u}\)-direction must be consistent with the function estimates taken along the \(\boldsymbol{U}\) coordinate.

The method was demonstrated with an optical bench experiment, but given the dismal size of the data that could be gathered in 1993, the results were unimpressive.

10.6.4 3-D Imaging

Nellist and co-workers recently showed that applying WDD with probe functions constructed at different levels of defocus, slices can be selectively imaged from multiple layers of a thick, weak object [17.10]. This is not the same as solving for the image and then propagating to different defocii, in which case there would be Fresnel effects from out of focus layers. The method seems to pick out an actual plane within the object function. At the time of writing, the work is at a very early stage.

10.6.5 The Bragg–Brentano Plane

It was recognized in the work that first described the weak object approximation of the \(G\)-set [17.124], that there exist two lines in it (two planes in the 4-D data set) that have unique properties. They lie along \(\boldsymbol{U}=2\boldsymbol{u}\) and \(\boldsymbol{U}=-2\boldsymbol{u}\). They contain identical information, because one is just the complex conjugate of the other. No matter what the aberrations in the aperture may be, if they are symmetric (which they often are), then the central value of the trotters, which lie along these lines as illustrated in Fig. 17.87, is always real and unity, because the complex conjugate components of the symmetric aperture functions cancel each other. This is also true for defocus, which implies that an image formed from these data alone will have, in theory, infinite depth of field. It will be a projection of the object.

Another way of understanding this is that the center of the trotters arises from interference between an incident beam at, say, \(\boldsymbol{g}\) and a scattered beam at \(-\boldsymbol{g}\). In conventional x-ray diffraction, the specimen is often rotated at half the angular speed of the detector, so that the normal direction of Bragg planes remain parallel to a fixed direction within the specimen. In this way, a flat slice is taken out of 3-D reciprocal space, instead of scattering over the curved Ewald sphere, which makes the analysis of the results much easier. A plane in 3-D reciprocal space corresponds to a 2-D projection in real space. The information along these special planes in the Fat-H is similarly symmetric and so can also pick out a projection of the object. This projection phenomenon has been experimentally demonstrated on the optical bench [17.127]. Calculations using Bloch waves for crystalline specimens also indicated that this plane of data is relatively immune to dynamical (multiple) scattering effects, at least compared with the bright-field image [17.128].

10.6.6 Probe Complexity and Noise Suppression

As mentioned in Sect. 17.5.6, the Wigner deconvolution can be used to explore optimal probes in ptychography. It would seem logical that if the \(\chi_{A}\) function has few low modulus areas, then the deconvolution should be more stable. This would appear to be the case. Other noise suppression strategies can be employed to avoid low values of \(\chi_{A}\) by using redundancy in the data. For more information on these issues, see [17.72].

11 Conclusions

This chapter was intended as an elementary introduction to the subject of ptychography. We have also tried to give a flavor of recent developments in each of the many diverse areas of the subject. It is not complete; since the subject took off in 2007, there have been more than 600 papers published on the technique. We have necessarily been selective, reporting on what we think are the most significant aspects of the technique. Other authors would certainly take a different perspective. A previous review chapter was written only a few months after the first iterative phase retrieval ptychography images were published [17.5]. By the time it was in print it was already out of date. Now, 10 years later, the developments in ptychography, some astonishing, continue to pour out of research groups around the world. The literature is expanding exponentially.

Fourier ptychography is undoubtedly under-represented here. Since its appearance in its modern form in 2013, it quickly covered all the ground previously addressed in real-space ptychography , and is pushing ahead, creating an independent field. Several groups are very active as we write, publishing new algorithms and new variants of the technique. We did also not have the space to cover optical encryption with ptychography [17.136], nonlinear ptychographical imaging [17.137], important developments in incoherent ptychography [17.92], and the many other refinements of experimental configuration and associated inverse algorithms.

Enabling technologies like microscopy usually follow a common development pattern. First, the technology is invented and shown to work for simple test specimens; ptychography is well past this stage in visible light, EUV, x-ray and electron imaging. Next, the method is applied to solve a scientific problem that is ideally suited to the technique; this has been achieved in x-ray and electron ptychography . The method is then applied to answer scientific problems that can only be solved by the particular method; this is probably true in the cases of high-resolution x-ray ptycho-tomography, Bragg ptychography, and spectro-ptychography. Finally, the method becomes widely adopted as a standard part of wider scientific investigations, to the extent that its use is regarded a normal component of scientific investigation, fully exploiting its niche capabilities.

As yet, ptychography is not quite at that final stage of maturity. It is most advanced in x-ray imaging. However, so long as it remains confined to the synchrotrons, it can never be very widely used; there just is not enough beamtime in the world, even though fourth-generation synchrotrons will greatly speed up ptycho-tomography. The rapid advance of table-top sources, some of which are very coherent, may bring about a step change in its usage at EUV or x-ray wavelengths in the ordinary laboratory. This may allow it to make a very big impact in all sorts of material and biological studies.

We can make one very reliable prediction. No one is going to throw away their aberration-corrected electron lenses, x-ray Fresnel lenses, KB mirrors, or high-resolution optical lenses. There are many indispensable sources of image contrast that will never be delivered by even the cleverest computational optics. The most compelling use of a STEM aberration corrector is the ability to capture material specific signals, like x-ray spectra and electron energy loss spectra. Modern machines can detect the elemental type of every single atom, at least in a two-dimensional, atomically thin structure [17.138]. The same applies in x-ray optics, where scanning focused probes can also resolve material-specific x-ray fluorescence, e. g., [17.139]. Material scientists crave for elemental and bonding information. They regard a scanning electron microscope (SEM) as virtually useless if it does not have an x-ray detector installed on it, despite the fact that modern SEMs can achieve subnanometer resolution with ease. Who wants just an image of a specimen, when it is possible to know what element every bit of it is made from? Similarly, confocal visible-light microscopy is nowadays indispensable to vast areas of biological research, again relying on excellent lenses to focus a beam onto fluorescent dyes that can spatially resolve the active sites of specific proteins and other molecules. Lenses are here to stay.

However, ptychography will find its niche, probably at all wavelengths, and it has many new things to look forward to. The ability to image state mixtures must have huge potential applications, although where this will emerge most effectively is hard to predict. 3-D imaging of the refractive index of unstained biological objects must also be ripe for exploitation. We know also that many electron microscopists dream of a very simple electron ptychography microscope. This would comprise a source, one lens, a detector, and a computer. However, there are difficulties. Electron ptychography is hard to do quantitatively without a good detector—preferably with single-event counting. (The existence of such detectors at hard x-ray energies partly accounts for its success in that field.) However, such electrons detectors are expensive ($\(\mathrm{700000}\) or so), which rather negates the idea of a low-cost, high-resolution table-top TEM. Yet who knows? There may well be a market for such a machine as detector technology becomes less expensive, which it inevitably will. Some x-ray ptychographers assert that it will eventually enable atomic resolution, at the same time overcoming the penetration limits of electron microscopy. We are sceptical; the information per damage event for x-rays is much lower than for electrons [17.140], but we would not discourage anyone from trying!

Finally, we remark, again, that there remains one very fat and large elephant in the room. It has so far been impossible to prove mathematically that ptychography works. Despite its ability to skip over the phase problem with such nonchalant ease, it still relies on inverting a highly nonlinear set of measurements. So yes, even the simplest heuristic algorithms give good pictures quickly and easily, but proving definitively why they do so is difficult. Even the most advanced algorithms have to make some assumptions. Luckily, applied mathematicians are slowly having their attention drawn to this rich and interesting field; ptychography needs them!

What next? Ptychography with neutrons? Surely the source size is far too incoherent and the interaction cross-section is far too small? However, given the advances in the last 10 years, we have learnt not to discount anything.