Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

5.1 Introduction

The interaction of short wavelength and strong field lasers with atoms and small molecules can be described in the language of a control problem: That is, the electromagnetic field produced by the laser exerts control over the quantum evolution of the system under study. By exerting this control in a predetermined way, the system will evolve to some desired final state, such as dissociative products or photoelectrons in a particular direction. This chapter uses this point of view to introduce the subject of ultrafast short wavelength strong field interactions. This subject is quite broad and deep, and this chapter can only discuss a few interesting aspects of it. Readers who wish to learn more should consult some of the review papers and textbooks on this subject [1].

To place this in its simplest context, consider the Schrödinger equation for a molecule in an open environment, and in the presence of a laser field. The Hamiltonian can be considered as the sum of three parts:

$$ ({H_{\textit{mol}}}+{H_{\textit{env}}}+{H_{\textit{laser}}})\psi (\vec{x},t)=i\dot{\psi}(\vec{x},t) $$
(5.1)

(Note that this is written using atomic units, in which \( \hbar ={m_e}=\left| e \right|=1 \)). The first term \( {H_{\textit{mol}}} \) is the single isolated molecule Hamiltonian in all its glory, including all kinetic energy terms, coulomb interactions among electrons and ions, spin-orbit interactions, and hyperfine interactions. This Hamiltonian has a ground state as well as bound and continuum excited states with eigenenergies \( {E_n} \) which satisfy Schrödinger’s time-independent equation \( {H_{\textit{mol}}}{\psi_n}(\vec{x},t)={E_n}{\psi_n}(\vec{x},t). \) The second and third terms describe “everything else,” and here we must adopt a particular point of view in order to separate this into two parts, due to the laser, and due to the “environment.” For the purposes of this discussion the laser term \( {H_{\textit{laser}}} \) will be taken to include both the laser energy itself as well as its interaction with the molecule. Control is exerted on the quantum system by means of this term.

In order to utilize \( {H_{\it{laser}}} \) to control the system, we therefore require that it possess two criteria:

  • \( {H_{\textit{laser}}}>{H_{\textit{env }}} \), i.e. the forces on the system due to the laser must exceed the forces of the environment, or other forces within the molecule that lead to dephasing of the wave packet away from the optimal path.

  • \( {\tau_{\textit{laser}}}<\hbar /\Delta E \), i.e. the time of the interactions must be less than the natural dynamical timescale for the system, given by the energy splitting. This timescale depends on the context. For example, in many molecular problems the relevant timescale is set by the relative motion of the atoms in the molecule, which is typically on the order of hundreds of femtoseconds. In other problems, the relative timescale may be due to the motion of molecular electrons, which can have dynamics on the few femtosecond or sub-femtosecond scale.

Thus control requires both ultrafast pulses and strong fields, to beat the tendency towards decoherence and to overcome natural dephasing. Some general protocol is required to generate a laser pulse corresponding to \( {H_{\textit{laser}}} \) that is optimal for a given problem.

5.2 Control Schemes

There are two different types of proposals for quantum control, that are known generally by the names of their inventors, the “Tannor-Rice” scheme and the “Brumer-Shapiro” scheme. Tannor, Rice, and the team of Brumer and Shapiro have each written textbooks or monographs including descriptions of control from their points of view, and these are excellent introductions [24]. Here we will merely summarize the essential features of these schemes.

5.2.1 Born-Oppenheimer Surfaces

Many readers of this chapter may already be completely familiar with the Born-Oppenheimer approximation in molecules, but since this chapter is intended for a broader group of laser scientists, we include a “one-page course” on the essential physics of this powerful idea. A very readable and more complete treatment is given in a review paper by Worth and Cederbaum [5]. Consider a molecule with N atoms and many electrons. The quantum Hamiltonian of this molecule can be separated into a kinetic energy part T n for the atoms, whose relative positions are described by a 3N-6 dimensional vector R, (3N-5 for a diatomic molecule); plus all of the terms describing the kinetic energies of the electrons at their different positions r and the electrostatic potential energies between all of the pairs of charged particles. It can be written in a schematic way like this:

$$ H(\mathbf{R},\mathbf{r})={T_n}(\mathbf{R})+{H_{el }}(\mathbf{R},\mathbf{r}) $$
(5.2)

The Born-Oppenheimer approximation is motivated by the insight that the atoms usually move much more slowly than the lighter electrons, so that the electronic structure can be calculated assuming that the atomic positions are frozen. Thus the T n terms may be neglected in first approximation. R becomes parameters describing the positions of the atomic nuclei in a simpler eigenvalue equation involving the electrons:

$$ {H_{el }}{\phi_i}(\mathbf{r};\mathbf{R})={V_i}(\mathbf{R}){\phi_i}(\mathbf{r};\mathbf{R}) $$
(5.3)

The electronic energy eigenvalues \( {V_i}(\mathbf{R}) \) are the potential energy surfaces of the Born-Oppenheimer approximation. Although they are often depicted as one-dimensional potential wells or two-dimensional “elastic sheets,” they are really high dimensional manifolds spanning all of the different relative positions of all of the atoms in the molecule.

The \( {\phi_i}(\mathbf{R}) \) form a real orthonormal basis on which the full solution to the molecular wave function may be expanded:

$$ {\psi_j}(\mathbf{R},\mathbf{r})=\sum\limits_i {{\chi_i}(\mathbf{R}){\phi_i}(\mathbf{r})} $$
(5.4)

Combining this expression for \( \psi \) with the molecular Hamiltonian, we can express the full Schrödinger’s equation \( H\psi =i\hbar \dot{\psi} \). Then we employ the familiar algebraic procedure to find the coefficients \( \chi (\mathbf{R}) \), i.e. multiply on the left by each of the \( {\phi_i}^* \) and integrate. This yields a set of coupled equations that look almost like a series of Schrödinger’s equations for the atoms moving on the potential energy surfaces \( {V_i}(\textbf{R}_{i}) \), with wave functions that are the expansion functions \( {\chi_i} \):

$$ [{T_n}+{V_i}]{\chi_i}+\sum\limits_j {{\varLambda_{ij }}{\chi_j}=i\hbar {{\dot{\chi}}_i}} $$
(5.5)

The different \( {\chi_i} \)’s are coupled together by the non-adiabatic coupling operators \( {\Lambda_{ij }} \), but in regions of nuclear coordinate space where this can be neglected, the nuclear wave functions may be considered as standing or traveling waves on separate potential energy surfaces V. This is the picture of intramolecular structure and dynamics that guides much of the thinking about control in molecules.

5.2.2 Limits to Adiabaticity, and Conical Intersections

The Born-Oppenheimer approximation breaks down when the non-adiabatic coupling operators \( {\Lambda_{ij }} \) cannot be ignored or treated as a small perturbation. These couplings come about because the wave function \( \psi \) is not a simple product state \( {\chi_i}(\mathbf{R}){\phi_i}(\mathbf{r}) \) for a single potential energy surface, and they become large when two different potential energy surfaces are nearly degenerate.

The usual procedure for treating two degenerate eigenvalues in stationary state perturbation theory can be employed in this case: We consider a subspace that contains only the two degenerate \( {\phi_i} \) states, and diagonalise the Hamiltonian including the off-diagonal couplings, but restricted to that subspace. For degeneracy between two potential energy surfaces, which is the usual case, this is a 2-state problem. The Hamiltonian takes on the following form in the Born-Oppenheimer basis: For R such that \( \left| {{V_1}(\mathbf{R})-{V_2}(\mathbf{R})} \right|\leq \left| {{\varLambda_{12 }}} \right| \),

$$ {H_{12 }}(\mathbf{R})=\left[ {\begin{array}{*{20}{c}} {{V_1}} & {{\Lambda_{12 }}} \\ {{\Lambda_{21 }}} & {{V_2}} \\ \end{array}} \right]\approx \left[ {\begin{array}{*{20}{c}} {{V_1}} & {\left\langle {{{\phi_2}}} {\left|{\vphantom {{{\phi_2}} {\nabla {\phi_1}}}} \right.\,} {{\nabla {\phi_1}}} \right\rangle } \\ {\left\langle {{{\phi_1}}} {\left|{\vphantom {{{\phi_1}} {\nabla {\phi_2}}}} \right.\,} {{\nabla {\phi_2}}} \right\rangle } & {{V_2}} \\ \end{array}} \right] $$
(5.6)

In simple one-dimensional problems the eigenvalues of this matrix break the degeneracy so the energies display an “avoided crossing.” True degeneracy requires two independent conditions for H: \( {V_1}(\mathbf{R})={V_2}(\mathbf{R}) \); and \( {\varLambda_{12 }}(\mathbf{R})=0 \). So if R is just a single degree of freedom, there is usually no value for R that will satisfy both conditions.

In molecules, however, R spans 3N-6 degrees of freedom corresponding to all of the relative positions of all the atoms. A single crossing between two potential surfaces only provides two constraints. Specifically, the locus of degenerate points in between these two surfaces in the 3N-6-dimensional \( \phi \) -basis must fill a subspace with one dimension less, i.e. 3N-7-dimensions. On each position R in this subspace, H 12 takes on the simple form:

$$ {H_{12 }}(\mathbf{R})=\left[ {\begin{array}{*{20}{c}} E & {\left\langle {{{\phi_2}}} {\left|{\vphantom {{{\phi_2}} {\nabla {\phi_1}}}} \right.\,} {{\nabla {\phi_1}}} \right\rangle } \\ {\left\langle {{{\phi_1}}} {\left|{\vphantom {{{\phi_1}} {\nabla {\phi_2}}}} \right.\,} {{\nabla {\phi_2}}} \right\rangle } & E \\ \end{array}} \right] $$
(5.7)

where E is the degenerate energy.

The off-diagonal terms can now break the degeneracy, but not everywhere in the remaining space, rather only where \( \left\langle {{{\phi_1}}} {\left | {\vphantom {{{\phi_1}} {\nabla {\phi_2}}}} \right.\,} {{\nabla {\phi_2}}} \right\rangle \) is non-zero. The gradient coupling defines a direction where the degeneracy is lifted. In all 3N-8 orthogonal directions that are still in the \( {V_1}(\mathbf{R})={V_2}(\mathbf{R}) \) subspace the degeneracy is maintained. Thus crossings of potential energy surfaces are not avoided in molecules with three or more atoms; in fact, degeneracies are common. This dimensional argument was first made by Wigner, von Neumann and Teller [6, 7]. The degeneracies are called Conical Intersections. In the two-dimensional space where the degeneracies are lifted, the potentials appear as two cones with a common vertex as shown in Fig. 5.1, which gives this feature its name.

Fig. 5.1
figure 1

Conical Intersection between two potential energy surfaces

5.2.3 Tannor-Rice

David Tannor and Stuart Rice have introduced an intuitive concept for control based on the motion of coherent wave packet states in laser-excited molecules. This is often referred to as “pump-dump” control [8]. The essence of this scheme is demonstrated by considering two Born-Oppenheimer potential energy surfaces in a molecule. We consider a model molecule consisting of three atoms, A, B, and C, which have a ground state in which the three are bound together in the order ABC. There are 3N-6 = 3 dimensions in the Born Oppenheimer potential energy surfaces, which could, for example, be designated as the AB bond length (R 1 ); the BC bond length (R 2 ), and the ABC bend angle (R 3 ). The coherent control problem is to photoexcite the molecule in such a way to control whether A or C dissociates from the other two atoms. The Born-Oppenheimer surface for the ground state in the R 1 -R 2 plane might look like Fig. 5.2.

Fig. 5.2
figure 2

Model ground state potential energy surface for a triatomic molecule ABC with two bond lengths: R 1 is the AB bond, and R 2 is the BC bond

Note that there are two low places in the energy landscape, where A may dissociate from BC (along R1), or C may dissociate from AB (along R2). The initial state of the molecule is presumed to be the ground vibrational state, which is well-localized in the region of the potential minimum at (R1, R2) = (2,2). The control task is to guide the system toward one dissociation channel using coherent light. The Tannor-Rice scheme employs an excited electronic state to facilitate control. This has a different potential energy surface, such as the one shown in Fig. 5.3.

Fig. 5.3
figure 3

Model excited state potential for the molecule ABC whose ground state is shown in Fig. 5.2

Tannor-Rice starts with an initial ultrafast photoexcitation pulse, which transfers some of the probability amplitude residing in the ground state to the excited state, with the atoms in the same position, i.e. near (2,2). Note that this does not correspond to the bottom of the excited potential well. Therefore the probability distribution that had been a stationary state in the ground potential is still localized but no longer stationary in the excited state potential. The wave packet then evolves under the forces that are indicated by the potential gradient, and is de-excited back to the ground state at a later time T. Control of the exit channel depends on T:

As shown in Fig. 5.4 above, dissociation is accomplished by excitation by a short pulse at t = 0 (1  2 above), followed by evolution on the excited state surface (2  3), followed by de-excitation at a later time T (3  4). Control is through the choice of dump time T. The simple premise of pump-dump control is that the initial molecular state can be selectively transferred to a target state simply by choosing the coherence properties (in this example, the time of arrival) of an interacting laser field.

Fig. 5.4
figure 4

The path of the pump-dump control scheme is shown above. The molecule is initially at location 1. It is pumped to 2, and then evolves on the excited surface to 3, where it is dumped to 4 and dissociates. The output channel is selected by controlling the time between the pump and dump pulses

5.2.4 Brumer-Shapiro

In the alternative Brumer-Shapiro route to quantum control, the final state specificity comes from the interference between multiple pathways in multiphoton transitions from the initial state to the final state. This method emphasizes the role of phase coherence in the control mechanism. Brumer-Shapiro control is multimode interference between stationary states, whereas Tannor-Rice control is wave packet evolution of nonstationary states. At first glance it might appear that the control mechanisms are totally different, since Tannor and Rice emphasize ultrafast pump and dump pulses to create wave packets that consist of many different eigenstates, whereas Brumer and Shapiro seem to deal with narrowband resonant processes with good phase control and excitation in two competing pathways that interfere. In fact, in many respects these are just two different views, time and spectral, of the same process. A simple example of how the Brumer and Shapiro idea can be connected to ultrafast excitation is the two-photon process depicted below in Fig. 5.5.

Fig. 5.5
figure 5

In this model illustrating Brumer-Shapiro control, an initial state A may evolve to either B or C depending on the properties of the applied laser fields. The final state is selected by controlling interfering pathways in the quantum evolution of the system

We assume that the bandwidth of the pulses is sufficiently broad that A  B and A  C can both occur, and the control problem is whether we can use multipath interference to select only one of these alternatives. The answer is that we can control the outcome without changing the power spectrum of the laser pulse, and we do this by controlling the spectral phase of the light. (Spectral phase is an important control feature of ultrafast light pulses, and is covered in some detail in other chapters of this volume, such as Chap. 1 by Wyatt and Professor Walmsley, so I will assume the reader has some familiarity with it.) The use of phase is seen clearly by considering the two-photon process within the formalism of time-dependent perturbation theory, where a non-resonant two-photon excitation is driven by an operator proportional to \( {E^2}(t) \). The Fermi’s Golden Rule transition probability between the ground state and any excited state is then proportional to the spectral power of \( {E^2}(t) \), i.e. by the square of its Fourier transform. Let’s take a simple pictorial example: Consider a short pulse \( E(t) \), with a real Fourier transform and a power spectrum as shown in Fig. 5.6.

Fig. 5.6
figure 6

The four panels show the electric field of a transform-limited ultrafast pulse; the spectrum of the pulse; the squared field; and the spectrum of the squared field

The spectrum of a transform-limited (TL) pulse is a smooth bell-shaped function centred on the central wavelength, and so is the power spectrum. The power spectrum of \( {E^2} \) is also shown in the last panel on the right of Fig. 5.6, overlapped with the level diagram to show how both transitions lie within the two-photon bandwidth.

Spectral phase shaping changes the situation dramatically, as shown in Fig. 5.7 above. Here we display the same spectral content, but the frequencies above the median are reversed in sign, i.e. advanced in phase by π. The power spectrum, which only depends on the magnitude of the spectral components, is unchanged. However, the nonlinear spectrum of \( {E^2} \) is substantially altered, so that the spectral power for a two photon transition to one of the two transitions is enhanced while the other transition nearly vanishes. The multipath interference responsible for this consists of not just two paths, but the continuum of different paths with two photons from the pulse which sum to the required transition frequency.

Fig. 5.7
figure 7

These panels show the same progression as in Fig. 5.6., but for a phase-shaped pulse. The spectrum of the squared field displays phase-dependent features that affect two-photon absorption

Pulse shaping control of this sort has been seen in laboratory experiments, and as the simple example above suggests, the effects can be quite large.

To conclude, these examples show that ultrafast pulse shaping can be an effective means to control quantum processes. Both Tannor-Rice and Brumer-Shapiro provide ways of thinking about quantum control, but neither comes to terms with the problem of how to find the optimal pulse sequence or phase shape to effect control. For that we seek some search protocol or algorithm that can be used to model general systems, to examine their controllability and to find the best control parameters.

5.3 Optimal Control Theory

One such protocol is Optimal Control Theory. We begin with a general statement of the problem: Given \( {\psi_i}(t) \) and \( H={H_0}+H^{\prime}\left[ {\varepsilon (t)} \right] \), we want to adjust \( \varepsilon (t) \) to create a desired target state \( {\phi_f}(T) \). Furthermore, we want to find the “best” path, i.e. the path that minimizes some cost, such as the total energy consumed. How do we do it?

This is a deep subject, and there are many deep thinkers who have influenced it. An excellent introduction is the textbook by Tannor [9]. The best we can do here is to provide some basic information. We will start with some simplifying assumptions: Two potential energy surfaces in the Born-Oppenheimer approximation; and pure states (i.e. no decoherence or thermal averaging) so that we can use the Schrödinger equation without modifications.

The “pump-dump” picture and the “two-path interference” picture both suggest that a perturbative approach taken to second order (for two interactions with the laser field) should suffice to describe the process of control. However, following Tannor, we will take a more general approach that allows for strong fields and any number of photons.

Our objective can be quantified as the overlap integral, or projection, of the system \( \psi (T) \) on the target state \( {\phi_f} \), i.e.

$$ J=\mathop{\lim}\limits_{{T\to \infty }}{{\left| {\left\langle {{{\phi_f}}} {\left | {\vphantom {{{\phi_f}} {\psi (T)}}} \right.\,} {{\psi (T)}} \right\rangle } \right|}^2}=\mathop{\lim}\limits_{{T\to \infty }}\left\langle {\psi (T)} \right|{P_f}\left| {\psi (T)} \right\rangle $$
(5.8)

where \( {P_f}=\left| {{\phi_f}} \right\rangle \left\langle {{\phi_f}} \right| \) is the projection operator for channel f, e.g. a particular final state channel. J is called the “objective functional” for the optimal control problem.

The system state function \( \psi (t) \) is evolving according to some \( H(t) \) that depends explicitly on the history of the applied laser field \( \varepsilon (t) \) at all times t < T:

$$ \left\langle {\psi \left[ {\varepsilon (t)} \right](T)} \right|{P_f}\left| {\psi \left[ {\varepsilon (t)} \right](T)} \right\rangle $$
(5.9)

There may also be some constraints. For example, perhaps we need to do this job with finite total laser pulse energy \( \int\limits_0^T {dt{{{\left| {\varepsilon (t)} \right|}}^2}} \). To implement this constraint, we modify the objective functional by decreasing its value in proportion to the pulse energy, as a penalty:

$$ \overline{J}\equiv \mathop{\lim}\limits_{{T\to \infty }}\left\langle {\psi (T)} \right|{P_f}\left| {\psi (T)} \right\rangle -\lambda \int\limits_0^T {dt{{{\left| {\varepsilon (t)} \right|}}^2}} $$
(5.10)

where \( \lambda \) is a Lagrange multiplier.

The wave function must obey Schrödinger’s equation, but this may also be enforced as a constraint, following Kosloff [3], so that we have:

$$ \overline{J} \equiv \mathop{\lim}\limits_{{T\to {\,}\infty }}\left\langle {\psi (T)} \right|{P_f}\left| {\psi (T)} \right\rangle \\ \qquad + 2\operatorname{Re}\left[ {\int\limits_0^T {dt\left\langle {\chi (t)} \right|-\frac{\partial }{{\partial t}}+\frac{{H(\varepsilon (t))}}{{i\hbar }}\left| {\psi (t)} \right\rangle } } \right]-\lambda \int\limits_0^T {dt{{{\left| {\varepsilon (t)} \right|}}^2}} $$
(5.11)

The “dual” wave function \( \chi (t) \) plays the role of the Lagrange multiplier. Clearly the second term vanishes for any \( \psi (t) \) which satisfies Schrödinger’s equation, independent of \( \chi (t) \). Notice by the way, that the opposite is also true: If \( \chi (t) \) obeys Schrödinger’s equation, then the second term vanishes. A convenient physical interpretation for \( \chi (t) \) is the projection of \( \psi (T) \) on the target state, i.e.

$$ \chi (T)={P_f}\psi (T) $$
(5.12)

We can use transitions between two states of the system, the ground state (“g”) and an excited state (“e”), just as in the Tannor-Rice example. The Hamiltonian takes on a simple 2 × 2 form:

$$ H\equiv \left[ {\begin{array}{*{20}{c}} {{H_g}} & {\ \ -\mu \varepsilon^* (t)} \\ {-\mu \varepsilon (t)} & \ \ {{H_e}} \\ \end{array}} \right],\;\mathrm{with}\;\psi =\left[ {\begin{array}{*{20}{c}} {{\psi_g}} \\ {{\psi_e}} \\ \end{array}} \right],\;\chi =\left[ {\begin{array}{*{20}{c}} {{\chi_g}} \\ {{\chi_e}} \\ \end{array}} \right] $$
(5.13)

The job now is to find the extremum value for J by varying its parameters \( \psi (t) \), \( \psi (T) \), \( \varepsilon (t) \), and \( \varepsilon^* (t) \). The first two variational equations simply direct us to require that \( \chi (t) \) satisfies Schrödinger’s equation and \( \chi (T) \) is the projection of \( \psi (T) \) on the target state, as we supposed from inspection of the form of J. The field variationals are more interesting. They lead to the principal equation of optimal control :

$$ \varepsilon (t)=\frac{i}{{\hbar \lambda }}\left[ {\left\langle {{\chi_g}(t)} \right|\mu \left| {{\psi_e}(t)} \right\rangle -\left\langle {{\psi_g}(t)} \right|\mu \left| {{\chi_e}(t)} \right\rangle } \right] $$
(5.14)

This equation states a relationship between the field, the wave function, and the “dual” function at every time during the quantum evolution of the state. The target state appears implicitly because of its connection to the dual function \( \chi (T) \), and this provides a clue about how to solve this equation iteratively:

  1. 1.

    Start with the initial state \( \psi (t=0) \) and a simple guess for the field \( \varepsilon (t) \).

  2. 2.

    Propagate the wave function using \( H[\varepsilon (t)] \) from t = 0 to t = T.

  3. 3.

    Invoke \( \chi (T)={P_f}\psi (T) \).

  4. 4.

    Now propagate \( \chi (t) \) backwards in time from T to 0 using \( H[\varepsilon^*(t)] \).

  5. 5.

    Now armed with both \( \chi (t) \) and \( \psi (t) \), we may calculate a new \( \varepsilon (t) \). This will generally be different from the first field that was chosen. But this new field can now be employed in steps 1–5, repeating the iterations until convergence.

Does this work? Often, but not always. The issue of “controllability” is beyond the scope of this introduction, except to state that not all simple quantum systems are obviously controllable. A classic example of a “problem” potential is the harmonic oscillator. Since all energy levels are equally spaced, any transition between two is potentially a transition among all, which makes control more challenging, if not impossible.

The ABC dissociation problem that we discussed earlier as an example was considered by Kosloff et al. for the specific case of mass ratios of 1:1:2 (such as HHD), as a test of the optimal control algorithm discussed above. The results are quite instructive regarding the connection to the Tannor-Rice ansatz. They do indeed find that the best laser field transfers population from g to e in a “pump-dump” sequence, but the optimal field sometimes requires several pump-dump cycles.

The conclusions of this section are quite powerful. We find that it is possible to define a protocol for quantum control in many general situations! That is, given an initial molecule and a more than one possible target product, we can use the coherence properties of a laser field to selectively direct a transition. The next question is, so what?! We take up this issue next.

5.4 Deriving Insight from Control Calculations

This section has a grand title that it probably doesn’t deserve, because there is no general prescription for deriving physical insight from the results of optimal control theory. Nonetheless, that is a primary motivation for the field. We desire to know the optimal field, but immediately after that to also know why that field works. What is the underlying physical mechanism? To illustrate the problem, we consider a recent theoretical study by Artamonov on the isomerization of ozone [1]. The normally bent O3 molecule has a metastable triangular form. Optimal control theory was used to find the optimal field for the transition, as shown in Fig. 5.8. The optimal field, taken by itself, is rather uninformative.

Fig. 5.8
figure 8

Left: Optimal field for the ozone isomerization transition [1]. Right: Spectrum of the field shown on left of the figure (Reproduced and adapted with permission from Ref. [1])

Perhaps one can glean that the field starts out at a lower frequency, then has a period of near darkness, before resuming a higher frequency. The spectrum shows that these two features are quite separate:

The correlation between time and frequency is an important feature of any control pulse, but that correlation is sometimes difficult to pick out of the temporal representation \( \varepsilon (t) \), and the information is totally lacking from the power spectrum. There are representations that contain both time and frequency information in an easy to read form. One example from audio waves is a musical score. Sheet music is a form of audio spectrogram, where each note on the page displays spectral information through its vertical placement on the musical staff, and temporal information through its shape and its placement order along the horizontal. This suggests that a two-dimensional representation could also be employed for light, and there are several different examples in common use.

A simple example is the windowed Fourier transform (WFT). A WFT uses a moving time gate function to restrict the electric waveform to a narrow window in time. The Fourier transform of the field within this gate as a function of gate position is then displayed on a two-dimensional plot. The WFT for the optical field found for the ozone isomerization is shown below in Fig. 5.9, next to the potential energy curve showing the ground state well (bent form around 112°) and metastable well (triangle form at 60°).

Fig. 5.9
figure 9

Left: Potential energy vs. opening angle for ozone. Right: WFT spectrogram for the optimal photoisomerisation pulse (Reproduced and adapted with permission from Ref. [1])

The spectrogram shows that two pulses are needed. The first is centred on the energy differences of levels residing in the stable initial well, and suggest a series of transitions upwards in the potential well at greater amplitudes, until the molecule can pass across to the metastable well. Then a positive chirped higher frequency pulse induces de-excitation to the final state. This pulse has a transfer efficiency of an impressive 94 %. This simple explanation may not be correct, though, because if the optimal pulse is replaced by a simple combination of two chirped pulses with the same shape, but without the low amplitude parts that appear to be a noisy pedestal, then the transfer efficiency goes down to less than 10 %. Evidently, even the small bumps and wiggles are very important to the success of the experiment, and the simple view of chirped pulse ladder climbing is not the whole story, nor even the biggest part of the story! [1].

One missing element in the simple one-dimensional wave packet interpretation of this calculation is the three-dimensional structure of the potential energy surfaces for ozone. The other two dimensions are associated with bond stretching. Although they are hidden from view in the one-dimensional picture, their presence survives in the energy level splitting, and the optimal field is affected by this as it tries to steer the molecule away from unnecessary vibrations excited by sub-optimal control fields.

Two-dimensional representation of control fields: This example demonstrates the value of good visualization of the control field using two-dimensional spectrograms. Optical spectrograms can be constructed in different ways, and the analogy to music only describes them partially. We would like a spectrogram that retains all of the same information as the temporal representation, \( E(t) \), which is a complete description of the field. The WFT does not do this, since it convolves the field with a gate function of finite width and shape. The gate can smear out sudden field changes, or attenuate rapid fluctuations. One transformation that retains complete information is the Wigner distribution, which is the inverse Fourier transform of the autocorrelation of the complex spectrum (Fig. 5.10):

$$ S(\nu, t)=\int {E(\nu +{\nu}^{\prime})E^*(} \nu -{\nu}^{\prime}){e^{{-4pii{\nu}^{\prime}t}}}d{\nu}^{\prime} $$
(5.15)
Fig. 5.10
figure 10

Displays of a two-pulse optical field. Top: The spectrum and the intensity vs. time. Left: Intensity and power spectrum. Left: Wigner time-frequency distribution, showing coherence features between the two separated spectral peaks. Right: A Husimi distribution eliminates the spectral interference features but still shows the spectral separation of the two peaks

Like the field itself, the Wigner distribution is a real function. It resembles a spectral-temporal probability distribution, but differs in two important ways from this simple interpretation. First, the Wigner distribution can take on both positive and negative values, so it can’t simply represent a probability. Second, it has features that appear in locations in space and time where there is no field at all! These are regions between optical field pulses, and they represent the relative coherence properties between sub-pulses that are not temporally connected. Still, like the FROG traces discussed in Chap. 1 of this book, the Wigner distribution does provide a fairly simple connection to the real pulse after a little practice. The Wigner distribution is “overcomplete.” This simply means that that if I divide an electric field waveform into N pieces, for example in order to plot it in Matlab, then the Wigner distribution I calculate from this has \( N\times N={N^2} \) pixels, even though only N pieces of information went into its construction. A different transformation was developed that contains the same features but with only \( \sqrt{N}\times \sqrt{N}=N \) pixels, called the von Neumann distribution [10].

Finally, for those who wonder if all two-dimensional optical distributions are named after Princeton faculty, here is a counterexample: A very useful spectrogram called a Husimi distribution is a two-dimensional convolution of the Wigner distribution with a minimum uncertainty Gaussian. This procedure removes rapid oscillating coherence features from the Wigner distribution, leaving something that much more closely resembles the temporal-spectral probability. Although this destroys coherence information contained in the complete description of the field, it shows clearly how the optical energy is distributed in both time and frequency.

5.5 Producing Shaped Ultrafast Optical Fields

Pulse shaping for ultrafast optical fields can be performed in various ways. There is an excellent pedagogical review by one of the pioneers in the field, Andrew Weiner [11]. Here I summarize some of the basic ideas and methods.

At its root, pulse shaping is wave filtering. A pulse shaper receives an input wave form with a complex Fourier transform \( {E_{\textit{in} }}(\omega ) \), transforms it by combining it with some mask or mixing wave \( M(\omega ) \), and outputs a final waveform \( {E_{\textit{out} }}(\omega ) \), as in the example in Sect. 5.2 of this chapter. If the mixer is a frequency modulator then we describe this using the language of nonlinear optics:

$$ {E_{out }}(\omega )=\chi^{(2) }{E_{\textit{in} }}({\omega_{\textit{in} }})M({\omega_M}) $$
(5.16)

where \( {E_{out }}(\omega ) \) is nonzero for frequencies \( \omega ={\omega_{in }}\pm {\omega_M} \).

In this case the main function of the pulse shaping is to add spectral content to the light field in the form of spectral sidebands spaced apart from the input frequency by \( \pm {\omega_M} \). Other examples of pulse shaping that add to the spectrum of the input field include nonlinear self-phase modulation, including white light generation (see Chaps. 8 and 9 by Dudley et al. and Wadsworth respectively); and frequency multiplication, including high harmonic generation (see Chaps. 3 and 7 by Marangos et al. and Mathias et al. respectively).

At the other extreme is a simple grating monochromator with a very narrow slit \( M(\omega ) \) for \( \omega ={\omega_M} \). We use a different language to describe what the slit does:

$$ {E_{out }}(\omega )={E_{\textit{in} }}(\omega )M(\omega )\approx \left\{ {\begin{array}{*{20}{c}} {0\ \ \ \ \ \mathrm{for}\;\ \omega \ne {\omega_M}} \\ {{E_{\textit{in} }} \ \mathrm{for}\;\omega ={\omega_M}\ } \\ \end{array}} \right. $$
(5.17)

This kind of mask appears purely subtractive meaning that it cannot add new frequencies to \( {E_{out }}(\omega ) \), but only filter frequencies that are already present in the input field. This is an example of a linear filter, whereas the wave mixing modulator is a nonlinear filter . Most programmable pulse shapers used in quantum control experiments use a linear frequency filter to shape pulses. Filters used include grating-mask combinations employing patterned stationary masks, liquid crystal programmable masks, acousto-optic Bragg cells, and programmable dispersive filters, such as the “Dazzler.”

The basic configuration of a linear frequency filter, as shown in Fig. 5.11, is similar to a spectrometer combined with its mirror image:

Fig. 5.11
figure 11

A Fourier filter pulse shaper

This is sometimes called a “4f” configuration, because the distance between the mirrors is chosen to be twice their focal length f. The distance from the gratings to the mirrors is f, so that the first mirror collimates the expanding fan of wavelengths at the same time as it focuses each separate wavelength. Such an arrangement does not change the size of the laser beam from the input to the output. Also, to lowest order the optical path for each wavelength is the same, so that a short pulse is not dispersed at the output. The mask is at the image plane of the first spectrometer, which is the point where the different wavelengths are most cleanly separated. The mask can then transmit some wavelengths and block others, or apply pre-programmed phase shifts to different regions of the spectrum. The second half of the device performs an inverse Fourier transform on the amplitude- and phase-altered spectrum that emerges from the mask. If the mask is programmable, then this device functions as an arbitrary waveform generator for light.

There are a few important constraints on the waveform that can be produced by a Fourier filter pulse shaper. First, if the mask is a static and a linear element as I have described it, then this device can only attenuate and shift frequencies that are already present in the input beam. Second, global phase cannot be controlled by this device. This is sometimes called the “carrier envelope” phase. For any optical field pulse longer than a few tens of femtoseconds, this phase has no significance, because it only represents a trivial overall time shift of less than one optical cycle. But for very short pulses that are only a few cycles long, the carrier phase can determine the details of the interactions of the light with matter.

Fourier filter pulse shapers can be constructed using several different kinds of spectral selection, some of which have been commercialized. The simple 4f system shown in Fig. 5.11 requires a mask-type filter, which can be implemented with programmable liquid crystals, acousto-optic Bragg-type reflectors, or static masks. All of these methods are subtractive, and in addition they have various efficiencies. I have already described how nonlinear processes can be controlled by adjusting the phase alone, and indeed phase shaping can accomplish many tasks, including the production of series of short pulses with variable delays. For this purpose, phase-only shapers are sufficient, such as deformable mirrors that adjust the phase of different spectral components by introducing sub-micron delays. The “Dazzler” is a commercial acousto-optic pulse shaper that relies on the spectral selectivity of a programmable periodic Bragg reflector, and therefore does not require dispersing gratings [12, 13].

Another property that influences the selection of an appropriate pulse shaper is the refresh rate. Acousto-optic gratings are transient but can only be refreshed at the speed of sound, and therefore inappropriate for continuous wave or high repetition rate applications. On the other hand, they are ideal for applications where a different pulse shape must be selected on each pulse, up to repetition rates of several kilohertz. Liquid crystals and deformable mirrors refresh much more slowly, but they make good static programmable masks, which can therefore be used for continuous wave applications.

5.6 Learning Feedback Algorithms

An experimental protocol analogous to Optimal Control Theory is called “learning feedback.” The explicit suggestion for this was, not surprisingly, made in a theory paper, by Judson and Rabitz [14]. In learning feedback, a target observable is selected for optimization. For example, one might desire that the pulse shaper be adjusted to take an unknown input pulse and readjust the phases to produce the shortest possible output pulse, subject only to the bandwidth limitation imposed by Fourier’s theorem. Alternatively, one might want to discover the optimal pulse shape to optimize a complex task in molecular or chemical physics such as a specific dissociation or isomerization. We may or may not have some idea from theory or previous experiments about the optimal pulse, but the point of learning feedback is that such prior knowledge is not needed. The quantum system under investigation can solve Schrödinger’s equation for us, if we give it the right pulse for its interaction Hamiltonian. We therefore need an efficient search protocol. Several have been proposed, such as searches using parameterisations of the pulse shape in the time and frequency domain. Parameterisation by one or two parameters, such as an expansion in order of dispersion, or simple use of overall amplitude, central frequency and chirp, have limited ability for the discovery of new insights. On the other hand, a large parameter space is almost impossible to search systematically. To see just how seemingly hopeless the task is, consider the number of degrees of freedom in a typical pulse shaper, with 256 different spectral amplitudes and phases. If each phase can be adjusted with eight bits of precision, then there are 256256 different pulse shapes. That’s more than the number of protons in the universe. The search could be streamlined by selecting an initial pulse and following the fitness gradient. But with so many dimensions, even determining the gradient is a challenge. Technical noise and other experimental limitations come into play. Furthermore, there is no particular reason to think the problem is necessarily convex, i.e. one where the optimal pulse can be found by following the gradient. This is where evolutionary algorithms become useful.

A number of successful learning feedback protocols employ variations on evolutionary algorithms. Such schemes are based on a map that contains the information needed to construct the pulse. This “pulse DNA” is called the “genome” for the pulse, and its individual pieces, the “genes” can be anything that is convenient for the experimenter. For example, the pulse spectrum could be divided into N pieces, and each piece assigned an amplitude and phase. These 2N pieces of information compose the genome for that pulse. The specific encoding is relatively unimportant, but it is necessary that any arrangement of genes within the 2N dimensional pulse shape space corresponds to a pulse that can be produced in the lab and used to perform a test on the physical system. Shown below, and in Fig. 5.12, are the steps of the learning search:

  1. 1.

    Select a population of starting genomes. This is usually done at random, and this step is only performed once. This population of genomes creates a population of pulse shapes, which are the individual members of an initial pulse shape population, Generation 0.

  2. 2.

    Perform the experiment with each member of the generation, and record the relative amount of the desired target state following each experiment. This successful fraction is called the “fitness” of the corresponding pulse, since it indicates its relative suitability.

  3. 3.

    Rank the genomes according to their fitness.

  4. 4.

    Now form the next generation in the learning feedback loop, by combining the best traits of the previous generation. This step is one where all the improvement in fitness occurs, and I will have more to say about it below.

  5. 5.

    Go back to step 2, and repeat steps 2–5 until the fitness is optimized.

Fig. 5.12
figure 12

Flow chart showing a learning feedback loop

The fact is that such a fitness directed search works extremely well. Much of the success, however, comes from clever innovations in step 4 above, i.e. the selection of a new generation base on the fitness ranking of the previous generation. The selection method is called an “operator.”

5.6.1 Operators in Genetic Algorithms

The operator in a genetic algorithm converts a parent population into offspring based on some kind of mathematical manipulation of the individual genomes. The most common operators employed are crossovers and mutation, which generate new pulse shapes using rules based on random statistics. Crossover operators create offspring by sharing the genetic material of two (or more) parents. In a typical implementation, one parent’s genome is cut in n-places (n-point crossover), creating (n + 1) or more segments, often at a randomly selected point in the genome string. Then a second parent’s genes are substituted for one or more of the segments. A similar type of operator is the “average crossover,” which creates offspring whose genes are the arithmetic average of the two parent genes in the same position along the genome. One important difference between normal crossover and average crossover is that the latter introduces new gene values to the population.

Mutation is also generally considered to be very beneficial in genetic learning algorithms, because it introduces new genetic material, i.e. different values of spectral amplitude and phase. This may either be insertion of a completely alien individual into the population, or the replacement of some of the genes in parent genomes.

Operators could also be based on some physical knowledge of the system. Examples include time-domain crossover, where the genome is a discretization of \( E(t) \) rather than \( E(\omega ) \). This can be more efficient in cases where the optimal solution consists of discrete pulses, which are easy to describe in the time domain but depend on more complicated phase functions in the frequency domain.

There are as many different operators as imagination can provide. Another example is polynomial phase mutation, where the spectral phase is decomposed into a polynomial expansion, and one order (cubic, for example) is replaced with a mutated value.

The large number and variety of operators suggests that learning feedback is as much an art as a science, but note that all of these variations are simply means to arrive at an optimization more rapidly. Operators are “inside the feedback loop” in the sense that a poor choice of operator may impede the progress to a solution, but will generally not prevent that optimization. Nonetheless, an efficient algorithm should find a way to utilize this variety.

5.6.1.1 Adaptive Operators

The learning feedback used to select individual pulses with high fitness can also be used during the algorithm to select the most efficient operators. This technique is called “adaptive operators.” In a typical implementation, several different operators are used to create offspring. The particular operator for each individual is chosen at random, but recorded by associating a genealogy with each individual. After three generations, the genealogy of the most fitted individuals is a record of the most efficient operators. The ensuing generations use this information to weight the probability for using these more effective operators. The genealogy is examined in every subsequent generation, and the probability for selecting operators is adjusted to ensure that the most effective operators are used a higher fraction of the time (Fig. 5.13).

Fig. 5.13
figure 13

The fitness for multiple operators plotted as a function of generation for a single run of the learning algorithm while maximizing the fitness for a particular learning feedback algorithm. This was a search to optimize the peak intensity of a pulse by adjusting its spectral phase (Reproduced and adapted with permission from Ref. [15])

5.6.1.2 Principal Control Analysis

One of the great strengths of learning feedback control is that it achieves a solution without any prior knowledge of the Hamiltonian. Its efficiency does not depend on this, nor to any great extent on the size of the control space. It is hardly more effort to search a large space than a small one, since the search is fitness-directed, and therefore tends away from unproductive regions or degrees of freedom. The number of possible solutions in the control space may be huge, but the number of solutions included in the search is not huge [16].

The price paid for this luxuriously large search space is that the solution may be difficult to intuit, and it may be nearly impossible to discern the underlying physics by examining the features of the optimal pulse. The fitness function maximum may only depend on a few “essential” features of the control field, even though these are not obvious in the spectral basis used for most control experiments. Other unimportant features are just ornaments, so long as they do no harm.

This problem has been addressed by adapting a control analysis protocol borrowed from the analysis of large statistical data sets, known as Principal Component Analysis (PCA). The version adapted to control problems has been named Principal Control Analysis [17]. PCA is based on the notion that all solutions with high fitness should share common traits which can be found by examining details of the search process itself.

The basic idea is that the control Hamiltonian for any particular pulse shape \( H\left[{E\left({{x_i}} \right)} \right] \) depends on a vector of control parameter settings \( {x_i} \) that produce that pulse shape. It should be possible to rewrite the Hamiltonian using a new basis where the most important linear combinations of parameters are basis vectors. In the old bases, the set of most important control directions forms a lower-dimensional subspace, like a surface embedded in a cube (see Fig. 5.14). Each essential control direction is a linear combination of the much larger basis of the control experiment. Since the trial pulse shapes in the experiment were chosen specifically because they had the highest fitness, we can get a good sense of these control directions by examining the control values in the learning search [18].

Fig. 5.14
figure 14

Geometric model of the control Hamiltonian. The space of all pulse shapes is depicted as a cube on the left. The optimal pulses lie on a lower dimensional space within the cube, here shown as a surface. All pulse shapes on this surface are equally fit. The same Hamiltonian control space is shown on the right, but now transformed to the essential control basis. Now the unimportant degrees of freedom are no longer present, so that the optimal pulses all lie at a point. PCA is a method to find these essential degrees of freedom (From Ref. [17])

Principal Control Analysis investigates the covariance of the values of pairs of genes in the set of pulse shapes used in the search. In other words, how much do pairs of values vary together in a fitness directed search. The variance of ith gene in the control parameter genome x i is:

$$ \left\langle {{x_i}} \right\rangle ={{\left( {\sum\limits_n {{{{\left( {x_i^2} \right)}}_n}} } \right)}^{1/2 }} $$
(5.18)

and the covariance is a square matrix:

$$ \left[ {{c_{ij }}} \right]=\langle {{x_i}{x_j}} \rangle -\langle {{x_i}} \rangle \langle {{x_j}} \rangle =\left[ {\arraycolsep4pt\begin{array}{*{20}{c}} {{c_{11 }}} & \ldots & {{c_{1m }}} \\ \vdots & \ddots & \vdots \\ {{c_{m1 }}} & \cdots & {{c_{\textit{mm} }}} \\ \end{array}} \right]. $$
(5.19)

We want to find a new basis in which the variance of each control direction is independent of the others. Clearly this is just the basis in which c ij is diagonal:

$$ \left[ {\arraycolsep4pt\begin{array}{*{20}{c}} {{c_{11 }}} & \ldots & {{c_{1m }}} \\ \vdots & \ddots & \vdots \\ {{c_{m1 }}} & \cdots & {{c_{\textit{mm} }}} \\ \end{array}} \right]\to \left[ {\arraycolsep4pt\begin{array}{*{20}{c}} {{\lambda_1}} & 0 & 0 \\ 0 & {{\lambda_2}} & 0 \\ 0 & 0 & \ddots \\ \end{array}} \right]. $$
(5.20)

The largest eigenvalues correspond to directions in the control space with the largest variance. Since the search is directed by fitness, these are the directions of greatest fitness improvement, and therefore correspond to the most important features of the pulse for control. The eigenvectors of these eigenvalues then form the essential control basis. Often there are only two or three large eigenvalues, indicating that control can be established with only a few independent controls.

5.6.1.3 Most Correlated Feature Analysis

A closely related technique to PCA is Most Correlated Feature Analysis (MCFA). In MCFA, we directly determine the features (directions) in the search space that correlate best with fitness. Here we define a feature as any vector in the control space that can have some fitness associated with it. The correlation of fitness f with any projection of the searched pulse shapes onto an arbitrary feature a is:

$$ \begin{gathered} R=\frac{{\left\langle {af} \right\rangle -\left\langle a \right\rangle \left\langle f \right\rangle }}{{{\sigma_a}{\sigma_f}}} \\ =\frac{{\left\langle {\sum\limits_{k=1}^N {{\alpha_k}{\eta_k}f} } \right\rangle -\left\langle {\sum\limits_{k=1}^N {{\alpha_k}{\eta_k}} } \right\rangle \left\langle f \right\rangle }}{{{\sigma_a}{\sigma_f}}} \end{gathered} $$
(5.21)

The feature with the highest fitness correlation has a particularly simple form in the PCA eigenvector basis: Its components are given by:

$$ {\alpha_k}=\frac{{\left\langle {{\eta_k}f} \right\rangle -\left\langle {{\eta_k}} \right\rangle \left\langle f \right\rangle }}{{{\sigma_{{{\eta_k}}}}}} $$
(5.22)

This is powerful, because it allows us to compute the pulse shape with the highest correlation to fitness even though this specific pulse shape may never have been produced in the search. Furthermore, since we know that only a few directions in the PCA basis have high eigenvalues and therefore high sensitivity to the search objective, therefore only the first few α k are important.

A basis of most correlated features may be calculated iteratively by Schwartz decomposition, i.e. subtracting the projection of the data onto the most correlated feature from the data and repeating the MCF analysis. This basis can be used to produce a simplified “essential” cut in the fitness landscape.

5.6.1.4 Multi-dimensional Data for Control

The fitness measurement in a learning control experiment is at least as important as the pulse shaper. There is considerable flexibility in choice of a fitness criterion, since the feedback loop process will move in the right direction for any measurement that has a monotonic relationship to the target functional. However, the connection between the raw signal and the fitness can be exploited to learn more about the dynamics initiated by the control fields, and recent work in quantum control has begun to focus on this.

5.7 Conclusion and Outlook

This chapter has introduced some of the basic concepts in the subject of quantum control of small molecules. The original impetus for quantum control was to learn how to use the coherent properties of lasers to accomplish arbitrary and selective bond rearrangements in molecules, despite the fact that the molecular Hamiltonian is far too complicated to solve directly, and the number of coupled degrees of freedom means that any solution will be nearly impossible to implement anyway.

The theory problem has been approached through optimal control methods, which are essentially computational feedback protocols to iteratively solve complex problems. On the experimental side, the learning feedback method overcomes this complexity by using the molecule itself as an analogue computer, where the experimental apparatus supplies the input and output registers, and the experimenter is merely the programmer [14]. The ansatz here is that the molecule will solve the problem for us, if we supply the right experimental protocol, and if we are persistent.

The current state of the field is far from the original goal of bond-selective manipulation, but quantum control has had an important impact in several areas, for example in ultrafast and strong-field physics, and ultracold atom physics. This brief introduction has not gone into these applications, but now we mention a few of them to provide an outlook for future work.

The theory of quantum control is extremely active. Fundamental issues such as the dependence of controllability on the nature of the state space, or the topology of the control landscape, are quite current [19]. Control in open systems, where there is dissipation, is important to many applications. A particularly active area of theoretical interest in this regard is quantum information.

Many of the concepts of quantum coherent control are closely related or even derived from the far more established field of nuclear magnetic resonance. Very sophisticated pulse shapes and feedback protocols are employed in magnetic resonance experiments to obtain structure information on the atomic scale [20].

The new field of attosecond science relies on strong (V/Ǻ) laser field, which can field ionise and subsequently re-scatter a molecular electron, producing high harmonics (HHG). Control of these quantum processes could been used for detecting atomic-scale structure and transient processes in small molecules [21].

Finally, quantum control concepts are also beginning to have an impact in the new x-ray free electron lasers. Control schemes are needed there to prepare molecules for interrogation by intense x-rays [22].