Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

9.1 Introduction

Particle accelerators are large complex systems composed of many thousands of coupled components which include radio frequency (RF) electromagnetic accelerating cavities, magnets, cooling systems, and detectors. For many decades accelerators have been designed with specific, static, operating conditions in mind, such as specific beam energies, currents, repetition rates, and bunch separations. For example, the Los Alamos Neutron Science Center accelerator is a \(\sim \)1 km long linear accelerator that has two fixed design energies of 100 and 800 MeV, and 8 fixed beam types which vary in terms of bunch length, charge/bunch, and repetition rate. Once the accelerator is tuned up following a maintenance outage, it is mostly continuously run with the various beam types accommodated by a fixed magnet/RF system setup, with intermediate tuning by operators to make up for small disturbances and fluctuations. The advanced photon source (APS) is a \(\sim \)1 km circumference synchrotron with magnet and RF systems tuned for a fixed 7 GeV electron beam which can be sent to various user stations with unique magnet and optic systems including monochromators for the production of specific light energy ranges from 3.5 to 100 keV. The Large Hadron Collider at CERN is the world’s most powerful accelerator with a circumference of 27 km and beam energy of 6.5 TeV per beam for two counter circulating proton beams. The machine is run at a fixed energy for years at a time while massive detectors at four collision points collect data for fundamental particle physics research. The three machines described above encompass a majority of existing accelerators, which are designed for and operated at fixed settings, providing very specific beam types and energies.

Unlike the static machines described above, the next generation of X-ray Free Electron Laser (FEL) advanced light sources are being designed and operated with the fundamentally different approach of allowing users to drastically change beam properties for various experiments. The main advantage of FELs over synchrotron light sources such as the APS is their ability to provide more coherent, brighter flashes of light by tens of orders of magnitude with custom bunch lengths down to tens of femtoseconds. The wavelength of the brighter, more coherent light produced by an FEL is extremely dependent on the electron beam energy, which must be adjusted between different experiments. A large change in beam energy and bunch length requires the re-tuning of almost the entire accelerator. For example, the shortest, few femtosecond electron bunches require an adjustment of the source in lowering the total electron bunch charge so that the space charge forces of such short pulses are manageable. The bunch compressor system and RF energy settings and offsets must then also be adjusted to provide the new, shorter bunch length. Finally, depending on the required light and therefore electron beam energy of the given experiment, the magnet focusing systems throughout the accelerator and the undulator must be retuned. Therefore, unlike traditional machines which can operate for months or years at fixed energies, RF, and magnet settings FELs must have the flexibility to be completely re-tuned. For example, the Linac Coherent Light Source (LCLS) FEL can provide electrons at an energy range of 4–14 GeV and 1 nC pulses with 300 fs pulse width down to 20 pC pulses with 2 fs pulse width.

The next generation of X-ray FELs will provide even bright, shorter wave-length (0.05 nm at EuXFEL, 0.01 nm at MaRIE), more coherent light, and at higher repetition rates (2 MHz at LCLS-II and 30000 lasing bunches/second at EuXFEL, 2.3 ns bunch separation at MaRIE) than currently possible, requiring smaller electron bunch emittances than achievable today. Existing light sources are also exploring new and exotic schemes such as two-color operation (LCLS, FLASH, SwissFEL). To achieve their performance goals, the machines face extreme constraints on their electron beams. The LCLS-II requires \({<}0.01\)% rms energy stability, a factor of \({>}10{\times }\) more than the existing LCLS [1], while the EuXFEL requires \({<}0.001\) deg rms RF amplitude and phase errors, respectively (current state of the art is \({\sim }0.01\)) [2].

Therefore, the next generation of light sources face two problems in terms of tuning and control. In parallel with the difficulties of improving performance to match tighter constraints on energy spreads and beam quality, existing and especially future accelerators face challenges in maintaining beam quality and quickly tuning between various experiments. It can take up to 10 h to retune the low energy beam sections (\({<}500\) MeV) and they still achieve sub-optimal results, wasting valuable beam time. Future accelerators require an ability to quickly tune between experiments and to compensate for extremely closely spaced electron bunches, such as might be required for MaRIE, requiring advanced controls and approaches such as droop correctors [3, 4].

While existing and planned FELs have automatic digital control systems, they are not controlled precisely enough to quickly switch between different operating conditions [5]. Existing controls maintain components at fixed set points, which are set based on desired beam and light properties, such as, for example, the current settings in a bunch compressor’s magnets. Analytic studies and simulations initially provide these set points. However, models are not perfect and component characteristics drift in noisy and time-varying environments; setting a magnet power supply to a certain current today does not necessarily result in the same magnetic field as it would have 3 weeks ago. Also, the sensors are themselves noisy, limited in resolution, and introduce delays. Therefore, even when local controllers maintain desired set points exactly, performance drifts. The result is that operators continuously tweak parameters to maintain steady state operation and spend hours tuning when large changes are required, such as switching between experiments with significantly different current, beam profile (2 color, double bunch setups), or wavelength requirements. Similarly, traditional feed-forward RF beam loading compensation control systems are limited by model-based beam-RF interactions, which work extremely well for perfectly known RF and beam properties, but in practice are limited by effects which include un-modeled drifts and fluctuations and higher order modes excited by extremely short pulses. These limitations have created an interest in model-independent beam-based feedback techniques that can handle time-varying uncertain nonlinear systems [6,7,8,9,10,11,12,13], as well as machine learning, and other optimization techniques [14,15,16,17,18].

We begin this chapter with a list of control problems important to particle accelerators and a brief overview of simple beam dynamics, including longitudinal and transverse effects and the coupling between them and an overview of RF systems. The second half of this chapter introduces some recently developed techniques for the control and tuning of accelerators with a focus on a feedback based extremum seeking method for automatic tuning and optimization.

9.1.1 Beam Dynamics

The typical coordinate system for discussing particle accelerator beam dynamics is shown in Fig. 9.1. The Lorentz force equation:

$$\begin{aligned} \frac{d\mathbf {P}}{dt} = e\left( \mathbf {E}+ \frac{\mathbf {v}}{c}\times \mathbf {B}\right) , \end{aligned}$$
(9.1)

describes charged particle dynamics. In (9.1) e is electron charge, \(\mathbf {v}\) is velocity, \(v=|\mathbf {v}|\), \(\mathbf {P}= \gamma m \mathbf {v}\) the relativistic momentum, \(\gamma = 1/\sqrt{1-\frac{v^2}{c^2}}\) the Lorentz factor, c the speed of light, \(\mathbf {E}\) the electric field and \(\mathbf {B}\) the magnetic field. In a particle accelerator \(\mathbf {E}\) and \(\mathbf {B}\) sources include electromagnetic accelerating fields, other charged particles, and magnets used for steering and focusing of the beams. While electric fields are used to accelerate particles, magnetic fields guide the particles along a design trajectory and keep them from diverging transversely. We start by reviewing betatron oscillations, a form of oscillatory motion which is common to all particle accelerators [19,20,21,22,23,24].

Fig. 9.1
figure 1

A coordinate system centered on the ideal particle orbit. Distance along the orbit is parametrized by s. Transverse offset from the axis of the orbit is given by x and y

Betatron oscillations are a general phenomenon occurring in all particle accelerators and are of particular importance in circular machines. For a particle traveling at the designed beam energy, \(p=p_0\), the transverse equations are given by Hill’s equation

$$\begin{aligned} x'' = K_x(s)x, \qquad y'' = K_y(s)y, \end{aligned}$$
(9.2)

with (xy) being the transverse particle locations relative to the accelerator axis (see Fig. 9.1), s (or z) is a parametrization of particle location along the axis of the accelerator, and \(x'(s)=dx(s)/ds\). In a ring, the function \(K_{x,y}(s+L)=K_{x,y}(s)\) is L-periodic, where L is the circumference of the accelerator, and depends on magnetic field strengths. Equation (9.2) resembles a simple harmonic oscillator with position-dependent \(K_{x,y}(s)\). The solution of (9.2) is of the form

$$\begin{aligned} p_{x,y}(s) = A\sqrt{\beta _{x,y}(s)}\cos \left( \psi _{x,y}(s) + \delta \right) , \quad \psi _{x,y}(s) = \int \limits _{0}^{s}\frac{d\sigma }{\beta _{x,y}(\sigma )}, \end{aligned}$$
(9.3)

where \(\beta _{x,y}(s)\) are the periodic solutions of the system of equations

$$\begin{aligned} \beta '''_{x,y}(s) + 4K_{x,y}(s)\beta '_{x,y}(s)+2K'_{x,y}(s)\beta (s)_{x,y}= & {} 0, \end{aligned}$$
(9.4)
$$\begin{aligned} \frac{1}{2}\beta (s)_{x,y}\beta ''_{x,y}(s) - \frac{1}{4}\left( \beta '_{x,y}(s) \right) ^2 + K_{x,y}(s)\beta ^2_{x,y}(s)= & {} 1. \end{aligned}$$
(9.5)

The solutions of (9.3) are known as betatron oscillations and are periodic functions of s with varying amplitude and frequency [20].

In general, betatron motion is governed by equations of the form:

$$\begin{aligned} x''(s) = -K_x(x,y,s,\mathbf {P},t)x(s) + F_x(x,x',y,y',s,\mathbf {P},t), \end{aligned}$$
(9.6)
$$\begin{aligned} y''(s) =-K_y(x,y,s,\mathbf {P},t)y(s) + F_y(x,x',y,y',s,\mathbf {P},t). \end{aligned}$$
(9.7)

The nonlinear coupling between x and y depends not only on particle position, trajectory, energy deviation, and time.

Typically, quadrupole magnets focus the beam transversally, maintaining a tight bunch along the accelerator axis, and dipole magnets having only a non-zero y component of magnetic field direct the particles in a circular orbit in the (xs) plane. The linear quadrupole and dipole magnetic field components give (9.6), (9.7) of the form

$$\begin{aligned} x''= & {} -\frac{p_0}{p}\left( \frac{1}{\rho ^2}-K_1(s) \right) x + \frac{p-p_0}{p}\frac{1}{\rho }, \end{aligned}$$
(9.8)
$$\begin{aligned} y''= & {} - \frac{p_0 K_1(s)}{p}y. \end{aligned}$$
(9.9)

\(K_1(s)\) is periodic and proportional to quadrupole field strength. The value \(p=\sqrt{E^2/c^2-m^2c^2}\) is the total kinetic momentum. \(p_0\) is the designed kinetic momentum. The value \(\rho \) is the local radius of curvature [20].

Sources of nonlinearity and coupling in the functions \(F_x\) and \(F_y\) in (9.6), (9.7) are nonlinear magnetic field components, misaligned magnets, solenoid fields, magnetic field errors, and skew components of magnetic field gradients. Furthermore, all manufactured magnets are non-ideal and introduce nonlinear field components, higher order coupling terms given by [23]:

$$\begin{aligned} \varDelta B_y + j \varDelta B_x = B_0 \sum _{n=0}^{\infty }(b_n + ja_n)(x+jy)^n. \end{aligned}$$
(9.10)

Sometimes nonlinear magnets are purposely introduced into the accelerator lattice. For example, sextuple magnets are placed in regions of high dispersion to mitigate the fact that particles with various momentums experience non-equal forces from the same magnetic fields and their trajectories diverge (chromatic effects). Such magnets result in nonlinear coupling terms such as \((x^2-y^2)\) and \((1-\varDelta )xy\), where \(\varDelta =(p-p_0)/p\) [20].

Betatron motion occurs in all accelerators, magnetic lattices are designed to minimize betatron oscillations. However, some regions of accelerators require large amplitude transverse particle motion. If this motion is not carefully, precisely controlled, excessive betatron oscillations are generated. One such section is a group of pulsed kicker magnets used to horizontally kick the beam out and then inject back into a machine. During injection kicks an imperfect match of parameters of the magnets results in the extremely large betatron oscillations, as shown in Fig. 9.2.

Fig. 9.2
figure 2

BPM readings of x and y beam displacement over 500 turns, before and during tuning

9.1.2 RF Acceleration

Particle acceleration in an RF field. For a particle passing through an RF cavity gap of length L, the energy gain due to an electromagnetic standing wave along the axis is given by

$$\begin{aligned} \varDelta W = q \int \limits _{-L/2}^{L/2}E(z)\cos (\omega t(z) + \phi ) dz, \quad t(z) = \int \limits _{0}^{z}\frac{dz}{v(z)}, \end{aligned}$$
(9.11)

where t(z) has been chosen such that the particle is at the center of the accelerating gap at \(t=0\), \(\phi =0\) if the particle arrives at the origin when the field is at a crest, and v(z) is the velocity of the particle. This energy gain can be expanded as

$$\begin{aligned} \varDelta W = q \int \limits _{-L/2}^{L/2} E(z) \left[ \cos (\omega t(z))\cos (\phi ) - \sin (\omega t(z))\sin (\phi ) \right] dz \end{aligned}$$
(9.12)

and rewritten in the form

$$\begin{aligned} \varDelta W = qV_0 T \cos (\phi ) = qE_0TL\cos (\phi ), \quad E_0 = \frac{V_0}{L}, \end{aligned}$$
(9.13)

where

$$\begin{aligned} V_0 = \int \limits _{-L/2}^{L/2}E(Z)dz, \quad T =\frac{\int _{-L/2}^{L/2}E(z)\cos (\omega t(z))dz}{V_0}- \tan (\phi ) \frac{\int _{-L/2}^{L/2}E(z)\sin (\omega t(z))dz}{V_0}, \end{aligned}$$
(9.14)

and T known as the transit-time factor. For typical RF accelerating cavities, the electric field is symmetric relative to the center of the gap and the velocity change within an accelerating gap for a relativistic particle is negligible so \(\omega t(z) \approx \omega z / v = 2\pi z / \beta \lambda \), where \(\beta = v/c\) and \(\beta \lambda \) is the distance a particle travels in one RF period. We can then rewrite the transit-time factor as

$$\begin{aligned} T = \frac{\int _{-L/2}^{L/2}E(z)\cos \left( 2\pi z / \beta \lambda \right) dz}{V_0}. \end{aligned}$$
(9.15)

Assuming that the electric field is constant \(E(z)\equiv E_0\) within the gap, we get

$$\begin{aligned} T = \frac{\sin (\pi L / \beta \lambda )}{\pi L / \beta \lambda }, \end{aligned}$$
(9.16)

and plugging back into (9.13) we get

$$\begin{aligned} \varDelta W = \frac{qE_0\beta \lambda }{\pi }\cos (\phi )\sin \left( \frac{\pi L}{\beta \lambda } \right) , \end{aligned}$$
(9.17)

which is, as expected, maximized for \(\phi = 0\) and \(L = \beta \lambda / 2\), that is for a particle that spends the maximal half of an RF period being accelerated through the cavity. This however would not be an efficient form of acceleration as most of the time the particle would see a much smaller than maximal RF field. For a given voltage gain \(V_0\), we get a maximum \(T=1\) with \(L=0\), which is not realizable. Actual design values of T depend on individual cavity geometries and desired efficiency.

9.1.3 Bunch Compression

For maximal acceleration, we typically choose \(\phi =0\), especially for highly relativistic electrons. However, sometimes a nonzero \(\phi \) is chosen either for longitudinal bunching or to purposely introduce an energy gradient along the electron bunch which can then be utilized for bunch compression. We define \(\phi \) as the relative phase between a particle and the zero crossing of the RF field, such that earlier particles, with \(\phi <0\) will receive a higher energy gain than later particles with \(\phi >0\). The energy offset of a particle at phase \(\phi \) at the exit of the RF compressor cavity, relative to the reference particle, is given by

$$\begin{aligned} \varDelta E_1 = \varDelta E_0 - \frac{qV_{\mathrm {rf}}}{E}\sin (\phi ), \end{aligned}$$
(9.18)

where \(V_{\mathrm {rf}}\) is the compressor voltage, E is beam energy, \(\varDelta E_0\) is the initial energy offset. Next the beam is transported through a dispersive section with non-zero \(R_{56}\), where

$$\begin{aligned} R_{56}(s) = \int \limits _{s_0}^{s} \frac{R_{16}(s')}{\rho (s')}ds', \end{aligned}$$
(9.19)

where \(R_{16}\) is the transverse displacement resulting from an energy error in a dispersive region of the accelerator. The energy offset is then translated to a longitudinal position offset according to

$$\begin{aligned} \varDelta z_1 = \varDelta z_0 + R_{56}\varDelta E_1 = \varDelta z_0 + R_{56} \left( \varDelta E_0 - \frac{qV_{\mathrm {rf}}}{E}\sin (\phi ) \right) . \end{aligned}$$
(9.20)

For an RF field of frequency \(\omega _{\mathrm {rf}}\), the phase \(\phi \) relative to the RF at position offset \(\varDelta z_0\) is given by \(\phi = -\omega _{\mathrm {rf}} \varDelta z_0 /c\). If this phase is small, we can expand sine and rewrite both the energy and position change as

$$\begin{aligned} \varDelta z_1\approx & {} \left( 1 + R_{56} \frac{\omega _{\mathrm {rf}}V_{\mathrm {rf}}}{cE} \right) \varDelta z_0 + R_{56}\varDelta E_0, \end{aligned}$$
(9.21)
$$\begin{aligned} \varDelta E_1= & {} \varDelta E_0 - \frac{eV_{\mathrm {rf}}\omega _{\mathrm {rf}}}{cE} \varDelta z_0. \end{aligned}$$
(9.22)

Therefore the final bunch length can be approximated as

$$\begin{aligned} \sigma _{zf} = \sqrt{ \left( 1 + R_{56} \frac{eV_{\mathrm {rf}}\omega _{\mathrm {rf}}}{Ec} \right) ^2 \sigma ^2_{z0} + R^2_{56}\sigma ^2_{\varDelta E_0}}, \end{aligned}$$
(9.23)

where \(\sigma _{z0}\) is the initial bunch length and \(\sigma _{\varDelta E_0}\) is the initial beam energy spread [26], with maximal compression for an RF system adjusted such that \(R_{56} \frac{eV_{\mathrm {rf}}\omega _{\mathrm {rf}}}{Ec} \approx -1\).

9.1.4 RF Systems

For a right-cylindrical conducting cavity of radius \(R_c\), as shown in Fig. 9.3, the 010 transverse-magnetic resonant mode, referred to as \(\mathbf {TM}_{\mathbf {010}}\), is used for acceleration because along the axis this mode has a large oscillating electric field and no magnetic field, as shown in Fig. 9.3. The electromagnetic fields of the \(\mathbf {TM}_{\mathbf {010}}\) mode are:

$$\begin{aligned} \mathbf {E}(r,t)= & {} E_0J_0 \left( \frac{2.405 r}{R_c} \right) e^{i\omega _0 t}\hat{\mathbf {z}} = \mathbf {E}_z(r)e^{i\omega _0 t}\hat{\mathbf {z}}, \end{aligned}$$
(9.24)
$$\begin{aligned} \mathbf {B}(r,t)= & {} -iE_0\sqrt{\frac{\epsilon }{\mu }}J_1 \left( \frac{2.405 r}{R_c} \right) e^{i\omega _0 t}\hat{\varphi } = \mathbf {B}_{\varphi }(r)e^{i\omega _0 t}\hat{\varphi }, \end{aligned}$$
(9.25)

where \(J_0\) and \(J_1\) are Bessel functions of the first kind with zero and first order, respectively, and the resonant frequency is given by

$$\begin{aligned} \omega _0 = \frac{2.405 c}{R_c}, \qquad c=\mathrm {speed \ of \ light }. \end{aligned}$$
(9.26)
Fig. 9.3
figure 3

Left: Electromagnetic field orientations for TM\(_{\mathbf {010}}\) accelerating mode of a right cylindrical RF cavity. Center: RLC circuit approximation of the dynamics of a single RF mode. Right: The axial electric field is maximal on axis and zero at the walls of the cavity and the opposite is true of the azimuthal magnetic field

The dynamics of such a single mode of an RF cavity with resonant frequency \(f_0\) can be approximated as

$$\begin{aligned} \ddot{V}_{\mathrm {cav}} + \frac{\omega _0}{Q_L}\dot{V}_{\mathrm {cav}}+ \omega ^2_0 V_{\mathrm {cav}}= \frac{1}{C}\dot{I}, \end{aligned}$$
(9.27)

where \(\dot{V} = \frac{dV}{dt}\), \(\ddot{V} = \frac{d^2V}{dt^2}\), \(\omega _0 = 2\pi f_0\), \(Q_L\) is the loaded quality factor of the resonant cavity, L and C are the inductance and capacitance of the cavity structure, respectively, such that \(\sqrt{LC}=\frac{1}{\omega _0}\), and \(I=I_c + I_b\) is the input current driving the RF fields, the sources of which are both the RF generator, \(I_c\), and the beam itself \(I_b\) [19, 27, 28].

For a driving current of the form

$$\begin{aligned} I_u(t) = I_0\cos (\omega _0 t), \end{aligned}$$
(9.28)

after the fast decay of some transient terms, the cavity response is of the form

$$\begin{aligned} V_{\mathrm {cav}}(t) = R I_0 \left( 1 - e^{-t/\tau } \right) \cos (\omega _0 t), \quad \tau = \frac{2Q_L}{\omega _0}. \end{aligned}$$
(9.29)
Fig. 9.4
figure 4

Amplitude of the cavity field and its phase relative to a reference signal

9.1.5 Need for Feedback Control

Although (9.29) implies that for a desired accelerating gradient one must simply choose the correct input power level and drive the cavity, as shown in Fig. 9.4. However, in the real world simply choosing set points for an RF drive signal does not work because of un-modeled time varying disturbances which perturb cavity fields from their desired set points. These disturbances include:

  1. 1.

    Temperature variation-induced resonance frequency drifts on the time scales of minutes to hours.

  2. 2.

    Mechanical vibrations which alter the cavity resonance frequency on the times scale of milliseconds.

  3. 3.

    RF source voltage and current fluctuations on the time scale of microseconds.

  4. 4.

    RF source voltage droop on the time scale of microseconds.

Furthermore, even if a desired accelerating voltage could be reached within a desired rise time, when the beam that is to be accelerated shows up, it itself perturbs the fields both by interacting with the oscillating electrons in the cavity walls and by drawing energy out of the cavity via the electric field which accelerates it, causing both amplitude and phase changes on the time scales of nanoseconds which must be compensated for in order to maintain proper acceleration of subsequent beam bunches.

Therefore real time active feedback control is always necessary, both to bring cavity voltage amplitudes and phases to their required set points before beam can be properly accelerated and during beam acceleration in order to maintain tight bounds on beam-induced cavity field errors, known as beam loading.

From the above discussions it is clear that all of the disturbances experienced by the RF systems immediately couple into the transverse and longitudinal beam dynamics. Similarly, many of the beam dynamics, including the effects of space charge forces, magnet misalignments, and energy deviations alter a particle’s position within a bunch and therefore the phase of the RF system relative to the particle’s arrival time and therefore the entire accelerator is a completely coupled system in terms of the final beam phase space distribution relative to the RF systems, magnet systems, and the forces due to the particles in the beam itself.

9.1.6 Standart Proportional Integral (PI) Control for RF Cavity

The vast majority of accelerator systems, such as RF feedback and power converters are typically controlled at fixed set points with simple, classical, proportional integral (PI) controllers. Therefore we start with a detailed overview of RF cavity phase and amplitude PI control. To develop feedback controllers we must consider the coupled beam-cavity-RF source system. We consider only the \(\omega _0\) frequency component of the beam, \(A_b(t)\cos (\omega _0 t + \theta _b(t))\), an RF driving current of the form \(I_c(t)=A_c(t)\cos (\omega t + \theta _c(t))\), and a cavity field of the form \(V_{\mathrm {cav}}(t) = A_{\mathrm {cav}}(t)\cos (\omega t + \theta _{\mathrm {cav}}(t) )\). The single second order differential equations describing the cavity dynamics, (9.27), can then be simplified to two coupled, linear, first order differential equations:

$$\begin{aligned} \dot{I}= & {} -\omega _{\frac{1}{2}}I - \varDelta \omega Q + \beta _{I,c} I_c + \beta _{I,b} I_b, \end{aligned}$$
(9.30)
$$\begin{aligned} \dot{Q}= & {} \varDelta \omega I -\omega _{\frac{1}{2}}Q + \beta _{Q,c} Q_c + \beta _{Q,b} Q_b, \end{aligned}$$
(9.31)

where \(\varDelta \omega = \omega - \omega _0\) is the difference between RF generator and cavity resonance frequencies, \(\omega _{\frac{1}{2}} = \omega _0/2Q_L\), and the I and Q quantities represent

$$\begin{aligned} I(t)= & {} A_{\mathrm {cav}}(t)\cos (\theta _{\mathrm {cav}}(t)), \;\; I_c(t) = A_c(t)\cos (\theta _c(t)), \;\; I_b(t) = A_b(t)\cos (\theta _b(t)), \end{aligned}$$
(9.32)
$$\begin{aligned} Q(t)= & {} A_{\mathrm {cav}}(t)\sin (\theta _{\mathrm {cav}}(t)), \;\; Q_c(t) = A_c(t)\sin (\theta _c(t)), \;\; Q_b(t) = A_b(t)\sin (\theta _b(t)), \end{aligned}$$
(9.33)

from which amplitudes and phases can be calculated according to

$$\begin{aligned} A_{\bullet }(t) = \sqrt{I^2_\bullet (t) + Q^2_\bullet (t)}, \qquad \theta _\bullet (t) = \arctan \left( \frac{Q_\bullet (t) }{ I_\bullet (t) } \right) . \end{aligned}$$
(9.34)

Equations (9.30), (9.31) can be written in the compact linear form

$$\begin{aligned} \dot{\mathbf {x}} = A\mathbf {x}+ B_c\mathbf {u}+ B_b\mathbf {d}, \;\; \mathbf {x}=\left[ \begin{array}{c} I \\ Q \end{array}\right] , \;\; A=\left[ \begin{array}{cc} -\omega _{\frac{1}{2}} &{} -\varDelta \omega \\ \varDelta \omega &{} -\omega _{\frac{1}{2}} \end{array}\right] , \;\; \mathbf {u}=\left[ \begin{array}{c} I_c \\ Q_c \end{array}\right] , \;\; \mathbf {d}=\left[ \begin{array}{c} I_b \\ Q_b \end{array}\right] , \end{aligned}$$
(9.35)

where \(\mathbf {u}\) refers to the control vector, and the beam itself, \(\mathbf {d}\), is thought of as a disturbance. The goal of RF feedback control is typically to maintain the cavity field as given by \(\mathbf {x}\) at a desired set point thereby ensuring proper acceleration of the beam. In addition to providing a simple, linear approximation of the dynamics of the beam, cavity, and RF generator system, (9.35) is very useful because a typical digital RF system does not have access to the raw cavity voltage signal \(V_{\mathrm {cav}}(t)\), but rather to \(I_{\mathrm {cav}}(t)\) and \(Q_{\mathrm {cav}}(t)\), which are provided by down sampling the cavity field signal. For example, at the Los Alamos Neutron Science Center (LANSCE) linear accelerator, \(f_{RF}= 201.25\) MHz RF signals of the form \(V_{\mathrm {cav}}(t) = A_{\mathrm {cav}}(t)\cos (2\pi f_{RF} t + \theta _{\mathrm {cav}}(t))\) are first mixed down via local oscillators to signals at an intermediate frequency \(f_{IF}= 25\) MHz, of the form \(A_{\mathrm {cav}}(t)\cos (2\pi f_{IF} t + \theta _{\mathrm {cav}}(t))\), which can be expanded in the I, Q formalism as:

$$\begin{aligned}&A_{\mathrm {cav}}(t)\cos (2\pi f_{IF} t + \theta _{\mathrm {cav}}(t))\nonumber \\&\quad =\, \underbrace{A_{\mathrm {cav}}(t)\cos (\theta _{\mathrm {cav}}(t))}_{I_{\mathrm {cav}}(t)}\cos (2\pi f_{IF} t) - \underbrace{A_{\mathrm {cav}}(t)\sin (\theta _{\mathrm {cav}}(t))}_{Q_{\mathrm {cav}}(t)}\cos (2\pi f_{IF} t) \nonumber \\&\quad =\, I_{\mathrm {cav}}(t)\cos (2\pi f_{IF} t) - Q_{\mathrm {cav}}(t)\cos (2\pi f_{IF} t) \sin (2\pi f_{IF} t). \end{aligned}$$
(9.36)

Then, by oversampling the signal (9.36) at the rate \(f_s = 4\times f_{IF}\), the analog to digital converter (ADC) collects samples at time steps \(nt = \frac{n}{fs}\):

$$\begin{aligned} V_{\mathrm {cav}}\left( \frac{n}{4 f_{IF}} \right) = I_{\mathrm {cav}}\left( \frac{n}{4 f_{IF}} \right) \cos \left( \frac{n\pi }{2} \right) - Q_{\mathrm {cav}}\left( \frac{n}{4 f_{IF}} \right) \sin \left( \frac{n\pi }{2} \right) , \end{aligned}$$
(9.37)

directly receiving the samples:

$$\begin{aligned} \left\{ I_{\mathrm {cav}}(0), -Q_{\mathrm {cav}}(t_s), -I_{\mathrm {cav}}(2t_s), Q_{\mathrm {cav}}(3t_s), \dots \right\} . \end{aligned}$$
(9.38)

The job of the RF control system is to maintain the cavity fields at amplitude and phase set points, \(A_s(t)\) and \(\theta _s(t)\), respectively, which translate into I and Q set points: \(I_s(t) = A_s(t)\cos (\theta _s(t))\), \(Q_s(t)=A_s(t)\sin (\theta _s(t))\). The most simple typical RF feedback control system first compares the cavity I and Q signals to their set points and calculates error signals \(I_e(t) = I_{\mathrm {cav}}(t) - I_s(t)\), \(Q_e(t) = Q_{\mathrm {cav}}(t) - Q_s(t)\), and then performs proportional-integral feedback control of the form

$$\begin{aligned} I_{c}(t) = -k_p I_e(t) - k_i \int \limits _{0}^{t}I_e(\tau )d\tau , \qquad Q_{c}(t) = -k_p Q_e(t) - k_i \int \limits _{0}^{t}Q_e(\tau )d\tau . \end{aligned}$$
(9.39)

Typically particle accelerators are pulsed at rates of tens to hundreds of Hz. For example, in the LANSCE accelerator, the RF drive power is turned on for 1 ms at a rate of 120 Hz. Once RF is turned on, cavity fields build up and reach steady state within a few hundred microseconds, after which the cavities are ready to accelerate the beam, whose sudden arrival perturbs the cavity fields, as shown in Fig. 9.5.

Fig. 9.5
figure 5

The RF source, \(I_c(t)\), is turned on at a rate of 120 Hz, for \({\sim }\)1 ms per pulse. The beam, \(I_b(t)\), arrives around \({\sim }350~\upmu \)s into the pulse after the cavity field, \(V_{\mathrm {cav}}(t)\), has had time to settle. The beam’s arrival disrupts the cavity field’s steady state

Although the initial I and Q set points are in the forms of smooth ramps, as seen from the shape of the cavity field amplitude in Fig. 9.5, once the field has reached steady state and before the beam has arrived, the set points are fixed in order to maintain a precise field amplitude and phase offset of the bunches relative to the RF zero crossing. Therefore, in what follows we consider the cavity set points only after steady state has been reached and they are therefore constants of the form:

$$\begin{aligned} I_{s}(t \ge T_{\mathrm {rise}}) \equiv I_s(T_{\mathrm {rise}}) = I_r, \quad Q_{s}(t \ge T_{\mathrm {rise}}) \equiv Q_s(T_{\mathrm {rise}}) = Q_r. \end{aligned}$$
(9.40)

Plugging the feedback (9.39) into the cavity dynamics (9.35) and rewriting the dynamics in terms of the error variables, we are then left with the closed loop system

$$\begin{aligned} \dot{\mathbf {x}}_e= & {} A\mathbf {x}_e + A\mathbf {x}_r - k_p B_c\mathbf {x}_e -k_i B_c \int \limits _{0}^{t}\mathbf {x}_e(\tau )d\tau + B_b\mathbf {d}, \quad \mathbf {x}_e =\left[ \begin{array}{c} I_e \\ Q_e \end{array}\right] , \quad \mathbf {x}_r =\left[ \begin{array}{c} I_r \\ Q_r \end{array}\right] . \end{aligned}$$
(9.41)

Taking the Laplace transform of both sides of (9.41), assuming that we are at steady state so that \(\mathbf {x}_e(T_{\mathrm {rise}})=0\), we get

$$\begin{aligned} s\mathbf {X}_e(s)= & {} A \mathbf {X}_e(s) + \frac{1}{s}A \mathbf {x}_r -k_p B_c \mathbf {X}_e(s) - \frac{1}{s} k_i B_c \mathbf {X}_e(s) + B_d \mathbf {D}(s) \nonumber \\&\Longrightarrow&\nonumber \\ \mathbf {X}_e(s)= & {} \left( s^2I - s\left( k_pB_c - A \right) + k_i B_c \right) ^{-1}\left( A \mathbf {x}_r + sB_d \mathbf {D}(s) \right) . \end{aligned}$$
(9.42)

The gains, \(k_i\) and \(k_p\) of the simple PI feedback control loop are then tuned in order to maintain minimal error despite the disturbances \(A \mathbf {x}_r\) and \(sB_d \mathbf {D}(s)\). The constant term \(A\mathbf {x}_r\) is due to the natural damping of the RF cavity and is easily compensated for. The more important and more difficult to deal with term is \(sB_d\mathbf {D}(s)\), which, in the time domain is proportional to the derivative of the beam current \(B_d \dot{\mathbf {d}}(t)\). Because the beam is typically ramped up to an intense current very quickly (tens of microseconds) or consists of an extremely short pulse, the derivative term is extremely disruptive to the cavity field phase and amplitude. Some typical beam current and bunch timing profiles are shown in Fig. 9.6. Currently LCLS is able to accelerate 1 nC during extremely powerful \({\sim }3~\upmu \)s RF pulses, with a separation of 8.3 ms between bunches. The European XFEL is pushing orders of magnitude beyond the LCLS bunch timing with 1 nC pulses separated by only 220 ns. This is extremely challenging for an RF system which must maintain field amplitude and phase set points and recover between bunches. The proposed MaRIE accelerator will push this problem another order of magnitude in attempting to accelerate high charge pulses with only \(\sim \)2.5 ns of separation.

Fig. 9.6
figure 6

Beam current time profiles of several accelerators are shown

Although the PI controller used in (9.41) can theoretically hold the error \(\mathbf {x}_e\) arbitrarily close to zero arbitrarily fast by choosing large enough gains \(k_i\) and \(k_p\) relative to the magnitude of the beam disturbance \(\left\| B_d\dot{\mathbf {d}}(t) \right\| \), in practice all control gains are limited by actuator saturation, response time, and most importantly, delay in the feedback loop. A typical RF feedback loop is shown in Fig. 9.7 and may experience as much as 5 \(\upmu \)s of round trip delay, which is an large delay relative to beam transient times.

Fig. 9.7
figure 7

Typical digital RF control setup with signals coming from the cavity into the digital FPGA-based controller and then back out through a chain of amplifiers

Consider for example the following scalar, delay system, where the goal is to quickly drive x(t) to zero from an arbitrary initial condition, but only being able to do so based on a controller which uses a delayed measurement of x(t), \(x(t-D)\). Considering a simply proportional feedback control, \(u=-kx\), for the system

$$\begin{aligned} \dot{x}(t) = u(x(t-D)) \qquad \Longrightarrow \qquad \dot{x}(t) = - kx(t-D), \end{aligned}$$
(9.43)

taking Laplace transforms we get

$$\begin{aligned} sX(s) - x(0) = - ke^{-Ds}X(s) \qquad \Longrightarrow \qquad X(s) = \frac{x(0)}{s + ke^{-Ds} }. \end{aligned}$$
(9.44)

If we assume the delay is small, \(D \ll 1\), we can approximate \(e^{-Ds} \approx 1 - Ds\), invert the Laplace transform and get the solution

$$\begin{aligned} x(t) = x(0)e^{\gamma t}, \quad \gamma = \frac{-k}{1-kD}, \end{aligned}$$
(9.45)

which exponentially converges to 0 for \(\gamma < 0\), requiring that k satisfy \(\frac{1}{D}> k > 0\), a limit on possible stabilizing values of the feedback control gain. If our system (9.43) had an external disturbance, d(t) the gain limit would be a major limitation in terms of compensating for large or fast d(t).

Because of such limitations, a feedback only LLRF system’s response to beam loading would typically look like the results shown in Fig. 9.8, where each intense beam pulse causes a large deviation of the accelerating field’s voltage from the design phase and amplitude, which must be restored before the next bunch can be properly accelerated.

Fig. 9.8
figure 8

Cavity field errors with frequency shift, RF power droop, beam loading, and simple proportional-integral feedback control

9.2 Advanced Control and Tuning Topics

For problems which can be accurately modeled, such as systems that do not vary with time and for which extensive, detailed diagnostics exist, there are many powerful optimization methods such as genetic algorithms (GA), which can be used during the design of an accelerator by performing extremely large searches over parameter space [29]. Such multi-objective genetic algorithms (MOGA) have been applied for the design of radio frequency cavities [30], photoinjectors [31], damping rings [32], storage ring dynamics [33], lattice design [34], neutrino factory design [35], simultaneous optimization of beam emittance and dynamic aperture [36], free electron laser linac drivers [37] and various other accelerator physics applications [38]. One extension of MOGA is multi-objective particle swarm optimization, has been used for emittance reduction [39]. Brute force approaches such as GA and MOGA search over the entire parameter space of interest and therefore result in global optimization, however, such model-based approaches are only optimal relative to the specific model which they are using, which in practice rarely exactly matches the actual machine when it is built. Differences are due to imperfect models, uncertainty, and finite precision of construction. Therefore, actual machines settings undergo extensive tuning and tweaking in order to reach optimal performance. Recently efforts have been made to implement a GA method on-line for the minimization of beam size at SPEAR3 [40]. Robust conjugate direction search (RCDS) is another optimization method. RCDS is model independent, but at the start of optimization in must learn the conjugate directions of the given system, and therefore is not applicable to quickly time-varying systems [41, 42]. Optimization of nonlinear storage ring dynamics via RCDS and particle swarm has been performed online [43].

Although many modern, well behaved machines can possibly be optimized with any of the methods mentioned above, and once at steady state, the operation may not require the fast re-tuning future light sources will require algorithms with an ability to quickly switch between various operating conditions and to handle quickly time-varying systems, based only on scalar measurements, rather than a detailed knowledge of the system dynamics, when compensating for complex collective effects. If any of the methods above were used, they would have to be repeated every time component settings were significantly changed and it is highly unlikely that they would converge or be well behaved during un-modeled, fast time-variation of components. Therefore, a model-independent feedback-based control and tuning procedure is required which can function on nonlinear and time varying systems with many coupled components.

The type of tuning problems that we are interested in have recently been approached with powerful machine learning methods [15, 44], which are showing very promising results. However, these methods require large training sets in order to learn how to reach specific machine set points, and interpolate in between. For example, if a user requests a combination of beam energy, pulse charge, and bunch length, which was not a member of a neural network-based controller’s learning set, the achieved machine performance is not predictable. Furthermore, machine components slowly drift with time and un-modeled disturbances are present and limit any learning-based algorithm’s abilities. Extremum seeking (ES) is a simple, local, model-independent algorithm for accelerator tuning, whose speed of convergence allows for the optimization and real-time tracking of many coupled parameters for time-varying nonlinear systems. Because ES is model independent, robust to noise, and has analytically guaranteed parameter bounds and update rates, it is useful for real time feedback in actual machines. One of the limitations of ES is that it is a local optimizer which can possible be trapped in local minima.

It is our belief that the combination of ES and machine learning methods will be a powerful method for quickly tuning FELs between drastically different user desired beam and light properties. For example, once a deep neural network (NN) has learned a mapping of machine settings to light properties for a given accelerator based on collected machine data, it can be used to quickly bring the machine within a local proximity of the required settings for a given user experiment. However, the performance will be limited by the fact that the machine changes with time, that the desired experiment settings were not in the training data, and un-modeled disturbances. Therefore, once brought within a small neighborhood of the required settings via NN, ES can be used to achieve local optimal tuning, which can also continuously re-tune to compensate for un-modeled disturbances and time variation of components. In the remainder of this chapter we will focus on the ES method, giving a general overview of the procedure and several simulation and in-hardware demonstrations of applications of the method. Further details on machine learning approaches can be found in [15, 44] and the references within.

9.3 Introduction to Extremum Seeking Control

The Extremum seeking method described in this chapter is a recently developed general approach for the stabilization of noisy, uncertain, open-loop unstable, time-varying systems [6, 7]. The main benefits of this approach are:

  1. 1.

    The method can tune many parameters of unknown, nonlinear, open-loop unstable systems, simultaneously.

  2. 2.

    The method is robust to measurement noise and external disturbances and can track quickly time-varying parameters.

  3. 3.

    Although operating on noisy and analytically unknown systems, the parameter updates have analytically guaranteed constraints, which make it safe for in-hardware implementation.

This method has been implemented in simulation to automatically tune large systems of magnets and RF set points to optimize beam parameters [11], it has been utilized in hardware at the proton linear accelerator at the Los Alamos Neutron Science Center to automatically tune two RF buncher cavities to maximize the RF system’s beam acceptance, based only on a noisy measurement of beam current [12], it has been utilized at the Facility for Advanced Accelerator Experimental Tests, to non-destructively predict electron bunch properties via a coupling of simulation and machine data [13], it has been utilized for bunch compressor design [45], and has been used for the automated tuning of magnets in a time-varying lattice to continuously minimize betatron oscillations at SPEAR3 [8]. Furthermore, analytic proofs of convergence for the method are available for constrained systems with general, non-differentiable controllers [9, 10].

9.3.1 Physical Motivation

It has been shown that unexpected stability properties can be achieved in dynamic systems by introducing fast, small oscillations. One example is the stabilization of the vertical equilibrium point of an inverted pendulum by quickly oscillating the pendulum’s pivot point. Kapitza first analyzed these dynamics in the 1950s [46]. The ES approach is in some ways related to such vibrational stabilization as high frequency oscillations are used to stabilize desired points of a system’s state space and to force trajectories to converge to these points. This is done by creating cost functions whose minima correspond to the points of interest, allowing us to tune a large family of systems without relying on any models or system knowledge. The method even works for unknown functions, where we do not choose which point of the state space to stabilize, but rather are minimizing an analytically unknown function whose noisy measurements we are able to sample.

To give an intuitive 2D overview of this method, we consider finding the minimum of an unknown function C(xy). We propose the following scheme:

$$\begin{aligned} \frac{d x}{d t}= & {} \sqrt{\alpha \omega } \cos \left( \omega t + k C(x,y) \right) \end{aligned}$$
(9.46)
$$\begin{aligned} \frac{d y}{d t}= & {} \sqrt{\alpha \omega } \sin \left( \omega t + k C(x,y) \right) . \end{aligned}$$
(9.47)

Note that although C(xy) enters the argument of the adaptive scheme, we do not rely on any knowledge of the analytic form of C(xy), we simply assume that it’s value is available for measurement at different locations (xy).

The velocity vector,

$$\begin{aligned} \mathbf {v}= & {} \left( \frac{d x}{d t}, \frac{d y}{d t} \right) = \sqrt{\alpha \omega } \left[ \cos \left( \theta (t) \right) , \sin \left( \theta (t) \right) \right] , \end{aligned}$$
(9.48)
$$\begin{aligned} \theta (t)= & {} \omega t + k C(x(t),y(t)), \end{aligned}$$
(9.49)

has constant magnitude, \(\left\| \mathbf {v} \right\| = \sqrt{\alpha \omega }\), and therefore the trajectory (x(t), y(t)) moves at a constant speed. However, the rate at which the direction of the trajectories’ heading changes is a function of \(\omega \), k, and C(x(t), y(t)) expressed as:

$$\begin{aligned} \frac{d \theta }{d t} = \omega + k \left( \frac{\partial C}{\partial x}\frac{d x}{d t} + \frac{\partial C}{\partial y}\frac{d y}{d t} \right) . \end{aligned}$$
(9.50)

Therefore, when the trajectory is heading in the correct direction, towards a decreasing value of C(x(t), y(t)), the term \(k \frac{\partial C}{\partial t}\) is negative so the overall turning rate \(\frac{\partial \theta }{\partial t}\) (9.50), is decreased. On the other hand, when the trajectory is heading in the wrong direction, towards an increasing value of C(x(t), y(t)), the term \(k \frac{\partial C}{\partial t}\) is positive, and the turning rate is increased. On average, the system ends up approaching the minimizing location of C(x(t), y(t)) because it spends more time moving towards it than away.

Fig. 9.9
figure 9

The subfigure in the bottom left shows the rotation rate, \(\frac{\partial \theta }{\partial t} = \omega + \frac{\partial C(x,y)}{\partial t}\), for the part of the trajectory that is bold red, which takes place during the first 0.5 s of simulation. The rotation of the parameters’ velocity vector \(\mathbf {v}(t)\) slows down when heading towards the minimum of \(C(x,y)=x^2 + y^2\), at which time \(k \frac{\partial C}{\partial t}<0\), and speeds up when heading away from the minimum, when \(k \frac{\partial C}{\partial t}>0\). The system ends up spending more time heading towards and approaches the minimum of C(xy)

The ability of this direction-dependent turning rate scheme is apparent in the simulation of system (9.46), (9.47), in Fig. 9.9. The system, starting at initial location \(x(0)=1\), \(y(0)=-1\), is simulated for 5 s with update parameters \(\omega = 50\), \(k=5\), \(\alpha =0.5\), and \(C(x,y)=x^2+y^2\). We compare the actual system’s (9.46), (9.47) dynamics with those of a system performing gradient descent:

$$\begin{aligned} \frac{d \bar{x}}{d t}\approx & {} -\frac{k\alpha }{2} \frac{\partial C(\bar{x},\bar{y})}{\partial \bar{x}} = -k\alpha \bar{x} \end{aligned}$$
(9.51)
$$\begin{aligned} \frac{d \bar{y}}{d t}\approx & {} -\frac{k\alpha }{2} \frac{\partial C(\bar{x},\bar{y})}{\partial \bar{y}} = -k\alpha \bar{y}, \end{aligned}$$
(9.52)

whose behavior our system mimics on average, with the difference

$$\begin{aligned} \max _{t\in [0,T]} \left\| (x(t),y(t)) - \left( \bar{x}(t), \bar{y}(t) \right) \right\| \end{aligned}$$
(9.53)

made arbitrarily small for any value of T, by choosing arbitrarily large values of \(\omega \).

Towards the end of the simulation, when the system’s trajectory is near the origin, \(C(x,y) \approx 0\), and the dynamics of (9.46), (9.47) are approximately

$$\begin{aligned} \frac{\partial x}{\partial t}\approx & {} \sqrt{\alpha \omega } \cos \left( \omega t \right) \Longrightarrow x(t) \approx \sqrt{\frac{\alpha }{\omega }} \sin \left( \omega t \right) \end{aligned}$$
(9.54)
$$\begin{aligned} \frac{\partial y}{\partial t}\approx & {} \sqrt{\alpha \omega } \sin \left( \omega t \right) \Longrightarrow y(t) \approx - \sqrt{\frac{\alpha }{\omega }} \cos \left( \omega t \right) , \end{aligned}$$
(9.55)

a circle of radius \(\sqrt{\frac{\alpha }{\omega }}\), which is made arbitrarily small by choosing arbitrarily large values of \(\omega \). Convergence towards a maximum, rather than a minimum is achieved by replacing k with \(-k\).

9.3.2 General ES Scheme

For general tuning, we consider the problem of locating an extremum point of the function \(C(\mathbf {p},t):\mathbb {R}^n\times \mathbb {R}^+ \rightarrow \mathbb {R}\), for \(\mathbf {p} = (p_1, \dots , p_n) \in \mathbb {R}^n\), when only a noise-corrupted measurement \(y(t) = C(\mathbf {p},t) + n(t)\) is available, with the analytic form of C unknown. For notational convenience, in what follows we sometimes write \(C(\mathbf {p})\) or just C instead of \(C(\mathbf {p}(t),t)\).

The explanation presented in the previous section used \(\sin (\cdot )\) and \(\cos (\cdot )\) functions for the x and y dynamics to give circular trajectories. The actual requirement for convergence is for an independence, in the frequency domain, of the functions used to perturb different parameters. In what follows, replacing \(\cos (\cdot )\) with \(\sin (\cdot )\) throughout makes no difference.

Fig. 9.10
figure 10

Tuning of the ith component \(p_i\) of \(\mathbf {p} = (p_1, \ldots , p_n) \in \mathbb {R}^n\). The symbol \(\frac{1}{s}\) denotes the Laplace Transform of an integrator, so that in the above diagram \(p_i(t) = p_i(0)+ \int \nolimits _{0}^{t} u_i(\tau )d\tau \)

Theorem 1

Consider the setup shown in Fig. 9.10 (for maximum seeking we replace k with \(-k\)):

$$\begin{aligned} \dot{p}_i = \sqrt{\alpha \omega _i}\cos \left( \omega _i t + k y \right) , \qquad y = C(\mathbf {p},t) + n(t) \end{aligned}$$
(9.56)

where \(\omega _i = \omega _0 r_i\) such that \(r_i \ne r_j \ \forall i\ne j\) and n(t) is additive noise. The trajectory of system (9.56) approaches the minimum of \(C(\mathbf {p},t)\), with its trajectory arbitrarily close to that of

$$\begin{aligned} \dot{\bar{\mathbf {p}}} = -\frac{k \alpha }{2} \mathbf {\nabla C}, \quad \bar{\mathbf {p}}(0)=\mathbf {p}(0) \end{aligned}$$
(9.57)

with the distance between the two decreasing as a function of increasing \(\omega _0\). Namely, for any given \(T \in [0,\infty )\), any compact set of allowable parameters \(\mathbf {p}\in K \subset \mathbb {R}^m\), and any desired accuracy \(\delta \), there exists \(\omega ^\star _0\) such that for all \(\omega _0 > \omega ^\star _0\), the distance between the trajectory \(\mathbf {p}(t)\) of (9.56) and \(\bar{\mathbf {p}}(t)\) of (9.57) satisfies the bound

$$\begin{aligned} \max _{\mathbf {p},{\bar{\mathbf {p}}}\in K, t \in [0, T]} \left\| \mathbf {p}(t) - {\bar{\mathbf {p}}}(t) \right\| < \delta . \end{aligned}$$
(9.58)

Remark 1

One of the most important features of this scheme is that on average the system performs a gradient descent of the actual, unknown function C despite feedback being based only on its noise corrupted measurement \(y=C(\mathbf {p},t)+n(t)\).

Remark 2

The stability of this scheme is verified by the fact that an addition of an un-modeled, possibly destabilizing perturbation of the form \(\mathbf {f}(\mathbf {p},t)\) to the dynamics of \(\dot{\mathbf {p}}\) results in the averaged system:

$$\begin{aligned} \dot{\bar{\mathbf {p}}} = \mathbf {f}(\bar{\mathbf {p}},t)-\frac{k \alpha }{2} \mathbf {\nabla C} , \end{aligned}$$
(9.59)

which may be made to approach the minimum of C, by choosing \(k\alpha \) large enough relative to the values of \(\left\| \left( \mathbf {\nabla C} \right) ^T \right\| \) and \(\left\| \mathbf {f}(\bar{\mathbf {p}},t) \right\| \).

Remark 3

In the case of a time-varying max/min location \(\mathbf {p}^\star (t)\) of \(C(\mathbf {p},t)\), there will be terms of the form:

$$\begin{aligned} \frac{1}{\sqrt{\omega }} \left| \frac{\partial C(\mathbf {p},t)}{\partial t} \right| , \end{aligned}$$
(9.60)

which are made to approach zero by increasing \(\omega \). Furthermore, in the analysis of the convergence of the error \(\mathbf {p}_e(t) = \mathbf {p}(t) - \mathbf {p}^\star (t)\) there will be terms of the form:

$$\begin{aligned} \frac{1}{k\alpha } \left| \frac{\partial C(\mathbf {p},t)}{\partial t} \right| . \end{aligned}$$
(9.61)

Together, (9.60) and (9.61) imply the intuitively obvious fact that for systems whose time-variation is fast, in which the minimum towards which we are descending is quickly varying, both the value of \(\omega \) and of the product \(k\alpha \) must be larger than for the time-invariant case.

Remark 4

In the case of different parameters having vastly different response characteristics and sensitivities (such as when tuning both RF and magnet settings in the same scheme), the choices of k and \(\alpha \) may be specified differently for each component \(p_i\), as \(k_i\) and \(\alpha _i\), without change to the above analysis.

A more general form of the scheme for simultaneous stabilization and optimization of an n-dimensional open-loop unstable system with analytically unknown noise-corrupted output function \(C(\mathbf {x},t)\) is shown in Fig. 9.11, but will not be discussed in detail here.

Fig. 9.11
figure 11

ES for simultaneous stabilization and optimization of an unknown, open-loop unstable system based on a noise corrupted scalar measurement

9.3.3 ES for RF Beam Loading Compensation

The ES method described above has been used both in simulation and optimization studies and has been implemented in hardware in accelerators. We now return to the RF problem described in Sect. 9.1.6, where we discussed the fact that due to delay-limited gains and power limitations, the sudden transient caused by beam loading greatly disturbs the RF fields of accelerating cavities which must be re-settled to within prescribed bounds before the next bunches can be brought in for acceleration. ES has been applied to this beam loading problem in the LANSCE accelerator via high speed field programmable gate array (FPGA).

In order to control the amplitude and phase of the RF cavity accelerating field, the \(I(t)=A(t)\cos (\theta (t))\) and \(Q(t)=A(t)\sin (\theta (t))\) components of the cavity voltage signal were sampled as described in Sect. 9.1.6, at a rate of 100 MS/s during a 1000 \(\upmu \)s RF pulse. The detected RF signal was then broken down into 10 \(\upmu \)s long sections and feed forward \(I_{\mathrm {ff},j}(n)\) and \(Q_{\mathrm {ff},j}(n)\) control outputs were generated for each 10 \(\upmu \)s long section, as shown in Fig. 9.12.

Remark 5

In the discussion and figures that follow, we refer to \(I_{\mathrm {cav}}(t)\) and \(Q_{\mathrm {cav}}(t)\) simply as I(t) and Q(t).

Fig. 9.12
figure 12

Top: Iterative scheme for determining I and Q costs during 1–10 \(\upmu \)s intervals. Bottom: ES-based feedforward outputs for beam loading transient compensation

The iterative extremum seeking was performed via finite difference approximation of the ES dynamics:

$$\begin{aligned} \frac{x(t+dt)-x(t)}{dt} \approx \frac{dx}{dt} = \sqrt{\alpha \omega }\cos (\omega t + k C(x,t)), \end{aligned}$$
(9.62)

by updating the feedforward signals according to

$$\begin{aligned} I_{\mathrm {ff},j}(n+1) = I_{\mathrm {ff},j}(n) + \varDelta \sqrt{\alpha \omega }\cos \left( \omega n \varDelta + k C_{I,j}(n) \right) , \end{aligned}$$
(9.63)

and

$$\begin{aligned} Q_{\mathrm {ff},j}(n+1) = Q_{\mathrm {ff},j}(n) + \varDelta \sqrt{\alpha \omega }\sin \left( \omega n \varDelta + k C_{Q,j}(n) \right) , \end{aligned}$$
(9.64)

where the individual I and Q costs were calculated as

$$\begin{aligned} C_{I,j}(n)= & {} \int \limits _{t_j}^{t_{j+1}} \left| I(t) - I_{s}(t) \right| dt, \end{aligned}$$
(9.65)
$$\begin{aligned} C_{Q,j}(n)= & {} \int \limits _{t_j}^{t_{j+1}} \left| Q(t) - Q_{s}(t) \right| dt. \end{aligned}$$
(9.66)

Note that although the \(I_j\) and \(Q_j\) parameters were updated on separate costs, they were still dithered with different functions, \(\sin (\cdot )\) and \(\cos (\cdot )\), to help maintain orthogonality in the frequency domain. The feed forward signals were then added to the PI and static feed forward controller outputs. Running at a repetition rate of 120 Hz, the feedback converges within several hundred iterations or a few seconds.

These preliminary experimental results are shown in Fig. 9.13 and summarized in Table 9.1. The maximum, rms, and average values are all calculated during a \(150\, \upmu \)s window which includes the beam turn on transient to capture the worst case scenario. The ES-based scheme is a \({>}2{\times }\) improvement over static feed-forward in terms of maximum errors and a \({>}3{\times }\) improvement in terms of rms error. With the currently used FPGA, the ES window lengths can be further reduced from \(10\,\upmu \)s to 10 ns and with the latest FPGAs down to 1 ns, which will greatly improve the ES performance.

Fig. 9.13
figure 13

Phase and amplitude errors shown before, during, and after beam turn-on transient. The histogram data shown is collected during the dashed histogram window, and cleaned up via 100 point moving average after raw data was sampled at 100 MS/s. Black: Beam OFF. Blue: Beam ON, feedback, and static feed-forward only. Red: Beam ON, feedback, static feed-forward, and iterative ES feed-forward

Table 9.1 ES performance during beam turn on transient

9.3.4 ES for Magnet Tuning

ES has also been tested in hardware for magnet-based beam dynamics tuning, as described in Sect. 9.1.1. At the SPEAR3 synchrotron at LCLS, ES was used for continuous re-tuning of the eight parameter system shown in Fig. 9.14, in which the delay, pulse width, and voltage of two injection kickers, \(K_1\) and \(K_2\), as well as the current of two skew quadrupoles \(S_1\) and \(S_2\), were tuned in order to optimize the injection kicker bump match, minimizing betatron oscillations. At SPEAR3, we simultaneously tuned 8 parameters: (1). \(p_1 = K_1\) delay. (2). \(p_2=K_1\) pulse width. (3). \(p_3=K_1\) voltage. (4). \(p_4=K_2\) delay. (5). \(p_5=K_2\) pulse width. (6). \(p_6=K_2\) voltage. (7). \(p_7=S_1\) current. (8). \(p_8=S_2\) current. The parameters are illustrated in Figs. 9.14, 9.15. While controlling the voltage for the kicker magnets \(K_1, K_2\), and the current for the skew quadrupole magnets \(S_1,S_2\), in each case a change in the setting resulted in a change in magnetic field strength.

Fig. 9.14
figure 14

Kicker magnets and skew quadrupole magnets. When the beam is kicked in and out of orbit, because of imperfect magnet matching, betatron oscillations occur, which are sampled at the BPM every time the beam completes a turn around the machine

Fig. 9.15
figure 15

Left: Kicker magnet delay (d), pulse width (w), and voltage (v) were adaptively adjusted, as well as the skew quadrupole magnet currents (i). Right: Comparison of beam quality with and without adaptation

The cost function used for tuning was a combination of the horizontal, \(\sigma _x\), and vertical, \(\sigma _y\), variance of beam position monitor readings over 256 turns, the minimization of which resulted in decreased betatron oscillations,

$$\begin{aligned} C= & {} \sqrt{\frac{1}{256}\sum _{i=1}^{256}\left( x(i) - \bar{x} \right) ^2 } + \sqrt{\frac{9}{256}\sum _{i=1}^{256}\left( y(i) - \bar{y} \right) ^2 } \nonumber \\= & {} \sigma _x + 3\sigma _y, \end{aligned}$$
(9.67)

where the factor of 3 was added to increase the weight of the vertical oscillations, which require tighter control since the vertical beam size is much smaller and therefore users are more sensitive to vertical oscillations.

The cost was based on beam position monitor (BPM) measurements in the SPEAR3 ring based on a centroid x and y position of the beam recorded at each revolution, as shown in Fig. 9.14. Variances \(\sigma _x\) and \(\sigma _y\) were calculated based on this data, as in (9.67). Feedback was implemented via the experimental physics and industrial control system (EPICS) [47].

To demonstrate the scheme’s ability to compensate for an uncertain, time-varying perturbation of the system, we purposely varied the voltage (and therefore resulting magnetic field strength) of the third kicker magnet, \(K_3(t)\). The kicker voltage was varied sinusoidally over a range of \({\pm }\)6% over the course of 1.5 h, which is a very dramatic and fast change relative to actual machine parameter drift rates and magnitudes. The ES scheme was implemented by setting parameter values, kicking an electron beam out and back into the ring, and recording beam position monitor data for a few thousand turns. Based on this data the cost was calculated as in (9.67), based on a measurement of the horizontal and vertical variance of beam position monitor readings. The magnet settings were then adjusted, the beam was kicked again, and a new cost was calculated. This process was repeated and the cost was iteratively, continuously minimized.

Figure 9.14 shows the cost, which is a function of betatron oscillation, versus magnet setting \(K_3(t)\), with and without ES feedback. For large magnetic field deviations, the improvement is roughly a factor of 2.5.

9.3.5 ES for Electron Bunch Longitudinal Phase Space Prediction

The Facility for Advanced Accelerator Experimental Tests (FACET) at SLAC National Accelerator Laboratory produces high energy electron beams for plasma wakefield acceleration [48]. For these experiments, precise control of the longitudinal beam profile is very important. FACET uses an x-band transverse deflecting cavity (TCAV) to streak the beam and measure the bunch profile (Fig. 9.16a). Although the TCAV provides an accurate measure of the bunch profile, it is a destructive measurement; the beam cannot be used for plasma wakefield acceleration (PWFA) once it has been streaked. In addition, using the TCAV to measure the bunch profile requires adjusting the optics of the final focus system to optimize the resolution and accuracy of measurement. This makes it a time consuming process and prevents on-the-fly measurements of the bunch profile during plasma experiments.

There are two diagnostics that are used as an alternative to the TCAV that provide information about the longitudinal phase space in a non-destructive manner. The first is a pyrometer that captures optical diffraction radiation (ODR) produced by the electron beam as it passes through a hole in a metal foil. The spectral content of the ODR changes with bunch length. The pyrometer is sensitive to the spectral content and the signal it collects is proportional to \(1/\sigma _z\), where \(\sigma _z\) is the bunch length. The pyrometer is an excellent device for measuring variation in the shot-to-shot bunch profile but provides no information about the shape of the bunch profile or specific changes to shape. The second device is a non-destructive energy spectrometer consisting of a half-period vertical wiggler located in a region of large horizontal dispersion. The wiggler produces a streak of X-rays with an intensity profile that is correlated with the dispersed beam profile. There X-rays are intercepted by a scintillating YAG crystal and imaged by a CCD camera (Fig. 9.16b). The horizontal profile of the x-ray streak is interpreted as the energy spectrum of the beam [49].

Fig. 9.16
figure 16

The energy spectrum is recorded as the electron bunch passes through a series of magnets and radiates x-rays. The intensity distribution of the X-rays is correlated to the energy spectrum of the electron beam (a). This non-destructive measurement is available at all times, and used as the input to the ES scheme, which is then matched by adaptively tuning machine parameters in the simulation. For the TCAV measurement, the electron bunch is passed through a high frequency (11.4 GHz) RF cavity with a transverse mode, in which it is streaked and passes through a metallic foil (b). The intensity of the optical transition radiation (OTR) is proportional to the longitudinal charge density distribution. This high accuracy longitudinal bunch profile measurement is a destructive technique

The measured energy spectrum is observed to correlate with the longitudinal bunch profile in a one-to-one manner if certain machine parameters, such as chicane optics, are fixed. To calculate the beam properties based on an energy spectrum measurement, the detected spectrum is compared to a simulated spectrum created with the 2D longitudinal particle tracking code, LiTrack [50]. The energy spread of short electron bunches desirable for plasma wakefield acceleration can be uniquely correlated to the beam profile if all of the various accelerator parameters which influence the bunch profile and energy spread are accounted for accurately. Unfortunately, throughout the 2 km facility, there exist systematic phase drifts of various high frequency devices, mis-calibrations, and time-varying uncertainties due to thermal drifts. Therefore, in order to effectively and accurately relate an energy spectrum to a bunch profile, a very large parameter space must be searched and fit by LiTrack, which effectively limits and prevents the use of the energy spectrum measurement as a real time measurement of bunch profile.

Fig. 9.17
figure 17

ES scheme at FACET

Figures 9.16 and 9.17 show the overall setup of the tuning procedure at FACET. A simulation of the accelerator, LiTrack is run in parallel to the machines operation. The simulation was initialized with guesses and any available measurements of actual machine settings, \(\mathbf {p}=\left( p_1, \ldots , p_n \right) \). We emphasize that these are only guesses because even measured values are noisy and have arbitrary phase shift errors. The electron beam in the actual machine was accelerated and then passed through a series of deflecting magnets, as shown in Figs. 9.16b and 9.17, which created X-rays, whose intensity distribution can be correlated to the electron bunch density via LiTrack. This non-destructive measurement is available at all times, and used as the input to the ES scheme, which is then matched by adaptively tuning machine parameters in the simulation. Once the simulated and actual spectrum were matched, certain beam properties could be predicted by the simulation.

Each parameter setting has its own influence on electron beam dynamics, which in turn influenced the separation, charge, length, etc, of the leading and trailing electron bunches.

The cost that our adaptive scheme was attempting to minimize was then the difference between the actual, detected spectrum, and that predicted by LiTrack:

$$\begin{aligned} C(\mathbf {x},{\hat{\mathbf {x}}},\mathbf {p},{\hat{\mathbf {p}}},t) = \int \left| \tilde{\psi }(\mathbf {x},\mathbf {p},t,\nu ) - \hat{\psi }({\hat{\mathbf {x}}},{\hat{\mathbf {p}}},t,\nu ) \right| ^2 d\nu , \end{aligned}$$
(9.68)

in which \(\tilde{\psi }(\mathbf {x},\mathbf {p},t,\nu )\) was a noisy measurement of the actual, time-varying (due to phase drift, thermal cycling...) energy spectrum, and \(\hat{\psi }({\hat{\mathbf {x}}},{\hat{\mathbf {p}}},t,\nu )\) was the LiTrack, simulated spectrum, \(\mathbf {x}(t)=\left( x_1(t) , \dots , x_n(t) \right) \) represents various aspects of the beam, such as bunch length, beam energy, bunch charge, etc. at certain locations throughout the accelerator, \(\mathbf {p}(t)=\left( p_1(t) , \dots , p_n(t) \right) \) represents various time-varying uncertain parameters of the accelerator itself, such as RF system phase drifts and RF field amplitudes throughout the machine, \(\mathbf {x}(t)\) are approximated by their simulated estimates \(\hat{\mathbf {x}}(t)=\left( \hat{x}_1(t), \ldots , \hat{x}_n(t) \right) \) and actual system parameters, \(\mathbf {p}(t)\), are approximated by virtual parameters \(\hat{\mathbf {p}}(t)=\left( \hat{p}_1(t), \ldots , \hat{p}_n(t) \right) \).

The problem was then to minimize the measurable, but analytically unknown function C, by adaptively tuning the simulation parameters \({\hat{\mathbf {p}}}\). The hope was that, by finding simulation machine settings which resulted in matched spectrums, we would also match other properties of the real and simulated beams, something we could not simply do by setting the simulation parameters to the exact machine settings, due to unknowns, such as time-varying, arbitrary phase shifts.

LiTrackES simulates large components of FACET as single elements. The critical elements of the simulation are the North Damping Ring (NDR) which sets the initial bunch parameters including the bunch length and energy spread, the North Ring to Linac (NRTL) which is the first of three bunch compressors, Linac Sectors 2–10 where the beam is accelerated and chirped, the second bunch compressor in Sector 10 (LBCC), Linac Sectors 11–19 where the beam is again accelerated and chirped, and finally the FACET W-chicane which is the third and final bunch compressor.

We calibrated the LiTrackES algorithm using simultaneous measurements of the energy spectrum and bunch profile while allowing a set of unknown parameters to converge. After convergence we left a subset of these calibrated parameters fixed, as they are known to vary slowly or not at all and performed our tuning on a much smaller subset of the parameters:

  • \(p_1\): NDR bunch length

  • \(p_2\): NRTL energy offset

  • \(p_3\): NRTL compressor amplitude

  • \(p_4\): NRTL chicane \(T_{566}\)

  • \(p_5\): Phase Ramp

“Phase ramp” refers to a net phase of the NDR and NRTL RF systems with respect to the main linac RF. Changing the phase ramp parameter results in a phase set offset in the linac relative to some desired phase.

LiTrackES, the combination of ES and LiTrack, as demonstrated, is able to provide a quasi real time estimate of many machine and electron beam properties which are either inaccessible or require destructive measurements. We plan to improve the convergence rate of LiTrackES by fine tuning the adaptive scheme’s parameters, such as the gains \(k_i\), perturbing amplitudes \(\alpha _i\) and dithering frequencies \(\omega _i\). Furthermore, we plan on taking advantage of several simultaneously running LiTrackES schemes, which can communicate with each other in an intelligent way, and each of which has slightly different adaptive parameters/initial parameter guesses, which we believe can greatly increase both the rate and accuracy of the convergence. Another major goal is the extension of this algorithm from monitoring to tuning. We hope to one day utilize LiTrackES as an actual feedback to the machine settings in order to tune for desired electron beam properties.

9.3.6 ES for Phase Space Tuning

For the work described here, a measured XTCAV image was utilized and compared to the simulated energy and position spread of an electron bunch at the end of the LCLS as simulated by LiTrack. The electron bunch distribution is given by a function \(\rho (\varDelta E, \varDelta z)\) where \(\varDelta E = E - E_0\) is energy offset from the mean or design energy of the bunch and \(\varDelta z = z-z_0\) is position offset from the center of the bunch. We worked with two distributions:

$$\begin{aligned}&\mathrm {XTCAV \ measured:} \ \rho _{\mathrm {TCAV}}(\varDelta E, \varDelta z), \\&\mathrm {LiTrack \ simulated:} \ \rho _{\mathrm {LiTrack}}(\varDelta E, \varDelta z). \end{aligned}$$

These distributions were then integrated along the E and z projections in order to calculate 1D energy and charge distributions:

$$\begin{aligned}&\rho _{E,\mathrm {TCAV}}(\varDelta E), \quad \rho _{z,\mathrm {TCAV}}(\varDelta z), \\&\rho _{E,\mathrm {LiTrack}}(\varDelta E), \quad \rho _{z,\mathrm {LiTrack}}(\varDelta z). \end{aligned}$$

Finally, the energy and charge spread distributions were compared to create cost values:

$$\begin{aligned} C_E= & {} \int \left[ \rho _{E,\mathrm {TCAV}}(\varDelta E) - \rho _{E,\mathrm {LiTrack}}(\varDelta E) \right] ^2 d\varDelta E, \end{aligned}$$
(9.69)
$$\begin{aligned} C_z= & {} \int \left[ \rho _{z,\mathrm {TCAV}}(\varDelta z) - \rho _{z,\mathrm {LiTrack}}(\varDelta z) \right] ^2 d\varDelta z, \end{aligned}$$
(9.70)

whose weighted sum was combined into a single final cost:

$$\begin{aligned} C = w_E C_E + w_z C_z. \end{aligned}$$
(9.71)

Iterative extremum seeking was then performed via finite difference approximation of the ES dynamics (Fig. 9.18):

$$\begin{aligned} \frac{\mathbf {p}(t+dt)-\mathbf {p}(t)}{dt} \approx \frac{d\mathbf {p}}{dt} = \sqrt{\alpha \omega }\cos (\omega t + k C(\mathbf {p},t)), \end{aligned}$$
(9.72)

by updating LiTrack model parameters, \(\mathbf {p}= (p_1, \ldots , p_m)\), according to

$$\begin{aligned} p_j(n+1) = p_j(n) + \varDelta \sqrt{\alpha \omega _j}\cos \left( \omega _j n \varDelta + k C(n) \right) , \end{aligned}$$
(9.73)

where the previous step’s cost is based on the previous simulation’s parameter settings,

$$\begin{aligned} C(n) = C(\mathbf {p}(n)). \end{aligned}$$
(9.74)
Fig. 9.18
figure 18

Components of the LCLS beamline

The parameters being tuned were:

  1. 1.

    L1S phase: typically drifts continuously and is repeatedly corrected via an invasive phase scan. Within some limited range a correct bunch length can be maintained by the existing feedback system. This parameter is used for optimizing machine settings and FEL pulse intensity. When the charge off the cathode is changed, L1S phase must be adjusted manually.

  2. 2.

    L1X phase: must be changed if L1S phase is changed significantly. This linearizes the curvature of the beam.

  3. 3.

    BC1 energy: control bunch length and provides feedback to L1S amplitude.

  4. 4.

    L2 phase: drifts continuously with temperature, is a set of multiple Klystrons, all of which cycle in amplitude and phase. Feedback is required to introduce the correct energy chirp required for BC2 peak current/bunch length set point. Tuned to maximize FEL intensity and minimize jitter.

  5. 5.

    BC2 energy: drifts due to Klystron fluctuations, must be changed to optimize FEL pulse intensity for exotic setups.

  6. 6.

    L3 phase: drifts continuously with temperature, based on a coupled system of many Klystrons.

Machine tuning work has begun with general analytic studies as well as simulation-based algorithm development focused on the LCLS beam line, using SLACs LiTrack software, a code which captures most aspects of the electron beams phase space evolution and incorporates noise representative of operating conditions. The initial effort focused on developing ES-based auto tuning of the electron beam’s bunch length and energy spread by varying LiTrack parameters in order to match LiTrack’s output to an actual TCAV measurement taken from the accelerator by tuning bunch compressor energies and RF phases. The results are shown in Figs. 9.19 and 9.20. Running at a repetition rate of 120 Hz, the simulated feedback would have converged within 2 s on the actual LCLS machine.

Preliminary results have demonstrated that ES is a powerful tool with the potential to automatically tune an FEL between various bunch properties such as energy spread and bunch length requirements by simultaneously tuning multiple coupled parameters, based only on a TCAV measurement at the end of the machine. Although the simulation results are promising, It remains to be seen what the limitations of the method are in the actual machine in terms of getting stuck in local minima and time of convergence. We plan on exploring the extent of parameter and phase space through which we can automatically move.

Fig. 9.19
figure 19

Parameter convergence and cost minimization for matching desired bunch length and energy spread profiles

Fig. 9.20
figure 20

Measured XTCAV, original LiTrack and final, converged LiTrack energy versus position phases space of the electron bunch shown

9.4 Conclusions

The intense bunch charges, extremely short bunch lengths, and extremely high energies of next generation FEL beams result in complex collective effects which couple transverse and longitudinal dynamics and therefore all of the RF and magnet systems and their influence on the quality of the light being produced. These future light sources, especially 4th generation FELs, face major challenges both in achieving extremely tight constraints on beam quality and in quickly tuning between various, exotic experimental setups. We have presented a very brief and simple introduction to some of the beam dynamics important to accelerators and have introduced some methods for achieving better beam quality and faster tuning. Based on preliminary results it is our belief is that a combination of machine learning and advanced feedback methods such as ES have great potential towards meeting the requirements of future light sources. Such a combination of ES and machine learning has recently been demonstrated in a proof of principle experiment at the Linac-Coherent Light Source FEL [51]. During this experiment we quickly trained a simple neural network to obtain an estimate of a complex and time-varying parameter space, mapping longitudinal electron beam phase space (energy vs time) to machine parameter settings. For a target longitudinal phase space, we used the neural network to give us an initial guess of the required parameter settings which brought us to within a neighborhood of the correct parameter settings, but did not give a perfect match. We then used ES-based feedback to zoom in on and track the actual optimal time-varying parameters settings.