1 Introduction

The theory of diffusive shock acceleration (DSA) originates from the idea originally posed by Fermi (1949) that cosmic rays are scattered by waves to be isotropic in their local frame. Scattering was proposed to happen in clouds where a turbulent magnetic field would isotropise the cosmic rays. Since on average head-on collisions are more frequent, a net acceleration occurs that naturally creates a powerlaw. Diffusive shock acceleration, proposed by various authors in the late seventies (Axford et al. 1977; Krymskii 1977; Bell 1978a, 1978b; Blandford and Ostriker 1978) is the discovery that this acceleration proceeds much faster in the vicinity of a shock. When crossing the shock, the first collision is always head on, thus allowing the acceleration to proceed at a significantly faster rate, making it an attractive mechanism to accelerate cosmic rays to high energies. The faster the shock velocity, the larger the energy gains upon transition of the shock. The theory predicts a powerlaw with a spectral index that matches quite closely the observed powerlaw of cosmic rays arriving on Earth.

The first argument that pointed to supernova remnants (SNRs) as the prime sources of Galactic cosmic rays was based on the energy budget (Baade and Zwicky 1934). The energy required to replenish the cosmic rays against their losses from the Galaxy amounts to about 10 % of that available in SNRs. However, it was not until the late seventies that a theory on how to transfer kinetic energy efficiently into the acceleration of cosmic rays had been developed. As shown in this review, even though the basic test particle theory is simple, the intrinsic nonlinearity of this process makes it a difficult one to grasp in full and more work needs to be done to understand the process from start to finish.

The strong support that DSA can work as the major mechanism to accelerate cosmic rays came from observations of the thin X-ray rims at the blast waves of SNRs. The magnetic field required to confine the cosmic rays in the vicinity of the shock is much higher than the mean interstellar magnetic field (Achterberg et al. 1994; Vink and Laming 2003; Völk et al. 2005). Originally it was proposed that the magnetic field could be amplified resonantly (Lerche 1967; Kulsrud and Pearce 1969; Wentzel 1974; Skilling 1975b), but it is not clear that the magnetic field will be able to grow beyond δB/B 0≈1 when the resonance condition is lost, as would be required in order to explain the observations. In the past decade, a number of theories has arisen that could potentially explain amplification of the magnetic field to values corresponding to those observed in supernova remnants.

However, there is no reason to believe that only SNR shocks accelerate particles. Shocks are abundant in the universe on all scales, and how and when they become efficient accelerators is an active area of current research. Locally, cosmic rays are accelerated in heliospheric plasmas. On larger scales, evidence exists for accelerated electrons in the lobes of radio galaxies, as suggested early on by Blandford and Rees (1974) and backed up by more detailed observations and modelling (e.g. Carilli et al. 1991; Croston et al. 2009; Blundell and Fabian 2011). Furthermore clusters of galaxies have been observed to contain nonthermal radio emission suggesting active particle acceleration (Bagchi et al. 2006; Ferrari et al. 2008).

Support for DSA on much larger scales has only recently been discovered with detailed and spectral observations in radio wavelengths of Megaparsec scale shocks (van Weeren et al. 2010). The spectral hardening downstream from the supposed shock front in these systems indicates that active acceleration proceeds at the shock front itself. The shocks are believed to be the result of mergers of clusters of galaxies. The magnetic field deduced from the radio observations indicate a field strength of the order of μG, which is much higher than would normally be expected in the intra-cluster medium (Brüggen et al. 2011). This is an additional indication that the process of DSA and magnetic field amplification is intrinsically linked and occurs at shocks on all scales. Because of the scales of these shocks potentially protons can be accelerated to energies of 1019 eV, although no direct evidence has been found yet. Magnetic field amplification through streaming cosmic rays has also been suggested as a source for primordial magnetic fields (Miniati and Bell 2011).

On the Galactic scale, in addition to SNRs, shocks around superbubbles have been discussed as accelerators (e.g. Bykov and Fleishman 1992; Bykov and Toptygin 2001; Parizot et al. 2004; Binns et al. 2008; Butt 2009; Ferrand and Marcowith 2010). Superbubbles are formed as a result of cumulative outflows from an assembly of massive stars, possibly enforced by the supernova explosions themselves. The outer region thus formed is a shock of large proportions that could potentially be a cosmic ray accelerator, or various multiple shocks can consecutively act to accelerate particles, which may modify the resulting spectral index.

Although in reality many deviations arise due to e.g. nonlinear modification of the shock structure, magnetic field obliquity, geometric effects, time-dependence, and magnetic field amplification, the basic theory still holds. However, in order to be able to compare the theory with observations, all of the complicating factors need to be taken into account. In this review, we focus on the advances in theory that have been made over the past decade. More specifically, we will discuss how the amplified magnetic field required for efficient acceleration is a direct result of DSA and the presence of cosmic rays. Details of the other processes can be found in earlier review articles (Drury 1983; Blandford and Eichler 1987; Malkov and O’C Drury 2001; Hillas 2005).

In Sect. 2 we will briefly review the original theory of diffusive shock acceleration and demonstrate how the process naturally results in a power law cosmic ray spectrum. In Sects. 35 we will discuss a number of theories that couple diffusive shock acceleration to magnetic field amplification. We will mainly focus on the most recent theories. In Sect. 6 we will discuss possible deviations from the source spectrum as a result of shock obliquity, nonlinearity, and time-dependence. We refer to other chapters in this book for a treatments of DSA in relativistic shocks (Spitkovsky 2012, this issue), and to the observational insights and developments (Helder et al. 2012, this issue).

2 Diffusive Shock Acceleration

The powerlaw in energy that results from diffusive shock acceleration can be understood in different ways, highlighted by different authors in its period of discovery. A crucial point is that the cosmic rays isotropise on either side of the shock due to small-angle scattering off magnetic field fluctuations. The faster the isotropisation, the faster the particle can recross the shock. Every time the shock is crossed, a net energy gain is received by the particle crossing the shock. Although the acceleration efficiency depends on the effective scattering efficiency, the resulting spectrum is independent of the diffusion coefficient. In many cases the most efficient scattering rate of Bohm diffusion, where the mean free path is of the order of the gyroradius, is used in order to generate cosmic rays with the high energies that are observed.

An intuitive way of approaching the problem is by evaluating the number of particles that is located at the shock versus the number of particles that escape downstream. Only particles that do not escape qualify for the next round of acceleration. This is the approach originally described by Bell (1978a) and described as the microscopic approach in the review by Drury (1983). The macroscopic approach, originating from Krymskii (1977), Axford et al. (1977), Blandford and Ostriker (1978), derives the acceleration and resulting powerlaw from the distribution function, requiring continuity at the shock. Below we will briefly summarise both methods and we refer to the original papers or Drury (1983) for a more extensive treatment.

The distribution of relativistic particles can be described by the Vlasov-Fokker-Planck equation:

(1)

The distribution can be separated into an isotropic part (f 0), and anisotropic parts to arbitrarily high order:

(2)

Mostly, the diffusion approximation is used, in which the first order anisotropy (f 1=f i v i /v) is used and eliminated, to arrive at a distribution that depends on the isotropic cosmic ray density (f 0) alone.

(3)
(4)
(5)

where we used κ=c 2/(3ν) for the diffusion coefficient.

From Eq. (5), far upstream the steady state solution ( t =0 and ∂u/∂z=0) implies that f 0 should have the form

(6)

to have a bound solution, where we use that in the shock frame u=−u 1. Or alternatively,

(7)

Downstream the steady state solution gives

(8)

At the shock the solutions have to connect, giving the boundary condition:

(9)

Using the boundary condition for the far upstream we can replace cf 1/3 with u 1 f 0, giving:

(10)

which results in requiring that the cosmic ray density follows a powerlaw distribution:

(11)

with

(12)

where r=u 1/u 2 represents the compression ratio at the shock. This powerlaw is valid in the test particle approach, for a planar shock, where the magnetic field is parallel to the shock normal. More details can be found in the original papers (Krymskii 1977; Axford et al. 1977; Blandford and Ostriker 1978).

An alternative approach was used by Bell (1978a) to derive the powerlaw distribution of shock accelerated particles. It is based on the microscopic physics and is helpful to get insight in how the powerlaw may change depending on escape probability and probability of crossing the shock, which will be useful in understanding the physics of the later sections.

The flux of particles downstream is just the number of particles n that are advected with the downstream flow velocity u 2: nu 2. The number of particles crossing the shock front per unit time from upstream to downstream in case of an isotropic distribution is half the number of particles moving towards the shock, and their average velocity over angle is again half of the shock velocity, giving for the flux nc/4. The fraction of particles not returning to the shock is therefore nu 2/(nc/4)=4u 2/c. The probability of recrossing the shock can be high: P ret =1−4u 2/c.

The energy gain of a particle crossing the shock from upstream to downstream can be calculated by transforming the momentum of the particle in the upstream to the downstream frame: p′=p(1+(u 1u 2)cosθ/c) such that the average change in momentum is 2p(u 1u 2)/(3c). The energy gain from downstream to upstream is exactly the same, as u 1 and u 2 are interchanged and the angle of integration runs to the opposite side, yielding an extra −1. Thus the gain of momentum after a complete cycle is Δp=4p(u 1u 2)/(3c). After k cycles, the number of particles has decreased as n=n 0(1−4u 2/c)k and the momentum has increased as p=p 0(1+4(u 1u 2)/(3c))k. The number of particles as a function of momentum can be found to be

(13)

where r again is the compression ratio r=u 1/u 2, such that:

(14)

and the differential energy spectrum is:

(15)

In terms of the distribution function we arrive at the same answer as from the macroscopic approach, since:

(16)

For a more detailed treatment we refer to the original papers and earlier reviews (Bell 1978a, 1978b; Drury 1983).

3 Magnetic Field Amplification: Resonance Regime

In the theory of diffusive shock acceleration cosmic rays are accelerated by crossing and recrossing a shock, as shown in the previous section. On each cycle of crossing recrossing between upstream and downstream the cosmic ray energy increases by a small fraction ∼u s /c where u s is the shock velocity. For acceleration to PeV energies a cosmic ray (CR) has to cross the shock ∼10c/u s times. A shock propagating into a purely uniform magnetic field cannot accelerate CR to PeV because charged particles pass easily through the shock and escape upstream or downstream making only one pass through the shock and gain little energy. Arguably, it was the realisation that charged particles are not free to escape the shock environment that provoked the development of the theory of shock acceleration in the late 1970’s.

Fast and efficient CR acceleration by the Fermi mechanism requires that particles are multiply scattered by magnetic fluctuations in the acceleration source (e.g. shock). Magnetic field amplification due to the resonant cosmic-ray streaming instability was studied in the context of galactic cosmic-ray origin and propagation since the 1960s (see e.g. Kulsrud and Cesarsky 1971; Wentzel 1974; Achterberg 1981; Berezinskii et al. 1990; Zweibel 2003). It was proposed by Bell (1978a) as a source of magnetic turbulence in the test particle DSA scenario.

CR streaming along magnetic field lines excite unstable growth of Alfven waves with wavelengths comparable with the CR Larmor radius (Kulsrud and Pearce 1969; Wentzel 1974; Skilling 1975a, 1975b, 1975c). The Alfven waves consist of circularly polarised distortions to the magnetic field lines. CR gyrating along the field lines in spatial resonance with the fluctuations are strongly scattered and consequently execute a random walk along a field line. The appropriate model for CR transport in the shock environment is diffusion instead of free propagation. CR cross a shock many times with a statistical probability that naturally results in a E −2 energy spectrum for cosmic rays (Krymskii 1977; Axford et al. 1977; Bell 1978a; Blandford and Ostriker 1978), as shown in Sect. 2.

The theory of wave excitation and CR scattering had previously been applied to CR propagation through the Galaxy. Skilling set out the coupled equations for the wave energy density I and the CR distribution function f. When applied to a steady state wave and CR precursor ahead of a shock the equations take the form:

(17)

where v A is the Alfven speed, p is the CR momentum, and r g is the CR Larmor radius. In back-of-the-envelope terms, 4πcp 4 f/3 is the CR pressure. I is the ratio of the energy density in the Alfven waves (δB 2/4π) to the energy density of the unperturbed field B 0, \(I =2 \delta B^{2}/B_{0}^{2}\). Further details of the equations can be found in Skilling (1975a, 1975b, 1975c), and a solution of the precursor equations can be found in Bell (1978a). The dominant physics of the interaction between CR and the Alfven waves is on the one hand that wave growth is driven by the CR pressure gradient and on the other hand that the CR diffusion coefficient is inversely proportional to the wave energy density with mean free path Λ given by Λ=4r g /3I. The equation for wave evolution can be integrated to give

(18)

For a characteristic interstellar magnetic field (B∼3 μG) the Alfven speed is around v A ∼10 km s−1, so u s /v A is of the order of 103 for the outer shock of a young supernova remnant (SNR). To account for the energy density of Galactic CR, acceleration by SNR must be efficient and the energy transfer into CR at a SNR shock has to be in the range \(0.1-0.5\rho u_{s}^{2}\) (Baade and Zwicky 1934; Longair 2010). In terms of Eq. (18) this gives I≫1, which would correspond to a perturbed field δB greatly exceeding the zeroth order field B 0. Naively this implies a diffusion coefficient much less than r g c and a CR scattering mean free path much less than the Larmor radius (Λr g ). The linear instability depends upon a resonance between the CR Larmor radius and the Alfven wavelength. This resonance is destroyed when δB approaches B 0, in which case the instability cannot be resonantly driven and the instability is expected to saturate at about δBB 0. The linear equations lose validity when I∼1, but Eq. (18) suggests that instabilities driven by CR streaming may be able to amplify magnetic field far beyond its initial ambient value. In the next sections we will turn to instabilities that do not rely on the resonance condition and may continue well beyond δB/B 0∼1.

4 Magnetic Field Amplification: Short Wavelength Regime

4.1 A Non-linear Estimate of the Amplified Magnetic Field

A more basic understanding of the opportunity for magnetic field amplification can be derived as follows. According to Eq. (17) the rate at which the wave energy density I grows is ≈v A ∂P cr /∂z where P cr is the CR pressure. This can be interpreted as a force ∂P cr /∂z pushing against magnetic fluctuations at the Alfvén velocity v A . Provided the CRs are coupled to the magnetic field fluctuations and the fluctuations propagate at about the Alfvén speed this equation is approximately valid even when the spatial resonance between the wavelength and the Larmor radius breaks down.

In the linear regime it makes sense to think of the magnetic fluctuations moving at the Alfvén velocity. In the non-linear regime, the fluctuations no longer take the form of linear waves, but still they can be estimated to move at a velocity v f v A ∼(magnetic pressure/density)1/2. Energy transfer to the fluctuations occurs at a rate v f ∂P cr /∂z, and the equation for growth of the turbulent energy density U f associated with the fluctuations is

(19)

where P s is the CR pressure at the shock. Assuming that v f =(U f /ρ)1/2 and U f =B 2/4π as for Alfvén waves, the turbulent energy density at the shock is

(20)

If the CR acceleration efficiency is \(P_{s}\approx0.1 \rho u_{s}^{2}\) this leads to an estimated magnetic field of the order of 100 μG in young SNR in good agreement with observations (Vink and Laming 2003; Berezhko et al. 2003; Völk et al. 2005). For a fixed acceleration efficiency \(P_{cr}\propto\rho u_{s}^{2}\), the amplified magnetic field is proportional to ρ 1/2 u s , see Bell and Lucek (2001) for a more detailed model. A later improved analysis of field amplification suggests a stronger dependence on shock velocity, \(B \propto\rho^{1/2} u_{s}^{3/2}\), as discussed in Sect. 4.5 (Bell 2004, 2009).

4.2 A Non-resonant Instability

Equation (20) sidesteps the requirement for a resonance between the Larmor radius and the instability wavelength and subsequent theoretical developments have shown that the resonance is not essential for the amplification of magnetic field by CR streaming. The same linearisation of the Vlasov equation that gives rise to the resonant instability also identifies a non-resonant instability as first demonstrated by Bell (2004). Bell (2004) simplified the problem by using the MHD equations to treat the thermal plasma as a magnetised fluid. A kinetic Vlasov treatment is retained for the CR. The CR exert a force on the MHD plasma through the reaction on the j cr ×B force, which is equal and opposite in most of the unstable regime, the details of which will be discussed in Sect. 4.3. Here j cr is the CR electric current density, and the CR trajectories are calculated in the magnetic and electric fields calculated by the MHD model.

The coupled Vlasov-MHD equations give a linear dispersion relation containing two different instabilities for the two different circular polarisations and wavenumbers k parallel to the zeroth order magnetic field. The resonant instability occurs when the circular polarisation is such that CR moving in the same direction as j cr gyrate resonantly around the zeroth order field in the same sense as the helical perturbations in the magnetic field, which we will refer to as the left-hand polarisation. A stronger non-resonant instability is found in the opposite (right-hand) circular polarisation.

The dispersion relation for the non-resonant instability is plotted in Fig. 1 for a power-law CR distribution f(p)∝p −4 for p 1<p<p 2 where f(p) is non-zero between momenta p 1 and p 2. The resonant and non-resonant instabilities have similar growth rates when \(k \approx r_{g}^{-1}\), that is when the inverse wavenumber k −1 is approximately equal to the Larmor radius of CR with the lowest momentum p 1. The lowest momentum CR are the most important because they carry most of the electric current. At wavelengths less than r g there are few CR in spatial resonance with the helices in the magnetic field and the growth rate of the resonant instability decreases as k increases. In contrast, the non-resonant instability does not depend on resonance and its growth rate instead increases with increasing k. The non-resonant growth rate increases with k until tension in the magnetic field overpowers the driving force, that is when j cr ×B=B×(k×B)c/4π and kB∼4πj/c.

Fig. 1
figure 1

Dispersion relation for the non-resonant instability v A =6.6×103ms−1, u s /c=1/30. ω is in units of \(u_{s}^{2}/cr_{g}\) and k in units of \(r_{g}^{-1}\) where r g is the Larmor radius of the lowest energy CR. Reproduced from Bell (2004)

The essential difference between the resonant and non-resonant linear instabilities can be explained as follows. In both cases the background MHD fluid motions are driven by the j cr ×B force. j cr and B have unperturbed zeroth order components, j cr0 and B 0, and first order perturbed components j cr1 and B 1 respectively. The j cr ×B driving force has two first order components, j cr0×B 1 and j cr1×B 0. The resonant instability is driven by j cr1×B 0 where j cr1 is the perturbed CR current which is especially strong when the CR trajectories respond resonantly to a helical magnetic field with a wavelength equal to the CR Larmor radius. On the other hand, the non-resonant instability is driven by the other first order force j cr0×B 1. In the non-resonant instability only the uniform zeroth order current j cr0 matters and there is no requirement for a resonance with the CR Larmor radius.

Because the non-resonant instability is driven by the zeroth order CR current, we can derive a growth rate while omitting the first order current from the analysis. The response of the CR trajectories to the perturbed fields can be ignored, and hence there is no need to solve the Vlasov equation for CR. It is sufficient to solve the MHD equations for the thermal plasma with sole addition of the j cr0×B 1 force in the MHD momentum equation. The MHD mass conservation equation can also be omitted since the instability is transverse and the density is unperturbed, ρ 1=0. Similarly the thermal plasma pressure is unperturbed. The first order equations are then

(21)

u 1 can be eliminated between the equations to give

(22)

Harmonic solutions B 1∝exp[i(kzωt)] with k parallel to B 0 in the z direction, gives a dispersion relation

(23)

In the absence of a CR driving current j cr0=0, this is the dispersion relation for an Alfvén wave. The right hand side of the dispersion relation creates an instability provided \({kB_{0} j_{cr0}}/{\rho c}>k^{2} v_{A}^{2}\) and the negative sign is chosen for ± corresponding to the non-resonant polarisation. If \(k^{2} v_{A}^{2} > | {kB_{0} j_{cr0}}/{\rho c}|\), the tension in the field lines overpowers the driving term and the dispersion relation is that for non-growing Alfvén waves with a modified phase velocity. The neglect from the analysis of the j cr1×B 0 ignores the tendency of CR to follow the field lines on wavelengths greater than the CR Larmor radius and therefore omits the reduced instability for kr g <1 that is seen in Fig. 1. For wavenumbers k in the unstable range \(r_{g}^{-1}<k<4 \pi j_{cr0}/B_{0} c\), equivalent to \(r_{g}^{-1}<k<B_{0}j_{cr0}/(\rho v_{A}^{2} c)\), the instability is purely growing with a growth rate

(24)

and a maximum growth rate, determined by the tension in the magnetic field,

(25)

The CR current j cr0 in the precursor is related to the CR pressure by

If the CR acceleration efficiency is defined by \(P_{cr}=\eta\rho u_{s}^{2}\), the number of e-foldings for the fastest growing mode is

(26)

where ω g is the CR Larmor frequency, M A is the Alfven Mach number \(M_{A}^{2}=4\pi\rho u_{s}^{2}/B^{2}\), κ/κ B is the ratio of the CR diffusion coefficient to the Bohm coefficient, and \(\tau_{\mathrm{pc}}=({c^{2}}/{u_{s}^{2}}) \omega_{g}^{-1}\) is the time taken for the shock to propagate a distance equal to the scaleheight of the CR precursor if Bohm diffusion applies. Since M A ∼103 for a young SNR propagating into an interstellar magnetic field, and overall CR efficiencies must be 10–50 %, the number of instability e-folding times is much greater than one. The term κ/κ B introduces an element of self-regulation since the number of e-foldings is greater if the magnetic fluctuations are small and diffusion is greater than Bohm. This may be important at the foot of the precursor where the instability has not had the opportunity to grow. If scattering is weak, CR escape a relatively large distance upstream, initiate instable growth far ahead of the shock and remedy the lack of a perturbed magnetic field able to scatter the CR.

4.3 Return Currents and Energy Transfer

In the above analysis the force on the background thermal plasma was included as −j cr ×B. To conserve momentum the force on the background plasma is equal and opposite to the force on the CR. Another way of looking at this is that the force on the background plasma is exerted through the return current j t that the background thermal plasma must carry to neutralise the current carried by the CR, j t ≈−j cr . The difference between j t and −j cr is given by the Maxwell equation when the displacement current is neglected: j t =−j cr +c∇×B/4π in which case j t ×B=−j cr ×BB×(∇×B)c/4π. j t and j cr very nearly cancel out where the instability is strong. However, their non-cancellation at small wavelengths gives rise to the magnetic tension that limits the instability to the range \(k<B_{0}j_{cr0}/\rho v_{A}^{2} c\). At longer wavelengths the thermal return current follows a path very close to that of the CR current.

A necessary condition for CR-driven instability is that the thermal particles are strongly magnetised with Larmor radii much smaller than the wavelength and that the CR should have a Larmor radius comparable with or larger than the wavelength so they are not tied to magnetic field lines. If the thermal particles are frozen-in to the magnetic field it is not immediately obvious how one can have a j t ×B force with a current j t unaligned with the magnetic field. The difficulty is resolved through the theory of cross-field drifts. The j t ×B force imparts an acceleration to the plasma. This acceleration can be viewed as being equivalent to a gravitational force. A charged particle in a gravitational field executes a cross field drift, and similarly in this case a cross field drift arises through the Lorentz force, which produces the current density j t .

Another conundrum is how energy is extracted from the CR current to drive the turbulence and amplify the magnetic field. The j cr ×B cannot extract energy from the CR because magnetic field only deflects particles and cannot change their energy. The solution here comes from the second order electric field c E 2=−u 1×B 1. E 2 is anti-parallel to j cr0 and reduces the CR energy which is transferred to the second order magnetic energy density \(\mathbf{B}_{1}^{2} /8 \pi\) and kinetic energy density \(\rho\mathbf{u}_{1}^{2} /2\).

4.4 The Non-resonant Instability in 3 Dimensions

In the case of a parallel shock, the zeroth order CR current is parallel to the zeroth order magnetic field and the above theory for k aligned with B 0 can be applied directly. A more general theory is needed for an oblique shock where k and B 0 are not parallel. The dispersion relation for general orientations of k, j and B 0 was derived by Bell (2005) for wavelengths shorter than the CR Larmor radius, i.e. kr g >1:

(27)

where v A =B 0/(4πρ 0)1/2 is the Alfven speed, c s =(∂P/∂ρ)1/2 is the sound speed, \(\gamma_{0}^{4}=( \mathbf{k}\cdot\mathbf{B}_{0})^{2} \mathbf{j}^{2}/\rho_{0}^{2} c^{2}\), and a hat denotes a unit vector: \(\hat{\mathbf{k}}=\mathbf{k}/|\mathbf{k}|\), \(\hat{\mathbf{b}}=\mathbf{B}_{0}/|\mathbf{B}_{0}|\), \(\hat{\mathbf{j}}=\mathbf{j}_{cr}/|\mathbf{j}_{cr}|\). The terms involving kv A are important at the short wavelength limit when magnetic field tension is important as in the case of k aligned with B 0. The terms involving kc s represent additional short wavelength compressibility effects that are not present when k, B 0 and j cr are all parallel. The terms in kv A and kc s are important only at short wavelengths. At longer wavelengths they can be neglected and the growth rate simplifies to

(28)

The instability grows most rapidly for wavenumbers parallel to the magnetic field but the growth rate is independent of the mutual orientation of the magnetic field B 0 and the CR current j cr .

The insensitivity of the growth rate to the angle between magnetic field and CR current implies that the instability is present for perpendicular as well as parallel shocks. In fact, the growth rate is faster for perpendicular shocks because the CR current is larger than that upstream of parallel shocks. In the precursor of a parallel shock the CR drift at the shock velocity relative to the thermal plasma to give a current density parallel to the shock normal of j cr =n cr eu s where n cr is the CR number density. In contrast, for a perpendicular shock the CR current density is larger in the direction perpendicular to the shock normal. The CR current density at a perpendicular shock can be calculated from the first moment of the Vlasov-Fokker-Planck (VFP) equation in the diffusive limit in which the one-dimension CR distribution function takes the form f(p,z,t)=f 0(|p|,z,t)+f 1(|p|,z,t)⋅(p/|p|).

(29)

where ν represents angular scattering by small scale fluctuations. For a mono-energetic CR distribution at a perpendicular shock, the CR current density in the precursor can be separated into a component j || normal to the shock and a component j that is perpendicular to the magnetic field and the shock normal:

(30)

where n cr is the CR number density, L is the scalelength of the precursor, Λ=c/ν is the CR mean free path, and r g =c/ω g is the CR Larmor radius.

The components j || and j of the CR current density correspond to the CR drift velocities in the local upstream fluid rest frame. In steady state, j ||=n cr eu s and j =(Λ/r g )n cr eu s . The large scale field is relatively unimportant when the mean free path is comparable with the Larmor radius Λr g (Bohm diffusion) since the distinction between parallel and perpendicular shocks is then relatively minor, and the scaleheight in both cases is L∼(c/u s )r g . Bohm diffusion corresponds to the smallest possible mean free path. More usually, the mean free path can be expected to be greater than the Larmor radius and the cases of parallel and perpendicular shocks become quite different. From the above equations, the precursor scaleheight ahead of a perpendicular shock is reduced to L∼(r g /Λ)(c/u s )r g , and j exceeds j ||. j results from the non-cancellation of gyratory currents in the CR density gradient in the precursor. Since j is greater than j ||, the instability is driven more rapidly at a perpendicular shock than at a parallel shock. However, the time during which the instability can grow is reduced because the scaleheight L is smaller at a perpendicular shock. The increased growth rate and the reduced time for growth cancel out and the number of linear e-foldings is the same for both parallel and perpendicular shocks. Hence the non-resonant instability is equally effective for all shocks whether they are perpendicular, parallel or oblique. From Eq. (28), in any of these cases fastest linear growth occurs for wavenumbers parallel to the large scale magnetic field, independent of its orientation to the shock normal. Unstable growth and magnetic field amplification at perpendicular shocks has been demonstrated numerically in particle-in-cell simulations by Riquelme and Spitkovsky (2010).

4.5 Non-linear Magnetic Field Amplification

A linear instability takes a small perturbation δ B on the magnetic field and amplifies it until it becomes comparable with the zeroth order field B 0. When δ BB 0, the linear assumption that second order terms in δ B can be neglected becomes untenable. A crucial question is whether the non-linear terms cause the instability to saturate and stop growing, or whether the magnetic field grows further to a magnitude much greater than B 0. In the context of CR acceleration, the field has to continue growing beyond δB/B 0∼1 if diffusive shock acceleration is to explain the presence of PeV CR in the Galaxy. Furthermore, saturation at δB/B 0∼1 is insufficient to explain the large magnetic fields inferred from X-ray observations of synchrotron emission at an SNR shock. Therefore it is crucial to ascertain not only that linear growth is sufficiently fast but also that the instability continues to grow non-linearly beyond δB/B 0∼1 to generate fields exceeding 100 μG at SNR shocks.

Fortunately the non-resonant instability has the unusual property of continuing rapid growth into the non-linear regime. Remarkably, in the restricted geometry of a monochromatic circularly polarised wave with wavenumber k, zeroth order uniform magnetic field B 0 and uniform CR current j cr all parallel, the linear equations remain valid into the non-linear regime and the instability continues to grow exponentially to arbitrary amplitude at the linear growth rate. In this special case, slabs of plasma with frozen in magnetic field continue to be accelerated in directions perpendicular to k, B 0 and j cr . In practice other modes with different k also grow and these interfere to slow the growth. For example, an exponentially expanding spiral field in one part of the plasma is likely to collide with an expanding spiral field seeded in a different part of the plasma. The spirals cannot in general pass through each other and their growth is limited. However, the presence of exponentially growing non-linear modes is a strong hint that growth to large amplitude is possible. Lucek and Bell (2000) showed numerically that the magnitude of the magnetic field can increase by at least an order of magnitude. 3D MHD simulations by Bell (2004) showed similar or larger growth before the calculation was terminated when magnetic structures expanded to the size of the periodic computational box. In Bell (2004) the instability grew exponentially at the expected rate until δB/B 0∼1, whereafter it continued to grow but more slowly.

The linear eigenmodes of the instability consist of spirals of magnetic field with a preferred helicity. The evolution of this basic configuration into the non-linear regime can be seen in Fig. 4 of Bell (2004). Initially small spirals and loops of magnetic field grow non-linearly in radius. Collisions between neighbouring spirals produce walls of strong magnetic field surrounding cavities of very weak magnetic field. The field is far from uniform and does not conform to conventional pictures of randomly phased Fourier modes in k-space. Because the magnetic field is frozen in to the background plasma, the density and magnetic field have closely correlated structures of walls and cavities as shown in Fig. 2, which reproduces two frames from Figs. 2 and 3 of Bell (2005). The same wall-cavity structure has been found by Reville et al. (2008), Zirakashvili et al. (2008) in MHD simulations, and by Riquelme and Spitkovsky (2009) and Ohira et al. (2009) in particle-in-cell simulations.

Fig. 2
figure 2

Comparison of structures of density and magnitude of magnetic field. 2D slice through a 3D simulation. Reproduced from Bell (2005)

The growth of large scale structures resulting from the expansion of the small scale structures provides a natural way of producing structures on the scale of a CR Larmor radius. These are especially important since CR are most effectively scattered by fields on this scale. Fields on smaller scales may explain the amplified fields observed at SNR shocks, but they cannot by themselves provide the strong CR scattering needed to accelerate CR to PeV energies. Reville et al. (2008) have modelled CR transport in non-linear CR-driven magnetic fields calculated with a 3D MHD code. They show that the amplified field does inhibit CR transport and reduces diffusion to less than Bohm diffusion in the initial magnetic field. The generation of magnetic structures on large scales is an active field of research (see Sect. 5), and other sources of turbulence contribute to the overall magnetic field structure in SNR (Giacalone and Jokipii 2007; Zirakashvili and Ptuskin 2008; Beresnyak et al. 2009; Inoue et al. 2009; Schure et al. 2009).

It is clear from analyses and simulations that magnetic field amplification continues far into the non-linear regime. In some circumstances amplification may be limited by the time for which the j cr ×B driving force operates. In the case of a shock precursor this limiting time is the time L/u s it takes for the precursor of scalelength L to be overtaken by the shock. However, two conditions have emerged as requirements for unstable growth. One is that the scale size k −1 of magnetic structures should not exceed the CR Larmor radius, otherwise the CR follow the field lines and j cr ×B becomes small since j cr is parallel to B. The second is that the magnetic tension B×(∇×B)c/4πkB 2 c/4π should not exceed the j cr ×B driving force. These conditions reduce to k>eB/pc and k<4πj cr /Bc respectively. For both conditions to be satisfied simultaneously we need B 2/4π<pj cr /e. Using the expression for j cr derived in the discussion between Eq. (25) and (26) in Sect. 4.2, this reduces to an estimate for the saturated magnetic energy density producable in the non-linear phase of the non-resonant instability.

(31)

This estimate is a good match to observations of the magnetic field in SNR (Vink 2008; Bell 2009), but other explanations are possible for the same data (Malkov et al. 2011).

5 Magnetic Field Amplification: Long Length Scales

As shown in the previous section, the non-resonant Bell instability can act efficiently to amplify magnetic fields on scales smaller than the gyroradius. However, in order to accelerate cosmic rays to higher energies, the magnetic field should also be amplified on scales beyond the cosmic ray gyroradius, which is the ‘long-wavelength regime’ discussed in this section. In 2011 various papers have been published on possible instabilities that act to amplify the magnetic field on these scales (Bykov et al. 2011b; Schure and Bell 2011a; Reville and Bell 2012), on which we will focus in this section. We will also briefly discuss other long-wavelength instabilities (firehose and acoustic), but for a more extensive review refer to the original papers or reviews (see e.g. Blandford and Eichler 1987; Malkov and O’C Drury 2001; Bykov et al. 2011a).

5.1 Current-Driven Stress-Tensor Instability

The small-scale instability is essentially a fluid instability. The cosmic ray current is regarded to be of such scales that it is unperturbed by the non-resonant growth of small-scale magnetic fields. When looking at growth of magnetic field on scales larger than the gyro-radius, this assumption no longer holds since the CR trajectories follow the field lines and perturbations to the current in perpendicular directions should be taken into account.

One way to determine this effect on the stability of the plasma is by including higher order anisotropic terms in the kinetic equation. The feedback between the cosmic ray particles and the magnetic field in the plasma proceeds through forces acting on the momentum equation. The j×B forces arise when components of the cosmic ray current, or rather the induced return current in the plasma, are perpendicular to local components of the magnetic field. Gradients in the perpendicular current require taking into account higher-order components of the distribution function, at least the stress tensor, effectively representing gradients in the current.

The distribution function of relativistic particles can be written as:

(32)

where f is the particle distribution in phase space, v the particle velocity, and \(\mathbf{D}=\frac{v^{2}\nu}{2} (\mathbf{I} - \hat{\mathbf{n}}\hat{\mathbf{n}})\) the diffusion tensor with ν the collision frequency, I the identity matrix, and \(\hat{\mathbf{n}}\) the unit vector in the direction of the corresponding tensor component.

The collisions, represented in the parameter ν, are not actual collisions in such tenuous systems, but effectively act in the same way. What we mean with collisions is the cumulative effect of small-angle scattering as a result of the Lorentz force of the perturbed current and magnetic field. When they are fluctuating on the same scale, the Lorentz force is effectively deviating the path of the cosmic rays. Multiple of these small-angle scatterings result in isotropisation. The length and time scales on which this occurs is represented in the parameter ν, which represents the scattering frequency in terms of the particle velocity over the mean free path. In Bohm diffusion this is taken to be of the order of the gyrofrequency. The short-wavelength non-resonant instability acts to amplify the field on scales that can efficiently deflect the low-energy cosmic rays. In effect, the parameter ν thus describes the momentum exchange of the small-scale instability and can be used to determine its influence on the long range.

The distribution of cosmic rays is dominated by the isotropic component: f 0, but also contains an anisotropic part, to increasing order f=f 0+f 1v/v+f 2vv/v 2+⋯ . f 1 can be viewed as the directional component of the cosmic ray distribution, or as the gradient of f 0, and acts like a current. f 2 is the pressure tensor, of which the isotropic part of the diagonal is normally included in the f 0 term. Anisotropy in the diagonal can be responsible for the firehose instability. Off-diagonal terms embody the stress-tensor and reflect gradients in the current. Each higher order is a factor of u/c smaller than the previous order, where u is the drift velocity. Evaluation of the transport equation to zeroth order, being the isotropic part, and first and second order anisotropies, gives the following system of equations, where we ignore any contribution of higher (3rd) order:

(33)
(34)
(35)

Here I 2 is the second order unity tensor, and […]2 indicates a summation of the permutations for ijk in two ways divided by 2, such that we get a symmetric tensor with components that satisfy f ij =f ji . In principle also higher order terms can be used in the evaluation, but it turns out these have no significant effect on the instability (Schure and Bell 2011a). This system of equations can be closed in combination with the MHD equations:

(36)
(37)

where ρ cr is the mass density of the relativistic particles, and u cr u is the drift speed of the cosmic rays relative to the plasma, which to zeroth order is equal to u s . Since j th =−j cr +c/(4π)∇×B the momentum equation (Eq. (37)) has to satisfy:

(38)

In systems relevant for efficient diffusive shock acceleration, the upstream plasma can be considered cold and isotropic, such that ∇P=0 and ∇⋅Π=0. Additionally, the second term on the r.h.s. is much smaller than the other terms, and only contributes at very short wavelengths where the magnetic tension is sufficient to quench the instability. If we furthermore use that j cr =n cr e u s =e f 1 c/3, divide both sides by ρ, and write n=n cr /n i the ratio between cosmic ray- and background nucleons, the momentum equation can be expressed in terms of f 1 as follows:

(39)

For a linear analysis it suffices to look at the first order perturbation, for which the above can be expressed, using Eq. (34), as:

(40)

where the subscripts between brackets indicate unperturbed (0) or perturbed (1) variables.

Both of the two above equations are instances of the momentum conservation, viewed either through the forces j×B and frictional force ρν u, or through the pressure gradient and divergence. Feeding this into the induction equation, we can express the perturbed magnetic field in terms of f 1 and f 2:

(41)

where we assumed a homogeneous background magnetic field, incompressibility, and u (0)=0. For the rest of this section we consider a parallel shock in the z-direction, such that \(\mathbf {B}_{(0)}=B_{(0)} \hat{\mathbf{z}}\), and we consider modes parallel to the original field, such that kB (0)=0.

Equations (34)–(41) can then be combined (see Schure and Bell 2011a) to arrive at the dispersion relation:

(42)

where ω is the complex frequency, k the wavenumber, c speed of light, ν the effective scattering frequency, ω g the gyrofrequency, and \(\varOmega=\sqrt{k j_{0} B_{(0)}/(\rho c)}\) contains the driving (return) current j 0 and is the growth rate of the non-resonant Bell instability. The upper signs correspond to the left-handed polarisation (which is the polarisation of a gyrating proton), and the lower signs to the right-hand polarisation. This dispersion relation is valid in the linear regime as long as the Alfvénic stress due to (∇×BB is small (for the parameters used to plot Fig. 3 beyond kr g ≈1000). Also, since ω is always small compared to k 2 c 2/(5(3ν g )), it can in practise be ignored on the right hand side of Eq. (42).

Fig. 3
figure 3

Growth rate of the tensor-mediated instability, from Schure and Bell (2011a). The solid (dashed) lines indicate the right- (left-) hand mode. The different colours give the growth rates for different values of the effective collisionality of the cosmic rays, which aids in coupling the cosmic ray momentum equation to the momentum equation of the plasma

The terms including factors of k 2 result from the inclusion of the stress tensor. Effects from the small-scale non-resonant instability are included through ν, the effective scattering frequency. Scattering on small scales is expected to arise earlier than on long scales, since the growth rate increases rapidly for kr g ≫1. Bohm diffusion is the regime where the effective scattering frequency is of the same order as the gyrofrequency, i.e. νω g . We plot the growth rate as a function of wavenumber for different values of ν/ω g in Fig. 3, for both the left-hand (dashed) and the right-hand polarisation (for the full plots of the dispersion relation see the original paper: Schure and Bell 2011a).

This method recovers the Bell (2004) growth rate for the right-hand polarisation in the regime where kcω g . The resonant instability is only approximately captured in the mono-energetic approach (as can be seen from the peak around kr g ≈2 in Fig. 3), which is why the familiar k-dependent growth rate around kr g =1 is not present in this representation. Care should be taken when including the resonant instability to do the momentum integration with an appropriate upper limit for p, so as to not overestimate the growth rates on the long-wavelength end. It can be seen that both modes are unstable. Which of the polarisations dominates depends on the ratio ν/ω g ; for \(\nu/\omega_{g} < 1/\sqrt{3}\) the left-hand mode dominates, and vice versa for higher collisionality. When the collisionality is zero, ν=0, only the left-hand mode is unstable and purely growing. The current-driven long-wavelength instability depends on the mediation of the short-scale instability through the stress-tensor and the ‘collisionality’.

Self-consistency between the equations for the cosmic rays and for the fluid are crucial in deriving the dispersion relation. In the following analysis we show that omission of the force due to friction in the momentum equation, as can often be done in other circumstances, would result in a completely different dispersion relation that shows a much more rapid growth and only declines at long wavelengths as \(\sqrt{k}\). This is a result of an unbalanced frictional force, that results in the ν f 1 term remaining in the momentum equation (Eq. (40)) as a consequence of there not being a similar term in the momentum equation for the fluid. This results in the following erroneous dispersion relation:

(43)

The difference is only an additional ν in the nominator. For large k, the additional ν can be ignored such that on the short scales the result does not change. However, on the long-wavelength end the additional ν would change the result. To lowest order in k (thus ignoring the k 2 contribution from f 2), and in the limit ωω g , the growth rate would change to:

(44)

In the limit where νω g , both modes would have similar growth rates, which would read:

(45)

It should be stressed this is not a physical solution.

5.2 Ponderomotive Instability

Bykov et al. (2011b) presented a long-wavelength instability that results from averaging the kinetic equation for the relativistic particles, the equations of the bulk plasma motions and the induction equation over the ensemble of the short scale fluctuations produced by CR instabilities in the collisionless regime e.g. by the fast Bell instability. To derive the growth rates of the modes in the long-wavelength regime <1, with Λ=r g /(ν/ω g ) the mean free path, the dispersion relation as in Eq. (61) derived from the collisionless kinetic equation approach is not appropriate.

In the presence of the short-scale fluctuations, the momentum exchange between the CRs and the flow in the hydrodynamic regime, results in a ponderomotive force that depends on the CR current in the mean-field momentum equation of bulk plasma (Bykov et al. 2011b). As a result, there exist transverse growing modes with wavevectors along the initial magnetic field with growth rates that are proportional to the turbulent coefficients determined by the short scale fluctuation:

(46)

where \(\sqrt{\langle\mathbf{b}^{2} \rangle}\) is the magnitude of the short-scale amplified magnetic field and which holds for both polarisations. The magnetic field amplification in that regime only weakly depends on the shock velocity (\(\gamma\tau\propto u_{s}^{-1/2}\) as follows from Eq. (48) in Bykov et al. (2011b), see also Schure and Bell (2011b)), that is important for the evolution of the maximal energy of CRs accelerated by DSA. In the intermediate regime, ν/ω g <kr g <1, the growth rate can be approximated as:

(47)

where \(\sqrt{\langle\mathbf{v}^{2} \rangle}\) is the amplitude of the short-scale turbulent bulk velocity.

The ponderomotive instability is a multi-layered phenomenon and the underlying physics is not immediately clear. In the intermediate regime, the growth rate (Eq. (47)) is independent of the CR current which suggests that it is not directly driven by CR streaming. Instead, the growth time is equal to the time taken to cross a distance 1/k at the characteristic turbulence velocity. This suggests that the magnetic field in this regime grows as a result of field-line stretching by already existing turbulent motions. The long wavelength regime (Eq. (46)) is complicated because it includes magnetic field on three different scales: (i) B 0 on a scale comparable with or greater than the wavelength, (ii) the magnetic field \(\sqrt{\langle\mathbf{b}^{2} \rangle}\) associated with the turbulence, and (iii) the small scale magnetic field causing the scattering represented by the collision frequency ν. The challenge is to see why all three fields are important for the instability. It is also important to check that the j×B momentum transfer between CR and the fluid is treated self-consistently in each case since errors in total momentum conservation can lead to an incorrect growth rate as shown in Sect. 5.2. In the next section we review the filamentation instability which is also due to the presence of turbulent magnetic field on scales smaller than kr g ∼1. A further challenge is to ascertain whether there is any overlap in underlying physics between the ponderomotive and filamentation instabilities.

5.3 Filamentation Instability

The expansion of the loops generated by the non-resonant instability on small scales can give rise to a further filamentation instability. Because the cosmic rays are focussed into filaments, the cosmic ray current locally increases. As a result magnetic fields around the loops further grow in strength, which again aids to focus the cosmic rays and increase the current. This was recently derived analytically and shown numerically by Reville and Bell (2012). The growth rate turns out to be independent of wavelength. Again, the Vlasov equation is used to determine the distribution of cosmic rays. The local electric field can be expressed in terms of the vector potential such that:

(48)

with A the magnitude of the vector potential parallel to the shock normal. Using this, the distribution function can be written as:

(49)

Since it can be safely assumed that ∂f/∂p<0, the cosmic-ray number density is locally larger when A is positive and can be written as a function of position:

(50)

where n 0=∫4πp 2 f 0dp and f 0 the isotropic part of the distribution function. Using further that j cr=n cr eu s and the MHD equations, the evolution of the filamentation can be expressed in terms of the cosmic ray current as follows (see Reville and Bell 2012):

(51)

The first term on the right-hand side is independent of the wavenumber and obviously dominates the other terms on long scales. Its growth rate is:

(52)

for a cosmic ray spectrum with a power law slope q=4 with a given minimum (p 1)and maximum (p 2) momentum cut-off. The requirement on the cosmic rays driving the instability is that they are not trapped within the cavities, such that p 1 ceA u s /c, with A the vector potential. This condition was also assumed when deriving the instability. The growth rate is linearly dependent on the value of the amplified small-scale magnetic field and decreases upstream of the shock when p 1 increases and U cr decreases.

When compared to the non-resonant growth rate on small scales, the growth rates are equal when

(53)

Non-linear simulations of the non-resonant instability indicate that the amplified field can reach values of  30B 0 (Bell 2004; Riquelme and Spitkovsky 2009), such that the above condition can be satisfied for kr g >1. Filling in numbers comparable to those used in the previous sections, the instability operates provided that:

(54)

When compared to the growth rate of the long-wavelength instability, they become comparable when:

(55)

Thus, for the longest wavelengths, the filamentation instability may dominate if the small-scale field is sufficiently amplified and as long as condition 54 is satisfied.

5.4 Firehose Instability

The short-wavelength non-resonant instability in Sect. 4.2 was driven by the current, i.e. the first order anisotropy in the cosmic ray distribution. Asymmetry in the second order anisotropy, the pressure tensor, can drive the firehose or mirror instability. It is distinctly different from the long-wavelength instability that was discussed in Sect. 5.1: that one was driven by the current, and the stress-tensor (off-diagonal terms of the pressure tensor) and small-scale collisions mediate this driving source to cause instability on long length-scales and the opposite (left-hand) polarisation.

Again, we will have to consider the distribution function up to second order anisotropy, which we can write in the form

(56)

where θ—particle pitch-angle, μ=cosθ, δ(p)—is the magnitude of the second harmonic anisotropy, which is normally of the order of \(u_{s}^{2}/c^{2}\). Indeed, it is the second harmonic anisotropy that constitutes the source of the CR-firehose instability on the magnetic field amplification. It is instructive to summarize the growth rates for magnetic instabilities that the quasi-linear theory predicts for weakly anisotropic CR distributions of the above form. Following the standard linear analysis of the kinetic equation in the intermediate regime r g /Λ<x 0<1 (see e.g. Bykov et al. 2011b), with Λ again being the mean free path, one may get the following dispersion relation if collisions can be neglected:

(57)
(58)
(59)
(60)

where \(k_{0}=\frac{4\pi}{c}\frac{en_{cr}u_{s}}{B_{0}}\), x=kr g (p), x 1=kr g (p 1), x 2=kr g (p 2), and δ is the magnitude of the second order anisotropy. The signs ± correspond to the two opposite circularly polarized modes under investigation. The second term on the right hand side is the instability that is driven by the cosmic ray current, and has a non-resonant form (Bell 2004, see previous section) when kr g is large, and a resonant form around kr g of the order (1) (e.g. Achterberg 1981; Zweibel 2003; Pelletier et al. 2006; Marcowith et al. 2006; Amato and Blasi 2009, and the references therein). The last term in the r.h.s. of Eq. (57) represents the CR firehose instability. In the long-wavelength regime x m ≪1 a simplified form of Eq. (57) can be derived

(61)

The growth rate of the firehose instability due to the CR pressure anisotropy is found in the last term of Eq. (61). It requires that the parallel pressure exceeds the perpendicular pressure such that P >P +B 2/(4π) (for more details see e.g. Bykov et al. 2011a).

5.5 Instabilities Driven by the Cosmic Ray Pressure Gradient in the Shock Precursor

On scales large compared to the gyro-radius (i.e. scales where the driving particles are strongly magnetised) the action of the cosmic rays can be simplified to a bulk force on the background plasma which is just given by the gradient of the cosmic ray pressure,

(62)

This follows naturally from considerations of momentum balance, the above pressure integral being just the flux of momentum associated with the cosmic rays.

The interesting thing about this bulk force is that it is not related in any simple way to the local density either of mass or of scattering centres (this is a consequence of the collective nature of the electro-magnetic interactions in the plasma). A gravitational field, by contrast, would produce a force that is always strictly proportional to the local mass density (this is just Einstein’s equivalence principle). Similarly a flux of particles interacting by two-body scatterings would produce a force that is proportional to the density of scattering centres, radiation pressure resulting from Thompson scattering on electrons being an example. No such simple relation holds for the cosmic ray scattering, which is a complicated function of the power-spectrum of structure in the magnetic field on scales comparable to the particle gyro-radius and, to lowest order, can be calculated using the methods of quasi-linear theory. For our purposes it is enough to just assume that there is some effective diffusion coefficient κ(p) and that on the scales of interest to us the cosmic ray transport can be represented as a simple diffusion process with a flux proportional to the local gradient,

(63)

where κ is the diffusion tensor (often approximated as a scalar diffusion coefficient).

A given element of background plasma, of local density ρ experiences an acceleration

(64)

and thus, even if the cosmic ray pressure is very uniform, local small-scale variations in density induce acceleration fluctuations which in turn lead to velocity fluctuations which can feed back into density fluctuations. The usual approach to studying instabilities, where one assumes a uniform steady background and then does a Fourier analysis of the modes, fails in this case because the non-stationary and non-uniform nature of the shock precursor region is the ultimate source of free energy driving the instability. Indeed the very question of how to define an instability in such a system is an interesting one with no obvious answer.

In Drury and Falle (1986) a solution to these problems was developed based on a two-scale expansion of the governing equations. In this approach one looks at small wave-length high-frequency modes propagating on a smoothly varying background. In the absence of the cosmic ray pressure the basic modes are then just sound waves and the wave amplitude satisfies a conservation equation for the wave action (this follows from Noether’s theorem because in the high-frequency limit the precise phase of the wave becomes unimportant, and there is thus an asymptotic symmetry related to the arbitrariness of the phase angle). In the case of one spatial dimension this is

(65)

where the wave action \(\mathcal{A}\) is the acoustic wave energy density divided by the local co-moving frequency, c s is the sound speed and the sign of ± corresponds to left- and right-travelling modes.

Including cosmic ray effects in the two-fluid approximation it is then possible to show (see Drury and Falle 1986) that the wave action equation acquires a non-zero right hand side,

(66)

where γ cr is an effective adiabatic index for the cosmic rays and lnκ/lnρ is the extent to which fluctuations in density induce fluctuations in the diffusion coefficient.

The first term is a linear damping term related to the cosmic ray diffusion, as derived earlier by Ptuskin (1981). More interesting for our purpose is the second term which is a potentially de-stabilising term related to the cosmic ray pressure gradient. By formulating the problem in this way it is possible to clearly separate out the conservative effects of the changing background, encapsulated in the conservation of wave action, from the non-conservative effects of the cosmic ray pressure. This gives a precise instability criterion,

(67)

A gradient in the cosmic ray pressure can arise when the cosmic rays make up a significant fraction of the total pressure, and is of interest in the shock precursor region of an efficiently accelerating shock wave.

If we introduce a length-scale for the cosmic ray pressure,

(68)

the condition for instability can be written as

(69)

Noting that in a shock precursor Lκ/u s where u s is the shock velocity, it is clear that the shock precursor region will be generically unstable at high shock Mach numbers unless lnκ/lnρ is identically −1. This is confirmed by numerical simulations of modified shocks in the two-fluid approximation which exhibit instabilities unless the product ρκ is artificially kept constant. Indeed it was this phenomenon in the early calculations of Dorfi which led to the discovery of the instability. It should be noted that the instability only operates as a fluid element is advocated through the precursor, so the maximum growth is limited to an amount of order exp(M) which can however be very large for high Mach numbers.

In addition to the acoustic modes there are also non-propagating entropy modes. The entropy of each fluid element is conserved and therefore the entropy modes can not grow, at least until secondary shocks form. The modes do couple to acoustic modes which can then be amplified. Such entropy modes are perhaps better thought of as dense clumps in pressure equilibrium with their surroundings and the differential acceleration forces resulting from the cosmic ray pressure will set these in motion relative to their less dense surroundings. This motion will then be transmitted to the surroundings in the form of acoustic waves which, when propagating parallel to the shock normal, will be amplified unless the diffusion coefficient scales inversely with the density according to the above analysis. If we now consider the very special case of a sinusoidal density perturbation with wave vector perpendicular to the shock normal, it is clear that it too will induce motions unless the diffusion is constant and independent of density. Thus in three dimensions it is impossible to fully stabilize a clumpy shock precursor. The condition for parallel stability implies transverse instability and vice-versa.

It is also worth noting, as pointed out by Giacalone and Jokipii (2007), that even with no precursor effects a clumpy upstream medium will induce strong post-shock vorticity and down-stream magnetic field amplification. The above analysis indicates that similar processes can work upstream if there is a strong cosmic-ray precursor and thus produce magnetic field amplification by what is in essence a bulk hydrodynamic effect. The scales on which this takes place can be large compared to the particle gyro-radii and is determined by the characteristic length-scales of the initial density fluctuations.

6 Deviations from the “-2” Power Law Index

6.1 The Spectral Index at Oblique Shocks

The relative motion of upstream and downstream scatterers imparts energy to CR as they bounce back and forth across a shock. By Lorentz transformation, the mean fractional energy gain is u s /c on each passage from upstream to downstream and back to upstream for a strong non-relativistic shock. As discussed in Sect. 2, the resulting CR spectrum is determined by the statistical distribution of the number of times a CR crosses the shock before escaping downstream. CR cross the shock at a rate n s c/4 where n s is the CR number density at the shock. This is balanced by the rate n u s /4 at which CR advect away downstream of the shock, where n is the CR number density far downstream. Hence the fractional number of CR lost after each crossing is (n /n s )(u s /c). In the limit of small shock velocity, the VFP equation dictates that n =n s and the fractional number lost is u s /c. When combined with the fractional energy gain of u s /c, a power law CR distribution results with differential energy spectrum n(E)dEE γ dE where γ=2. If, as we show below, n s can differ from n , the spectral index is

(70)

The above argument predicts a universal E −2 spectrum for all high Mach number non-relativistic shocks, whether they are perpendicular, parallel or oblique. However, the argument depends on the result that n =n s . It has been clear for a while that this breaks down when the shock velocity is relativistic (e.g. Achterberg et al. 2001, and Spitkovsky 2012, this issue), but it has recently been shown (Bell et al. 2011) that departures from this spectrum occur at shock velocities as low as ∼c/30 as found in young supernova remnants. The departure is particularly strong for shocks that are nearly perpendicular and when the CR mean free path Λ is larger than the Larmor radius r g . As shown in the previous section, the precursor scaleheight at a perpendicular shock is L∼(r g /Λ)(c/u s )r g . According to this formula, the upstream scaleheight can be very short causing a discontinuity in the CR density gradient across the shock. Kinetic theory does not allow discontinuities to occur over distances less than a CR Larmor radius since CR gyration imposes a smoothing distance of a Larmor radius on the CR distribution function. If the precursor scaleheight is not much larger than the Larmor radius, the overall CR change in density across the shock takes place partly downstream as well as upstream. The result is that the CR density n s at the shock is less than the density far downstream n , the spectral index γ is greater than two, and the CR spectrum is steepened as indicated by Eq. (70).

In contrast, solution of the VFP equation shows that the spectrum is flattened if the shock is oblique and more than 10–20 from perpendicular, depending on the shock velocity. Compression at the shock increases the perpendicular component of the magnetic field because it is frozen in to the background plasma, whereas the parallel component of the field is unchanged by the shock. Consequently the magnetic field increases in magnitude and changes direction at an oblique shock. The change in field acts as a partial magnetic mirror which reflects back upstream some CR trying to cross the shock into the downstream plasma. The shock acts as a partial snowplough, pushing CR ahead of it. This produces a local excess in the CR number density at the shock such that n s >n , and the CR spectrum is flattened in accordance with Eq. (70) giving γ<2.

Clearly, the universal strong shock spectrum, γ=2, does not hold for young SNR shocks. This is consistent with radio observations which exhibit significantly steepened spectra in very young SNR expanding at high velocity (Bell et al. 2011). The effect is not confined to high velocity shocks. The crucial parameter is the ratio of the shock velocity to the velocity of the accelerating particle, so the spectra of sub-relativistic particles accelerated by heliospheric shocks may be expected to show a related departure from γ=2.

6.2 Non-linearity and Time-Dependence

From observations only direct evidence of acceleration of electrons in supernova remnant blast wave exists through the observation of narrow synchrotron rims in X-ray and radio-synchrotron emission from a more extended region. In theory there is no reason why protons and heavier nuclei would not be accelerated through the same process. Gamma-ray observations are currently hinting towards observations of escaping nuclei interacting with molecular clouds, and perhaps also in situ protons at SNR shock waves. Protons could potentially be much more abundant (estimates of a factor 1000 are not uncommon), due to easier injection and less radiation losses, and may reach higher energies. Efficient proton acceleration would therefore have important consequences on the shock structure and temperature. The non-linear effects due to cosmic rays constituting a significant fraction of the energy at the shock have been widely discussed and modelled (Ellison et al. 1995; Malkov 1997; Blasi 2002; Kang and Jones 2005; Amato and Blasi 2006; Vladimirov et al. 2008; Kang et al. 2009; Patnaude et al. 2009; Caprioli et al. 2010; Ferrand et al. 2010). Especially towards the higher-energy end of the spectrum, the spectral index can significantly flatten due to the higher overall compression ratio that is probed by the more energetic cosmic rays. On the low-energy end spectral steepening can occur, although this is only important in low-Mach number shocks unless almost all of the energy goes into cosmic rays (e.g. Vink et al. 2010). It may be important to consider in simulations of clusters of galaxies, in which low-Mach number shocks appear to accelerate particles (Ryu et al. 2003; Pfrommer et al. 2006; Vazza et al. 2009). The transfer of energy to cosmic rays may further be seen in temperature deviations behind the shock, where part of the energy that was supposed to heat the plasma has effectively gone into a cosmic ray component (Helder et al. 2009; Patnaude et al. 2009).

Whether a shock can be an efficient accelerator, and what the resulting cosmic ray spectrum looks like, is also dependent on the environment the shock is running into. A core-collapse SNR will have a different evolution of the shock velocity from a SNR evolving in a homogeneous medium as may be the case for most Type Ia SNe. The time-dependent evolution will determine the cumulative spectrum (Schure et al. 2010). The environment may in some cases also affect the damping through ion-neutral collisions (Reville et al. 2007), the detectable emission through various energy-exchange processes (e.g. Raymond et al. 2011) or through surrounding molecular clouds enhancing the target density for pion creation from escaping cosmic ray protons (e.g. Ohira et al. 2011).

Apart from magnetic field amplification, progress has been booked on the theory of DSA that deviates from the ideal case. The powerlaw of the cosmic rays in reality is a complex addition of time-dependence, shock obliquity and shock speed, as has been discussed in Sect. 6.1. The theory on how cosmic rays may escape upstream is also under active development, with recent papers by Drury (2011), Ohira and Ioka (2011).

7 Discussion and Conclusion

The importance of magnetic field amplification has been recognised since the early developments of the theory of DSA, and in the past decade significant progress has been made in this field. Driven by observations that strongly indicate amplification of factors 10–100, various theories have been developed to explain this intrinsically nonlinear process. A distinction between scales shorter and longer than the gyroradius of the driving cosmic rays is made. Most rapid amplification can be achieved on short scales, but amplification on longer scales is paramount in accelerating cosmic rays to the PeV energies that they are believed to gain in galactic sources. Currently, SNRs still seem the best candidate, but the process is independent of the type of shock wave and other sources may well contribute. The nonlinear behaviour of the various instabilities remains an area of active research and more work is required to satisfactorily decide in which regimes various instabilites operate and dominate. Also the saturation level needs to be determined. Observations that determine more precisely direction, degree of polarisation, on different length and time scales could aid in constraining the theory. The high-energy end of the emission spectrum can be better probed with the current observatories that operate over a range of wavelengths in the gamma-ray regime. Most notably the low-energy end as probed by Fermi-LAT, as well as Cherenkov telescopes for the high-energy end, such as HESS. The next generation telescopes, such as CTA, will certainly be extremely useful, if the observed energy is pushed up high enough to really distinguish between electron- and proton- based emission processes. High-resolution radio measurements of magnetic field strength, and polarimetry in the X-ray band, are other items on the wish list.