1 Introduction

In 1962 Bruno Rossi finalized the writing of his book Cosmic Rays (Rossi 1964) in coincidence with the 50th anniversary of the discovery of cosmic rays (CRs) (though the book was published in 1964). In the epilogue of the book he emphasizes how the field of CR research had become a complex combination of several fields, from Astronomy to Plasma Physics and Particle Physics. He also argues that “It is quite possible that future historians of science will close the chapter on cosmic rays with the fiftieth anniversary of Hess’s discovery”. Interestingly enough, very little of what will be discussed in the present review was known or even proposed at the time of Rossi’s book: scientists in this field have been extremely active and many new ideas and new observations have changed much of what was believed in the early 1960s. The purpose of this review is to provide a recount of these exciting developments, especially the ones that took place in the last decade or so. I am pretty sure that historians of science will not close the chapter on cosmic rays with the 100th anniversary of their discovery. Too many loose ends need to be put in place.

Cosmic rays are mainly charged particles that contribute an energy density in the Galaxy of about 1 eV cm−3. They are mainly protons (hydrogen nuclei) with about 10 % fraction of helium nuclei and smaller abundances of heavier elements. Despite the much lower fluxes of electrons and positrons, these particles provide us with precious information on the sources of CRs and the transport of these particles through the Galactic magnetic field. An even smaller flux of electromagnetic radiation (from radio frequencies to gamma rays) reaches the Earth from both the sources and from the interactions that CRs occasionally suffer during propagation. The models we develop for the origin of CRs are all based on an attempt to interpret these separate pieces of observations within a unified frame.

The flux of all nuclear components present in CRs (the so-called all-particle spectrum) is shown in Fig. 1. At low energies (below ∼30 GeV) the spectral shape bends down, as a result of the modulation imposed by the presence of a magnetized wind originated from our Sun, which inhibits very low energy particles from reaching the inner solar system. The prominent steepening of the spectrum at energy E K =3×1015 eV is named the knee: at this point the spectral slope of the differential flux (flux of particles reaching the Earth per unit time, surface and solid angle, per unit energy interval) changes from ∼−2.7 to ∼−3.1. There is evidence that the chemical composition of CRs changes across the knee region with a trend to become increasingly more dominated by heavy nuclei at high energy (see Höorandel 2006, for a review), at least up to ∼1017 eV. At even higher energies the chemical composition remains matter of debate. Recent measurements carried out with KASCADE-GRANDE (Apel et al. 2013) reveal an interesting structure in the spectrum and composition of CRs between 1016 and 1018 eV: the collaboration managed to separate the showers in electron-poor (a proxy for light chemical composition) and electron-rich (a proxy for heavy composition) showers and showed that the light component (presumably protons and He, with some contamination from CNO) has an ankle-like structure at 1017 eV. The authors suggest that this feature signals the transition from Galactic to extragalactic CRs (in the light nuclei component). The spectrum of Fe-like CRs continues up to energies of ∼1018 eV, where the flux of Fe and the flux of light nuclei are comparable. A similar conclusion was recently reached by the ICETOP Collaboration (Aartsen et al. 2013). This finding does not seem in obvious agreement with the results of the Pierre Auger Observatory (Abraham et al. 2010), HiRes (Sokolsky and Thomson 2007) and Telescope Array (Sokolsky 2013), which find a chemical composition at 1018 eV that is dominated by the light chemical component.

Fig. 1
figure 1

Spectrum of cosmic rays at the Earth (courtesy Tom Gaisser). The all-particle spectrum measured by different experiments is plotted, together with the proton spectrum. The subdominant contributions from electrons, positrons and antiprotons as measured by the PAMELA experiment are shown

The presence of a knee and the change of chemical composition around it have stimulated the idea that the bulk of CRs originates within our Galaxy. The knee could for instance result from the superposition of cutoffs in the spectra of the different chemicals as due to the fact that most acceleration processes are rigidity dependent: if protons are accelerated in the sources to a maximum energy E p,max∼5×1015 eV, then an iron nucleus will be accelerated to E Fe,max=26E p,max∼(1–2)×1017 eV (it is expected that at such high energies even iron nuclei are fully ionized, therefore the unscreened charge is Z=26). A knee would naturally arise as the superposition of the cutoffs in the spectra of individual elements (see for instance Hörandel 2004; Blasi and Amato 2012a; Gaisser et al. 2013).

The apparent regularity of the all-particle spectrum in the energy region below the knee is at odds with the recent detection of features in the spectra of individual elements, most notably protons and helium: the PAMELA satellite has provided evidence that both the proton and helium spectra harden at 230 GeV (Adriani et al. 2011). The spectrum of helium nuclei is also found systematically harder than the proton spectrum, through only by a small amount. The slope of the proton spectrum below 230 GeV was measured to be γ 1=2.89±0.015, while the slope above 230 GeV becomes γ 2=2.67±0.03. The slopes of protons and helium spectra at high energies as measured by PAMELA appear to be in agreement with those measured by the CREAM experiment (Ahn et al. 2010) at supra-TeV energies. Some evidence also exists for a similar hardening in the spectra of heavier elements (see Maestro et al. 2010 and references therein).

Different explanations for the feature at 230 GV have been put forward: Thoudam and Hörandel (2012, 2013) suggested that a local source of CRs might appear in the total spectrum as a spectral hardening. On the other hand, the fact that a similar feature has been detected in the spectrum of helium nuclei (and possibly heavier nuclei) might suggest that a new physical phenomenon is showing up, probably due to CR transport. For instance, Tomassetti (2012) showed that a spatially dependent diffusion coefficient may induce a spectral hardening under some assumptions on the functional shape of the function representing the diffusion coefficient (non-separability between energy and space coordinates is required). Blasi et al. (2012a) and Aloisio and Blasi (2013) showed that a similar feature may naturally appear if CRs can produce their own scattering centers (diffusion) through streaming instability. In the latter model, the feature appears at ∼200 GeV/n as a result of the transition from self-generated diffusion and diffusion in a pre-existing turbulence.

Very recently, some preliminary data from the AMS-02 experiment on the International Space Station have been presentedFootnote 1 and do not confirm the existence of the spectral breaks in the protons and helium spectra, as observed by PAMELA. Given the preliminary nature of these data and the lack of refereed publications at the time of writing of this review, I cannot comment further on their relevance.

The measurement of the ratio of fluxes of some nuclei that can only be produced by CR spallation and the flux of their parent nuclei provides the best estimate so far of the amount of matter that CRs traverse during their journey through the Galaxy. In order to account for the observed B/C ratio, CRs must travel for times that exceed the ballistic time by several orders of magnitude before escaping the Galaxy (this number decreases with energy). This is the best argument to support the ansatz that CRs travel diffusively in the Galactic magnetic field (Juliusson et al. 1972). A similar conclusion can be drawn from the observed flux of some unstable isotopes such as 10Be (Simpson and Garcia-Munoz 1988). The decrease of the B/C ratio with energy per nucleon is well described in terms of a diffusion coefficient that increases with energy.

In principle a similar argument can be applied to the so-called positron fraction, the ratio of fluxes of positrons and electrons plus positrons, \(\varPhi_{e^{+}}/(\varPhi_{e^{+}}+\varPhi_{e^{-}})\), where, however, special care is needed because of the important role of energy losses for leptons. In first approximation, it is expected that positrons may only be secondary products of inelastic CR interactions that lead to the production and decays of charged pions. In this case it can be proven that the positron fraction must decrease with energy. In fact several past observations, and most recently the PAMELA measurements (Adriani et al. 2009) and the AMS-02 measurement (Aguilar et al. 2013), showed that the positron fraction increases with energy above ∼10 GeV. This anomalous behavior is not reflected in the flux of antiprotons (Adriani et al. 2008): the ratio of the antiprotons to proton fluxes \(\varPhi_{\bar{p}}/\varPhi _{p}\) is seen to decrease, as expected based on the standard model of diffusion. Although the rise of the positron fraction has also been linked to dark matter annihilation in the Galaxy, there are astrophysical explanations of this phenomenon that can account for the data without extreme assumptions (see the review paper by Serpico 2012 for a careful description of both astrophysical models and dark matter inspired models).

The simple interpretation of the knee as a superposition of the cutoffs in the spectra of individual elements, as discussed above, would naively lead to the conclusion that the spectrum of Galactic CRs should end at ∼26E K ≲1017 eV. Clearly this conclusion is not straightforward: a rare type of sources that can potentially accelerate CRs to much larger energies may leave the interpretation of the knee unaffected and yet change the energy at which Galactic CRs end. This opens the very important question of where should one expect the transition to extragalactic CRs to take place. Although in the present review I will only occasionally touch upon the problem of ultra-high-energy cosmic rays (UHECRs), it is important to realize that the quest for their origin is intimately connected with the nature of the transition from Galactic CRs to UHECRs.

At the time of this review, there is rather convincing and yet circumstantial evidence that the bulk of CRs are accelerated in supernova remnants (SNRs) in our Galaxy, as first proposed by Baade and Zwicky (1934), Ginzburg and Syrovatsky (1961). The evidence is based on several independent facts: gamma rays unambiguously associated with production of neutral pions have been detected from several SNRs close to molecular clouds (Ackermann et al. 2013; Tavani et al. 2010); the gamma-ray emission detected from the Tycho SNR (Giordano et al. 2012; Acciari et al. 2011) also appears to be most likely of hadronic origin (Morlino and Caprioli 2011; Berezhko et al. 2013); the bright X-ray rims detected from virtually all young SNRs (see Vink 2012, Ballet 2006 for a recent review) prove that the local magnetic field in the shock region has been substantially amplified, probably by accelerated particles themselves, due to streaming instability (for recent reviews see Bykov et al. 2011a, 2013, Schure et al. 2012). Despite all this circumstantial evidence, no proof has been found yet that SNRs can accelerate CRs up to the knee energy.

Charged particles can be energized at a supernova shock through diffusive shock acceleration (DSA) (Krymskii 1977, Blandford and Ostriker 1978, Axford et al. 1977, Bell 1978a, 1978b). If SNRs are the main contributors to Galactic CRs, an efficiency of ∼10 % in particle acceleration is required (see Sect. 2). The dynamical reaction of accelerated particles at a SNR shock is large enough to change the shock structure, so as to call for a non-linear theory of DSA (Malkov and Drury 2001). Such a theory should also be able to describe the generation of magnetic field in the shock region as due to CR-driven instabilities (Amato and Blasi 2006; Caprioli et al. 2008, 2009b), although many problems still need to be solved.

The combination of DSA and diffusive propagation in the Galaxy represents what I will refer to as the supernova remnant paradigm. Much work is being done at the time of this review to find solid proofs in favor or against this paradigm. I will summarize this work here.

The review is structured as follows: in Sect. 2 I will review the basic aspects of the SNR paradigm for the origin of CRs; in Sect. 3 I will provide a pedagogical discussion of the mechanism of diffusive shock acceleration (DSA) at collisionless shocks and the maximum energy achievable. The non-linear version of the theory of DSA is illustrated in Sect. 4, where the dynamical reaction of accelerated particles and magnetic field amplification are discussed in depth. In Sect. 5 I briefly discuss the issue of SN explosions in superbubbles. A discussion of several crucial pieces of the SNR paradigm (CR escape, spectra of SNRs and SNRs close to molecular clouds) are discussed in Sect. 6. The phenomenon of DSA in partially ionized material is discussed in Sect. 7, with special emphasis of the implication of CR acceleration for the width of the line in Balmer-dominated shocks. I conclude in Sect. 8.

2 The bases of the SNR paradigm

The abundances of some light elements such as boron, lithium and beryllium in the CRs provide us with the best estimates of the time \(\tau _{\it esc}(E)\) spent by CRs in the Galaxy before escaping. More precisely, the ratio of boron and carbon fluxes is related to the grammage traversed by CRs, \(X(E)=\bar{n} \mu v \tau_{\it esc}(E)\), where \(\bar{n}\) is the mean gas density in the confinement volume of the Galaxy (disc plus halo), μ is the mean mass of the gas, v is the speed of particles. For particles with energy per nucleon of 10 GeV/n the measured B/C corresponds to X∼10 g cm−2. If the sources are located in the thin disc of the Galaxy with half thickness h=150 pc and the halo extends to a height H, the mean density can be estimated as \(\bar{n}=n_{\it disc} h/H = 5\times10^{-2} \bigl(\frac {n_{\it disc}}{1~\mathrm{cm}^{-3}} \bigr) \bigl(\frac{H}{3~\mathrm{kpc}} \bigr)^{-1}~\mathrm{cm}^{-3}\). For a standard chemical composition of the ISM (\(n_{\it He}\approx0.15 n_{H}\)) the mean mass is \(\mu= (n_{H}+4n_{\it He})/(n_{H}+n_{\it He})\approx1.4 m_{p}\). It follows that for a proton with energy E =10 GeV the typical escape time is

$$ \tau_{*} \sim\frac{X(E_{*})}{\bar{n} \mu c} = 90 \biggl(\frac{H}{3~\mathrm{kpc}} \biggr)~\mathrm{Myr}, $$
(1)

which exceeds the ballistic propagation time scale by at least three orders of magnitude. This remains the strongest evidence so far for diffusive motion of CRs in the Galaxy. A diffusion coefficient can be introduced as \(\tau_{\it esc}(E)=H^{2}/D(E)=\tau_{*}(E/E_{*})^{-\delta }\), so that at 10 GeV \(D(E)\simeq3\times10^{28} \bigl(\frac{H}{3~\mathrm{kpc}} \bigr)~\mathrm{cm}^{2}\,\mathrm{s}^{-1}\). The grammage (and therefore the escape time) decreases with energy (or rather with rigidity) as inferred from the B/C ratio, illustrated in Fig. 2, which shows a collection of data points on the ratio of fluxes of boron and carbon, as obtained by using the data collection provided by the Cosmic Ray Database (Maurin et al. 2013). Figure 2 illustrates the level of uncertainty in the determination of the slope of the B/C ratio at high energies, which reflects on the uncertainty in the high-energy behavior of the diffusion coefficient. At low energies the uncertainty is even more severe due to the effects of solar modulation which suppresses CR fluxes in a different way during different phases of the solar activity (see Potgieter 2013 for a recent review). The high rigidity behavior of the B/C ratio is compatible with a power law grammage X(R)∝R δ with δ=0.3–0.6.

Fig. 2
figure 2

B/C ratio as a function of energy per nucleon. Data have been extracted from the Cosmic Ray Database (Maurin et al. 2013)

Supernovae exploding in our Galaxy at a rate \(\mathcal{R}_{\it SN}\) liberate a kinetic energy in the form of moving ejecta of \(E_{\it SN}=10^{51}E_{51}\) erg. This number is weakly dependent upon whether the SN is of type Ia or a core-collapse SN, although it might be somewhat different for rare types of SNe (type Ib, Ic), possibly connected with gamma-ray bursts. As I discuss below, particle acceleration in SNRs is believed to occur through diffusive shock acceleration, which leads to power law spectra of accelerated particles, and for the sake of the present discussion I assume that such an injection spectrum is in the form

$$N(p)=\xi_{\it CR}\frac{E_{\it SN}}{m^{2}}I(\gamma) \biggl(\frac{p}{m} \biggr)^{-\gamma},\quad\quad I(\gamma)\approx\frac{2(3-\gamma)(\gamma -2)}{4-\gamma}, $$

where γ>2 is the slope of the differential spectrum of accelerated particles and \(\xi_{\it CR}<1\) is the CR acceleration efficiency. Here I(γ) is a normalization factor obtained by imposing that the total energy at the source equals \(\xi_{\it CR}E_{\it SN}\). It is best to normalize the flux of CRs to the observed proton flux, since it is not expected to be affected by spallation reactions. The flux of protons observed by different experiments is shown in Fig. 3 (data are from the Cosmic Ray Database (Maurin et al. 2013)). Provided we focus on sufficiently high energies, ionization losses can also be neglected and the effects of solar modulation play no role (we can also assume Epc). In this case the spectrum of CR protons contributed by SNRs at the Earth can be simply written as

$$\begin{aligned} J(E) =& \frac{c}{4\pi}\frac{N(E)\mathcal{R}_{\it SN}}{\pi R_{d}^{2} 2 H} \tau_{\it esc}(E) \\ =& 8\times10^{5} \xi_{\it CR} I(\gamma) \biggl(\frac{\mathcal{R}_{\it SN}}{30~\mathrm{yr}^{-1}} \biggr) \biggl(\frac{E}{m} \biggr)^{-\gamma-\delta} \biggl(\frac{E_{*}}{m} \biggr)^{\delta}~\mathrm{m}^{-2}\,\mathrm{s}^{-1}\,\mathrm{sr}^{-1} \,\mathrm{GeV}^{-1}, \end{aligned}$$
(2)

and I assumed that the disc of the Galaxy has a radius R d =10 kpc. It is useful to notice that if the escape time is normalized to the B/C ratio at a given energy E (see Eq. (1)) then the expected flux becomes independent of the size of the halo H. This reflects the fact that in the simple diffusion model introduced here the CR flux in the absence of losses and the grammage both scale with the ratio H/D(E). This rule of thumb remains valid even in more sophisticated propagation calculations, such as GALPROP.

Fig. 3
figure 3

Proton spectrum as measured by different experiments. Data are from the Cosmic Ray Database Maurin et al. 2013

Normalizing to the proton flux at E =10 GeV,

$$E_{*}^{2}J(E_{*})\approx2\times10^{3}~\mathrm{GeV}\,\mathrm{m}^{-2}\,\mathrm{s}^{-1}\,\mathrm{sr}^{-1} $$

(see Fig. 3), one immediately gets

$$ \xi_{\it CR} \approx\frac{2.5\times10^{-3}}{I(\gamma)} (E_{*}/m)^{(\gamma-2)} \biggl(\frac{\mathcal{R}_{\it SN}}{30~\mathrm{yr}^{-1}} \biggr)^{-1}. $$
(3)

The required efficiency turns out to be a weak function of the slope of the injection spectrum γ and is typically \(\xi_{\it CR}\simeq 2\)–3 % when changing the value of δ. The total CR acceleration efficiency is somewhat higher than the estimate in Eq. (3) because of the contribution of nuclei heavier than hydrogen. More refined calculations provide a better estimate of the total acceleration efficiency that is between 5 % and 10 % for the bulk of SNRs, while it can be higher or smaller for individual objects, depending upon the environment in which the supernova event takes place.

3 The theory of diffusive shock acceleration of test particles

A supernova explosion in the interstellar medium (ISM) results in the injection of metal enriched ejecta with a total mass \(M_{\it ej}\) moving with a velocity \(V_{\it ej}\). If the total energy output in the form of kinetic energy is \(E_{\it SN}=10^{51} E_{51}\) erg, then the velocity of the ejecta in the initial phases can be written as

$$ V_{\it ej} = 10000 E_{51}^{1/2} M_{{\it ej},\odot}^{-1/2}~\mathrm{km/s}, $$
(4)

where \(M_{{\it ej},\odot}\) is the mass of the ejecta in units of solar masses.

The sound speed in the ISM can be estimated as

$$ c_{s} = \sqrt{\gamma_{g} \frac{k T}{m_{p}}} \approx11 \biggl(\frac {T}{10^{4}K} \biggr)^{1/2}~\mathrm{km/s}, $$
(5)

where γ g is the adiabatic index (assumed here to be γ g =5/3) and T is the temperature. It follows that the typical Mach number of the plasma ejected in a SN explosion is

$$ M_{s} = \frac{V_{\it ej}}{c_{s}} \approx900 E_{51}^{1/2} M_{{\it ej},\odot }^{-1/2} \biggl(\frac{T}{10^{4}K} \biggr)^{-1/2}. $$
(6)

The motion of the ejecta is highly supersonic and drives the formation of a shock front. The motion of the shock front is heavily affected by the environment around the parent star and by the density profile in the ejecta (see (Maurin et al. 2013) for a review). The matter accumulated behind the shock during the expansion increases the inertia of the expanding shell and eventually slows down the explosion at a time when the accumulated mass equals that of the ejecta. For an explosion in the standard ISM one can write

$$ \frac{4}{3}\pi\rho_{\it ISM} R_{\it ST}^{3} = M_{\it ej} \to R_{\it ST} = \biggl( \frac{3 M_{\it ej}}{4\pi\rho_{\it ISM}} \biggr)^{1/3} \approx2 M_{{\it ej},\odot}^{1/3} \biggl( \frac{n_{\it ISM}}{1~\mathrm{cm}^{-3}} \biggr)^{-1/3}~\mathrm{pc}, $$
(7)

where \(R_{\it ST}\) defines the radius of the expanding shell at the beginning of the so-called Sedov–Taylor (adiabatic) phase. This stage of the SNR evolution starts at the time

$$ T_{\it ST}=\frac{R_{\it ST}}{V_{\it ej}}\approx200 M_{{\it ej},\odot }^{5/6}E_{51}^{-1/2} \biggl(\frac{n_{\it ISM}}{1~\mathrm{cm}^{-3}} \biggr)^{-1/3}~\mathrm{years}. $$
(8)

These estimates of the Sedov–Taylor radius and time should be considered as orders of magnitude, while the actual values depend on the conditions around the supernova explosion. For instance, for a core-collapse SN explosion the material ejected by the pre-supernova star may dominate the density in the initial phases of the explosion, and the adiabatic phase may start at earlier times than indicated by Eq. (8). On the other hand, in the case of a fast wind, with low density (such as would be the case for Wolf–Rayet pre-supernova star) the SN explosion might take place in an underdense bubble of hot dilute gas. In this case the adiabatic phase might start at later time. In any case, for core-collapse SN explosions the dynamics of the expanding shell is usually much more complex to describe than in the case of type Ia SN explosions in the ISM. This also reflects in the morphology of the non-thermal emission from SNRs of different types. The morphology of SNRs of core-collapse SN explosions is usually irregular and often asymmetric. This is also due to the fact that the environment in which massive stars explode through a core collapse is often complex, with inhomogeneous distribution of gas and the presence of molecular clouds that provide the gas material for the formation of these massive, relatively short lived stars. On the other hand, type Ia SNRs are usually more regular and it is not rare to find cases of almost perfectly spherical SN shells as observed at all wavelengths.

In Fig. 4 I show the cases of RX J1713.7-3946 (left panel, from (Uchiyama et al. 2002)) and Tycho (right panel, from (Warren et al. 2005)). The former is a SNR originated from a core-collapse SN explosion and its gamma-ray emission (color) and X-ray emission (lines) show the irregular morphology of this remnant. The Tycho SNR is the leftover of a type Ia SN exploded in 1572 at ∼3 kpc distance from the solar system. The image shows its thermal X-ray emission, mainly contributed by the ejecta in the central part of the explosion region, and the non-thermal X-ray emission which has the rim-like morphology shown in the picture. In Sect. 6 I will discuss at length the implications of the non-thermal X-ray morphology of SNRs and of Tycho in particular. All these aspects are very important whenever the predictions of a theory of acceleration of CRs have to face observations.

Fig. 4
figure 4

Left Panel: Morphology of the RX J1713.7-3946. The colors illustrate the high-energy gamma-ray emission as measured by HESS (Maurin et al. 2013), while the contours show the X-ray emission in the 1–3 keV band measured by ASCA McKee and Truelove 1995. Right Panel: Morphology of the Tycho SNR as measured with Chandra (Aharonian et al. 2007). The three colors refer to emission in the photon energy range 0.95–1.26 keV (red), 1.63–2.26 keV (green), and 4.1–6.1 keV (blue). The latter emission is very concentrated in a thin rim and is the result of synchrotron emission of very high-energy electrons

As anticipated above, the supersonic motion of the ejecta of a stellar explosion leads to the formation of a shock front propagating in the ISM or in the circumstellar medium, depending on the type of SN explosion. The Mach number of the shock depends on the conditions in the region in which the explosion takes place. For instance the Mach number of the shock becomes appreciably lower than the one quoted in Eq. (6) if the shock propagates in the hot tenuous gas around a core-collapse SNR.

The first question that we have to face is, however, about the nature of these shock waves. In the section below I will argue that SN shocks (and in fact most astrophysical shock waves) are intrinsically different from the shock waves that we are used to in the Earth atmosphere, in that the latter are mediated by molecular collisions, while the former could not be formed based on particle-particle collisions in the ISM. The SN shocks expanding in the ordinary ISM belong to the class of collisionless shocks. Since many fundamental concepts of the physics of particle acceleration in astrophysical shocks rely on this property, I dedicate some space here to a discussion of the basic principles that regulate the formation of a collisionless shock.

3.1 Collisionless shocks

Collisionless shocks are formed because of the excitation of electromagnetic instabilities, namely collective effects generated by groups of charged particles in the background plasma. A thorough review of the theory of collisionless shock waves has recently been published by (Aharonian et al. 2007), and I refer to that paper for a careful discussion of the many subtle aspects of the physics of collisionless shocks. Here I limit myself with a qualitative description of the conditions necessary for their formation, to be used as background material for some of the topics discussed in connection with the physics of particle acceleration. Moreover, since the shocks that will be discussed in this review are non-relativistic, here I will restrict the discussion to non-relativistic shocks vc and cases where the temperature downstream of the shock is much smaller than the electron mass, so as to avoid pair production. The requirement of a shock being non-relativistic can also be rewritten in terms of the Alfvenic Mach number:

$$ v\ll c \to M_{A}=\frac{v}{v_{A}}\ll \biggl(\frac{m_{p}}{m_{e}} \biggr)^{1/2}\frac{\omega_{p,e}}{\omega_{c,e}}= 1.3\times10^{5} n_{\mathrm{cm}^{-3}}^{1/2} B_{\mu G}^{-1}, $$
(9)

where \(v_{A}=B_{0}/\sqrt{4\pi n m_{p}}\) is the Alfvén velocity, ω p,e and ω c,e are the electron plasma frequency and cyclotron frequency.

In an electron–proton plasma, Coulomb scattering acts in three different ways: (1) it leads the electrons to thermalize, namely to reach a Maxwellian distribution; (2) it leads protons to thermalize; (3) it leads to thermalization of electrons and protons. Typically these three processes have a well defined hierarchy: electron thermalization is the fastest process, followed by electron–proton thermalization. The slowest process is the thermalization of protons. This clearly opens several questions: first, electron–proton collisions are likely to occur when the proton distribution is not yet maxwellian; second, the time scale for electron–proton equilibration may be exceedingly long as compared with the age of the system at hand.

The time scale for equilibration between two generic populations of particles with temperature T 1 and T 2, masses m 1 and m 2 and the same electric charge q and same density n is (Uchiyama et al. 2002):

$$ \tau_{eq} = \frac{3 m_{1}m_{2} k_{B}^{3/2}}{8 (2\pi)^{1/2} n q^{4} \ln\varLambda} \biggl( \frac{T_{1}}{m_{1}} + \frac{T_{2}}{m_{2}} \biggr)^{3/2}, $$
(10)

where k B is the Boltzmann constant and lnΛ∼10 is the Coulomb logarithm. For instance, the equilibration time of electrons with themselves would be:

$$ \tau_{eq,ee} \approx1200 \biggl(\frac{n}{1~\mathrm{cm}^{-3}} \biggr)^{-1} \biggl(\frac{T_{e}}{10^{8} K} \biggr)^{3/2}~\mathrm{years}, $$
(11)

while for protons:

$$ \tau_{eq,pp} = \approx2.3\times10^{6} \biggl( \frac {n}{1~\mathrm{cm}^{-3}} \biggr)^{-1} \biggl(\frac{T_{p}}{10^{8} K} \biggr)^{3/2}~\mathrm{years}. $$
(12)

Having in mind the case of a SNR, it is easy to envision that these equilibration times are long compared with the scale on which the shocks associated with the blast waves are actually observed, thereby raising the question of how such shocks are actually formed. On the other hand, the comparison with some plasma related quantities may be illuminating: for instance, for a velocity of 1000 km/s and density 1 cm−3, the cyclotron radius of a particle is mvc/eB 0 which is ∼1010 cm for a proton in a μG magnetic field and about 2000 times smaller for an electron. The electron plasma frequency is \(\omega_{p,e}=(4\pi e^{2} n/m_{e})^{1/2}\sim5.6\times10^{4} n_{\mathrm{cm}^{-3}}^{1/2}\), corresponding to a spatial scale v/ω p,e ∼2×103 cm for a velocity \(v\sim10^{8}~\mathrm{cm/s}\).

The formation of shock waves in these conditions is likely due to collective effects of charged particles. Several aspects of the physics of these collisionless shocks are all but trivial. Since the thermalization of these plasmas is directly linked to isotropization of the directions of motion of particles, it is natural to expect that the temperatures of electrons and protons immediately behind the shock front are proportional to the masses and therefore different for electrons and protons:

$$ k T_{e} \approx\frac{3}{2} m_{e} v^{2} = \frac{m_{e}}{m_{p}} k T_{p}. $$
(13)

Coulomb collisions between electrons and protons eventually lead them to reach the same temperature, but the time necessary to achieve this situation often exceeds the age of the source, hence the equilibration is all but guaranteed in collisionless shocks. This is especially true for young SNRs, since for typical gas densities n∼0.1–1 cm−3 typical of the average ISM, the thermalization time may be of several thousands years. For instance, for a strong shock one has \(T_{p}=\frac {3}{16} \frac{m_{p}V_{\it sh}^{2}}{k_{B}} = 5.6\times10^{8} (V_{\it sh}/5000~\mathrm{km/s})^{2}\) and using Eq. (13) for T e , one finds that electrons would need several 100 years to reach the same temperature as protons (even assuming that protons are thermalized in the first place).

On the other hand, even partial equilibration between electrons and protons may produce observational signatures, such as the excitation of lines in the regime of non-equilibrium ionization of heavy atoms such as oxygen, which takes place whenever the electron temperature is above ∼1 keV (Warren et al. 2005). For a shock moving with velocity v the temperatures of protons and electrons immediately downstream can be estimated to be of order \(k T_{p}\sim15 v_{8}^{2}\) keV and \(T_{e}\sim80 v_{8}^{2}\) eV, where v 8 is the shock velocity in units of 108 cm/s =1000 km/s.

The formation of collisionless shocks raises the important question about the mechanism for dissipation, needed in order to transform part of the kinetic energy of the plasma crossing the shock from upstream into thermal energy of the plasma downstream. The dissipation is expected to be qualitatively different depending upon the orientation of the background magnetic field. For parallel shocks (background field oriented along the normal to the shock surface) the excitation of Weibel instability leads to the generation of small-scale magnetic fields which become part of the dissipation mechanism (see discussion by Aharonian et al. 2007).

It is easy to picture how the physics of dissipation at a collisionless shock also affects the injection of particles into the acceleration cycle. Similar to the case of collisional shocks, where the thickness of the shock front is of the order of the collisional mean free path, for collisionless shocks the thickness of the front is of the order of the typical scale of the instabilities that are responsible for dissipation. As an order of magnitude one can expect that the thickness of the front is several gyration radii of the thermal particles in the plasma. While gyrating in the self-produced magnetic fields, a small fraction of particles on the tail of the distribution may end up in the upstream side of the shock that is being formed, thereby bootstrapping the injection of the first accelerated particles. Injection remains one of the most poorly known aspects of particle acceleration at astrophysical shocks. In the last few years, Particle-in-Cell (PIC) simulations have been instrumental in reaching a better understanding of the formation of collisionless shocks (both relativistic and non-relativistic) and the initial stages of the acceleration process (Warren et al. 2005, Treumann (2009); (Spitzer 1962)).

Independent of the specific mechanism for dissipation, after the collisionless shock has been formed one can write equations for conservation of mass, momentum and energy across the shock surface. Here I limit myself with the simple case of a plain parallel infinite shock and with accelerated particles treated as test particles, having no dynamical role. For simplicity I also assume that on the scales we are interested in the shock can be considered stationary in time. In a realistic situation, basically all of these conditions get broken to some extent, and it becomes important to always have under control the limitations of the calculations we carry out, depending on their application.

Conservation of mass, momentum and energy across the shock read

$$\begin{aligned} &\frac{\partial}{\partial x} ( \rho u )=0, \end{aligned}$$
(14)
$$\begin{aligned} &\frac{\partial}{\partial x} \bigl( \rho u^{2} + P_{g} \bigr)=0, \end{aligned}$$
(15)
$$\begin{aligned} &\frac{\partial}{\partial x} \biggl( \frac{1}{2}\rho u^{3}+ \frac {\gamma_{g}}{\gamma_{g}-1}u P_{g} \biggr)=0, \end{aligned}$$
(16)

where γ g is the adiabatic index, P g is the gas pressure and ρ and u are the density and velocity of the plasma as seen in the reference frame of the shock. These conservation equations have the trivial solution ρ=constant, u=constant, P g =constant, but they also admit the discontinuous solutions:

$$\begin{aligned} &\frac{\rho_{2}}{\rho_{1}}=\frac{u_{1}}{u_{2}}=\frac{(\gamma _{g}+1)M_{1}^{2}}{(\gamma_{g}-1)M_{1}^{2}+2}, \end{aligned}$$
(17)
$$\begin{aligned} &\frac{P_{g,2}}{P_{g,1}}=\frac{2\gamma_{g} M_{1}^{2}}{\gamma _{g}+1}-\frac{\gamma_{g}-1}{\gamma_{g}+1}, \end{aligned}$$
(18)
$$\begin{aligned} &\frac{T_{2}}{T_{1}}=\frac{(2\gamma_{g} M_{1}^{2}-\gamma_{g} (\gamma _{g}-1))((\gamma_{g}-1)M_{1}^{2}+2)}{(\gamma_{g}+1)^{2}M_{1}^{2}}. \end{aligned}$$
(19)

For a plasma with adiabatic index γ g =5/3 and M 1≫1 the jump conditions simplify considerably. I refer to this case as the strong shock limit and it is easy to show that in this asymptotic limit

$$ \frac{\rho_{2}}{\rho_{1}}=\frac{u_{1}}{u_{2}} = 4,\quad\quad\frac {P_{g,2}}{P_{g,1}} = \frac{5}{4} M_{1}^{2},\quad\quad\frac{T_{2}}{T_{1}} = \frac{5}{16} M_{1}^{2}. $$
(20)

Recalling that \(M_{1}^{2}=u_{1}^{2}/c_{s,1}^{2}\) and \(c_{s,1}^{2}=\gamma P_{g,1}/\rho_{1}\) one easily obtains that

$$ k T_{2} = \frac{3}{16} m_{p} u_{1}^{2}, $$
(21)

namely for a strong shock a large fraction of the kinetic energy of the particles upstream is transformed into internal energy of the gas behind the shock. The downstream temperature becomes basically independent of the temperature upstream, T 1.

The presence of non-thermal particles accelerated at the shock front, and of magnetic fields in the shock region both change the conservation equations written above, as described in Sect. 4. It is important to realize that the processes involved in the formation of a collisionless shock also determine the injection of a few particles in the acceleration cycle that may lead to CRs. At the same time CRs change the structure of the collisionless shock, thereby affecting their own injection. This complex chain of effects illustrates in a qualitative way what is known as non-linear particle acceleration.

3.2 Transport of charged particles in magnetic fields: basic concepts

The original idea that the bulk motion of magnetized clouds could be transformed into the kinetic energy of individual charged particles was first introduced by Enrico Fermi (Ellison et al. 2007) and is currently widely referred to as second order Fermi acceleration. Each interaction of a test particle with a magnetized cloud results in either an energy gain or an energy loss, depending upon the relative direction of motion at the time of the scattering. On average however, the head-on collisions dominate upon tail-on collisions and the momentum vector of the charged particle performs a random walk in momentum space, in which the length of the vector increases on average by an amount ∼ΔE/E=(4/3)(V/c)2, where V/c is the modulus of the velocity of the clouds in units of the speed of light. The scaling with the second power of V/c is the reason why the mechanism is named second order Fermi mechanism. In the ISM the role of the magnetized clouds is played by plasma waves, most notably Alfvén waves, which move at speed \(v_{A}=B/\sqrt{4\pi\rho_{i}}=2 B_{\mu} n_{i,\mathrm{cm}^{-3}}\) km/s, where ρ i =n i m p is the mass density of ionized material. Given the smallness of the wave velocity it is easy to understand that the role of second order Fermi acceleration is, in general, rather limited. However, the revolutionary concept that it bears is still of the utmost importance: the electric field induced by the motion of the magnetized cloud (or wave) may accelerate charged particles. Given the importance of this phenomenon, not only for particle acceleration but for propagation as well, in this section I will illustrate some basic concepts that turn out to be useful in order to understand the behavior of a charged particle in a background of waves.

The motion of a particle moving in an ordered magnetic field \(\mathbf{B}_{0} = B_{0} \hat{z}\) conserves the component of the momentum in the \(\hat{z}\) direction and since the magnetic field cannot do work on a charged particle, the modulus of the momentum is also conserved. This implies that the particle trajectory consists of a rotation in the xy plane perpendicular to \(\hat{z}\), with a frequency Ω=qB 0/(mcγ) (gyration frequency) and a regular motion in the \(\hat{z}\)-direction with momentum p z = where μ is the cosine of the pitch angle of the particle (see Fig. 5). The velocity of the particle in the three spatial dimensions can therefore be written as:

$$\begin{aligned} &v_{x}(t)=v_{\perp} \cos (\varOmega t+\phi ), \end{aligned}$$
(22)
$$\begin{aligned} &v_{y}(t)=-v_{\perp} \sin (\varOmega t+\phi ), \end{aligned}$$
(23)
$$\begin{aligned} &v_{z}(t)=v_{\parallel}= v\mu= \mathrm{constant}, \end{aligned}$$
(24)

where ϕ is an arbitrary phase and v and v are the parallel and perpendicular components of the particle velocity.

Fig. 5
figure 5

Trajectory of a charged particle moving with a pitch angle θ with respect to an ordered magnetic field B 0, along the \(\hat{z}\) axis

Let us assume now that on top of the background magnetic field B 0 there is an oscillating magnetic field consisting of the superposition of Alfvén waves polarized linearly along the x-axis. In the reference frame of the waves (v A c) the electric field vanishes and one can write the individual Fourier modes as

$$ \delta\mathbf{B} = \delta B \hat{x} \sin(kz - \omega t) \approx\delta B \hat{x} \sin(kz), $$
(25)

where the z coordinate of the particle is z=vμt. The Lorentz force on the particle in the z-direction is

$$ mv\gamma\frac{d \mu}{dt} = -\frac{q}{c}\delta B v_{y} \to \frac {d\mu}{dt} = \varOmega\frac{\delta B}{B_{0}} \bigl(1-\mu^{2} \bigr)^{1/2} \sin (\varOmega t+\phi ) \sin(k v \mu t), $$
(26)

which can also be rewritten as

$$ \frac{d\mu}{dt} = \frac{1}{2}\varOmega\frac{\delta B}{B_{0}} \bigl(1-\mu ^{2}\bigr)^{1/2} \bigl\{ \cos \bigl[(\varOmega-kv\mu) t+\phi \bigr] - \cos \bigl[(\varOmega+kv\mu) t+\phi \bigr] \bigr\} . $$
(27)

From this expression it is clear that for μ>0 (particles moving in the positive direction) Ω+kvμ>0 and the cosine averages to zero on a long time scale. The first cosine also averages to zero unless Ω=kvμ, in which case the sign of δμ depends on cos(ϕ) and it is random if the phase is random. The average over the phase also vanishes, but the mean square variation of the pitch angle does not vanish:

$$ \biggl\langle \frac{\Delta\mu\Delta\mu}{\Delta t}\biggr\rangle _{\phi} = \pi \varOmega^{2} \biggl(\frac{\delta B}{B_{0}} \biggr)^{2} \frac{(1-\mu ^{2})}{\mu} \delta \biggl(k-\frac{\varOmega}{v \mu} \biggr). $$
(28)

The linear scaling of the square of the pitch angle cosine with time is indicative of the diffusive motion of the particles. The rate of scattering in pitch angle is usually written in terms of pitch angle diffusion coefficient:

$$ \nu= \biggl\langle \frac{\Delta\theta\Delta\theta}{\Delta t}\biggr\rangle _{\phi} = \pi \varOmega^{2} \biggl(\frac{\delta B}{B_{0}} \biggr)^{2} \frac{1}{\mu} \delta \biggl(k-\frac{\varOmega}{v \mu} \biggr). $$
(29)

If P(k)dk is the wave energy density in the wave number range dk at the resonant wave number k=Ω/, the total scattering rate can be written as:

$$ \nu= \frac{\pi}{4} \biggl( \frac{kP(k)}{B_{0}^{2}/8\pi} \biggr) \varOmega. $$
(30)

The time required for the particle direction to change by δθ∼1 is

$$ \tau\sim1/\nu\sim\varOmega^{-1} \biggl( \frac{kP(k)}{B_{0}^{2}/8\pi } \biggr)^{-1} $$
(31)

so that the spatial diffusion coefficient can be estimated as

$$ D(p) = \frac{1}{3} v (v\tau) \simeq \frac{1}{3} v^{2} \varOmega ^{-1} \biggl( \frac{kP(k)}{B_{0}^{2}/8\pi} \biggr)^{-1} = \frac{1}{3} \frac{r_{L}v}{\mathcal{F}}, $$
(32)

where r L =v/Ω is the Larmor radius of the particles and \(\mathcal{F}= \bigl( \frac{kP(k)}{B_{0}^{2}/8\pi} \bigr)\).

It is interesting to notice that the escape time of CRs as measured from the B/C ratio and/or from unstable elements, namely a time of order 107 years in the energy range ∼1 GeV, corresponds to require H 2/D(p)∼107 years, where H∼3 kpc is the estimated size of the galactic halo. This implies D≈1029 cm2 s−1, which corresponds to require δB/B∼6×10−4 at the resonant wave number. A very small power in the form of Alfvén waves can easily account for the level of diffusion necessary to confine CRs in the Galaxy. The requirements become even less demanding when higher energy CRs are considered.

The simple treatment presented here should also clarify the main physical aspects of particle scattering in the ISM, not only in terms of CR confinement in the Galaxy, but also in terms of particle transport inside the accelerators. Alfvén waves in proximity of a shock front can lead to a diffusive motion of particles on both sides of the shock surface. This apparently simple conclusion is the physical basis of diffusive shock acceleration, which will be discussed in the sections below. However, it is also important to realize the numerous limitations involved in the simple description illustrated above.

First, the perturbative nature of the formalism introduced here limits its applicability to situations in which δB/B≪1. Second, as discussed already by Treumann 2009 and (Spitkovsky 2008a), when δB/B becomes closer to unity, the random walk of magnetic field lines may become the most important reason for particle transport perpendicular to the background magnetic field. The combined transport of particles as due to diffusion parallel to the magnetic field and perpendicular to it is not yet fully understood, and in fact it is not completely clear that the overall motion can be described as purely diffusive. In other words, the mean square displacement 〈z 2〉 may not scale linearly with time (see for instance 2008b and references therein). The particle transport perpendicular to the background field most likely plays a very important role in terms of confinement of CRs in the Galaxy, especially when realistic models of the galactic magnetic field are taken into account Sironi and Spitkovsky 2011, Gargaté and Spitkovsky 2012.

Third, as discussed by (Fermi 1949, 1954), the cascade of Alfvenic turbulence from large to small spatial scales proceeds in an anisotropic way, so that at the resonant wavenumbers relevant for particle scattering, small power might be left in the parallel wavenumbers. The CR transport in these conditions might be better modeled as diffusion in a slab plus two dimensional turbulence and the diffusion of particles in such turbulence can be described by the so-called non-linear guiding center theory, first developed by Jokipii and Parker (1969a). The main physical characteristic of this theory of CR transport is that the diffusion coefficient perpendicular to the magnetic field is a non-trivial function of the diffusion coefficient parallel to the field. This non-linearity makes it difficult to achieve a fully self-consistent treatment of CR propagation either in the Galaxy or in the accelerators. This point has recently been investigated in detail by 1969b.

3.3 DSA through the transport equation

Let us consider a shock front characterized by a Mach number M s . The compression factor at the shock r=u 1/u 2 is then

$$ r=\frac{4 M_{s}^{2}}{M_{s}^{2}+3}, $$
(33)

which tends to 4 in the limit of strong shocks, M s →∞. A test particle diffusing in the upstream plasma does not gain or lose energy (although the second order Fermi process discussed above may be at work).

For a stationary parallel shock, namely a shock for which the normal to the shock is parallel to the orientation of the background magnetic field (see Fig. 6) the transport of particles is described by the diffusion–convection equation Giacalone 2013 (see (DeMarco et al. 2007; Effenberger et al. 2012) for a detailed derivation), which in the shock frame reads

$$ u \frac{\partial f}{\partial z} = \frac{\partial}{\partial z} \biggl[ D \frac{\partial f}{\partial z} \biggr] + \frac{1}{3} \frac{du}{dz} p \frac{\partial f}{\partial p} + Q, $$
(34)

where f(z,p) is the distribution function of accelerated particles, normalized in a way that the number of particles with momentum p at location z is ∫dp4πp 2 f(p,z). In Eq. (34) the LHS is the convection term, the first term of the RHS is the spatial diffusion term. The second term on the RHS describes the effect of fluid compression on the accelerated particles, while Q(x,p) is the injection term.

Fig. 6
figure 6

Illustration of test-particle acceleration at a collisionless shock. In the shock frame the plasma enters from the left with velocity u 1 and exits to the right with velocity u 2<u 1. Here the test particle is shown to enter downstream with cosine of the pitch angle μ (as measured in the upstream plasma frame) and exit with a cosine of the pitch angle μ′ (as measured in the downstream plasma frame)

A few comments on Eq. (34) are in order: (1) the shock will appear in this equation only in terms of a boundary condition at z=0, and the shock is assumed to have infinitely small size along z. This implies that this equation cannot properly describe the thermal particles in the fluid. The distribution function of accelerated particles is continuous across the shock. (2) In a self-consistent treatment in which the acceleration process is an integral part of the processes that lead to the formation of the shock one would not need to specify an injection term. Injection would result from the microphysics of the particle motions at the shock. This ambiguity is usually faced in a phenomenological way, by adopting recipes such as the thermal leakage one Goldreich and Sridhar (1995) that allow one to relate the injection to some property of the plasma behind the shock. This aspect becomes relevant only in the case of non-linear theories of DSA, while for the test particle theory the injection term only determines the arbitrary normalization of the spectrum. However, it is worth recalling that while these recipes may apply to the case of protons as injected particles, the injection of heavier nuclei may be much more complex. In fact, it has been argued that nuclei are injected at the shock following the process of sputtering of dust grains Matthaeus et al. (2003).

For the purpose of the present discussion I will assume that injection only takes place at the shock surface, immediately downstream of the shock, and that it only consists of particles with given momentum \(p_{\it inj}\):

$$ Q(p,x)=\frac{\eta n_{1}u_{1}}{4\pi p_{\it inj}^{2}}\delta (p-p_{\it inj})\delta(z)=q_{0} \delta(z), $$
(35)

where n 1 and u 1 are the fluid density and fluid velocity upstream of the shock and η is the acceleration efficiency, defined here as the fraction of the incoming number flux across the shock surface that takes part in the acceleration process. Hereafter I will use the indices 1 and 2 to refer to quantities upstream and downstream respectively.

The compression term vanishes everywhere but at the shock since du/dz=(u 2u 1)δ(z). Integration of Eq. (34) around the shock surface (between z=0 and z=0+) leads to:

$$ \biggl[D\frac{\partial f}{\partial z} \biggr]_{2} - \biggl[D\frac {\partial f}{\partial z} \biggr]_{1} + \frac{1}{3} (u_{2}-u_{1}) p \frac{df_{0}}{dp} + q_{0}(p) = 0, $$
(36)

where f 0(p) is now the distribution function of accelerated particles at the shock surface. Particle scattering downstream leads to a homogeneous distribution of particles, at least for the case of a parallel shock, so that [∂f/∂z]2=0. In the upstream region, where du/dz=0 the transport equation reduces to:

$$ \frac{\partial}{\partial z} \biggl[ u f - D\frac{\partial f}{\partial z} \biggr]=0, $$
(37)

and since the quantity in parentheses vanishes at upstream infinity, it follows that

$$ \biggl[D\frac{\partial f}{\partial z} \biggr]_{1} = u_{1} f_{0}. $$
(38)

Using this result in Eq. (36) we obtain an equation for f 0(p)

$$ u_{1} f_{0} = \frac{1}{3}(u_{2}-u_{1}) p \frac{d f_{0}}{dp} + \frac {\eta n_{1}u_{1}}{4\pi p_{\it inj}^{2}}\delta(p-p_{\it inj}), $$
(39)

which is easily solved to give

$$ f_{0}(p) = \frac{3r}{r-1} \frac{\eta n_{1}}{4\pi p_{\it inj}^{2}} \biggl( \frac{p}{p_{\it inj}} \biggr)^{-\frac{3r}{r-1}}. $$
(40)

The spectrum of accelerated particles is a power law in momentum (and not in energy as is often assumed in the literature) with a slope α that only depends on the compression ratio r:

$$ \alpha=\frac{3r}{r-1}. $$
(41)

The slope tends asymptotically to α=4 in the limit M s →∞ of an infinitely strong shock front. The number of particles with energy ϵ is n(ϵ)=4πp 2 f 0(p)(dp/), therefore n(ϵ)∝ϵ α for relativistic particles and n(ϵ)∝ϵ (1−α)/2 for non-relativistic particles. In the limit of strong shocks, n(ϵ)∝ϵ −2 (n(ϵ)∝ϵ −3/2) in the relativistic (non-relativistic) regime.

Some points are worth being mentioned: the shape of the spectrum of the accelerated particles does not depend upon the diffusion coefficient. On one hand this is good news, in that the knowledge of the diffusion properties of the particles represent the greatest challenge for any theory of particle acceleration. On the other hand, this implies that the concept of maximum energy of accelerated particles is not naturally embedded in the test particle theory of DSA. In fact, the power law distribution derived above does extend (in principle) up to infinite particle energy. In the strong shock limit, such spectrum contains a divergent energy, thereby implying a failure of the test particle assumption. Clearly the absence of a maximum energy mainly derives from the assumption of stationarity of the acceleration process, which can be achieved only in the presence of effective escape of particles from the accelerator, a point which is directly connected to the issue of maximum energy, as discussed in Sect. 3.4.

3.4 Maximum energy: time versus space

There is some level of ambiguity in the definition of the maximum energy achieved in a SNR shock expanding in the ISM. The ambiguity arises from the fact that the maximum energy may be due to a finite time of acceleration (the age of the remnant) or to the existence of a spatial boundary, such that particles can leak out of the system when they diffuse out to such boundary. Clearly in this second case, the physical nature of such a boundary should be discussed.

At least three different definitions of the maximum energy should be considered, and it is not always clear which definition works the best or best describes reality. The first definition consists in requiring that the acceleration time be smaller than the age of the SNR (in case of electrons as accelerated particles the age of the remnant should be replaced by the minimum between the age of the SNR and the time scale for energy losses due to synchrotron and inverse Compton scattering (ICS) radiative processes).

A rigorous calculation of the acceleration time was carried out by Shalchi et al. (2010), while a generalization of such a derivation in the context of the non-linear theory of DSA was presented by (Skilling 1975a). In this section I will illustrate a simple derivation of the acceleration time based on the very essential feature of DSA, namely the fact that it proceeds through repeated shock crossings of individual particles. Let us consider a particle that from the upstream crosses the shock towards the downstream, with a pitch angle μ 1 and an energy E 1. For simplicity let us assume that the particle is already relativistic, so that pE. As seen in the reference frame of the downstream plasma the particle has energy

$$ E_{2} = \varGamma E_{1} ( 1+\beta\mu_{1} ), \quad 0 \leq\mu _{1} \leq1, $$
(42)

where β=(u 1u 2) is the relative velocity between the upstream and the downstream fluid in units of the speed of light c, and Γ=(1−β 2)1/2. While in the downstream region, the particle does not gain or lose energy to first order (there are the usual second order effects that are neglected here). If the particle returns to the shock it may recross its surface with a pitch angle with cosine −1≤μ 2≤0, so that the particle energy as seen again by an observer in the upstream fluid is

$$ E_{1}' = \varGamma E_{2} ( 1-\beta \mu_{2} ) = \varGamma^{2} E_{1} ( 1+\beta \mu_{1} ) ( 1-\beta\mu_{2} ). $$
(43)

Notice that the final energy of the particle after one full cycle upstream-downstream-upstream (or downstream-upstream-downstream) is always \(E_{1}'>E_{1}\), namely particles gain energy at each cycle. In the assumption that the distribution of particles is isotropized by scatterings (diffusion) both upstream and downstream, the fluxes on both sides are normalized as 2|μ|. In other words \(\int_{0}^{1} d\mu A\mu=\int_{-1}^{0} d\mu A|\mu|=1\to A=2\). The mean value of the energy change per cycle is therefore Blandford and Eichler 1987:

$$\begin{aligned} \biggl\langle \frac{E_{1}'-E_{1}}{E_{1}}\biggr\rangle _{\mu_{1},\mu_{2}} =& -\int _{0}^{1} d\mu_{1} 2 \mu_{1} \int_{-1}^{0} d\mu_{2} 2 \mu_{2} \bigl[ \varGamma^{2} E_{1} ( 1+\beta \mu_{1} ) ( 1-\beta \mu_{2} ) - 1 \bigr] \\ =& \frac{4}{3} \beta. \end{aligned}$$
(44)

The scaling of \(\langle\frac{\Delta E}{E}\rangle\) with the first power of β is the reason why DSA is often named first order Fermi acceleration.

In the assumption of isotropy, the flux of particles that cross the shock from downstream to upstream is n s c/4, which means that the upstream section is filled through a surface Σ of the shock in one diffusion time upstream with a number of particles \(n_{s}(c/4)\tau _{{\it diff},1}\varSigma\) (n s is the density of accelerated particles at the shock). This number must equal the total number of particles within a diffusion length upstream L 1=D 1/u 1, namely:

$$ n_{s} \frac{c}{4} \varSigma\tau_{{\it diff},1}= n_{s} \varSigma\frac{D_{1}}{u_{1}}, $$
(45)

which implies for the diffusion time upstream \(\tau_{{\it diff},1}=\frac {4D_{1}}{c u_{1}}\). A similar estimate downstream leads to \(\tau _{{\it diff},2}=\frac{4D_{2}}{c u_{2}}\), so that the duration of a full cycle across the shock is \(\tau_{{\it diff}}=\tau_{{\it diff},1}+\tau_{{\it diff},2}\). The acceleration time is now:

$$ \tau_{\it acc} = \frac{E}{\Delta E/\tau_{{\it diff}}}=\frac {3}{u_{1}-u_{2}} \biggl[ \frac{D_{1}}{u_{1}}+\frac{D_{2}}{u_{2}} \biggr]. $$
(46)

This should be compared with the formally correct and more general expression ((Malkov 1998; Gieseler et al. 2000), (Meyer et al. 1997; Ellison et al. 1997)):

$$ \tau_{\it acc} = \frac{3}{u_{1}-u_{2}} \int_{0}^{p} \frac{dp'}{p'} \biggl[ \frac{D_{1}(p')}{u_{1}}+\frac{D_{2}(p')}{u_{2}} \biggr]. $$
(47)

The two expressions return the same order of magnitude provided D(p) is an increasing function of momentum.

Equation (47) effectively illustrates the fact that the acceleration time is dominated by particle diffusion in the region with less scattering (larger diffusion coefficient) which in normal conditions is the region of the upstream fluid.

The first definition of maximum energy is that the acceleration time be smaller than the age to the SNR \(\tau_{\it SNR}\). Using Eq. (32) for the diffusion coefficient, and concentrating our attention on the upstream fluid, one can write the condition for the maximum energy as

$$ \frac{1}{3}\frac{r_{L}(p_{\max}) c}{v_{s}^{2}\mathcal{F}(k_{\min})}\approx \tau_{\it SNR}, $$
(48)

where k min=1/r L (p max) is the wave number resonant with particles with momentum p max. Using the fact that for a SNR in its ejecta dominated phase \(v_{s}\tau_{{\it SNR}}\approx R_{\it SNR}\), the radius of the SNR shell, the condition becomes

$$ \mathcal{F}(k_{\min}) \approx\frac{1}{3} \frac{c}{v_{s}} \frac {r_{L}(p_{\max})}{R_{\it SNR}}. $$
(49)

This condition is rather interesting since at p max, for reference values of the parameters, one has

$$ r_{L}(p_{\max}) = 1 pc \biggl( \frac{E}{10^{15}~\mathrm{eV}} \biggr) B_{\mu}^{-1}, $$
(50)

which is a fraction of order ∼0.1 of the size of young known SNRs in the ejecta dominated phase or early stages of the Sedov–Taylor phase. Since c/v s ∼100 for the same cases, one immediately infers that in order for a SNR to be a PeVatron one has to have \(\mathcal{F}(k_{\min})\gg1\), namely the random component of the magnetic field on the scale ∼r L (p max) must be much larger than the pre-existing ordered magnetic field, δB/B 0≫1. Clearly in these conditions the calculations that led us to the expression Eq. (32) for the diffusion coefficient fail since the random field can no longer be considered as a perturbation. These last few lines are sufficient to illustrate one of the problems that the field of CR research has been facing for the last several decades: for SNRs to behave as PeVatrons one has to invoke a physical mechanism that enhances the turbulent magnetic field upstream of a SNR shock by a factor ∼10–100 on all scales up to r L (p max). Notice that in the absence of such a mechanism, the maximum energy achieved at a SNR shock is rather uninteresting. For instance, if the diffusion coefficient close to the shock were the same as inferred in the ISM from measurements of the B/C ratio, the maximum energy that could be achieved at ∼1000 years old SNR with the shock moving at 3000 km/s is only a fraction of GeV.

It is important to stress that since the acceleration time is dominated by the upstream conditions, the large magnetic field amplification is needed upstream, where only accelerated particles can reach. It is therefore natural to expect, as was initially proposed by Bell (Drury (1983)) and Lagage and Cesarsky (Blasi et al. (2007)) that the magnetic field may be excited by the same particles that are being accelerated. This important aspect of DSA will be discussed in the context of the non-linear theory of DSA in Sect. 4.

One last point is worth being mentioned concerning Eq. (49). One might argue that increasing the radius of the SNR the condition on \(\mathcal{F}\) may be relaxed, and that acceleration of very high-energy CRs may take place at the late stages of the SNR. This is, however, not plausible for several reasons: (1) after the beginning of the Sedov–Taylor phase, the radius of the remnant increases slowly, therefore not much changes in the constraint on \(\mathcal{F}(k_{\min})\); (2) during the Sedov–Taylor phase the velocity of the shock drops with time, therefore the acceleration time starts increasing, unless the rate of magnetic field amplification gets larger, but in this case the constraint on \(\mathcal{F}(k_{\min})\) becomes even more severe. It is therefore plausible that the highest energy in a SNR is reached sometimes during the ejecta dominated phase, and most likely right before the beginning of the Sedov–Taylor phase.

An alternative definition of the maximum energy is inspired by the possibility of free particle escape from a boundary located at some distance \(z_{0}=\chi R_{\it sh}\), with χ<1. This definition is more often used to describe the maximum energy during the Sedov–Taylor phase, when particle escape should be easier because the shock slows down, so that not only the probability for the highest energy particles to return to the shock increases (see discussion in Sect. 6.1) but also the strength of the amplified magnetic field is likely to drop. The condition for the maximum momentum in this case can be written as:

$$ \frac{D(p_{\max})}{V_{\it sh}} \approx\chi R_{\it sh}. $$
(51)

Again the highest value of p max can be reached at the beginning of the Sedov–Taylor phase, when one can approximately estimate the SNR radius as \(R_{\it sh}\approx V_{\it sh} T_{\it ST}\), so that Eq. (51) becomes

$$ \frac{D(p_{\max})}{V_{\it sh}^{2}} \approx\chi T_{\it ST}. $$
(52)

Recalling that \(D(p)/V_{\it sh}^{2}\) is a rough estimate of the acceleration time, one easily realizes that the condition in Eq. (52) is somewhat more restrictive than the one based on comparing the acceleration time with the age of the SNR, since χ<1.

The third definition of the maximum energy is purely geometric in nature and should be used more as a solid upper limit rather than as an estimate of p max. The condition, which I will only mention here, consists in requiring that the Larmor radius of the highest energy particles equal the size of the system, \(r_{L}(p_{\max})=R_{\it sh}\). Typically this condition overestimates the value of p max by \({\sim} c/V_{\it sh}\) with respect to the second definition discussed above.

All estimates of the diffusion coefficient presented above are based on the framework of particle acceleration at a quasi-parallel shock. Jokipii ((Bell 1978a)) argued that particle acceleration may be faster at oblique shocks (angle to the shock normal larger than ∼30o) and be the fastest at perpendicular shocks (magnetic field perpendicular to the shock normal), even for δB/B<1. At such shocks, particles can cross the shock surface several times during Larmor gyrations while moving along the magnetic field, and be thereby accelerated by the drifts associated to the electric fields that the particles experience because of the different plasma velocity upstream and downstream of the shock. The weak point of this simple scenario is that the particles get advected at the plasma speed, with the magnetic field line that they are trapped on, thereby reducing the time that they can stay in the shock region. On the other hand, the random walk of magnetic field lines may solve this problem, as discussed by Lagage and Cesarsky 1983a. The role of particle transport perpendicular to the field lines is, however, not yet completely understood: the theory that currently best describes particle diffusion perpendicular to field lines was formulated by 1983b, and shows how the perpendicular diffusion coefficient depends in a non-trivial way upon the parallel diffusion coefficient, thereby creating serious problems in building a self-consistent picture of particle acceleration at perpendicular shocks. However, numerical simulations have showed that particle acceleration at perpendicular shocks may be a promising mechanism to increase the maximum energy of accelerated particles beyond the limits discussed above 1978a, 1978b.

The two scenarios of effective magnetic field amplification and of perpendicular shock configuration (without magnetic field amplification) are often considered as two alternative possibilities to shorten the acceleration time and lead to higher energy particles. In fact reality can be appreciably more complex than that. For instance the field at the shock can become prevalently perpendicular as a result of magnetic field amplification upstream with δB/B≫1, since the perturbations are likely to evolve mainly in the plane perpendicular to the pre-existing magnetic field. Moreover, as discussed by 1983a, 1983b, the large scale behavior of the magnetic field lines is likely to speed up acceleration even for the case of parallel shocks, because when the magnetic field line crosses the shock, there is a finite probability that it happens to be oblique with respect to the shock normal, so that drifts enhance the particle energy gain. This complexity and its implications for particle acceleration to the highest energies deserve much more attention than have received until now.

4 The non-linear theory of diffusive shock acceleration

In the previous section I have outlined the main principles and the main limitations of the test-particle theory of CR acceleration in SNR shocks. There are three main reasons that justify the need for a non-linear theory of DSA:

  1. (1)

    Dynamical reaction of accelerated particles.

    For the typical rate of SNRs in the Galaxy, the acceleration efficiency per supernova required to reproduce the CR energetics observed at Earth is of order ∼10 %. This implies that the pressure exerted by accelerated particles on the plasma around the shock affects the shock dynamics as well as the acceleration process. The non-linearity appears through the modification of the compression factor which in turn changes the spectrum of accelerated particles in a way that in general depends upon particle rigidity.

    Note also that while ∼10 % may be a reasonable estimate of the CR acceleration efficiency averaged over the entire history of the remnant, there may be stages during which the efficiency may be appreciably larger.

  2. (2)

    Plasma instabilities induced by accelerated particles.

    As I discussed above, SNRs can be the source of the bulk of CRs in the Galaxy, up to rigidities of order ∼106 GV only if substantial magnetic field amplification takes place at the shock surface. Since this process must take place upstream of the shock in order to reduce the acceleration time, it is likely that it is driven by the same accelerated particles, which would therefore determine the diffusion coefficient that describes their motion. The existence of magnetic field amplification is also the most likely explanation of the observed bright, narrow X-ray rims of non-thermal emission observed in virtually all young SNRs (see 1982, 1987 for recent reviews). The non-linearity here reflects in the fact that the diffusion coefficient becomes dependent upon the distribution function of accelerated particles, which is in turn determined by the diffusion coefficient in the acceleration region.

  3. (3)

    Dynamical reaction of the amplified magnetic field.

    The magnetic fields required to explain the X-ray filaments are of order 100–1000μG. The magnetic pressure is therefore still a fraction of order 10−2–10−3 of the ram pressure \(\rho v_{s}^{2}\) for typical values of the parameters. However, the magnetic pressure may easily become larger than the upstream thermal pressure of the incoming plasma, so as to affect the compression factor at the shock. A change in the compression factor affects the spectrum of accelerated particles which in turn determines the level of magnetic field amplification, another non-linear aspect of DSA.

While a review of non-linear DSA (NLDSA) can be found in Giacalone (2005), here I will focus on the physical aspects of relevance for the calculations of the spectrum and multifrequency appearance of SNRs. Mathematical subtleties, when present, will be pointed out but not discussed in detail.

4.1 Dynamical reaction of accelerated particles

The dynamical reaction that accelerated particles exert on the shock is due to two different effects: (1) the pressure in accelerated particles slows down the incoming upstream plasma as seen in the shock reference frame, thereby creating a precursor. In terms of dynamics of the plasma, this leads to a compression factor that depends on the location upstream of the shock Matthaeus et al. (2003). (2) The escape of the highest energy particles from the shock region makes the shock radiative (Giacalone 2005, 2013), thereby inducing an increase of the compression factor between upstream infinity and downstream. Both these effects result in a modification of the spectrum of accelerated particles, which turns out to be no longer a perfect power law Giacalone (2005).

Before embarking in outlining a theory of NLDSA, it is useful to have a feeling of the physical effects expected due to the dynamical reaction of accelerated particles on the shock. A pictorial representation of the shock modification induced by accelerated particles is reported in Fig. 7: the plasma velocity at upstream infinity (x=−∞) is u 0. While approaching the shock, a fluid element experiences an increasing pressure due to accelerated particles. This is the result of the fact that the diffusion coefficient is an increasing function of momentum, therefore at a position z upstream only particles with energy EE min(z), with D(E min)/v s ≈|z|, can reach that far. The pressure of accelerated particles tends to slow down the incoming fluid, so that a precursor is created, with the gas getting slower while approaching the shock surface. Since the shock region becomes more complex in the presence of particle acceleration, the term shock is usually used to refer to the whole region between upstream infinity and downstream infinity, and is made of a precursor and a subshock, which is now the sharp discontinuity produced in the background gas. If the spectrum were ∼E −2, the energy density would only scale logarithmically with E min (for a given E max), therefore the precursor is spatially extended. For spectra steeper than E −2, the energetics is dominated by low energies, therefore the precursor is concentrated toward the subshock. On the other hand, it will be shown later that in the presence of efficient CR acceleration, the spectrum at high energies can become appreciably harder than E −2, so as to make the total energy in the form of accelerated particles dominated by the maximum energy.

Fig. 7
figure 7

Schematic view of a cosmic-ray modified shock wave in the shock frame. Upstream infinity is on the left (x=−∞), where the plasma velocity is u 0. The CR pressure slows down the inflowing plasma, so as to reduce its bulk velocity to u 1<u 0 immediately upstream of the subshock. The plasma in then compressed and slowed down at the subshock so that the plasma velocity downstream is \(u_{2}=u_{1}/R_{\it sub}\). The total compression factor is \(R_{\it tot}=u_{0}/u_{2}\)

Although the energy density in the form of accelerated particles may become comparable with the ram pressure ρu 2, the number density of these particles remains negligible with respect to the density of the background plasma. Therefore the equation for mass conservation is

$$ \frac{\partial\rho}{\partial t} + \frac{\partial}{\partial z} ( \rho u ) = 0. $$
(53)

The equation of motion of a fluid element under the action of a gradient in the total pressure is

$$ \rho\frac{D u}{D t} = -\frac{\partial}{\partial z} ( P_{g} + P_{c} ), $$
(54)

where D/Dt=/∂t+u∂/∂z denotes the total time derivative and P g and P c are the gas and cosmic-ray pressure respectively. After some simple algebra and using Eq. (53) for mass conservation, one can easily rewrite this as

$$ \frac{\partial}{\partial t} ( \rho u ) = -\frac{\partial }{\partial z} \bigl[ \rho u^{2} + P_{g} + P_{c} \bigr], $$
(55)

which can be viewed as the equation for momentum conservation in the presence of accelerated particles. It is useful to introduce the energy per unit mass of fluid as \(\epsilon=\frac{1}{2} u^{2} + \frac {P_{g}}{\rho(\gamma_{g}-1)}\), so that the energy per unit volume is ρϵ. The time derivative of the energy per unit volume can therefore be written as:

$$ \frac{\partial}{\partial t} (\rho\epsilon ) = -\frac {\partial}{\partial z} \bigl[ ( \rho \epsilon+ P_{g} ) u \bigr] - u \frac{\partial P_{c}}{\partial z}, $$
(56)

where I used the equations for conservation of mass and momentum and the condition that on both sides of the subshock (but not at the subshock itself) the gas evolves adiabatically:

$$ \frac{D P_{g}}{D t} = -\gamma_{g} P_{g} \frac{du}{dz}. $$
(57)

Equations (53), (55) and (56) represent mass, momentum and energy conservation in a plasma in which there are accelerated particles contributing a pressure P c . In the assumption of stationarity that is often adopted in calculations of particle acceleration at SNR shocks, the three equations read

$$\begin{aligned} &\frac{\partial}{\partial z} ( \rho u ) = 0, \end{aligned}$$
(58)
$$\begin{aligned} &\frac{\partial}{\partial z} \bigl( \rho u^{2} + P_{g} +P_{c} \bigr) = 0, \end{aligned}$$
(59)
$$\begin{aligned} &\frac{\partial}{\partial z} \biggl( \frac{1}{2} \rho u^{3} + \frac {\gamma_{g}}{\gamma_{g}-1}u P_{g} \biggr) = -u\frac{\partial P_{c}}{\partial z}. \end{aligned}$$
(60)

It is useful to notice that since the distribution function of accelerated particles is continuous across the subshock, P c (z=0)=P c (z=0+), the conservation equations at the subshock are those of an ordinary gas shock. The effect of accelerated particles only reflects in the fact that the fluid velocity immediately upstream of the subshock is different from the one at upstream infinity. In this sense, the subshock is a standard gaseous shock, while the overall structure of the shock region may be heavily affected by cosmic rays.

The dynamics of accelerated particles is defined by the transport equation, which I report here in its time-dependent form:

$$ \frac{\partial f}{\partial t} + u \frac{\partial f}{\partial z} = \frac{\partial}{\partial z} \biggl[ D \frac{\partial f}{\partial z} \biggr] + \frac{1}{3} \frac{du}{dz} p \frac{\partial f}{\partial p} + Q. $$
(61)

If T(p) is the kinetic energy of particles with momentum p, the energy density and pressure of accelerated particles can be written as

$$\begin{aligned} &E_{c} (z)= \int_{0}^{\infty} dp\,4\pi p^{2} T(p) f(p,z), \end{aligned}$$
(62)
$$\begin{aligned} &P_{c} (z) = \frac{1}{3} \int_{0}^{\infty} dp 4\pi p^{3} v(p) f(p,z). \end{aligned}$$
(63)

Integrating Eq. (61) in momentum space, and neglecting the small energy input at the shock as due to injection, one gets:

$$ \frac{\partial E_{c}}{\partial t} + u \frac{\partial E_{c}}{\partial z} = \frac{\partial}{\partial z} \biggl[ \bar{D} \frac{\partial E_{c}}{\partial z} \biggr] - P_{c} \frac{d u}{d z} + \frac{1}{3} \biggl(\frac{d u}{d z} \biggr) \bigl[ 4\pi p^{3} T(p) f(p,z) \bigr]_{p=0}^{p=\infty}, $$
(64)

where I introduced the mean diffusion coefficient defined as:

$$ \bar{D}(z) = \frac{\int_{0}^{\infty} dp\,4\pi p^{2} T(p) D(p) \frac {\partial f}{\partial z}}{\int_{0}^{\infty} dp\,4\pi p^{2} T(p) \frac {\partial f}{\partial z}}. $$
(65)

The last term in Eq. (64) requires some comments: in test-particle theory, the transport equation (61) has a time-dependent solution with a steadily increasing maximum momentum (if the shock velocity remains constant), namely there is no stationary solution of that equation. A stationary solution would correspond to a power law extending to infinite energy, and for a strong shock this would lead to the last term in Eq. (64) being finite. In the context of a non-linear theory of particle acceleration, the situation is even worse since spectra can become harder than p −4, thereby making the same term diverging. Clearly the system would be destroyed by CR pressure before reaching that situation. A meaningful stationary solution (or a quasi-stationary solution) can only be obtained by assuming the existence of a physical boundary at a finite location z 0 upstream, where particles escape from the acceleration region. This corresponds to requiring f(z 0,p)=0, so as to have an escape flux proportional to the space derivative of the distribution function in z 0 (which does not vanish). Within this framework the distribution function has a strong suppression at p max (see below) and the last term in Eq. (64) vanishes. Hence Eq. (64) becomes

$$ \frac{\partial E_{c}}{\partial t} + \frac{\partial}{\partial z} \biggl[ \frac{\gamma_{c}}{\gamma_{c}-1}u P_{c} \biggr] = \frac {\partial}{\partial z} \biggl[ \bar{D} \frac{\partial E_{c}}{\partial z} \biggr] + u \frac{\partial P_{c}}{\partial z}, $$
(66)

where I introduced the adiabatic index of accelerated particles as E c =P c /(γ c −1). Equation (66) can be used to derive u∂P c /∂z, to be substituted in Eq. (56) Vink 2012, Ballet 2006:

$$\begin{aligned} \frac{\partial}{\partial t} \biggl[ \frac{1}{2} \rho u^{3} + \frac {P_{g}}{\gamma_{g}-1}+E_{c} \biggr] =& - \frac{\partial}{\partial z} \biggl[ \frac{1}{2} \rho u^{3} + \frac{\gamma_{g}}{\gamma_{g}-1}u P_{g} + \frac{\gamma_{c}}{\gamma_{c}-1}u P_{c} \biggr] \\ &{}+ \frac {\partial}{\partial z} \biggl[\bar{D} \frac{\partial E_{c}}{\partial z} \biggr]. \end{aligned}$$
(67)

In the stationary regime the compression factor at the subshock can be written as a function of the Mach number M 1 of the fluid immediately upstream of the subshock in the usual way:

$$ R_{\it sub}=\frac{u_{1}}{u_{2}}=\frac{\rho_{2}}{\rho_{1}}=\frac {(\gamma_{g}+1) M_{1}^{2}}{(\gamma_{g}-1) M_{1}^{2}+2}, $$
(68)

which can be obtained by integrating the equations of conservation of mass and momentum around the subshock. Integrating the same equations between immediately upstream (z=0) and far upstream (z=z 0) one also derives

$$ R_{\it tot}=\frac{u_{0}}{u_{1}}=M_{0}^{\frac{2}{\gamma_{g}+1}} \biggl[ \frac{(\gamma_{g}+1)R_{\it sub}^{\gamma_{g}}-(\gamma _{g}-1)R_{\it sub}^{\gamma_{g}+1}}{2} \biggr]^{\frac{1}{\gamma_{g}+1}}, $$
(69)

where I used the condition of adiabaticity of the upstream gas: \(M_{1}^{2}=M_{0}^{2} (\frac{R_{\it sub}}{R_{\it tot}} )^{\gamma _{g}+1}\). The total compression factor changes in case of non-adiabatic heating of the precursor, for instance due to the damping of waves induced by accelerated particles (see for instance Malkov and Drury (2001)).

Finally, Eq. (67) can be used to determine \(F_{\it esc}=\bar{D} \frac{\partial E_{c}}{\partial z}|_{z=z_{0}}\) which has the meaning of an escape flux of energy in the form of accelerated particles. These equations illustrate very clearly the formation of a cosmic-ray-induced precursor: for instance in the limit in which the gas pressure upstream remains negligible compared with ρu 2, which is always true for strong shocks, the equation of conservation of momentum can be written as

$$ \xi_{\it CR}(z)\approx\frac{P_{c} (z)}{\rho_{0}u_{0}^{2}}\approx 1-\frac{u(z)}{u_{0}}, $$
(70)

where u(z) is the gas velocity at the position z upstream. Immediately upstream of the shock the gas feels the largest CR pressure \(\xi_{\it CR}(0) = 1-\frac{u_{1}}{u_{0}}\). In other words the upstream gas is slowed down by the CR pressure by an amount which is directly related to the fraction of the ram pressure \(\rho_{0}u_{0}^{2}\) that gets converted to accelerated particles.

Since the subshock is a gaseous shock (namely its dynamics is not affected by the presence of accelerated particles), its compression factor is bound to be \(R_{\it sub}<4\), while the total compression factor can potentially become large. In the absence of particle escape, the net effect of the accelerated particles would be to change the adiabatic index toward ∼4/3 (appropriate for a relativistic gas), therefore \(R_{\it tot}\sim7\). However, the escape of particles makes the shock radiative-like, so that \(R_{\it tot}\) can become larger than 7, although I will show below that in all realistic calculations of CR modified shocks both \(R_{\it sub}\) and \(R_{\it tot}\) stay rather close to 4, as a consequence of additional non-linear effects that reduce the CR reaction.

The formation of a precursor upstream implies that the spectra of accelerated particles are no longer power laws. Physically this is easy to understand: particles with momentum p diffuse upstream by a distance that is proportional to the diffusion coefficient D(p), which is usually a growing function of momentum. This implies that particles with low momentum experience a compression factor closer to \(R_{\it sub}<4\), while higher momentum particles trace a compression factor closer to \(R_{\it tot}>4\). As a consequence the spectrum is expected to be steeper than p −4 at low momenta and harder than p −4 at high momenta, with the transition typically taking place around a few \(\mathrm{GeV/c}\). From the mathematical point of view, the spectrum can be calculated by solving together the non-linear CR transport equation, and the equations for conservation of mass, momentum and energy. This has been done in at least three different ways: (1) finite schemes of numerical integration (Drury and Voelk 1981; Axford et al. 1982), (2) Monte Carlo methods (Berezhko and Ellison 1999) and (3) semi-analytical (Berezhko et al. 1994; Berezhko and Völk 1997; Berezhko and Ellison 1999; Malkov 1999; Blasi 2002). Each semi-analytical methods has its pros and cons: calculations of the CR transport based on finite schemes of integration are best in tracking the temporal evolution of the whole system. Monte Carlo methods could in principle be used to investigate non-diffusive effects of CRs close to the maximum momentum. Both these methods are rather time-consuming and in general it is problematic to use them together with hydrodynamical simulations of a supernova evolution. Semi-analytical methods are computationally very fast and easy to implement in more complex calculations involving simulations of supernova evolution. The quasi-stationary solutions derived with quasi-analytical methods are excellent approximations to the time-dependent solutions for given parameters, as discussed by (Caprioli et al. 2009a).

The encouraging agreement among these different methods of calculations of the CR dynamical reaction allows us to deduce some general conclusions on these non-linear effects: (1) the spectra of particles accelerated at a shock in the non-linear regime are not perfect power laws. (2) Since a fraction \(\xi_{\it CR}\) of the ram pressure \(\rho _{0}u_{0}^{2}\) is channeled into accelerated particles, the thermal energy of particles downstream of the shock is less than would have been found in the absence of particle acceleration. Both these effects are well illustrated in Fig. 8 (from Berezhko and Ellison 1999), where the distribution function of particles (thermal plus accelerated) is plotted (multiplied by p 4). The three curves are obtained by changing the Mach number of the shock by lowering the temperature of the upstream gas (the shock velocity is fixed at u 0=5×108 km s−1. Increasing the Mach number causes the CR acceleration to increase (the value of the maximum momentum is fixed at p max=105 GeV/c) and the spectra become increasingly more concave so as to reflect a more pronounced CR-induced shock modification. Moreover, while increasing the CR acceleration efficiency, the temperature of the downstream plasma drops, reflecting in the peak of the Maxwellian distributions in Fig. 8 moving leftward. In Sect. 7 I will discuss the implications of this phenomenon on the width of the broad Balmer line emission in shocks where CR acceleration is efficient.

Fig. 8
figure 8

Particle spectra (thermal plus non-thermal) at a CR modified shock with Mach number M 0=10 (solid line), M 0=50 (dashed line) and M 0=100 (dotted line). The vertical dashed line is the location of the thermal peak as expected for an ordinary shock with no particle acceleration (this value depends very weakly on the Mach number, for strong shocks). The plasma velocity at upstream infinity is u 0=5×108 cm/s, p max=105 m p c and the injection parameter is ξ=3.5 (Berezhko and Völk 1997, 2000; Zirakashvili and Ptuskin 2012)

The curvature in the spectrum is directly related to the formation of a precursor upstream of the shock. The plasma compression in the precursor is directly related not only to the pressure in the form of accelerated particles but also to any form of non-adiabatic heating possibly associated with the presence of accelerated particles. Non-adiabatic heating leads in general to a weakening of the precursor and in turn to a reduction of the concavity in the spectra of accelerated particles. Since the most natural source of non-adiabatic heating upstream is due to damping of the turbulent component of magnetic fields, this phenomenon is related to the magnetic field generation, discussed in the next section.

4.2 Magnetic field amplification

The phenomenon of magnetic field amplification is probably the most important manifestation of the non-linearity of DSA. This role is related to both observational and phenomenological reasons. From the observational point of view, the main evidence for large magnetic fields in the shock region is represented by the observation of narrow filaments of non-thermal X-ray radiation in virtually all young SNRs (Ellison and Eichler 1984; Knerr et al. 1996; Vladimirov et al. 2008). This radiation is the result of synchrotron emission from electrons with energy \(E_{e}\approx 8 ( \frac{E_{\gamma}}{100~\mathrm{eV}} )^{1/2} B_{100}^{-1/2}\) TeV, where E γ is the energy of the synchrotron photons and B 100 is the magnetic field in units of 100μG. One can clearly see that only electrons in the ∼10 TeV energy range can produce the X-rays observed from the rims. Assuming Bohm diffusion for simplicity, the acceleration time can be estimated as

$$ \tau_{\it acc}\approx3.3\times10^{7} E_{\mathrm{TeV}}B_{100}^{-1}V_{\mathit{sh},8}^{-2}~\mathrm{s}, $$
(71)

where E TeV is the electron energy in TeV and \(V_{\mathit{sh},8}=V_{\it sh}/(10^{8}~\mathrm{cm/s})\). The synchrotron loss time is

$$ \tau_{\mathit{syn}} = 4\times10^{10} B_{100}^{-2} E_{\mathrm{TeV}}^{-1}~\mathrm{s}. $$
(72)

Therefore the maximum electron energy is

$$ E_{e,\max} \approx34 B_{100}^{-1/2} V_{\mathit{sh},8}~\mathrm{TeV} $$
(73)

and the energy of the synchrotron photons reads

$$ E_{\gamma,\max} \approx1.7 V_{\mathit{sh},8}^{2}~\mathrm{keV}, $$
(74)

independent of the strength of the local magnetic field. The independence of E γ,max on the value of B 100 is a consequence of having assumed Bohm diffusion, and is not a general result. In the same approximation of Bohm diffusion, the distance covered by electrons with energy close to E e,max before losing their energy can be estimated as

$$ \sqrt{D(E_{e,\max})\tau_{\mathit{syn}}} \approx3.7 \times10^{-2} B_{100}^{-3/2}~\mathrm{pc}. $$
(75)

At the distance of the young SNRs in which the bright X-ray rims have been observed, the thickness of the rims corresponds to a physical scale of ∼10−2 pc, thereby implying the presence of a magnetic field of order several 100μG, to be compared with the 1–6μG typically found in the ISM. The filaments are the best evidence so far that magnetic fields in the shock region have been amplified by a factor ∼10 with respect to the interstellar magnetic field compressed at the shock.

Establishing the nature of this phenomenon is of the utmost importance. Magnetic field amplification could be produced by the shock corrugation, through a sort of Richtmyer–Meshkov instability (Malkov 1999, 1997; Blasi 2002, 2004; Amato and Blasi 2005, 2006) or could be induced by the streaming of accelerated particles (see Caprioli et al. (2010b) for a recent review), thereby representing a different aspect of the non-linear reaction of CRs on the shock. There is a qualitative, extremely important difference between these two scenarios: in the former, the field is wrapped around and its strength enhanced in the downstream region only, while in the latter case the amplification only occurs upstream of the shock, and the field is further compressed at the shock surface. These two possibilities have profoundly different implications in terms of particle acceleration, as discussed below.

Besides being needed in order to explain the thickness of the X-ray rims, magnetic field amplification is also required as a crucial aspect of the SNR paradigm. Particle acceleration as due to DSA requires effective diffusive confinement of CRs close to the shock surface in order to make it possible for the maximum energy to rise up to ∼1015–1016 eV, as required by observations of CRs at Earth. This need is well illustrated by the following simple estimate. If the diffusion coefficient relevant for particle acceleration at SNR shocks were the one derived in the ISM from the measurement of the B/C ratio, D(E)≈3×1028(E/10 GeV)δ cm2 s−1, with δ≈0.3–0.6, the acceleration time would be

$$ \tau_{\it acc}(E) \sim\frac{D(E)}{V_{\it sh}^{2}} \approx10^{5} \biggl( \frac{E}{10~\mathrm{GeV}} \biggr)^{\delta} V_{{\it sh},8}^{-2}~\mathrm{years}, $$
(76)

which exceeds the typical duration of the free-expansion phase of a SNR in the ISM even for low energies (for any reasonable value of δ). During the Sedov–Taylor phase the velocity of the expanding shock decreases, so that it is unlikely that the maximum energy can appreciably increase during such stage, unless the magnetic field increases with time during this phase, which is not expected to happen.

This simple argument shows that CR acceleration in SNRs requires that magnetic field is disordered and amplified in the proximity of the shock so as to shorten the acceleration time. For instance, if the diffusion coefficient were Bohm-like with a strength of 100μG, as suggested by X-ray observations, then the acceleration time would read

$$ \tau_{\it acc}(E) \sim\frac{D(E)}{V_{\it sh}^{2}} \approx3.3\times10^{4} E(\mathrm{GeV}) V_{{\it sh},8}^{-2} B_{100}^{-1}~\mathrm{s}. $$
(77)

Comparing this time with the duration of the ejecta dominated phase of a supernova, T s , which for typical parameters is of order a few 100 years, one easily obtains

$$ E_{\max} \approx3\times10^{5}~\mathrm{GeV}\ B_{100} \biggl( \frac {T_{s}}{300~\mathrm{years}} \biggr) \biggl( \frac{V_{\it sh}}{1000~\mathrm{km}\,\mathrm{s}^{-1}} \biggr)^{2}. $$
(78)

Clearly faster shocks help reaching higher values of E max by decreasing the advection time \({\sim} D(E)/V_{\it sh}^{2}\), although it is worth keeping in mind that this also implies that there is less time available for magnetic field amplification.

More realistic estimates of the maximum energy usually return somewhat lower values. Equation (78) illustrates in a simple way the difficulty in reaching the energy of the knee in SNR shocks. All parameters have to be chosen in the most optimistic way so as to maximize E max.

As mentioned above, magnetic field amplification can also be due to plasma related phenomena rather than to the presence of accelerated particles. One implementation of this idea was illustrated by Blasi et al. 2005: the shock propagates in an inhomogeneous medium with density fluctuations δρ/ρ∼1. While crossing the shock surface these inhomogeneities lead to shock corrugation and to the development of eddies in which magnetic field is frozen. The twist of the eddie may lead to magnetic field amplification on time scales ∼L c /u 2, where L c is the spatial size of these regions with larger density and u 2 is the plasma speed downstream of the shock. Smaller scales also grow so as to form a power spectrum downstream. This phenomenon could well be able to account for the observed thin X-ray filaments. The acceleration time for particles at the shock is, however, not necessarily appreciably reduced in that no field amplification occurs upstream. It turns out that this mechanism may be effective in accelerating particles in the cases where the initial magnetic field is perpendicular to the shock normal. In fact in this case the particles’ return from the upstream region is geometrically easier and may potentially occur in just one Larmor gyration. It seems unlikely that this scenario, so strongly dependent upon the geometry of the system, may lead to a general solution of how to reach the highest energies in Galactic CRs, but this possibility definitely deserves more attention.

It has been known for quite some time that the super-Alfvénic streaming of charged particles in a plasma leads to the excitation of an instability (Blasi et al. 2005). The role of this instability in the process of particle acceleration in SNR shocks was recognized and its implications were discussed by many authors, most notably (Blasi et al. 2005) and (Vink 2012; Ballet 2006). The initial investigation of this instability led to identify as crucial the growth of resonant waves with wavenumber k=1/r L , where r L is the Larmor radius of the particles generating the instability. The waves are therefore generated through a collective effect of the streaming of CRs but can be resonantly absorbed by individual particles thereby leading to their pitch angle diffusion. The resonance condition, taken at face value, would lead to expect that the growth stops when the turbulent magnetic field becomes of the same order as the pre-existing ordered magnetic field δBB 0, so that the saturation level of this instability has often been assumed to occur when δB/B∼1. Lagage and Cesarsky ((Giacalone and Jokipii 2007; Sano et al. 2012)) used this fact to conclude that the maximum energy that can possibly be reached in SNRs when the accelerated particles generate their own scattering centers is ≲104–105 GeV/n, well below the energy of the knee. Hence, though the streaming instability leads to an appealing self-generation of the waves responsible for particle diffusion, the intrinsic resonant nature of the instability would inhibit the possibility to reach sufficiently high energy. It is important to notice that the problem with this instability is not the time scale, but again the resonant nature that forces δB/B∼1. In fact, the growth rate of the streaming instability can easily be found to be (see Sect. 4.2.1):

$$ \varGamma_{\it CR}(k) = \frac{\pi}{8} \varOmega_{p}^{*} \frac {V_{\it sh}}{V_{A}}\frac{n_{\it CR}(p>p_{\it res}(k))}{n_{i}}, $$
(79)

where \(\varOmega_{p}^{*}\) is the proton cyclotron frequency, \(n_{\it CR}(p>p_{\it res}(k))\) is the CR density with momentum \(p>p_{\it res}(k)\), where \(p_{\it res}(k)=\varOmega_{p}^{*}m_{p}/k\) is the minimum momentum of particles that can resonate with waves with wavenumber k and n i is the density of ionized gas in the background plasma (here it was assumed that \(V_{\it sh}\gg V_{A}\)).

If we introduce the power per unit logarithmic wavenumber \(\mathcal{F}(k)\), the diffusion coefficient that rules particle acceleration is \(D(p)\simeq\frac{1}{3}r_{L}(p) v(p) \frac{1}{\mathcal{F}(k)}\), and \(\mathcal{F}(k)\) satisfies the advection equation

$$ V_{\it sh} \frac{\partial\mathcal{F}}{\partial z} = \sigma(k) \mathcal{F}(z,k), $$
(80)

where \(\sigma(k)=2\varGamma_{\it CR}(k)\) is the growth rate for the quantity δB 2. It is easy to solve this equation analytically if we assume that the spectrum of accelerated particles is the standard ∝p −4, so as to obtain the result that the power spectrum at the location of the shock is

$$ \mathcal{F}_{0}(k)=\frac{\pi}{4} \xi_{\it CR} \frac{V_{\it sh}}{V_{A}}\frac {1}{\varLambda}, $$
(81)

where \(\xi_{\it CR}\) is the fraction of \(\rho V_{\it sh}^{2}\) that is converted to accelerated particles and Λ=log(p max/m p c). Equation (80) reflects the fact that waves grow upstream of the shock while advecting towards the shock. In other words, the time available for wave growth is the advection time of a fluid element in the upstream, which is of order \(D(p)/V_{\it sh}^{2}\). This is a sort of upper limit to the growth of waves, in that saturation might intervene at earlier times because of damping or, as mentioned above, because the growth rate gets suppressed when \(\mathcal{F}\sim1\). For canonical values of the parameters in Eq. (81) (\(\xi_{\it CR}=0.1\), \(V_{\it sh}=5000\) km/s, V A =3 km/s, Λ≈10), one can see that \(\mathcal{F}_{0}\gg1\), hence the CR-induced streaming instability may potentially play a crucial role in amplifying the magnetic field upstream of the shock and enhance the scattering of accelerating particles. Moreover, for spectrum \(n_{\it CR}(p)\sim p^{-4}\) the power spectrum \(\mathcal{F}_{0}(k)\) is independent of k, thereby implying that the diffusion coefficient is Bohm-like D(p)∝v(p)p.

This qualitative conclusion is, however, challenged by numerous theoretical and practical difficulties: first, when \(\mathcal{F}>1\) one qualitatively expects that the resonance condition be broken, which considerably reduces the impact of this instability; second, as I show in next section, for acceleration efficiencies \(\xi_{\it CR}\sim10~\%\) or larger the growth rate is profoundly changed, the excited waves are no longer Alfvén waves and the saturation level is considerably reduced.

4.2.1 Resonant streaming instability induced by accelerated particles

In the reference frame of the shock the distribution of accelerated particles is approximately isotropic, while the background upstream plasma (made of protons and electrons) moves towards the shock with velocity \(V_{\it sh}\). The condition that the total charge density vanishes at any point upstream reads

$$ n_{\it CR} + n_{i} =n_{e}, $$
(82)

where \(n_{\it CR}\), n i and n e are the density of accelerated particles, ions and electrons respectively. Moreover the total electric current must also vanish, which implies

$$ n_{i}v_{i} = n_{e} v_{e}. $$
(83)

Since m p m e we can make the assumption that electrons react more promptly than protons, so that \(v_{i}\approx V_{\it sh}\) and

$$ v_{e} = V_{\it sh}\frac{n_{i}}{n_{\it CR}+n_{i}} \approx V_{\it sh} \biggl( 1 - \frac{n_{\it CR}}{n_{i}} \biggr), $$
(84)

where I also assumed \(n_{\it CR}\ll n_{i}\), which is usually the case. The background plasma reacts to CRs moving with the shock by creating a current (relative drift between electrons and ions) that cancels the excess positive charge contributed by CRs, assumed here to be all protons. The dispersion relation of waves with wavenumber k and frequency ω allowed in a system made of CRs, background ions and electrons can be written in a general form as:

$$ \frac{c^{2} k^{2}}{\omega^{2}} = 1 + \sum_{\alpha} \frac{4\pi^{2} q_{\alpha}^{2}}{\omega} \int dp \int d\mu\frac{p^{2} v(p) (1-\mu ^{2})}{\omega- k v \mu\pm\varOmega_{\alpha}} \biggl[ \frac{\partial f_{0,\alpha}}{\partial p} + \frac{1}{p} \biggl( \frac{v k}{\omega }-\mu \biggr) \frac{\partial f_{0,\alpha}}{\partial\mu} \biggr], $$
(85)

where f 0,α (p,μ) is the unperturbed distribution function of particles of type α=CR,i,e. Here Ω α =q α B 0/m α c is the cyclotron frequency of the species α.

Here we first consider the solutions of the dispersion relation in the regime of low frequency waves, \(\omega\ll k V_{\it sh}\). The resulting frequency is in general a complex function of k, and the sign of the imaginary part of the frequency provides information on the growth or damping of the waves. The real part of the frequency describes the type of waves that get excited.

For simplicity let us consider the case of a spectrum of accelerated particles coincident with the canonical DSA spectrum f CR,0(p)∝p −4 for γ minp/m p cγ max. In the case of small CR efficiency, namely when the condition

$$ \frac{n_{\it CR}}{n_{i}} \ll\frac{v_{A}^{2}}{V_{\it sh}c} $$
(86)

is fulfilled Schure et al. 2012, it is easy to show that Alfvén waves are excited (namely \(\operatorname{Re} [\omega ]\approx k v_{A}\)) and their growth rate is

$$ \operatorname{Im} [\omega ](k) \equiv\omega_{I} (k)= \frac{\pi}{8} \varOmega_{p}^{*} \frac{V_{\it sh}}{v_{A}}\frac{n_{\it CR}(p>p_{\it res}(k))}{n_{i}}. $$
(87)

This is the same growth rate as quoted in the previous section and used for the estimates of the maximum energy of accelerated particles (the factor 2 difference between Eqs. (87) and (79) is due to the fact that σ=2ω I ). The same expression can also be used to estimate the growth rate of Alfvén waves excited in the Galaxy during propagation of CRs, if \(V_{\it sh}\) is replaced with ∼2v A . It is, however, very important to notice that for the usual nominal values of parameters, the condition in Eq. (86) reads \(\frac {n_{\it CR}}{n_{i}}\ll10^{-7}\). As an order of magnitude the density of CRs can be related to the efficiency as \(\frac{n_{\it CR}}{n_{i}} \approx \frac{3\xi_{\it CR}}{\gamma_{\min}\varLambda} ( \frac {V_{\it sh}}{c} )^{2}\). Therefore Eq. (86) becomes

$$ \xi_{\it CR} \ll\frac{\gamma_{\min}\varLambda}{3} \biggl( \frac {v_{A}}{V_{\it sh}} \biggr)^{2}\frac{c}{V_{\it sh}}\approx8\times10^{-4} \biggl( \frac{V_{\it sh}}{5\times10^{8}~\mathrm{cm/s}} \biggr)^{-3}, $$
(88)

which is typically much smaller than the value \(\xi_{\it CR}\sim10~\%\) which is required of SNRs to be the sources of the bulk of Galactic CRs. It follows that in phases in which the SNR accelerates CRs most effectively the growth rate proceeds in a different regime.

This regime was already investigated in the pioneering papers by Giacalone and Jokipii (2007) and (Skilling 1975b) where it is referred to as cosmic-ray modified regime. Two important effects come into play: (1) the excited waves are no longer pure Alfvén waves, in that imaginary and real part of the frequency become comparable, and (2) their growth rate acquires different scalings with the physical quantities involved in the problem.

In this regime, which occurs when Eq. (86) is not fulfilled, the solution of the dispersion relation for kr L,0≤1, namely for waves that can resonate with particles in the spectrum of accelerated particles (γγ min) becomes

$$ \omega_{I} \approx\omega_{R} = \biggl[ \frac{\pi}{8} \varOmega _{p}^{*} k V_{\it sh} \frac{n_{\it CR}(p>p_{\it res}(k))}{n_{i}} \biggr]^{1/2}. $$
(89)

Since \(n_{\it CR}(p>p_{\it res}(k))\propto p_{\it res}^{-1}\sim k\), it follows that ωk for kr L,0≤1, but the phase velocity of the waves v ϕ =ω R /kv A . The fact that the phase velocity of these waves exceeds the Alfvén speed may affect the spectrum of particles accelerated at the shock.

One can repeat the calculation of the saturation of the turbulent magnetic field as due to advection alone, upstream of the shock, as done above (see Eq. (80)), but using now the growth rate appropriate for the case of efficient CR acceleration at the shock. It is easy to calculate the power spectrum at the shock location:

$$ \mathcal{F}_{0}(k) = \biggl(\frac{\pi}{6} \biggr)^{1/2} \biggl( \frac {\xi_{\it CR}}{\varLambda} \biggr)^{1/2} \biggl( \frac{c}{V_{\it sh}} \biggr)^{1/2}. $$
(90)

For the usual canonical values of the parameters, one finds \(\mathcal{F}_{0}\lesssim1\), hence the effect of efficient CR acceleration is such as to reduce the growth of the waves and limit the value of the self-generated magnetic field to the same order of magnitude as the pre-existing magnetic field. Magnetic field damping may possibly make the problem even more severe.

4.2.2 Non-resonant small-scale modes from streaming instability induced by accelerated particles

The solution of the dispersion equation, Eq. (85) contains more modes than the resonant ones discussed above. Zweibel (1979) noticed that when the condition in Eq. (86) is violated, namely when

$$ \xi_{\it CR} > \frac{\gamma_{\min}\varLambda}{3} \biggl( \frac {v_{A}}{V_{\it sh}} \biggr)^{2}\frac{c}{V_{\it sh}}, $$
(91)

the right-hand polarized mode develops a non-resonant branch for kr L,0>1 (spatial scales smaller than the Larmor radius of all the particles in the spectrum of accelerated particles), with a growth rate that keeps increasing proportional to k 1/2 and reaches a maximum for

$$ k_{*}r_{L,0} = \frac{3\xi_{\it CR}\gamma_{\min}}{\varLambda} \biggl( \frac {V_{\it sh}}{v_{A}} \biggr)^{2}\frac{V_{\it sh}}{c}>1, $$
(92)

which is a factor (k r L,0)1/2 larger than the growth rate of the resonant mode at kr L,0=1. This non-resonant mode has several interesting aspects: first, it is current driven, but the current that is responsible for the appearance of this mode is the return current induced in the background plasma by the CR current. The fact that the return current is made of electrons moving with respect to protons is the physical reason for these modes developing on small scales (electrons in the background plasma have very low energy) and right-hand polarized. Second, the growth of these modes, when they exist, is very fast for high speed shocks, however, they cannot resonate with CR particles because their scale is much smaller than the Larmor radius of any particles at the shock. On the other hand, it was shown that the growth of these modes leads to the formation of complex structures: flux tubes form, which appear to be organized on large spatial scales Achterberg (1983) and ions are expelled from these tubes thereby inducing the formation of density perturbations. At present it is not clear whether the growth may lead to the formation of magnetic perturbations on scales relevant for scattering of CRs with energy ≥105 GeV (see discussion in Sect. 4.2.3).

The situation described above is well illustrated in Fig. 9, taken from a paper by 1983a, 1983b. The top (bottom) panel refers to the left-hand (right-hand) polarized mode for a case with strong CR modification of the waves (\(\xi_{\it CR}=10~\%\)). In both plots the real and imaginary part of the frequency are plotted as a solid and dashed line respectively. In this plot, the concept of resonant and non-resonant should be understood in terms of left-hand and right-hand polarization of the waves. In fact one can see that the resonant part of the dispersion relation (kr L,0≤1) is present in both panels, namely these modes are excited irrespective of the polarization (this would not be true in the case of low acceleration efficiency, in which only left-hand modes are excited). In addition to the waves that can resonate with protons, the right-hand branch also has modes that grow for kr L,0≥1, as discussed above. For the set of parameters used in Fig. 9, the maximum growth occurs for k r L,0∼104. One can also notice how the real part of the frequency of the fast growing modes is very small: these modes are basically almost purely growing.

Fig. 9
figure 9

Real and imaginary parts of the frequency as a function of wavenumber for the resonant (top panel) and non-resonant (bottom panel) modes, as calculated in (Zweibel 1979; Achterberg 1983). Wavenumbers are in units of 1/r L,0, while frequencies are in units of \(V^{2}_{\it sh}/(c r_{L,0})\). In each panel, the solid (dashed) curve represents the real (imaginary) part of the frequency. The values of the parameters are as follows: \(V_{\it sh}=10^{9}~\mathrm{cm\,s}^{-1}\), B 0=1μG, n=1 cm−3, \(\xi_{\it CR}=10~\%\) and p max=105 m p c

Finally, it is worth recalling that damping considerably reduces the region of parameter space where the Bell modes may effectively grow and give rise to the strongly non-linear phase of development of the instability Zweibel (1979).

The problem of particle acceleration at SNR shocks in the presence of small-scale turbulence generated by the growth of the non-resonant mode was studied numerically by Achterberg (1983), where maximum energies of the order to 105 GeV were found, as a result of the fact that at the highest energies the scattering proceeds in the small deflection angle regime D(p)∝p 2. This finding reflects the difficulty of small-scale waves to resonate with particles, irrespective of how fast the modes grow.

Recently Bell (2004, 2005) proposed that the growth of the fast non-resonant mode may in fact also enhance the growth of waves with kr L,0<1. If this process were confirmed by numerical calculations of the instability (current calculations are all carried out in the quasi-linear regime), it might provide a way to overcome the problem of inefficient scattering of accelerated particles off the existing turbulence around SNR shocks.

4.2.3 Filamentation instability

Recent work has shown that the non-linear development of CR-induced magnetic field amplification is more complex than illustrated above. There is no doubt that the small-scale non-resonant instability (Reville and Bell 2012) is very fast, provided the acceleration efficiency is large enough. The question is what happens to these modes while they grow. Both MHD simulations Amato and Blasi (2009) and Particle-in-Cell simulations of this instability carried out by (Amato and Blasi 2009) show how the growth leads to the development of modes on larger spatial scales. In recent numerical work (Amato and Blasi 2009) it has been shown that the current of CRs escaping the system induces the formation of filaments: the background plasma inside such filaments gets expelled from the filaments because of the J×B force. Different filaments attract each other as two currents would and give rise to filaments with larger cross section. Interestingly this instability, which might be a natural development of Bell’s instability to a strongly non-linear regime, leads to magnetic field amplification on a spatial scale comparable with the Larmor radius of particles in the CR current. However, since the current is made of particles that are trying to escape the system, the instability leads to a sort of self-confinement. The picture that seems to be arising consists in a possibly self-consistent scenario in which the highest energy particles (whichever that may be) generate turbulence on the scale of their own Larmor radius, thereby allowing particles of the same energy to return to the shock and sustain DSA (Zweibel and Everett 2010).

Zirakashvili et al. (2008) have recently discussed the importance of the filamentation instability in achieving PeV energies in young SNRs. The authors estimated the current of particles escaping at p max as a function of the shock velocity and concluded that the rate of growth of the instability is such as to allow young SNRs to reach ∼200 TeV energies for shock velocity \(V_{\it sh}\sim5000\) km/s (typical of SNRs such as Tycho), falling short of the knee by about one order of magnitude. A possible conclusion of this study might be that SNRs with an even larger velocity (therefore much younger) may be responsible for acceleration of PeV CRs. The issue of whether such young SNRs may have plowed enough material (and therefore accelerated enough particles) to account for the actual fluxes of CRs observed at Earth remains to be addressed. It is worth recalling that the argument discussed above, if applied to scenarios involving SNe type Ib, Ic where it has been speculated that the maximum rigidity may be as high as ∼1017 V Bykov et al. (2009, 2011b), imply considerably lower maximum energies. Future detection of CR protons of Galactic origin in such high-energy region would be hardly reconcilable with DSA in SNRs of any type.

4.2.4 Non-resonant large scale streaming instability induced by accelerated particles

In addition to the resonant and non-resonant modes discussed above, the dispersion relation Eq. (85) also returns a large scale non-resonant mode, basically a firehose instability. This instability excites waves with wavenumber smaller than the 1/r L,max, where r L,max is the Larmor radius of particles at some maximum momentum p max. The instability is excited due to the anisotropy of the distribution function of accelerated particles, similar to the standard firehose instability that requires an anisotropic pressure. Interestingly the relevant anisotropy is the quadrupole rather than the dipole anisotropy (see the review paper by (Bell 2004) for a discussion of this issue). The growth rate of the firehose instability can be written as

$$ \varGamma_{\mathit{FH}} (k) \simeq\xi_{\it CR}^{1/2} \frac{V_{\it sh}^{2} k}{c}. $$
(93)

Since k≪1/r L,max and \(\tau _{\mathit{adv}}(p_{\max})=r_{L,\max}c/V_{\it sh}^{2}\) can be used as an estimate of the advection time of particles at p max, it follows that \(\varGamma _{\mathit{FH}}\tau_{\mathit{adv}}(p_{\max}) \ll\xi_{\it CR}^{1/2}<1\), namely the instability is unlikely to have enough time to grow to a level that can be important for particles at p max. On the other hand, the distribution of particles escaping the system could be much more anisotropic than what is implied by the diffusive approximation and hence enhance the effectiveness of the firehose instability.

4.3 The dynamical reaction of amplified magnetic fields on the shock

A third aspect of the non-linearity of CR acceleration at shocks consists of the dynamical reaction of magnetic fields produced upstream by CRs on the shock itself. The theoretical aspects of this phenomenon at CR modified shocks were developed by Caprioli et al. ((Bell 2004)). I will refer to these papers for mathematical details, which basically represent the generalization of the conservation equations discussed in Sect. 4.1 to the case where magnetic fields are present. The conservation of mass and momentum read

$$\begin{aligned} &\frac{\partial}{\partial z} ( \rho u ) = 0, \end{aligned}$$
(94)
$$\begin{aligned} &\frac{\partial}{\partial z} \bigl( \rho u^{2} + P_{g} +P_{c} + P_{w} \bigr) = 0, \end{aligned}$$
(95)

where P w is the pressure in the form of waves. As discussed by Riquelme and Spitkovsky (2009), the dynamical reaction of the amplified magnetic field can be understood by focusing on what happens at the subshock, where energy conservation reads

$$ \biggl[ \frac{1}{2}\rho u^{3} + \frac{\gamma_{g}}{\gamma_{g}-1} u P_{g} + F_{w} \biggr]_{1}^{2} = 0, $$
(96)

where I used the continuity of the CR distribution function (and therefore pressure) across the subshock. As usual the indices 1 and 2 denote quantities immediately upstream and downstream of the subshock respectively. Here F w is the flux of waves with pressure P w . The connection between P w and F w is specific of the type of waves that are being studies, which unfortunately limits the applicability of the conclusions to the same cases. (Reville and Bell 2012; Caprioli and Spitkovsky 2013) only considered the case of Alfvén waves, for which

$$ P_{w} = \frac{1}{8\pi} \biggl(\sum _{i} \delta B_{i} \biggr)^{2},\quad\quad F_{w} = \sum_{i} \frac{\delta B_{i}^{2}}{4\pi} (u + H_{c,i}v_{A}) + P_{w} u, $$
(97)

where H=±1 is the wave helicity. The calculations illustrated in Sect. 4.1 can be repeated including the effect of waves, so as to obtain the expression connecting \(R_{\it sub}\) (compression factor at the subshock) and \(R_{\it tot}\) (total compression factor):

$$ R_{\it tot}^{\gamma_{g}+1} = \frac{M_{0}^{2}R_{\it sub}^{\gamma_{g}}}{2} \biggl[ \frac{\gamma_{g}+1-R_{\it sub}(\gamma_{g}-1)}{1+\varLambda_{B}} \biggr], $$
(98)

where

$$ \varLambda_{B} = W \bigl[ 1+R_{\it sub}(2/\gamma_{g}-1) \bigr],\quad\quad W = \frac{P_{w,1}}{P_{g,1}}. $$
(99)

The dynamical reaction of the amplified magnetic field is regulated by the quantity Λ B , which in turn is determined by the ratio W between the pressure in the form of waves and the thermal pressure immediately upstream of the subshock. If W≪1 the dynamical reaction of the magnetic field is negligible, while for W≳1 the total compression factor gets reduced (Eq. (98)): the effect of the magnetic field is that of reducing the plasma compressibility when the magnetic pressure becomes comparable with the thermal pressure of the upstream gas, thereby acting in the direction opposite to that of CRs, which lead to larger values of \(R_{\it tot}\). This is the reason why taking into account the effect of magnetic fields on the shock dynamics leads to predict less modified shocks, and correspondingly less concave spectra of accelerated particles (Bell et al. 2013; Reville and Bell 2013). The values of magnetic fields inferred from the thickness of the X-ray rims typically corresponds to W∼1–10, if the field is interpreted as CR induced. Hence the magnetic dynamical reaction described above is certainly important and it has been shown to have a considerable impact on the spectra of accelerated particles, making them closer to power laws.

4.4 A critical summary of magnetic field amplification mechanisms

The X-ray filaments observed in virtually all young SNRs are the strongest evidence so far that magnetic field amplification takes place close to the shock. Is this the same magnetic field that is responsible for particle acceleration up to the knee?

In the standard theory of diffusive particle transport at shocks, scattering occurs efficiently at resonance, namely when the particle encounters a wave with wavenumber k≃1/r L . This requires that sufficient power exists in the magnetic spectrum at the resonant wavenumber, so as to lead to the required scattering frequency. In the sections above I have discussed several nuances of the excitation of resonant instabilities and for all of them the case can be made that they grow too slowly. In general the strength of the magnetic field only grows to δBB for waves excited by the CRs when they are efficiently accelerated (\(\xi_{\it CR}\) larger than few percent). Clearly if the instability led to δB>B one could argue that the resonance condition would be ill defined. In this case the perturbative approach intrinsic in the quasi-linear theory would reveal itself as being utterly inadequate. On the other hand, the non-resonant mode first discussed by Bell et al. (2013) (but see also (Ptuskin et al. 2010)) has a growth rate which can be much larger than any other unstable mode, and can certainly lead to large magnetic fields at the shock. However, the scales that get excited by the instability are very small compared with the gyration radii of accelerated particles and although their growth also leads to power transfer to larger scales (a sort of inverse cascade Bykov et al. 2013), it is unlikely that this process may continue up to the scales comparable with the larmor radius of particles of 105–106 GeV, because of the limited time available for the process to occur (roughly one advection time). Moreover, the current that induces the instability is dominated by low energy particles (say GeV particles), hence it is not easy to envision a mechanism that should move power to scales much larger than the Larmor radius of the particles representing the bulk of the current.

In addition to the CR-induced instabilities discussed above, there are also fluid instabilities (e.g. see 2008, 2009b) that only amplify magnetic field downstream of the shock if a density inhomogeneity is present upstream on suitably chosen scales. In this case the scattering of particles upstream of the shock is not affected by the amplification process.

We could speculate that the instabilities discussed above, and more specifically the non-resonant modes first found by Caprioli et al. (2008), play a crucial role in the production of the magnetic field as inferred from the X-ray morphology, while the same instabilities might be less important to warrant the necessary level of particle scattering to reach high energies. What would then be the mechanism to energize CRs to the knee energy? Clearly this question is still open and it may be worth keeping an open mind on how to address it. As discussed above, a possible way out might come from the investigation of the filamentation instability excited by particles escaping the acceleration region.

A very important role in understanding the role of the different types of CR-induced instabilities in SNR shocks is being played by hybrid numerical simulations, in which the protons in the background plasma are treated by using a kinetic approach, while electrons are treated as a fluid. This approach allows one to take into account a larger range of spatial scales with respect to Particle-in-cell (PIC) simulations, which are more appropriate for the investigation of the initial stages of particle acceleration (injection). Hybrid simulations have recently been used to investigate the role of shock obliquity in the process of particle acceleration and magnetic field amplification Caprioli et al. (2008, 2009b). Unfortunately, even with hybrid simulations it is, at present, difficult to describe the complex interplay between large and small scales that is so important in astrophysical sources of high-energy particles: for instance, the dynamics of the shock is often dominated by the highest energy particles, which diffuse further away from the shock and probably play a crucial role in seeding magnetic instabilities (see for instance (Caprioli et al. 2009b)), but these scales may be very large compared with the computation box. Another instance is in the random walk of magnetic field lines on very large scales (comparable with the size of a SNR) that facilitates the process of particles’ return to the shock surface in oblique shocks, and that would not be properly described in current hybrid simulations.

In the section below I also discuss a more mondane possibility that has been often discussed in the literature and yet received less attention than it deserved, namely the possibility that the bulk of Galactic CRs is accelerated in superbubbles excavated in the ISM by repeated SN explosions, rather than in isolated SNRs. These regions are very active in that several SNRs occur in a relatively short period of time (a few tens million years), and conditions might be better suited for particle acceleration to higher energies.

5 The superbubble hypothesis

Massive stars form mainly in the cores of dense molecular clouds in a time span that is only a few million years long. This short time inhibits the stars from acquiring a peculiar velocity larger than ∼2 km/s, so that these stars explode basically within a few tens of parsecs from the place where they were born. Stars of type O and B are typically characterized by intense stellar winds with an energy injection which is of the same order of magnitude as the energy liberated at the time of the supernova event associated with the end of the nuclear reactions in the parent stars. It has been estimated that ∼85 % of the core-collapse SNe in the Galaxy occur in these superbubbles (Bell (2004) and references therein), excavated by the collective action of the stellar winds of O and B stars.

The launching of the stellar winds pollutes the circumstellar region with heavy elements synthesized in the stellar interior due to nuclear reactions, therefore it may be expected that the SN explosion due to the core collapse of the parent stars take place in a metal enriched medium. It has been advocated that this may explain some anomalies in the chemical composition of CRs, most notably the overabundance of refractory elements and the 22Ne abundance Lucek and Bell 2000, Bell and Lucek 2001.

It is easy to realize that the environment in which the OB association is located is profoundly changed by the collective action of the stellar winds and the SN explosions, all within a few million years time span. In principle particle acceleration may be taking place in this environment due to several different processes, from shock acceleration in the winds, to shock acceleration at shocks formed during supernova explosions, to second order acceleration in the turbulent magnetic field deriving from merging winds and SN ejecta. These processes have been studied for instance by Bell 2004 and Giacalone and Jokipii 2007, and the calculations seem to show a general trend to very hard spectra of accelerated particles. It has also been proposed that the maximum energy that can be achieved is higher than in isolated SNR, although these estimates are somewhat based on simple arguments that may fail to properly represent reality. Nonetheless, as a qualitative statement, it is clear that a place with enhanced background turbulence may in principle be better suited to make acceleration faster, thereby allowing us to infer higher values of the maximum energy. The problem of how to reconcile the hard injection spectra with those observed at the Earth remains to be properly addressed.

Recently the Fermi-LAT telescope has found the first direct evidence for gamma-ray emission that can be attributed to freshly accelerated CRs in the Cygnus region Bell (2004), an OB association at 1.4 kpc distance from the Sun. The spectrum of the gamma radiation is appreciably harder than the average Galactic gamma-ray spectrum, again supporting the hypothesis that the parent CRs have been produced at a location close to the emission region.

6 Indirect evidence for CR acceleration in SNRs

There is no doubt that SNRs are sites of cosmic-ray acceleration. The subject of the debate is whether all CRs are accelerated in SNRs, and which SNRs or which phases of a SNR may possibly allow for CR acceleration up to the energy of the knee. This confidence is based on direct observation of the radiation produced by CRs while being accelerated inside the sources. SNRs have long been known as radio and X-ray sources, while gamma-ray emission extending to >TeV energies has been detected more recently.

Radio emission is associated with synchrotron emission of non-thermal electrons, accelerated at the SNR shock. Electrons with energy E would radiate at frequency ν≃3.7 MHz B μ E(GeV)2. It is easy to see how the phenomenon of magnetic field amplification affects very profoundly the radio emission, in two ways: (1) if the field is amplified to values of, say, 100μG, the electrons responsible for GHz radio waves have energy E∼1–2 GeV, while if the magnetic field were not amplified the corresponding electron energy would be ∼10–20 GeV. The electron spectra in these two energy regions might carry information on the acceleration process: for instance in the theory of NLDSA with strong dynamical reaction of accelerated particles the spectrum is somewhat steeper (softer) at ∼ GeV energies than it is at ∼10 GeV, which might reflect into a similar hardening in the spectrum of radio emission. This effect is more pronounced when comparing the spectrum of GeV electrons with that of particles responsible for synchrotron X-rays. X-ray radiation at 1 keV requires electrons with energy ∼20–30 TeV for a 100μG magnetic field, therefore the concavity might be visible if one considers together radio and X-ray emission. (2) Moreover, the strong dependence of synchrotron losses from magnetic field strength implies that at given photon frequency less electrons are needed in order to explain the synchrotron emission. This reflects in a smaller value of the ratio between electrons and protons in the GeV range, what is usually referred to as K ep . A general feature of NLDSA is to require very low values of this ratio, K ep ∼10−3–10−4 as a consequence of magnetic field amplification. The value of K ep measured at the Earth in the GeV energy region, where energy losses during propagation do not play an important role, is ∼10−2, which is a reason for concern if one wants to associate the origin of CR electrons to SNRs as well. One should, however, exercise some caution here, in that the effective spectrum of CRs injected by a SNR is the integral over time of the particles escaping the remnant at different times. The problem of escape of CRs from their sources is of central importance to the origin of CRs and is also one of the most uncertain aspects of the whole SNR paradigm (see Sect. 6.1 below). The value of \(K_{\it ep}\) as inferred from multiwavelength studies in the sources reflects the instantaneous ratio of densities of electrons and protons, while the value of \(K_{\it ep}\) as measured at Earth is the result of the integration over time of the escape flux and the overlap of potentially different numerous sources. This is not a justification of the discrepancy, but rather an assessment of the complexity that lies behind the simple nature of the SNR paradigm.

Another instance of this complexity is represented by the spectra of accelerated particles in a SNR (see Sect. 6.2 below). The basic prediction of DSA in its linear or non-linear version is that the spectra of accelerated particles at sufficiently high energies (above few GeV) should be close to ∼E −2 or harder if the efficiency of acceleration is high enough to drive a strong dynamical reaction on the shock. As discussed below, this simple expectation is in conflict both with measurements of CR anisotropy at Earth and with measurements of the gamma-ray spectrum from selected SNRs. Whether this represents a symptom of new physical effects of particle acceleration or a byproduct of the environment in which the acceleration process takes place remains to be understood.

In the following I will try to address the strong and weak points of the SNR paradigm for the origin of CRs, stressing, whenever possible, which observational strategy could help improving our understanding.

6.1 Escape

In an ideal plane infinite shock, the return probability of CRs from upstream of the shock is unity, namely all CRs return to the shock and are eventually advected downstream. If this were the end of the story, CRs would all be confined inside a SNR until the shock would eventually dissipate away and the particles would be able to escape into the ISM and become CRs. The adiabatic energy losses suffered by particles during the SN expansion would imply that the highest energy CRs (say with energy close to the knee) would lose part of their energy and the requirements in terms of maximum energy at the source would be even more severe than they already are. More important, one would not expect any gamma-ray emission in situations in which a molecular cloud is illuminated by the CR escaping from a nearby SNR, or at least this phenomenon would appear only when CRs are left free to escape since the shock is no longer able to confine them inside the expanding shell.

Many physical phenomena intervene in a more realistic shock wave: (1) the shock slows down due to mass accumulation, more so during the Sedov–Taylor phase. In this phase, the shock radius changes in time as \(R_{\it sh}\propto t^{2/5}\) (if the expansion takes place in a homogeneous ISM), while the diffusion front of CRs moving with the shock expands with respect to the shock as ∝t 1/2. It seems unavoidable that more particles will diffuse away from the shock and the probability that they may return to the shock from upstream is reduced. (2) The shock may be broken, so as to allow for particles’ escape to some extent. In this instance, the spectrum and density of escaping particles would depend on details of the environment in which the shock expands, making this scenario rather unappealing but not necessarily less realistic. (3) If particles can produce their own scattering centers through the collective excitation of streaming instability, it is reasonable to imagine that at some distance from the shock the particle density drops, so as to make the scattering frequency too low to warrant their return to the shock.

A careful description of the numerous problems involved in the description of the escape of particles from a SNR shock can be found in a recent paper by (Gargaté and Spitkovsky 2012; Caprioli and Spitkovsky 2013).

Historically, in the absence of a physical theory of particle escape, this phenomenon has been modeled by introducing a spatial boundary (the same for particles of any energy) at which particles are left free to escape the system. This condition is usually implemented by solving the diffusion–convection equation with the boundary condition that f(p,z 0)=0, where z 0 is the location of the escape boundary. The idea behind this boundary condition is that when self-confinement becomes inefficient, the particle density drops as a result of a transition to a sort of ballistic motion. Clearly, even this description is rather simplistic in that even the escaping particles move diffusively, but with a larger scattering length, probably closer to the one they experience while diffusing in the Galaxy. In other words, what is changing is the value of the diffusion coefficient, which increases from the small, self-generated one in the shock proximity, to the larger one present in the Galaxy.

The position of the free escape boundary is usually assumed to be located at a given fraction (of order ∼10 %) of the shock radius. In this case, the solution of the transport equation can be simply found to be

$$ f(z,p)=f_{0}(p) \frac{\exp (\frac{u z}{D(p)} )-\exp (\frac{u z_{0}}{D(p)} )}{1-\exp (\frac{u z_{0}}{D(p)} )}, $$
(100)

in the assumption that the diffusion coefficient D(p) does not depend upon the spatial coordinate x. As usual, I assume that downstream of the shock the particle distribution is homogeneous, namely ∂f/∂x|2=0. The flux of particles escaping the accelerator at x 0 is then

$$ F(z_{0},p) = - D(p) \frac{\partial f}{\partial z}\bigg|_{z=z_{0}} = - \frac {u_{1}f_{0}(p)}{1-\exp (\frac{u z_{0}}{D(p)} )}\exp \biggl(\frac{u z_{0}}{D(p)} \biggr). $$
(101)

The fact that F(z 0,p)<0 simply expresses the fact that the particles are escaping from the system. As a function of momentum, Eq. (101) vanishes for p→0 and for p→∞, while it has a peak around the momentum for which D(p )/u 1x 0, which can be used as an estimate of the maximum momentum.

In other words, for a given location of the escape boundary, only particles in a narrow region around the maximum momentum can escape the system, so that the spectrum of escaping particles as seen from the point of view of an observer outside the system appears to be centered around the momentum p . On the other hand, during the Sedov–Taylor phase of a SNR the shock velocity drops, the radius of the shell increases and the magnetic field amplification causes the magnetic field to decrease with time. The spectrum of particle escaping the system is then the result of integration over time of the peaked spectra escaping at any given time. Calculating this spectrum is a useful exercise and can be done very easily Bell et al. 2013. Let us assume that the maximum momentum reached at the beginning of the Sedov phase, T s , is p max,s , and that then it drops with time as p max(t)∝(t/T s )α, with α>0. The energy in the escaping particles of momentum p is

$$ d\epsilon= 4\pi p^{2} dp p c N_{\it esc}(p) = \xi_{\it esc} \frac{1}{2} \rho v_{\it sh}^{3} 4\pi R_{\it sh}^{2} dt, $$
(102)

where \(\xi_{\it esc}(t)\) is the fraction of the income flux \(\frac{1}{2} \rho v_{\it sh}^{3} 4\pi R_{\it sh}^{2}\) that is converted into escaping flux.

During the Sedov–Taylor phase in a homogeneous medium one has \(R_{\it sh}\propto t^{2/5}\) and \(V_{\it sh}\propto t^{-3/5}\), therefore from Eq. (102):

$$ N_{\it esc}(p) \propto\xi_{\it esc}(t) p^{-3} t^{-1} \frac{dt}{dp_{\max}} \propto p^{-4} \xi_{\it esc}(t). $$
(103)

What I obtained is that the spectrum of escaping particles integrated over the Sedov–Taylor phase of the SNR is p −4 if the fraction \(\xi_{\it esc}\) does not depend on time. It is worth stressing that this p −4 has nothing to do with the standard result of the DSA in the test-particle regime, neither it depends on the detailed evolution in time of the maximum momentum. It solely depends on having assumed that particles escape the SNR during the adiabatic (self-similar) phase. Notice also that in realistic calculations of the escape \(\xi_{\it esc}\) usually decreases with time, leading to a spectrum of escaping particles which is even harder than p −4. The total spectrum of particles injected into the ISM by an individual SNR is the sum of the escape flux and the flux of particles escaping the SNR after the shock dissipates and allows for the release of the particles accelerated throughout the history of the SNR and trapped in the expanding shell.

This simple picture does not change qualitatively once the non-linear effects of particle acceleration are included: Higdon and Lingenfelter 2005 calculated the spectrum of CRs injected by a SNR in detail in the context of the NLDSA. These calculations raise many problems, when compared with observations, as discussed below.

6.2 Spectra

The spectrum of CRs injected by a SNR into the ISM during the few tens thousands years of its evolution is extremely complex to calculate since it requires the knowledge of the instantaneous spectrum of accelerated particles at any time, of the temporal evolution of the maximum energy, of the mechanism that leads to particle escape (see discussion above), and the entire calculation depends on the type of SN and the environment in which it explodes. The most one can do at the present time is to consider different scenarios and achieve a quantitative estimate of the amount of changes in the overall CR spectrum. Several possibilities were investigated by (Higdon and Lingenfelter 2005, 2006, 2013), but a pretty general conclusion of these calculations is that the spectrum is typically very close to E −2 at high energies if not harder, mainly as a result of the dynamical reaction of accelerated particles, and the contribution from the flux of particles escaping at any given time, which is typically harder than E −2, as discussed above. A typical spectrum obtained from these calculations is reported in Fig. 10 (from the work of Bykov and Toptygin (2001)) for a shock expanding in a uniform medium with temperature T 0=105 K and injection parameter \(\xi _{\it inj}=3.9\). The dashed curve shows the spectrum of particles escaping through the boundary, located at \(\chi R_{\it sh}\) (with χ=0.15) from the shock, at any time. The dash-dotted line shows the spectrum of particles that leave the SNR at the end of its evolution. The maximum energy in the latter component is clearly lower, since higher energy particles escaped at earlier times through the boundary. The solid line shows the total spectrum contributed by the SNR after the end of its evolution. The bump-like structure at the highest energies is due to the hard escape flux dominating there. Notice that the escape flux as calculated in NLDSA is harder than the naive estimate ∼E −2 derived in Sect. 6.1, and its concavity reflects the temporal evolution of the non-linear dynamical reaction of accelerated particles on the shock. Notice also that in the absence of an escape flux from the SNR the spectrum of CRs contributed by SNRs (dash-dotted line) would exhibit a pronounced cutoff at energies much lower than the knee, as a result of adiabatic energy losses.

Fig. 10
figure 10

CR spectrum injected in the ISM by a SNR expanding in a medium with density n 0=0.1 cm−3, temperature T 0=105 K and injection parameter \(\xi_{\it inj}=3.9\) (from Parizot et al. (2004)). The dashed line shows the escape of particles from upstream, the dash-dotted line is the spectrum of particles escaping at the end of the evolution. The solid line is the sum of the two. The escape boundary is located at \(0.15 R_{\it sh}\)

The spectrum illustrated in Fig. 10 is troublesome in at least two ways: (1) it is harder than the spectra observed in gamma rays in several SNRs, as pointed out by (Ackermann et al. 2011); (2) if the CR spectrum injected by an individual SNR is that hard, the diffusion coefficient required in the Galaxy to fit the spectra observed at Earth is D(E)∝E 0.7 (see also Drury (2011)), which is known to result in exceedingly large CR anisotropy (Caprioli et al. 2010a; Ptuskin et al. 2010).

It is worth noticing that this discrepancy is not a consequence of the non-linear theory of DSA, in that the predictions of the test-particle theory are also plagued by the same problem.

It has been argued by Caprioli et al. (2010a) that one possible reason for softer spectra might be the presence of fast moving scattering centers around the shock: as was first pointed out by Caprioli et al. (2010a), the compression factor that enters the calculation of the spectrum of accelerated particles is the ratio of the upstream and downstream velocity of scattering centers. In the case of ordinary Alfvén waves, \(v_{A}\ll V_{\it sh}\) and the effect is weak, namely the velocity of the scattering centers (in the shock frame) is very close to the plasma velocity. On the other hand, in the case of strong magnetic field amplification it may be speculated that the speed of waves may be a sizeable fraction of the shock speed.Footnote 2 In this case the spectrum of accelerated particles becomes N(E)dEE α dE with Caprioli et al. 2010a:

$$ \alpha=\frac{\tilde{r}+2}{\tilde{r} - 1},\quad\quad \tilde{r} = \frac {u_{1}\pm v_{W,1}}{u_{2}\pm v_{W,2}}. $$
(104)

While it is customary to assume that waves get isotropized downstream (v W,2=0), the compression factor can be either decreased or increases depending on the helicity of waves upstream. This reflects in either softer or harder spectra of accelerated particles.

Another possibility to obtain softer spectra has been discussed by Caprioli et al. 2010a: the authors claim that in case of a mainly perpendicular shock geometry, the return probability of particles from downstream can become smaller, thereby leading to steeper spectra.

It is rather disappointing that both these effects rely on details of the theory, and one is left to wander if observations may actually allow us to find the correct explanation for this rather serious discrepancy between theory and observational evidence.

6.3 Gamma-ray emission from isolated SNRs

The best chance of testing our theories of the origin of CRs in SNRs is in the modeling of the multifrequency spectrum and morphology of selected SNRs. The purpose of this section is, however, not that of listing the individual SNRs that have been detected in gamma rays, but rather to choose a few cases of SNRs that are sufficiently isolated so as to be modeled as individual sources, and use them to illustrate the type of information that we can gather by comparing observations with theory.

The first clear detection of TeV gamma-ray emission from a SNR came from the SNR RXJ1713.7-3946 Caprioli (2011), later followed by the detection of the same remnant in the GeV energy range with the Fermi-LAT telescope Berezhko and Völk 2007. Here I will briefly discuss this case because it is instructive of how the comparison of theoretical predictions with data can drive our understanding of the acceleration environment.

A discussion of the implications of the TeV data, together with the X-ray data on spectrum and morphology was presented by (Ptuskin 2006; Blasi and Amato 2012b). A hadronic origin of the gamma-ray emission would easily account for the bright X-ray rims (requiring a magnetic field of ∼160μG), as well as for the gamma-ray spectrum. If electrons were to share the same temperature as protons, the model would predict a powerful thermal X-ray emission, which is not detected. Rather than disproving this possibility, this finding might be the confirmation of the expectation that at fast collisionless shocks electrons fail to reach thermal equilibrium with protons. In fact, the Coulomb collision time scale for this remnant turns out to exceed its age. On the other hand, it was pointed out by Ptuskin et al. (2010), Caprioli et al. (2010a) that even a slow rate of Coulomb scattering would be able to heat electrons to a temperature ≳1 keV, so that oxygen lines would be excited and they would dominate the thermal emission. These lines are not observed, thereby leading to a severe upper limit on the density of gas in the shock region, which would result in a too small pion production. Bell (1978a) concluded that the emission is of leptonic origin. This interpretation appears to be confirmed by the more recent Fermi-LAT data, which show a very hard gamma-ray spectrum, incompatible with an origin related to pion production and decay. Clearly this does not mean that CRs are not efficiently accelerated in this remnant. It simply implies that the gas density is too low for efficient pp scattering.

However, it should be pointed out that models based on ICS of high-energy electrons are not problem free: first, as pointed out by Bell (2004, 2005), the density of IR light necessary to explain the HESS data as the result of ICS is ∼25 times larger than expected. Second, the ICS interpretation requires a weak magnetic field of order ∼10μG, incompatible with the observed X-ray rims. Finally, recent data on the distribution of atomic and molecular hydrogen around SNR RXJ1713.7-3946 Caprioli (2012) suggest a rather good spatial correlation between the distribution of this gas and the TeV gamma-ray emission, which would be easier to explain if gamma rays were the result of pp scattering. In conclusion, despite the fact that the shape of the spectrum of gamma rays would suggest a leptonic origin, the case of SNR RXJ1713.7-3946 will probably turn out to be one of those cases in which the complexity of the environment around the remnant plays a crucial role in determining the observed spectrum. Future high resolution gamma-ray observations, possibly with the Cherenkov telescope array (CTA), will contribute to clarify this situation.

A somewhat clearer case is that of the Tycho SNR, the leftover of a SN type Ia exploded in a roughly homogeneous ISM, as confirmed by the regular circular shape of the remnant. Tycho is one of the historical SNRs, as it was observed by Tycho Brahe in 1572. The multifrequency spectrum of Tycho extends from the radio band to gamma rays, and a thin X-ray rim is observed all around the remnant (see the right panel of Fig. 4). It has been argued that the spectrum of gamma rays observed by Fermi-LAT Schure and Bell (2013) in the GeV range and by VERITAS (Aharonian et al. 2004, 2006, 2007) in the TeV range can only be compatible with a hadronic origin (Abdo et al. 2011). The morphology of the X-ray emission, resulting from synchrotron radiation of electrons in the magnetic field at the shock, is consistent with a magnetic field of ∼300μG, which implies a maximum energy of accelerated protons of ∼500 TeV. A hadronic origin of the gamma-ray emission has also been claimed by Morlino et al. (2009), where, however, the steep gamma-ray spectrum measured from Tycho is attributed to an environmental effect: the gamma ray flux is assumed to be made of two components: one due to gamma ray production in a roughly homogeneous medium and another due to gamma ray production in denser, compact clumps where the maximum energy of CRs is lower. In the calculations of Ellison et al. (2010) the steep spectrum is instead explained as a result of NLDSA in the presence of waves moving with the Alfvén velocity calculated in the amplified magnetic field. In the latter case the shape of the spectrum is related, though in a model-dependent way, to the strength of the amplified magnetic field, which is the same quantity relevant to determine the X-ray morphology. In the former model the steep spectrum might not be found in another SNR in the same conditions, in the absence of the small-scale density perturbations assumed by the authors.

The multifrequency spectrum of Tycho (left) and the X-ray brightness of its rims (right) are shown in Fig. 11 (from Ellison et al. (2010)). The dash-dotted line in the left panel shows the thermal emission from the downstream gas (here the electron temperature is assumed to be related to the proton temperature as T e =(m e /m p )T p immediately behind the shock, and increases with time solely due to Coulomb scattering, which couples electrons with the warmer protons), the short-dashed line shows the ICS contribution to the gamma-ray flux, while the dashed line refers to gamma rays from pion decays. The solid lines show the total flux. The figure shows rather impressively how the magnetic field necessary to describe the radio and X-ray radiation as synchrotron emission also describes the thickness of the X-ray rims (right panel) and pushes the maximum energy of accelerated particles to ∼500 TeV (in the assumption of Bohm diffusion).

Fig. 11
figure 11

Left Panel: Spatially integrated spectral energy distribution of Tycho. The curves show synchrotron emission, thermal electron bremsstrahlung and pion decay as calculated by Morlino et al. (2009). Gamma-ray data from Fermi-LAT (Fukui et al. 2012) and VERITAS (Giordano et al. 2012) are shown. Right Panel: Projected X-ray brightness at 1 keV. Data points are from (Acciari et al. 2011). The solid line shows the result of the calculations by (Morlino and Caprioli 2012) after convolution with the Chandra point spread function

The case of Tycho is instructive as an illustration of the level of credibility of calculations based on the theory of NLDSA: the different techniques agree fairly well (see Berezhko et al. (2013) for a discussion of this point) as long as only the dynamical reaction of accelerated particles on the shock is included. When magnetic effects are taken into account, the situation becomes more complex: in the calculations based on the semi-analytical description of Morlino and Caprioli (2012) the field is estimated from the growth rate and the dynamical reaction of the magnetic field on the shock is taken into account Morlino and Caprioli 2012. Similar assumptions are adopted by Morlino and Caprioli (2012), although the technique is profoundly different. Similar considerations hold for (Giordano et al. 2012). On the other hand, (Acciari et al. 2011) take the magnetic field as a parameter of the problem, chosen to fit the observations, and its dynamical reaction is not included in the calculations. The magnetic backreaction, as discussed by Cassam-Chenaï et al. 2007 comes into play when the magnetic pressure exceeds the thermal pressure upstream, and leads to a reduction of the compression factor at the subshock, namely less concave spectra. Even softer spectra are obtained if one introduces a recipe for the velocity of the scattering centers Morlino and Caprioli (2012). This, yet speculative, effect is not included in any of the other approaches.

Even more pronounced differences arise when environmental effects are included. The case of Tycho is again useful in this respect: the predictions of the standard NLDSA theory would not be able to explain the observed gamma-ray spectrum from this SNR. But assuming the existence of ad hoc density fluctuations, may change the volume integrated gamma-ray spectrum as to make it similar to the observed one Morlino and Caprioli (2012). Space resolved gamma-ray observations would help clarify the role of these environmental effects in forging the gamma-ray spectrum of a SNR.

6.4 SNRs near molecular clouds

There is no lack of evidence of CR proton acceleration in SNRs close to molecular clouds (MC), which act as a target for hadronic interactions resulting in pion production. Recently the AGILE (Giordano et al. 2012) and Fermi-LAT ((Acciari et al. 2011), Cassam-Chenaï et al. 2007; Morlino and Caprioli (2012)) collaborations claimed the detection of the much sought-after pion bump in the gamma-ray spectrum. This spectral feature confirms that the bulk of the gamma-ray emission in these objects is due to ppπ 0→2γ.

Figure 12 (from Caprioli et al. 2010b) shows the gamma-ray spectra of SNRs IC443 (left panel) and W44 (right panel), where the pion bump is well visible. The steep gamma-ray spectrum at high energies suggests that the acceleration process is no longer very active, as one may qualitatively have expected for old SNRs.

Fig. 12
figure 12

Pion bump in the gamma-ray emission of SNRs IC 443 and W44 as measured by Fermi-LAT and reported by Amato and Blasi (2006)

SNRs close to molecular clouds are very interesting astrophysical objects, not so much in terms of investigating CR acceleration (as these are old objects in which one would not expect acceleration to very high energies), but rather as laboratories to investigate CR propagation around sources and escape from sources. In this respect, it is useful to separate the SNR-MC associations in two types: (1) the ones in which the shock is directly propagating inside the cloud, and (2) the ones in which the MC is illuminated by CRs propagating out of a nearby SNR, which is, however, at some distance from the cloud.

In the first instance, several new effects intervene: for a density of molecular gas n=103 cm−3, the interaction length between molecules, assuming a geometric cross section of σ∼10−14 cm2, becomes λ∼1/∼1011 cm. Moreover, the typical fraction of ionized gas in a molecular gas is so small that collisionless processes of formation of a shock wave may be less important than the ones associated with molecular collisions. The SNR shock impacting a molecular gas might become collisional, thereby leading to heating of the molecular gas on a scale ∼λ downstream. This picture appears to be supported by the presence of maser emission from behind such shocks (Caprioli et al. 2008, 2009b), which prove the presence of heated molecular gas. The possibility that such shocks may accelerate particles is all but demonstrated. In fact the gamma-ray emission from a MC in these conditions might be the result of the streaming of particles accelerated at previous times at the collisionless SNR shock and liberated once the shock impacts the MC.

The second scenario has received more attention (see for instance Vladimirov et al. (2008)). The propagation of escaping CRs from a SNR shock to a MC in its vicinity is a rather complex phenomenon to describe and model: the spectrum of CRs reaching the MC is in general time dependent, in that it is affected by both the time dependence of the escape flux (see discussion in Sect. 6.1) and by the finite time that CRs have to diffuse out to the distance of the MC, R MC . Several authors have argued that a low energy cutoff can be expected in the CR spectrum, at the energy for which \([D(E) \tau_{\it SNR} ]^{1/2}\simeq R_{\it MC}\). This reflects the fact that higher energy particles diffuse faster, thereby reaching the MC when lower energy particles are still lagging behind. It is important to notice that a low energy cutoff in the spectrum of CRs reaching the MC at a given time does not reflect in a cutoff in the gamma-ray spectrum: the cross section for pion production from a proton of given energy scales approximately as 1/E π , so that low energy gamma rays are expected to have a spectrum approximately \(\propto E_{\gamma}^{-1}\), a signature of a low energy cut in the CR spectrum at the MC location. Possible indications of this phenomenon might have been already detected in the SNR W28 Ptuskin et al. (2010), where two clouds at different distances from the SNR appear to be illuminated in a different way (different flux of CRs) and to be characterized by a low energy spectral break that starts at higher energies for the most distant MC, as one would expect if the break is related to CR propagation.

Two phenomena add to the complexity of the picture presented above: (1) for isotropic diffusion, the density of CRs from the SNR dominates upon the Galactic CR spectrum for distances of a few tens of parsecs (see discussion in Berezhko et al. (2013)). This may imply that the diffusion properties of CRs inside such distance are self-produced by the diffusing CRs, therefore possibly very different from the average conditions inside the Galaxy at large. In case of dominant parallel diffusion, this effect becomes even more important. (2) If there is a dominant orientation of the background Galactic magnetic field where the SNR and the MC are located, one can expect anisotropic diffusive effects to play a prominent role. Below I briefly discuss these issues, which might represent major sources of interesting discoveries in the near future.

As I pointed out several times throughout this review, CRs play a crucial role in determining the diffusion properties of the medium in which they propagate. This is equally true at SNR shocks, in the Galaxy while CRs propagate, and near sources due to the CR gradient that is established there. A self-consistent solution of the propagation of CRs near their sources has recently been presented by Caprioli et al. (2008, 2009b), where effects of diffusion parallel and perpendicular to the local magnetic field have also been discussed.

The expected pattern of diffusion mainly parallel to the background local magnetic field reflects in a spatial distribution of CRs which is elongated in the direction of the field (Ptuskin et al. 2010; Caprioli et al. 2010a; Morlino and Caprioli 2012) at least for a time smaller than the diffusion time over a scale of the order of the coherence scale L c ∼50–100 pc of the magnetic field. When CRs diffuse farther than L c they start feeling the random walk of magnetic field lines and their distribution spreads in three spatial dimensions. If a nearby MC is located along the direction of the magnetic field it gets eventually illuminated by CRs escaping the SNR. If on the other hand the MC is not connected to the SNR by a flux tube, it is unlikely to be illuminated by CRs (because perpendicular diffusion is suppressed on these scales), and virtually no gamma-ray emission is expected. This picture is strikingly more complex and richer of information than the simple picture of CRs escaping a SNR isotropically that is usually adopted in studying MCs.

7 line as a cosmic-ray calorimeter in SNRs

optical emission from Balmer-dominated SNR shocks is a powerful indicator of the conditions around the shock (Berezhko et al. 2013) including the presence of accelerated particles (see (Giuliani et al. 2010, 2011) for a review). The line is produced when neutral hydrogen is present in the shock region, and it gets excited by collisions with thermal ions and electrons to the level n=3 and decays to n=2. In the following I describe the basic physics aspects of this phenomenon and how it can be used to gather information on the CR energy content at the shock.

A collisionless shock propagating in a partially ionized background goes through several interesting new phenomena: first, neutral atoms cross the shock surface without suffering any direct heating, due to the collisionless nature of the shock (all interactions are of electromagnetic nature, therefore the energy and momentum of neutral hydrogen cannot be changed). However, a neutral atom has a finite probability of undergoing either ionization or a charge exchange reaction, whenever there is a net velocity difference between ions and atoms. Behind the shock, ions are slowed down (their bulk motion velocity drops down) and heated up, while neutral atoms remain colder and faster. The reactions of charge exchange lead to formation of a population of hot atoms (a hot ion downstream catches an electron from a fast neutral), which also have a finite probability of getting excited. The Balmer line emission from this population corresponds to a Doppler broadened line with a width that reflects the temperature of the hot ions downstream. Measurements of the width of the broad Balmer line have often been used to estimate the temperature of protons behind the shock, and in fact it is basically the only method to do so, since at collisionless shocks electrons (which are responsible for the continuum X-ray emission) have typically a lower temperature than protons. Equilibration between the two populations of particles (electrons and protons) may eventually occur either collisionally (through Coulomb scattering) or through collective processes. The broad Balmer line is produced by hydrogen atoms that suffer at least one charge exchange reaction downstream of the shock. The atoms that enter downstream and are excited before suffering a charge exchange also contribute to the line, but the width of the line reflects the gas temperature upstream, and is therefore narrow (for a temperature of 104 K, the width is 21 km/s). In summary, the propagation of a collisionless shock through a partially ionized medium leads to emission, consisting of a broad and a narrow line (see the recent review by Abdo et al. 2009).

When CRs are efficiently accelerated, two phenomena occur, as discussed in Sect. 4: (1) the temperature of the gas downstream of the shock is lower than in the absence of accelerated particles. (2) A precursor is formed upstream, as a result of the pressure exerted by accelerated particles.

Both these phenomena have an impact on the shape and brightness of the Balmer line emission. The lower temperature of the downstream gas leads to a narrower broad Balmer line, whose width bears now information on the pressure of accelerated particles, through the conservation equations at the shock.

The CR-induced precursor slows down the upstream ionized gas with respect to the hydrogen atoms, which again do not feel the precursor but through charge exchange. If ions are heated in the precursor (not only adiabatically, but also because of turbulent heating) the charge exchange reactions transfer some of the internal energy to neutral hydrogen, thereby heating it. This phenomenon results in a broadening of the narrow Balmer line.

A narrower broad Balmer line and a broader narrow Balmer line are both signatures of CR acceleration at SNR shocks 2010a, 2010b, 2010c. The theory of CR acceleration at collisionless SNR shocks in the presence of neutral hydrogen has only recently been formulated Ackermann et al. 2013 and has led to the prediction of several new interesting phenomena, discussed below.

7.1 Acceleration of test particles at shocks in partially ionized media

The presence of neutrals in the shock region changes the structure of the shock even in the absence of appreciable amounts of accelerated particles, due to the phenomenon of neutral return flux Ackermann et al. 2013. A neutral atom that crosses the shock and suffers a charge exchange reaction downstream gives rise to a new neutral atom moving with high bulk velocity. There is a sizeable probability (dependent upon the shock velocity) that the resulting atom moves towards the shock and crosses it towards upstream. A new reaction of either charge exchange or ionization upstream leads the atom to deposit energy and momentum in the upstream plasma, within a distance of the order of its collision length. On the same distance scale, the upstream plasma get heated up and slows down slightly, thereby resulting in a reduction of the plasma Mach number immediately upstream of the shock (within a few pathlengths of charge exchange and/or ionization). This implies that the shock strength drops, namely its compression factor becomes less than 4 (even for strong shocks).

This neutral return flux Ackermann et al. (2013) plays a very important role in the shock dynamics for velocity \(V_{\it sh}\lesssim3000\) km/s. For faster shocks, the cross section for charge exchange drops rather rapidly and ionization is more likely to occur downstream. This reduces the neutral return flux and the shock modification it produces.

The consequences of the neutral return flux both on the process of particle acceleration and on the shape of the Balmer line are very serious: some hydrogen atoms undergo charge exchange immediately upstream of the shock, with ions that have been heated by the neutral return flux. These atoms give rise to a Balmer line emission corresponding to the temperature of the ions immediately upstream of the shock. As demonstrated by Ackermann et al. (2013) this contribution consists of an intermediate Balmer line, with a typical width of ∼100–300 km/s. Some tentative evidence of this intermediate line might have already been found in existing data (e.g. see (Hewitt et al. 2009)).

The most striking consequence of the neutral return flux is, however, the steepening of the spectrum of test particles accelerated at the shock, first discussed by Gabici et al. 2007, 2009, Rodriguez Marrero et al. 2008. The effect is caused by the reduction of the compression factor of the shock, which reflects on the fact that the slope of the spectrum of accelerated particles gets softer. This effect is, however, limited to particles that diffuse upstream of the shock out to a distance of order a few collision lengths of charge exchange/ionization upstream. It follows that the steepening of the spectrum is limited to particle energies low enough as to make their diffusion length shorter than the pathlength for charge exchange and ionization. In Fig. 13 (from (Giuliani et al. 2010)) I show the spectral slope as a function of shock velocity for particles with energy 1, 10, 100, 1000 GeV, as labeled (background gas density, magnetic field and ionization fraction are as indicated). One can see that the standard slope ∼2 is recovered only for shock velocities >3000 km/s. For shocks with velocity ∼1000 km/s the effect may make the spectra extremely steep, to the point that the energy content may be dominated by the injection energy, rather than, as it usually is, by the particle mass. This situation, for all practical purposes, corresponds to not having particle acceleration but rather a strong modification of the distribution of thermal particles. For milder neutral induced shock modifications, the effect is that of making the spectra of accelerated particles softer. It is possible that this effect may play a role in reconciling the predicted CR spectra with those inferred from gamma-ray observations (see Blasi and Amato 2012a and Sect. 6.2 for a discussion of this problem), although the effect is expected to be prominent only for shocks slower than ∼3000 km s−1.

Fig. 13
figure 13

Slope of the differential spectrum of test particles accelerated at a shock propagating in a partially ionized medium, with density 0.1 cm−3, magnetic field 10μG and ionized fraction of 50 %, as a function of the shock velocity. The lines show the slope for particles at different energies, as indicated. The figure is taken from the paper by Malkov et al. (2013)

7.2 NLDSA in partially ionized media

The theory of NLDSA in the presence of partially ionized media was fully developed by (Nava and Gabici 2013; Giacinti et al. 2013), using the kinetic formalism introduced by (Chevalier and Raymond 1978; Chevalier et al. 1980) to account for the fact that neutral atoms do not behave as a fluid, and their distribution in phase space can hardly be approximated as being a maxwellian. The theory describes the physics of particle acceleration, taking into account the shock modification induced by accelerated particles as well as neutrals, and magnetic field amplification. The theory is based on a mixed technique in which neutrals are treated through a Boltzmann equation while ions are treated as a fluid. The collision term in the Boltzmann equation is represented by the interaction rates of hydrogen atoms due to charge exchange with ions and ionization, at any given location. The Boltzmann equation for neutrals, the fluid equations for ions and the non-linear partial differential equation for accelerated particles are coupled together and solved by using an iterative method. The calculation returns the spectrum of accelerated particles at any location, all thermodynamical quantities of the background plasma (density, temperature, pressure) at any location, the magnetic field distribution, and the distribution function of neutral hydrogen in phase space at any location from far upstream to far downstream.

These quantities can then be used to infer the Balmer line emission from the shock region, taking into account the excitation probabilities to the different atomic levels in hydrogen. An instance of such calculation is shown in Fig. 14, where I show the shape of the Balmer line for a shock moving with velocity \(V_{\it sh}=4000\) km/s in a medium with density 0.1 cm−3 with a maximum momentum of accelerated particles p max=50 TeV/c. The left panel shows the whole structure of the line, including the narrow and broad components, while the right panel shows a zoom-in on the narrow Balmer line region (gray shadowed region in the left panel). The black line is the Balmer line emission in the absence of accelerated particles. Allowing for particle acceleration to occur leads to a narrower broad Balmer line (left panel) and to a broadening of the narrow component (right panel). The latter is rather sensitive, however, to the level of turbulent heating in the upstream plasma, namely the amount of energy that is damped by waves into thermal energy of the background plasma. In fact turbulent heating is also responsible for a more evident intermediate Balmer line (better visible in the left panel) with a width of few 100 km/s. It is worth recalling that observations of the Balmer line width are usually aimed at either the narrow or the broad component, but usually not both, because of the very different velocity resolution necessary for measuring the two lines. Therefore the intermediate line is usually absorbed in either the broad or the narrow component, depending on which component is being measured. This implies that an assessment of the observability of the intermediate Balmer component requires a proper convolution of the predictions with the velocity resolution of the instrument.

Fig. 14
figure 14

Left Panel: Shape of the Balmer line emission for a shock moving with velocity \(V_{\it sh}=4000\) km/s in a medium with density 0.1 cm−3, as calculated by Heng 2010. The thick (black) solid line shows the result in the absence of particle acceleration. The other lines show the broadening of the narrow component and the narrowing of the broad component when CR are accelerated with an injection parameter \(\xi_{\it inj}=3.5\) and different levels of turbulent heating (η TH ) as indicated. Right Panel: Zoom-in of the left panel on the region of the narrow Balmer line, in order to emphasize the broadening of the narrow component in the case of efficient particle acceleration

At the time of this review, an anomalous shape of the broad Balmer line has been reliably measured in a couple of SNRs, namely SNR 0509-67.5 Ghavamian et al. 2013 and SNR RCW86 (Heng 2010). As I discuss below, the main problem in making a case for CR acceleration is the uncertainty in the knowledge of the shock velocity and the degree of electron-ion equilibration downstream of the shock. The ratio of the electron and proton temperatures downstream is indicated here as β down =T e /T p . The other parameters of the problem have a lesser impact on the inferred value of the CR acceleration efficiency.

The SNR 0509-67.5 is located in the Large Magellanic Cloud (LMC), therefore its distance is very well known, 50±1 kpc. (Blasi et al. 2012; Morlino et al. 2012, 2013c) carried out a measurement of the broad component of the line emission in two different regions of the blast wave of SNR 0509-67.5, located in the southwest (SW) and northeast (NE) rim, obtaining a FWHM of 2680±70 km/s and 3900±800 km/s, respectively. The shock velocity was estimated to be \(V_{\it sh} = 6000 \pm300\) km/s when averaged over the entire remnant, and 6600±400 km/s in the NE part, while a value of 5000 km/s was used by (Blasi et al. 2012) for the SW rim. The width of the broad Balmer line was claimed by the authors to be suggestive of efficient CR acceleration. In order to infer the CR acceleration efficiency the authors made use of the calculations by (Blasi et al. 2012), which, as discussed by Morlino et al. (2012), adopt some assumptions on the distribution function of neutral hydrogen that may lead to a serious overestimate of the acceleration efficiency for fast shocks. Moreover, a closer look at the morphology of this SNR, reveals that the SW rim might be moving with a lower velocity than assumed by Ghavamian et al. 2000, possibly as low as ∼4000 km/s. Both these facts have the effect of implying a lower CR acceleration efficiency, as found by Blasi et al. (2012).

In Fig. 15 (from Blasi et al. (2012)) I show the FWHM of the broad Balmer line in the SW rim of SNR 0509-67.5 as a function of the acceleration efficiency, for shock velocity \(V_{\it sh}=4000\) km/s (on the left) and \(V_{\it sh}=5000\) km/s (on the right) and a neutral fraction h N =10 %. The shaded area represents the FWHM as measured by Blasi et al. (2012), with a 1σ error bar. The curves refer to β down =0.01, 0.1, 0.5, 1 from top to bottom. For low shock speed and for full electron-ion equilibration (β down =1) the measured FWHM is still compatible with no CR acceleration. On the other hand, for such fast shocks, it is found that β down ≪1 Morlino et al. (2013c), in which case one can see that acceleration efficiencies of ∼10–20 % can be inferred from the measured FWHM.

Fig. 15
figure 15

FWHM of the broad Balmer line as a function of the CR acceleration efficiency for the SNR 0509-67.5, as calculated by Blasi et al. 2012, assuming a shock velocity \(V_{\it sh}=4000\) km/s (left panel) and \(V_{\it sh}=5000\) km/s (right panel) and a neutral fraction h N =10 %. The lines (from top to bottom) refer to different levels of electron-ion equilibration, β down =0.01,0.1,0.5,1, The shadowed region is the FWHM with 1σ error bar, as measured by Caprioli 2011

The case of RCW86 is more complex: the results of a measurement of the FWHM of the broad Balmer line were reported by Blasi et al. (2012), where the authors claimed a FWHM of 1100±63 km/s with a shock velocity of 6000±2800 km/s and deduced a very large acceleration efficiency (∼80 %). In a more recent paper by the same authors Morlino et al. (2013c), the results of Morlino et al. (2013c) were basically retracted: several regions of the SNR RCW86 were studied in detail and lower values of the shock velocity were inferred. Only marginal evidence for particle acceleration was found in selected regions. The morphology of this remnant is very complex and it is not easy to define global properties. Different parts of the SNR shock need to be studied separately. In addition, the uncertainty in the distance to SNR RCW86 is such as to make the estimate of the acceleration efficiency even more difficult.

Anomalous widths of narrow Balmer lines have been also observed in several SNRs (see, e.g. (Helder et al. 2010, 2011)). The width of such lines is in the 30–50 km/s range, implying a pre-shock temperature around 25000–50000 K. If this were the ISM equilibrium temperature there would be no atomic hydrogen, implying that the pre-shock hydrogen is heated by some form of shock precursor in a region that is sufficiently thin so as to make collisional ionization equilibrium before the shock unfeasible. The CR precursor is the most plausible candidate to explain such a broadening of the narrow line.

Most important would be to have measurements of the width of the narrow and broad components (and possibly intermediate component) of the Balmer line at the same location in order to allow for a proper estimate of the CR acceleration efficiency. Co-spatial observation of the thermal X-ray emission would also provide important constraints on the electron temperature. So far, this information is not yet available with the necessary accuracy in any of the astrophysical objects of relevance.

Recent observations of the Balmer emission from the NW rim of SN1006 (Helder et al. 2009) have revealed a rather complex structure of the collisionless shock. That part of the remnant acts as a bright Balmer source, but does not appear to be a site of effective particle acceleration, as one can deduce from the absence of non-thermal X-ray emission from that region. This reflects in a width of the broad Balmer line that appears to be compatible with the estimated shock velocity in the same region, with no need for the presence of accelerated particles. The observations of Helder et al. (2010, 2011) provide, however, a rather impressive demonstration of the huge potential of Balmer line observations, not only to infer the CR acceleration efficiency, but also as a tool to measure the properties of collisionless shocks.

8 Conclusions

The problem of the origin of cosmic rays is a complex one: what we observe at the Earth results from the convolution of acceleration inside sources, escape from the sources and propagation in the Galaxy (or in the Universe, for extragalactic cosmic rays). Each one of these pieces consists of a complex and often non-linear combination of pieces of physics. This intricate chain of physical processes and the fact that wildly different spatial and temporal scales are involved represent the very reasons why we are still discussing of the problem of the origin of cosmic rays, one century after the discovery of their existence.

Here I summarized the main aspects of the physics of acceleration of CRs in SNRs, emphasizing the progress made in the last decade or so, as well as the numerous loose ends deriving from the comparison between theoretical predictions and observational findings.

At the time of writing this review, there is enough circumstantial evidence suggesting that SNRs accelerate the bulk of Galactic CRs, so as to introduce the concept of SNR paradigm. This evidence is mainly based on the following pieces of observation: (1) gamma-ray measurements, both from the ground and from space, prove that SNRs accelerate particles up to at least 50–500 TeV (Helder et al. (2010, 2011); van Adelsberg et al. (2008), Morlino et al. (2013a); Helder et al. (2010, 2011)). In some of these cases (for instance in Tycho) one can make the case that the observed gamma-ray emission is most likely due to the decay of neutral pions, thereby supporting the hypothesis that CR protons are being accelerated. (2) X-ray spectrum and morphology strongly suggest that magnetic field amplification is taking place at SNR shocks Morlino et al. (2013b), in virtually all young SNRs that we are aware of, with field strength of order few 100μG Morlino et al. (2013b). This phenomenon is most easily explained if accelerated particles induce the amplification of the fields through the excitation of plasma instabilities. In this way, particles scatter on waves that are produced by the same particles that are being accelerated Helder et al. (2010). (3) In selected SNRs there is evidence for anomalous width of the Balmer lines, which can be interpreted as the result of efficient CR acceleration at SNR shocks Morlino et al. (2013b).

Despite the confidence that SNRs may act as the main sources of the bulk of Galactic CRs, at present there is not yet any evidence of an individual SNR accelerating CRs up to the knee, although, as discussed by Helder et al. (2010), this may not be surprising, because of the relatively short duration of the phase during which acceleration to the highest energies is expected to take place. More disturbing is the lack of a complete understanding of the physical mechanisms responsible for magnetic field amplification. I discussed here several ideas on how magnetic field amplification may occur and how this phenomenon feeds back on the distribution function of accelerated particles. While it appears that there are several ways of describing the large magnetic fields inferred from X-ray morphology, it seems harder to produce these fields on spatial scales relevant for particle scatterings at the highest energies. In other words, the issue of the highest energy achievable at SNR shocks remains open. Promising results in this direction are, however, recently arising from numerical investigations of the development of a filamentation instability Morlino et al. 2013b, which might represent a breakthrough in our understanding of the connection between particle escape from the accelerator and generation of turbulence on the necessary spatial scales.

Magnetic field amplification and CR dynamical reaction on the accelerator represent the two main ingredients of the non-linear theory of particle acceleration at SNR shocks. The main predictions of the theory are that (1) the spectra of accelerated particles are no longer power laws, being concave in shape and possibly harder than predicted by the test-particle theory of DSA, and (2) that the temperature of the plasma behind the shock is expected to be lower at a SNR shock that is accelerating CRs effectively than it would be in the absence of particle acceleration.

The spectra of accelerated particles predicted by NLDSA, as well as the test-particle spectra, are at odds with the current observations of gamma-ray emission from SNRs and with the anisotropy observed at Earth. The physical reason for this discrepancy is that since the spectra of particles accelerated at SNRs are so hard, the required diffusion coefficient in the Galaxy is a rather steep function of energy, D(E)∝E 0.7 at relativistic energies Helder et al. (2010, 2011). Such a dependence is known to be incompatible with the measured anisotropy at energies E≳10 TeV (Ghavamian et al. 2007, 2013). The hard spectra inside the sources also appear to be incompatible with the gamma-ray spectra from a sample of SNRs Helder et al. (2009). It is worth recalling that the spectra of particles escaping a SNR are not as concave as the spectra of particles accelerated at any given time at the shock (Helder et al. 2013), but this effect is not sufficient to solve the anisotropy problem. Several authors Helder et al. (2009) suggested that appreciably steeper spectra may be obtained by assuming fast moving scattering centers in the upstream fluid, but this effect appears to be dependent on rather poorly known characteristics of the waves responsible for the scattering.

A deeper look into the physics of particle acceleration in SNRs will be possible with the upcoming new generation of gamma-ray telescopes, most notably the Cherenkov Telescope Array (CTA) Sollerman et al. 2003. The increased sensitivity of CTA is likely to lead to the discovery of a considerable number of other SNRs that are in the process of accelerating CRs in our Galaxy. The high angular resolution will allow us to measure the spectrum of gamma-ray emission from different regions of the same SNR so as to achieve a better description of the dependence of the acceleration process upon the environment in which acceleration takes place.

Interestingly, it has recently been realized that the presence of accelerated particles in the shock region of a SNR exploding in a partially ionized medium leads to considerable modification of the acceleration process (Nikolić et al. 2013), as well as to modification of the shape of the Balmer line emission from hydrogen atoms Nikolić et al. (2013). Measurements of the Balmer emission from SNRs that show evidence of particle acceleration is a unique tool to measure the CR acceleration efficiency. The very high angular resolution of optical observations may, in principle make possible to achieve a detailed investigation of the CR acceleration process in SNRs.

The general picture that arises from the SNR paradigm inspires some confidence that we may unfold the mechanism responsible for the acceleration of CR protons up to a few PeV, and of nuclei of charge Z to an energy Z times larger. For iron nuclei this implies that the maximum energy should be ∼1017 eV. This energy should also flag the end of the Galactic CR spectrum. The fact that this energy is much lower than the ankle, where traditionally the transition from Galactic to extragalactic CR has been placed, has stimulated a considerable interest in the development of models that may be able to describe at once the CR spectrum in the transition region and the chemical composition observed by different experiments in the relevant energy region (see Aharonian 2013 for a review). At the time of writing of this review, it is unclear whether the low maximum energy inferred based on the SNR paradigm are compatible with the observed chemical composition and spectra. Recent data collected with the KASCADE-Grande experiment Brandt et al. 2013a and ICETOP 2013b suggest that some additional CR component is needed in the energy region between 1017 eV and 1019 eV. The required chemical composition by these data at 1018 eV is a roughly equal mix of light and heavy nuclei, which does not appear to be in obvious agreement with the chemical composition observed by the Pierre Auger Observatory Holder 2012, HiRes (Völk et al. 2005) and Telescope Array (Vink 2012), which find a chemical composition at 1018 eV that is dominated by a light chemical component. The understanding of the transition region through increasingly more accurate measurements of chemical composition is a crucial step towards figuring out the origin of ultra high-energy cosmic rays, which still represents a big unsolved problem.