1 Introduction

Coronal mass ejections (CMEs) are certainly one of the most impressive consequences of solar dynamics. They have acquired growing importance in the field of space-weather forecasting, mainly motivated by their recognized capability to interact with Earth’s magnetosphere and ultimately trigger geomagnetic storms, with possible harmful consequences for various human technologies. CMEs can drive extensive magnetohydrodynamic (MHD) shocks, as they travel in the interplanetary (IP) medium carrying out vast amounts of plasma and magnetic fields at velocities that can surpass \(2000~\mbox{km}\,\mbox{s}^{-1}\) (e.g. Yashiro et al., 2004). These shocks excite electrons, which in turn produce radio emission at the local plasma frequency, related to the local electron density through \(f[\mbox{kHz}]=9\sqrt{n_{e}}\) [\(\mbox{cm}^{-3}\)], and/or its first harmonic (e.g. Reiner et al., 1997). As the shock encounters regions of lower density, the frequency of the emission decreases, giving rise to a slowly drifting radio emission called Type-II radio burst (TII). Metric TII bursts may start at around 400 MHz close to the Sun, while kilometric emissions may reach down to 20 kHz at \(\mathrm{L}_{1}\) spacecraft. As a result of the filtering effect of Earth’s ionosphere, the detection of these longer wavelengths is only possible by means of space-borne instruments.

The first in-situ observation of the source region of a TII radio emission by Bale et al. (1999) confirmed previous analyses (Reiner et al., 1997, 1998b,a) that indicated that TII emissions are generated in the upstream region of CME-driven shocks (hereafter CMEs/shocks). Multiple studies on the source regions of TII emissions have followed. For instance, Knock et al. (2001, 2003) and Knock and Cairns (2005) analyzed the TII emissions originating at shocks moving through various environments, such as a quiet corona and solar wind, coronal loops, CIRs, and preexisting CMEs, on the basis of a theoretical model that predicts the source of the TII radio emission to lie in the foreshock region upstream of an MHD shock front. Reiner et al. (1998b) and Gopalswamy et al. (2001b) revealed the importance of upstream plasma conditions on the detected spectra through correlations between pre-existing plasma structures and changes in emission levels, while Cho et al. (2011) identified the CME nose and CME-streamer interaction as the sites of the multiple TII emissions that they analyzed.

The capability of TII emissions to track CMEs/shocks in the IP medium has been exploited by previous studies focusing on few specific events (e.g. Pinter, 1982; Smart and Shea, 1985; Pinter and Dryer, 1990; Reiner et al., 1998b; Dulk, Leblanc, and Bougeret, 1999; Leblanc et al., 2001; Hoang et al., 2007) and by studies combining information extracted from white-light coronagraph images, low-frequency radio spectra, in-situ spacecraft detections, and/or interplanetary scintillation (IPS) data (Pohjolainen et al., 2007; Cho et al., 2007; Feng et al., 2009; Gonzalez-Esparza and Aguilar-Rodriguez, 2009; Bisi et al., 2010; Liu et al., 2013; Iju, Tokumaru, and Fujiki, 2013). The first joint analysis of white-light, radio, and in-situ observations on a statistically relevant number of cases is that of Reiner, Kaiser, and Bougeret (2007). They described the propagation profiles of 42 CMEs/shocks that occurred during Solar Cycle 23 by assuming a constant deceleration up to a certain distance, congruent with the in-situ shock arrival time and speed. Their study does not take into account kilometric TII emissions, which are only evident in the dynamic spectra of the Wind/WAVES Thermal Noise Receiver (TNR: see next section). The kilometric range of TII emissions (kmTII: 300 – 30 kHz) is of particular interest for the present study because the distances at which these emissions take place (≈ 20 – 170 \(\mathrm{R}_{\odot}\)) are favorable for issuing forecasts of shock arrival time (SAT).

In addition to their usefulness in tracking and describing the propagation of the shocks driven by CMEs, TII emissions have also been employed to predict the arrival time of their associated shocks at Earth’s geospace. For an overview of TII emission-based models aimed at forecasting arrival time of shocks at Earth, see the review by Pick and Vilmer (2008). Some good proxies were also obtained from empirical models based on, or combined with, other data sets, such as white-light coronal observations (e.g. Gopalswamy et al., 2000, 2001a; Smith et al., 2003; Michałek et al., 2004; Schwenn et al., 2005), ground-based interplanetary-scintillation measurements (Manoharan, 2006), solar energetic particles (Qin, Zhang, and Rassoul, 2009), novel images from the Heliospheric Imager (HI: Eyles et al., 2009) instruments (Möstl and Davies, 2013; Mishra and Srivastava, 2013) onboard the Solar-Terrestrial Relations Observatory (STEREO: Kaiser et al., 2008). Additional research comparing Enlil model (Odstrcil, Pizzo, and Arge, 2005) runs for 16 events with the kmTII-based technique described below was reported by Xie et al. (2013).

The empirical technique based on kmTII emissions by Cremades, St. Cyr, and Kaiser (2007) (hereafter CSK2007) was conceived to obtain proxies of SAT at Earth. They analyzed distinct data sources for both radio and in-situ observations covering the years 1997 to 2004. For the 92 matched pairs of kmTII emissions and MHD shocks, they derived the approximate location and radial speed of the MHD shock at the times of the radio emission. Estimated values of SAT at Earth were then obtained by assuming constant speed. These SAT values were, however, highly dependent on the electron density value at 1 AU required by the density model, while in the dynamic-spectrum images only two patches of radio emission were allowed by the selection procedure. These issues are addressed in Section 3 and their impact on the prediction technique is described in Section 5.1. Furthermore, an approach that departs from the CSK2007 dataset and introduces information of the associated coronal counterparts helps to describe the propagation of the analyzed events from Sun to Earth (Section 3). The outcome provides information on the characteristics of the dataset (Section 4) and stimulates the outline of another predictive version of the technique, presented in Section 5.2.

2 Data Sets

A vast amount of data and catalogs were employed to inspect different propagation stages of CMEs/shocks during the years 1997 – 2007. The starting point of the study is the CSK2007 list of kmTII-in-situ shock pairs, introduced in the following subsection, while CME counterparts were associated with the IP pairs of events as described in Section 2.2.

2.1 Radio and in-situ Data

All radio data used in this study were provided by the Radio and Plasma Wave (WAVES) experiment onboard the Wind spacecraft (Bougeret et al., 1995). WAVES detects radio emission in three different spectral ranges by means of three receivers: Radio Receiver Band 2 (RAD2: 13.825 – 1.075 MHz), Radio Receiver Band 1 (RAD1: 1040 – 20 kHz), and Thermal Noise Receiver (TNR: 256 – 4 kHz). Solar kilometric emissions are filtered by Earth’s atmosphere, thus the only way to detect them is by means of space-based instrumentation. As mentioned previously, one of the peculiarities of the present study is that it relies on TNR kilometric emissions, given TNR’s much better spectral resolution in most of the kilometric frequency range in comparison with that of RAD1, although the latter does also cover the kilometric wavelength range. TII emissions occurring at or extending to kilometric frequency ranges, i.e. lower than 300 kHz, were extracted from the Wind/WAVES Type-II radio bursts list maintained by Kaiser ( www-lep.gsfc.nasa.gov/waves/data_products.html ). From 1997 to 2007, a total of 181 kmTII radio emissions were found.

MHD shocks detected at \(\mathrm{L}_{1}\) by the Advanced Composition Explorer (ACE: Stone et al., 1998) and Wind spacecraft were obtained from the IP shocks catalogs developed by Berdichevsky et al. ( pwg.gsfc.nasa.gov/wind/current_listIPS.htm ; Wind), the Space Plasma Group at the Massachusetts Institute of Technology – now available at the Harvard-Smithsonian Center for Astrophysics ( www.cfa.harvard.edu/shocks/ ; Wind and ACE); and the Experimental Space Plasma Group at the University of New Hampshire ( www.ssg.sr.unh.edu/mag/ace/ACElists/obs_list.html ; ACE). After cross-referencing information from all of these catalogs, 333 forward shocks were identified during the years 1997 to 2007. In principle, a shock is considered to be potentially associated with a kmTII emission if it occurs within 72 hours after the appearance of the radio emission. If no candidate shock was found within that time span, the following time interval was investigated. Only in three cases (out of the 71 analyzed events) was a shock detected after three days, in particular after 79 hours (event #3 – see Table 1), 93 hours (#4), and 82 hours (#10).

Table 1 CME-kmTII-shock associations and main characteristics of the 71 analyzed events. Column (1): event number. (2) CME start time. (3) TII start time. (4) SAT at Wind. (5) CME source region heliocentric coordinates. (6) Halo CME type (PH = partial halo, OA = outline asymmetry, BA = brightness asymmetry, S = symmetry; a number indicates the angular width, in degrees, of a non-halo CME). (7) Kinematic profile type exhibited in SOHO/LASCO. (8) Frequency range of the TII [kHz]. (9) TII duration [h]. (10) Adopted values of \(n_{0}\) [\(\mbox{cm}^{-3}\)]. (11) Speed solely derived from the TII emission. (12) in-situ shock speed [\(\mbox{km}\,\mbox{s}^{-1}\)]. (13) Shock-transit speed [\(\mbox{km}\,\mbox{s}^{-1}\)]. (14) Existence of BDEs in in-situ data. (15) Presence of MC signatures (MFR = only rotation in field direction, (S) = “Schwenn”, (B) = “Berdichevsky et al.”). (16) Descriptive profile propagation type. (17) Shock initial speed [\(\mbox{km}\,\mbox{s}^{-1}\)]. (18) Average acceleration [\(\mbox{m}\,\mbox{s}^{-2}\)]. The latter two are derived from the descriptive fit.

2.2 Coronal Data

Backtracking of the kmTII-shock IP features to CMEs in the solar corona involved white-light data provided by the Large Angle Spectroscopic Coronagraph (LASCO: Brueckner et al., 1995) onboard the Solar and Heliospheric Observatory (SOHO: Domingo, Fleck, and Poland, 1995). First, the existence of CME events in agreement with temporal and spatial considerations was ascertained by inspection of the Coordinated Data Analysis Workhop (CDAW) SOHO/LASCO CME Catalog ( cdaw.gsfc.nasa.gov/CME_list ; Yashiro et al., 2004). Preliminary associations were subsequently compared and verified with the aid of CME-interplanetary CME (ICME) lists, namely: the Richardson and Cane Near-Earth ICMEs ( www.srl.caltech.edu/ACE/ASC/DATA/level3/icmetable2.htm ; henceforth RC ICME list), the list of associations during 1997 – 2005 in Table 1 of Gopalswamy et al. (2005c), and Schwenn’s list of CME–ICME associations during 1997 – 2001 (R. Schwenn, personal communication, 2004).

Thirteen kmTII-shock pairs from the original CSK2007 list with 94 events were discarded, either because their speed was implausibly slow, because the CME onset time and the backtracking of the kmTII distance–time profile were incompatible, or because the CME source location and/or angular width was inconsistent with an Earth-directed radio emission. To the remaining 81 events, four corresponding to the year 2005 were added, while 14 kmTII-shock pairs occurred after or during SOHO/LASCO data gaps. These issues reduced the data set to 71 events. We also note that none of the kmTII-shock pairs during 2006 and 2007 could be associated with a CME. Figure 1 shows the yearly frequency of forward shocks (black bars) and kmTII radio emissions (dark-gray bars), in comparison with the number of kmTII-shock pairs that could be associated to a CME (light-gray bars). The solar-cycle variation agrees with that found by Gopalswamy et al. (2010) during 1996 – 2006 for associations between in-situ shocks and metric–kilometric TII emissions. The 71 triple associations are listed below in Table 1 of Section 4.

Figure 1
figure 1

Yearly frequency of in-situ forward shocks (black bars), kmTII radio emissions (dark-gray bars), and kmTII-shock pairs that could be associated with a CME (light-gray bars).

3 Propagation Profiles

3.1 Methodology

The various stages of propagation were put together to assemble distance–time plots for each of the 71 CME-kmTII-shock events: CME height in the solar corona, IP distance derived from the kmTII emissions, and arrival time of the shock detected in-situ at Wind or ACE. Height information for CMEs in the solar corona was obtained from the CDAW SOHO/LASCO CME Catalog. Since the catalog provides height values of three-dimensional entities projected in the two-dimensional plane of the sky, these should be interpreted with caution. Projection effects in these coronal height–time points are addressed in the next subsection.

Interplanetary propagation of shocks ahead of CMEs is tracked by means of kmTII emissions discernible in TNR dynamic spectra, on the basis of the relationship between local plasma frequency and density. A typical kmTII emission detected by TNR is displayed in Figure 2, where the vertical axis represents the inverse of the frequency in units of \(\mbox{kHz}^{-1}\), and the horizontal axis shows the time in hours. Dynamic spectral plots in the 1/\(f\) space show drifting radio emission organized approximately along straight lines, given that 1/\(f\) can be assumed to be equivalent to the heliocentric distance [\(R\)] by considering the IP plasma density to roughly vary as 1/\(R^{2}\) (Bougeret, King, and Schwenn, 1984; Reiner et al., 1998a). Naturally, this does not hold true for complex events and when pre-existing structures are present in the IP medium (Knock and Cairns, 2005). In Figure 2, white crosses are the points manually selected as representative of the kmTII emission. While CSK2007 took into account only two representative points to obtain the slope of the drifting emission, here we introduce the possibility of selecting several points. This methodology reduces the susceptibility of the slope determination to errors in the selected data points and thus helps to relax the selection process, given that TII emissions are commonly intermittent and of varying bandwidth, in a noisy and contaminated environment. The effect of varying bandwidth is neglected in this approach, which systematically considers the selection of the central point of each kmTII patch.

Figure 2
figure 2

KmTII emission detected by Wind/WAVES TNR on 13 September 2004. The ordinates are plotted as the inverse of the frequency [\(\mbox{kHz}^{-1}\)], while abscissae represent time [hours]. Intensity in dB is color-coded.

TII emission points in the frequency domain are translated into distance from the Sun by means of the relation between local plasma frequency and density \(f[\mbox{kHz}]=9\sqrt{n_{e}}\) [\(\mbox{cm}^{-3}\)] in combination with the Leblanc, Dulk, and Bougeret (1998) coronal/interplanetary density model. According to this empirical model, derived from Wind/WAVES and ground-based radio observations, the electron density [\(n_{e}\)] decreases with increasing heliospheric distance [\(r\)] in units of \(\mathrm {R}_{\odot}\) as \(n_{e}(r)=3.3\times10^{5} r^{-2}+4.1\times10^{6}r^{-4} + 8.0 \times 10^{7} r^{-6}~\mbox{cm}^{-3}\). The equation is solved for \(r\) by using the globally converging Newton method. This model is valid for a density at 1 AU \(n_{0} = 7.2~\mbox{cm}^{-3}\), while for individual bursts with different \(n_{0}\) the model is multiplied by \(n_{0} / 7.2\). The electron-density value at 1 AU [\(n_{0}\)] is a crucial input to the model that directly affects the calculation of the shock location in the IP medium. According to Leblanc, Dulk, and Bougeret (1998), assuming \(n_{0} = 7.2~\mbox{cm}^{-3}\) is a good approximation for all cases when the fluctuation of its real value prevents adopting a better one. This was the criterion used by CSK2007 in most cases. However, this becomes a major source of error due to the wide variation range of \(n_{0}\) (2 to \(39~\mbox{cm}^{-3}\)), especially at solar maximum. To reduce the uncertainty in \(n_{0}\), a more realistic value was adopted, based on results of a neural-network procedure that detects the local plasma density at the spacecraft (Bougeret et al., 1995), whose outcome is available at the Coordinated Data Analysis Web (CDAWeb; cdaweb.gsfc.nasa.gov/ ). More precisely, the \(n_{0}\)-value used to feed the Leblanc, Dulk, and Bougeret (1998) density model was computed as the mean value during the day(s) in which a specific kmTII event was observed. Figure 3 displays the dynamic spectrum corresponding to the kmTII on 13 September 2000. This is an example of the plasma frequency fluctuating at \(\mathrm{L}_{1}\), with an average value considerably differing from \(7.2~\mbox{cm}^{-3}\). The white line following the plasma-frequency line represents the high-resolution values obtained from CDAWeb, after a filtering process that eliminates high-frequency noise. The pink-horizontal line at ≈ 31 kHz (or \(11.6~\mbox{cm}^{-3}\)) stands for the mean value used as input for the density model, while the green line at 24 kHz corresponds to \(7.2~\mbox{cm}^{-3}\).

Figure 3
figure 3

TNR dynamic spectrum during 12 – 13 September 2000. The white line on top of the plasma frequency line corresponds to the electron density at \(\mathrm{L}_{1}\) derived from the CDAWeb data. The mean value during that time period is represented by the pink line at ≈ 31 kHz (\(11.6~\mbox{cm}^{-3}\)). For comparison, the green line is drawn at the typical value of 24 kHz (\(7.2~\mbox{cm}^{-3}\)). The considered kmTII emission takes place from 0 to 12 hours on 13 September.

The interplanetary distance of the CME/shock derived in this way from the kmTII radio emissions, together with the CME height–time measurements from coronal data and the SAT at 1 AU, can be combined in single distance–time plots for each event. Two examples are shown in Figure 4. As mentioned earlier, the SOHO/LASCO height–time points have inherent projection effects. As for the kmTII distance–time points, it is assumed here that the source of the radio emission (namely the shock’s emitting parcel) is traveling approximately along the Sun–Earth line, even if the bulk of the ICME propagates off it.

Figure 4
figure 4

The three stages of CME/shock propagation: CME (asterisks) and kmTII (crosses) distance–time points, and shock arrival at 1 AU (triangle). The solid-black line is the best fit, while the dash–dotted line is a linear fit through the SOHO/LASCO points. The right panel includes several models applied to SOHO/LASCO height–time points to remove projection effects (colored lines): XOL (Xie, Ofman, and Lawrence, 2004), magenta line; MGY (Michałek, Gopalswamy, and Yashiro, 2003), orange line; ZPL (Zhao, Plunkett, and Liu, 2002), green line; HNK (Howard et al., 2008), cyan line; and DSG (Dal Lago, Schwenn, and Gonzalez, 2003), purple line.

3.2 Descriptive Profiles

After plotting data points together, and to gain understanding of the various propagation profiles exhibited by this set of events from the Sun to 1 AU, the distance–time points were fitted to curves. Equations (1) and (2) were used to represent the accelerated and decelerated cases, respectively:

$$\begin{aligned} d (t) =& a t^{2} +b t +c, \end{aligned}$$
(1)
$$\begin{aligned} d (t) =& \sqrt{a t +b}+c, \end{aligned}$$
(2)

where \(d\) is the distance from the Sun, \(t\) is time, and \(a\), \(b\), \(c\), are coefficients that arise from the fitting, performed by means of a Levenberg–Marquardt least-squares fit (Markwardt, 2009). Naturally, time [\(t\)] and distance [\(d\)] must be positive. The equation for the decelerated case is no other than Equation (1) solved for \(t\), i.e. its inverse, with variable names \(t\) and \(d\) swapped. Its behavior is very similar to that of Equation (1) with a negative value of \(a\), but with a steeper slope close to the Sun, which approximates the sets of data points gathered at different stages of propagation.

The left panel of Figure 4 is an example of a typical decelerated case, while the right panel exhibits a nearly linear propagation profile. In the figure, the solid-black line is henceforth referred to as the “descriptive profile”, namely the best fit to the kmTII points, connecting only the first appearance of the CME in the coronagraph (to avoid including projection effects) and the shock arrival at 1 AU. The black dash–dotted line is a linear fit through the SOHO/LASCO height–time points, projected to 1 AU.

The right panel of Figure 4 in addition includes several models that have been applied to the SOHO/LASCO height–time points to remove projection effects from a nearly Earth-directed event. There are two reasons why it is not possible to straightforwardly compare the LASCO height–time points with the kmTII and shock arrival points: i) the SOHO/LASCO points are a projection in the plane of the sky of an approximately Earthward-traveling event, and ii) most often it is the projected height of the leading edge, and not of the shock, that is measured in coronagraph images. In an attempt to account for the effects of i), the propagation profiles corrected by several methods (colored lines in Figure 4, right panel) have been compared with the descriptive profile (solid-black line). The methods considered are DSG (Dal Lago, Schwenn, and Gonzalez, 2003), ZPL (Zhao, Plunkett, and Liu, 2002), MGY (Michałek, Gopalswamy, and Yashiro, 2003), XOL (Xie, Ofman, and Lawrence, 2004), and HNK (Howard et al., 2008). The DSG is based on the concepts of radial and expansion speed, and its application to all events is straightforward. Radial values of speed derived from the ZPL, MGY, and XOL cone models were used only when available in Xie et al. (2006), accounting for 17 events in common – enough for the purposes of this study. The HNK makes use of the CME source region location and 3D aspects of its trajectory, and corrected values of speed and acceleration where provided for 24 of the events here analyzed by D. Nandy (private communication, 2008).

The right panel of Figure 4 is one of the 12 cases for which values corrected for projection effects by all methods were available. It is evident from the figure that none of the corrected speed profiles (solid-colored lines) well approximate the descriptive profile based on the kmTII points, the first appearance of the CME, and the shock arrival at 1 AU (solid-black line). This was a common situation when attempting to assess the performance of these five methods, with some few cases exhibiting a preference randomly, and not consistently, for one or two of them. It must be noted that the DSG, ZPL, MGY, and XOL methods assume a constant velocity, which is nearly true for 19 events according to the analysis of their speed profiles (see next section). HNK, in spite of being a second-order method, in the general case does not approach the behavior of the descriptive propagation profile. Although all of these methods are oriented to deduce the radial speed of a CME and not the Earth-directed component – not necessarily the same – it is assumed here that this subset of 12 events propagates approximately along the Sun–Earth line, given that their source-region coordinates lie within \(30^{\circ}\) of the central meridian, except for two outliers in longitude (see Table 1). In addition, it must be taken into account that even if these models were successful in removing projection effects, their validity is limited to coronal heights, since propagation conditions in the IP medium may drastically differ due to inhomogeneities in the ambient solar wind (e.g. Pohjolainen et al., 2007).

4 Characteristics of the Events Under Study

The set of 71 CME-kmTII-shock triplets during 1997 – 2007 is peculiar on its own, given the low rate of associations found (see Figure 1). Therefore, it is of particular interest to investigate various properties of these events starting at their source regions at the Sun and in their different stages of propagation. Table 1 summarizes some of the main characteristics compiled for all analyzed events. In all cases an empty cell indicates absence of information in the corresponding data source. The first column assigns an event number to each entry, while the following three columns refer to the triple associations made: CME start time as reported by the CDAW SOHO/LASCO CME Catalog, start time of the TII emission as informed by the Wind/WAVES Type-II radio bursts list, and shock arrival time at the Wind spacecraft.

The solar origins of the CME events associated with the kmTII/shock pairs were ascertained with the aid of low coronal images provided by the Extreme ultraviolet Imaging Telescope (EIT: Delaboudinière et al., 1995) onboard SOHO. The central heliographic coordinates of the candidate source regions are listed in Column 5 of Table 1. For four events it was not possible to obtain the coordinates, either because there were no SOHO/EIT images, because the source could not be unequivocally determined, or because it was located behind the western limb. The obtained central coordinates graphed as histograms in Figure 5 show a two-peak latitudinal distribution in agreement with the two activity belts, and a clear preference for western longitudes. Cliver, Kahler, and Reames (2004) as well as Gopalswamy et al. (2008) have found a weaker western bias in the sources of CMEs associated with metric and/or decametric–hectometric TIIs. Furthermore, the visibility of solar energetic particles drastically increases toward the west limb in the event of metric and particularly decametric–hectometric TIIs (Cliver, Kahler, and Reames, 2004; Gopalswamy et al., 2008).

Figure 5
figure 5

Central latitude (left panel) and longitude (right panel) distributions of the candidate source regions of the CME-kmTII-shock triplets.

Ninety-seven percent of the CMEs composing the set of analyzed events were either full or partial halo CMEs, i.e. with an angular width greater than \(120^{\circ}\) (see Column 6 of Table 1). The two CMEs that account for the remaining events (#5 and #34) are particular cases of CMEs that arose close to disk center, but exhibiting very dim coronal signatures, which is the reason for the low measured angular width. Partial-halo CMEs account for 20 % of the events. According to the CDAW SOHO/LASCO Catalog classification of full-halo CMEs, 38 % of the events were of the “outline asymmetry” (OA) type, 34 % of the “brightness asymmetry” (BA), and 6 % of them were symmetric (S). The large portion of full-halo CMEs of the OA type is in accordance with the predominance of CME source regions with western longitudes. Although several CMEs in Table 1 originate close to the west limb, their condition of full-halo CMEs indicates their large extent in angular width. Therefore, it is plausible to assume that the kmTII emissions associated with these events originate in the portion of the shock that is traveling in the direction of Earth. Limb full-halo CMEs are worth of being considered, given that they have the potential to be geoeffective (e.g. Cid et al. 2012).

CME height–time diagrams were built for the 71 events from projected-height information provided by the CDAW SOHO/LASCO CME Catalog, i.e. points are confined to the coronal heights covered by the SOHO/LASCO coronagraphs. Column 7 of Table 1 indicates the type of propagation profile exhibited by each coronal event in the SOHO/LASCO field of view: 44 % appear accelerated, 45 % decelerated, and 11 % linear. Kinematic profiles are considered linear if their average acceleration (second derivative of the distance–time expression of each event’s propagation profile) is within \({\pm}\,1.5~\mbox{m}\,\mbox{s}^{-2}\). As mentioned before, these values have projection effects attached and must be interpreted with caution. The behavior of descriptive profiles obtained as addressed in Section 3.2, i.e. after including distance–time points derived from the kmTII information and the shock arrival at 1 AU, is listed in Column 16 of Table 1. They do not exhibit the same behavior as in the corona, presumably because of varying conditions in the IP medium that modify the propagation of CMEs/shocks. Statistics of the descriptive profiles yield 39 % of the decelerated type, 34 % accelerated, and 27 % of the linear type, when considering as linear events those with an acceleration within \({\pm}\, 1.5~\mbox{m}\,\mbox{s}^{-2}\).

Column 8 of Table 1 lists the frequency range of the 71 TII emissions analyzed here, as reported by the Wind/WAVES Type-II radio bursts list. All of them naturally extend down to frequencies corresponding to kilometric wavelengths. Almost 23 % of these TII events were limited to the kilometric domain, while 32 % began in the upper detection limit of RAD2. The rest of the reported TII start frequencies were spread in the frequency detection range of the Wind/WAVES detectors. Likewise, the duration in hours of the complete TII emission as reported by the Wind/WAVES list is presented in Column 9 of Table 1. The TII durations, whose distribution is presented in Figure 6 (left panel), range from nearly 1 hour to 65 hours, with an average of 25 hours and a standard deviation of 13 hours.

Figure 6
figure 6

Left panel: Distribution of the TII emission duration for the 71 analyzed cases. The bin size is ten hours. Right panel: Distribution of the \(n_{0}\)-value used as input for the coronal/interplanetary density model. The third bin (44 events) appears truncated in the graph due to its high frequency.

The values of \(n_{0}\) used as input for the coronal/interplanetary density model are presented in Column 10 of Table 1. For 35 of the 71 cases it was possible to deduce \(n_{0}\) using the technique introduced in Section 3. For 36 cases \(n_{0}\) was user-specified, either because the results of the neural-network procedure did not properly reproduce the local plasma density at the spacecraft (33 cases, typically during highly fluctuating intervals) or because there was no output from the procedure (last three events of the list). The value of \(n_{0}\) manually specified (bold values in Column 10 of the table) was \(7.2~\mbox{cm}^{-3}\) for periods of rapidly varying density, while for seven cases the plasma line was quite stable and a better representative mean value could be adopted. The right panel of Figure 6 shows the distribution of \(n_{0}\) for the 71 analyzed cases, where the third interval (6 – \(9~\mbox{cm}^{-3}\)) has been truncated due to its high occurrence (44 events), in view of the fact that 29 events were manually assigned with the \(7.2~\mbox{cm}^{-3}\) average value.

Speeds derived solely from the kmTII radio emission by means of the technique described in Section 3.1 are presented in Column 11 of Table 1. They do not correlate well with shock speeds derived from in-situ measurements, as also noted by CSK2007, but correlate well with shock-transit speeds. Shock-transit speeds, determined from the time difference between the shock arrival time at 1 AU and the time of the first CME detection, are listed in Column 13 of Table 1. Their distribution is presented in the left panel of Figure 7 in black columns, together with that of the corresponding in-situ speeds at Wind in dark-gray columns. The latter were obtained from the shock properties at www.cfa.harvard.edu/shocks/ , which are determined using the methods described by Pulupa, Bale, and Kasper (2010). Values are presented in Column 12 of Table 1 and were available for 73 % of the events (52) in the mentioned data source. Average values of the shock transit and the in-situ speeds are \(895~\mbox{km}\,\mbox{s}^{-1}\) and \(613~\mbox{km}\,\mbox{s}^{-1}\), respectively. This discrepancy is also evident in the right panel of Figure 7. For comparison, in the speed histogram the CME speed in the corona is also shown (light-gray columns), corresponding to the linear fit through the plane-of-sky projected height–time points available in the CDAW SOHO/LASCO CME Catalog. We note that there are 52 CMEs (73 %) with coronal speeds above \(1000~\mbox{km}\,\mbox{s}^{-1}\) and that the last bin of the histogram includes CMEs with speeds between 2000 and \(2800~\mbox{km}\, \mbox{s}^{-1}\).

Figure 7
figure 7

Left: Speed distribution in bins of \(200~\mbox{km}\,\mbox{s}^{-1}\), with the shock-transit speed in black columns, the in-situ shock speed in dark gray, and CME projected speed in light-gray columns. Right: shock-transit speeds (vertical axis) vs. in-situ shock speeds (horizontal axis) for 52 out of the 71 events.

Columns 14 and 15 of Table 1 display ICME characteristics when available at the RC ICME list. Evidence of bidirectional suprathermal electron strahls (BDEs) in ACE observations is indicated for 43 events by “Y” (yes) and “N” (no). BDEs were identified in 39 of these events (91 %). This percentage contrasts with the values obtained when computing the complete RC ICME list: out of 303 ICMEs, 204 (67 %) exhibit BDEs while 99 (33 %) do not. Presence of magnetic-cloud (MC) signatures is indicated in column 15, where “Y” means that an MC has been reported in association with the ICME or that the ICME has clear features of an MC, “MFR” indicates that the ICME shows evidence of a rotation in field direction, but lacks some other characteristics of an MC, for example a smoothly rotating and enhanced magnetic field, and “N” means that the ICME is not a reported MC and lacks most of its typical features. The letter “S” in parentheses after the “N” stands for “Schwenn” and “B” for “Berdichevsky et al.”; this indicates that the event was not listed in the RC ICME list, but was listed by the author in parentheses and not recognized as an MC. Information on this matter was found for 57 events. Fourteen events were identified as MCs in the RC ICME list, 16 more as only exhibiting magnetic-field rotation, and 27 as not being an MC by various authors.

Information on different kinematical parameters arises from the descriptive profiles of the CME/shock, i.e. the fit to the kmTII points connecting the first appearance of the CME in the coronagraph and the shock arrival at 1 AU. Some of the quantities that can be derived from the descriptive fit are the initial speed and the average acceleration, listed in Columns 17 and 18 of Table 1. The terms “initial speed” and “average acceleration” refer to the slope of the descriptive profile for \(t=0\) and to the second derivative of the distance–time expression of each event’s propagation profile, respectively. Although for simplicity the first and second derivative of the descriptive profiles are here called speed and acceleration, physical interpretations should be made with care. The initial speed does not show a preference with source-region latitude, although it is notable that all CMEs with initial speeds higher than \(2000~\mbox{km}\,\mbox{s}^{-1}\) originated in the southern hemisphere. This could be either fortuitous or due to a north–south asymmetry. Figure 8 is a scatter plot of average acceleration vs. initial speed from the descriptive profiles. Two outliers have been removed from the plot, namely those corresponding to events #31 and #61, given their implausibly high initial speed. We note that the corrected speed values as derived by Howard et al. (2008) are \(5455~\mbox{km}\,\mbox{s}^{-1}\) and \(7531~\mbox{km}\,\mbox{s}^{-1}\), respectively, for these events. In the figure, events have been classified according to their speed profile after the fitting: decelerated events are plotted as blue diamonds, accelerated events as red squares, and linear events as green triangles. Once more, linear events are considered those with an acceleration between \({\pm}\, 1.5~\mbox{m}\,\mbox{s}^{-2}\). It is straightforward that the initially fastest events are those that decelerate the most, while slower CMEs tend to accelerate. In addition, the decelerating process appears more uniform than the accelerating one, with initial-speed values more varied than for acceleration.

Figure 8
figure 8

Initial speed vs. average acceleration, both deduced from the fit connecting the CME, kmTII, and shock arrival points.

5 SAT Forecasting

5.1 Improved Linear Method

The linear method presented here is a modified version of the SAT forecasting technique developed by CSK2007. The analysis is implemented, on the one hand, to assess the impact of the improvements introduced in Section 3.1 on the quality of the predictions, and on the other to obtain a reference for direct comparison with the results obtained in Section 5.2. This kmTII-based methodology assumes a constant speed of the MHD shock at distances corresponding to the kmTII emissions along the Sun–Earth line. Although this may seem to contradict the propagation profiles described in Section 3, it is a good approximation at the distances at which kmTII usually occur, as suggested by reports that support the concept of acceleration–cessation distance (e.g. Gopalswamy et al., 2001a; Reiner, Kaiser, and Bougeret, 2007; Temmer et al., 2011).

The speed [\(v\)] is calculated from the empirical expression \(v=\mathit{slope} \times a \times R_{0} \times \sqrt{n_{0}}\) (Reiner et al., 1998a), where \(\mathit{slope}\) is the frequency drift rate of the associated kmTII emission, \(a\) is a constant (9 or 18 if the radio emission occurs at the fundamental or the harmonic of the local plasma frequency, respectively), \(R_{0}=1.5 \times10^{8}~\mbox{km}\), and \(n_{0}\) is the electron density at Earth. Once the shock velocity is known, its distance from the Sun and arrival time can be obtained, provided that the distance of the emission along the Sun–Earth line is known. This is achieved by means of a coronal/interplanetary electron-density model, in particular the Leblanc, Dulk, and Bougeret (1998) model introduced in Section 3.1.

In addition to the assumption of nearly constant-speed propagation at the distances associated with kmTIIs, there are other factors that must be taken into account. To begin with, it is assumed that the source of the radio emission propagates along the Sun–Earth line, meaning that in the case of a limb full-halo CME the shock parcels responsible for the emission are those corresponding to the component traveling in the Earth’s direction. It is also assumed here that events are not complex, i.e. they are not interacting with other solar-wind structures, and that they travel through a stationary and quiet IP medium. The pre-existing conditions of the IP medium on which CMEs/shocks propagate may modify their three-dimensional propagation, as found by several authors (e.g. Pohjolainen et al., 2007; Kilpua et al., 2009; Zuccarello et al., 2012; Panasenco et al., 2013; Lugaz, Farrugia, and Al-Haddad, 2014).

As mentioned earlier, one of the main drawbacks of the CSK2007 method is the fixed value of \(7.2~\mbox{cm}^{-3}\) adopted for \(n_{0}\). This value is required not only to determine \(v\), but also by the coronal/interplanetary density model to deduce the distance at which a particular emission takes place. To overcome the limitations imposed by the use of a fixed \(n_{0} = 7.2~\mbox{cm}^{-3}\), new values of \(n_{0}\) were determined by means of the procedure presented in Section 3.1, whenever feasible. The other major shortcoming is the computation of the \(\mathit{slope}\) parameter, formerly calculated as the linear interpolation between two points selected in the \(1/f\) dynamic power spectrum of the emission. As explained in Section 3.1, a variable number of frequency points can now be selected in the spectrum to incorporate all of the “patches” of the usually noisy and intermittent kmTII emission. A linear fit to all of the points is then applied (see white line in Figure 2), obtaining in this way a more robust estimate of the \(\mathit{slope}\).

While this is a relatively simple and highly empirical technique, it has proved to be accurate and flexible enough to allow applying it to emissions with spectra of several different qualities. Other techniques based on image processing or automatic spectrum analyses (e.g. Gonzalez-Esparza and Aguilar-Rodriguez, 2009) only obtain good results if the emission spectrum has a relatively well-defined and isolated shape, i.e. is not contaminated by other radio phenomena. These conditions are typical for energetic events that do not overlap with other major solar phenomena occurring in the same bands (e.g. Type-III events) and present relatively stable and homogeneous values of the ambient plasma density. This scenario represents only 25 % of TII events (Cane and Erickson, 2005) and is not the case for the majority of the events analyzed here, since most of them took place during solar maximum.

The error in the SAT estimation is defined as the difference between the true SAT as measured by Wind at 1 AU and the predicted SAT, so that a positive (negative) error means that the event is forecast to arrive after (before) the real SAT. The application of this technique to the 71 events in Table 1 produces an average SAT error of ≈ four hours, while 85 % of the events are predicted with an absolute error smaller than six hours. This represents a significant improvement with respect to the average SAT error of 7.8 hours of the original CSK2007 technique. See Table 2 for statistical facts, where “Linear 2P” stands for the original version of the technique, and “Linear MP” represents the multipoint improved version of the method. The performance of these techniques is discussed in Section 6, also in comparison with that of the predictive-profile method presented below.

Table 2 Statistics of SAT errors for the CSK2007 technique (Column Linear 2P), the improved linear method (Linear MP), and the predictive profile method (Pred–Profile).

5.2 Predictive Profile Method

The descriptive profiles introduced in Section 3.2, which consider CME-onset information, distance–time points derived from kmTII emissions, and the shock arrival at 1 AU, stimulated the formulation of a predictive method. The latter is more comprehensive than the linear version presented in the previous section because it not only considers kmTII information, but also CME data, so as to have a more complete scenario potentially helpful for forecasting. Furthermore, the linear fit through the kmTII points presented above does not appear representative of the various descriptive profiles found for the 71 events (Section 4). Therefore, the same simple Equations (1) and (2) used to simulate the descriptive profiles introduced in Section 3.2 are now used to fit the CME and kmTII points in a predictive fashion, i.e. disregarding the shock arrival. The same assumptions hold as were made for the previous method: nearly Sun–Earth propagation of shock-emitting parcel, non-complex events, and propagation through a stationary and quiet IP medium.

Figure 9 shows the various propagation profiles traced for the events starting on 6 November 1997 (left panel – decelerated case) and 20 January 2005 (right panel – accelerated case). The solid line represents once more the descriptive profiles through the CME, kmTII, and shock-arrival points; while the dash–dotted line is the linear fit through the CME projected height–time points (asterisks). The predictive profile, i.e. disregarding the shock-arrival data point, is represented by the dotted line and accompanied by the corresponding SAT error in hours. Although it is obvious that the Sun–Earth propagation of CMEs/shocks is not simple, the second-order predictive method approximates the CME and kmTII distance–time points, also by forecasting almost half of the events with a SAT error less than six hours (see last column of Table 2). The average SAT error obtained with this technique for all events yields 9.1 hours, much larger than the one achieved with the improved linear method. This average error is still comparable to those of SAT predictions based on metric TII emissions, which range from ≈ 8 to 12 hours (e.g. Dryer and Smart, 1984; Smith and Dryer, 1990; Fry et al., 2001).

Figure 9
figure 9

The predictive-profile method (dotted line) in comparison with the descriptive profile (solid line) and the linear fit through the SOHO/LASCO points (dash–dotted line) for the events starting on 6 November 1997 and 20 January 2005. The SOHO/LASCO projected height–time points are represented by asterisks, the kmTII distance–time points by crosses, and the shock arrival at 1 AU by a triangle.

6 Summary and Discussion

We investigated the individual propagation of 71 ejective events after association of coronal, interplanetary, and in-situ counterparts during 1997 – 2007. Combining coronagraph images of CMEs, TII radio emission in the kilometric range, and in-situ information on shocks, it is possible to build distance–time diagrams that cover the Sun–Earth journey. These give an overview of the general kinematics of CMEs and their driven shocks in the IP medium. Propagation at coronal heights is given by CME height–time information projected in the plane of the sky, while the shock-arrival time at 1 AU is obtained from in-situ shock lists. The IP distance of the kmTII emissions is determined by means of a coronal/interplanetary-density model, fed with a refined value of the electron density at 1 AU.

Distance–time diagrams built from CME-kmTII-shock data points were approximated by two simple fitting expressions (Equations (1) and (2)) used to describe the propagation of the ejective events from the Sun to Earth. These descriptive profiles can be classified according to their behavior into decelerated (39 %), accelerated (34 %), and linear (27 %), when considering as linear events those with an average acceleration within \({\pm}\, 1.5~\mbox{m}\,\mbox{s}^{-2}\). In contrast, Reiner, Kaiser, and Bougeret (2007) found few accelerated CME/shock events as seen in radio data. The discrepancy could be due to either a bias in the data selection procedure in one or both investigations or to incorrect CME-kmTII-shock associations, in spite of a careful examination of candidate events and double-checking with other lists of CME–ICME shock associations. The propagation profiles agree with the findings of Gopalswamy et al. (2001a), Reiner, Kaiser, and Bougeret (2007), and Liu et al. (2013). The first proposed a model based on coronagraph and in-situ data that assumes a constant acceleration from the Sun until an acceleration–cessation distance of 0.76 AU, common to all CMEs. From there onward, a constant-speed propagation is assumed. Reiner, Kaiser, and Bougeret (2007) investigated three stages of propagation of a set of 42 events and formulated a model that assumes a constant deceleration out to a variable heliocentric distance, followed by a constant-speed propagation at the in-situ-derived shock speed. Liu et al. (2013) suggested three propagation phases of fast events: an impulsive acceleration, a rapid deceleration, and a nearly constant-speed propagation (or gradual deceleration). Previous studies solely based on coronagraph observations suggest that fast CMEs undergo an impulsive acceleration phase followed by constant or slowly decreasing speed (e.g., Zhang et al., 2001). They found differing propagation profiles (e.g. Vršnak, 2001), which can be regarded as indicative of different physical mechanisms playing a role (Joshi and Srivastava, 2011).

Various quantities arise from the descriptive curves of CME-kmTII-shocks propagation through the first and second derivative of the descriptive profiles. These are respectively addressed as values of speed and acceleration, although caution must be taken to avoid direct physical interpretations. Although the CME and the shock-transit speed do not correlate well (as found also by Reiner, Kaiser, and Bougeret, 2007), the initial speed shows a tight relationship with the effective deceleration, showing that fastest CMEs decelerate at a greater rate, as previously found by several authors and in agreement with the drag-force concept (e.g. Vršnak and Gopalswamy, 2002; Cargill, 2004; Manoharan, 2006; Vršnak et al., 2010, 2013).

The characteristics of the set of analyzed events were presented to typify events exhibiting kmTII emissions directly linked to shocks reaching Earth, certainly a relevant group for space-weather forecasting. Nearly all CMEs related with the kmTII emissions are partial or full-halo CMEs, mainly asymmetric events, in accordance with the strong bias of the source-region longitude toward western values. Seventy-three percent of the CMEs exhibited projected coronal speeds greater than \(1000~\mbox{km}\,\mbox{s}^{-1}\). For their part, TII emissions appear limited to the kilometric wavelength range (starting at \({\approx}\,20~\mathrm{R}_{\odot}\)) for 23 % of the cases, while 30 % began at the upper detection limit of RAD2 and extend down in frequency to few kHz (i.e. to distances larger than 0.5 AU). A large number of the in-situ events provided with information by the RC ICME list presented BDEs (91 %), while 25 % showed MC signatures and 28 % solely magnetic-field rotation.

The descriptive propagation profiles of CME-kmTII-shocks have the potential to help evaluate the performance of various projection-effects correction methods (Section 3.2). None of them have been systematically demonstrated to follow the descriptive profiles at coronal heights. The implementation of these methods for deprojecting coronal height–time points in views of running SAT prediction methods can have an adverse impact on the predictions, yielding errors of 12 hours to 24 hours. Incorrect models and/or inhomogeneities in the IP medium modifying the propagation conditions may be responsible for these errors.

To improve the performance of the kmTII-based SAT forecasting technique originally developed by CSK2007, which is a constant-speed approach, two key modifications were introduced. These were aimed at mitigating the main drawbacks of that technique, concerning the determination of the crucial value of density at 1 AU required by the density model, and the selection procedure of radio-emission patches in the dynamic spectrum images. First, instead of the \(n_{0}=7.2~\mbox{cm}^{-3}\) used for almost all cases in the original technique, we adopted when possible the mean value of \(n_{0}\) derived from the automatic detection of the plasma line for the day(s) when the kmTII was observed. This establishes a better constraint on the Leblanc, Dulk, and Bougeret (1998) density model and contributes to reducing the SAT error. However, the model sensitivity to \(n_{0}\) established the convenience of using \(n_{0} = 7.2~\mbox{cm}^{-3}\) for events with unstable plasma frequency line, which is a common case during times of solar maximum. Issues in the estimation of CMEs/shocks location were addressed by Pohjolainen et al. (2007) and include the interaction with slower CMEs and passage through a perturbed medium. Second, the methodology used to estimate the frequency drift rate of the kmTII emission adds the possibility to select several points on its dynamic spectrum and use a linear fit to obtain the \(\mathit{slope}\) parameter. This is in contrast with the previous version, which was limited to the subjective selection of only two points of the kmTII emission. TII emissions are known to exhibit a variety of behaviors that complicate their discernibility, such as multiple lanes, differing drift rates, and intermittent emission, as a consequence of coronal and interplanetary plasma structure, magnetic-field topology, and relative motion of the TII source with respect to the global shock evolution (Knock and Cairns, 2005). The implemented improvement increases the technique accuracy when dealing with the very frequent patchy and/or noisy events, without degrading the characteristic simplicity of the original method. The reduction of the average absolute SAT error to 3.8 hours is attributed to the synergistic effect of these two improvements. For a comparison, Xie et al. (2006) tested the performance in SAT prediction by combining input parameters of three different cone models with the ESA model (Gopalswamy et al., 2005a,b) and achieved average errors of almost six hours. The statistical studies of Taktakishvili et al. (2009) and Xie et al. (2013), which tested the Enlil (Odstrcil, Pizzo, and Arge, 2005) model predictive performance when fed with various sets of cone-model parameters yielded errors ranging from six to eight hours.

In addition to the improved linear method of forecasting, we attempted to employ the same expressions used for the descriptive-propagation profiles, but in a predictive fashion. This implies that the kmTII-derived distance–time points are only combined with the first observation of a CME in the coronagraph while neglecting the SAT data point. In spite of being more realistic than a simple linear fit through the radio data, the average absolute error yielded by this method is comparable to the errors obtained by predictive models based on metric radio bursts (Dryer and Smart, 1984; Smith and Dryer, 1990; Fry et al., 2001), which yield average errors within ≈ 8 to 14 hours (Fry et al., 2003; McKenna-Lawlor et al., 2002, 2006). Average errors obtained with the presented predictive techniques may be partially explained by i) the linear projection to 1 AU of the kmTII points better approximates the last propagation phase of the shock wave, while if coronagraph data are introduced the prediction worsens; and ii) although the best efforts were undertaken to find the most suitable association between each kmTII event and the CME that originated it, discrepancies found in some few cases would indicate that the two phenomena are not related. Because of the latter, events with average absolute SAT error greater than six hours for the predictive-profile method reach 52 %, while for the improved linear method they account for only 15 %.

The shortcomings of these kmTII-based forecasting techniques include i) the need of space-based instrumentation capable of performing low-frequency radio measurements; ii) the shorter anticipation of the forecasts, given that the kmTII emissions take place at larger distances from the Sun (from \({\approx}30~\mathrm{R}_{\odot}\) onward), iii) the lack of information on which portion of the three-dimensional shock is actually producing the emission, and iv) the low number of shock waves arriving at 1 AU that are effectively associated with kmTII emissions (28 % for the period investigated). In this regard, it must be noted that there may have been more kmTII emissions than actually reported in the Wind/WAVES Type-II radio bursts list because they remained hidden in the RAD1 dynamic spectra due to its low frequency resolution close to its lower detection limit.

Furthermore, these techniques also rely on the assumptions of non-complex events and propagation through a stationary and quiet IP medium. Therefore, their performance may be affected for an interaction with other solar-wind structures and deviation from a stationary propagation. Nonetheless, more than half of the events here analyzed took place during the maximum of Solar Cycle 23, when complex events and highly disorganized IP conditions dominated the background IP medium, and they still resulted in reducing the SAT error of the improved linear method.

The present study appears promising for achieving better SAT forecasts solely based on spacecraft data located in the Sun–Earth line, thus emphasizing the need for continuity of space missions that monitor wavelength ranges filtered out by the Earth’s atmosphere. Although the success of the predictive-profile method is relative, the descriptive profiles built for 71 Earth-directed events provide some insights into their Sun–Earth propagation up to 1 AU. The improved linear method, based only on kmTII emissions, yields considerable improvements in SAT predictions of (I)CME-driven shocks, however.