1 Introduction

Many institutions (using several RCMs) have completed numerous high-resolution (0.11° and 0.22°) climate projections over regions worldwide as part of the COordinated Regional climate Downscaling EXperiment (CORDEX; Giorgi et al. 2009; Jones et al. 2011; Gutowski et al. 2016). In particular, 55 simulations were completed within the EURO-CORDEX initiative (Jacob et al. 2013, 2020). Downscaling low-resolution data to a high-resolution using a regional climate model (RCM) is a computationally costly process, and despite the statistical analysis and validation of these simulations against various observation sources (Sanchez-Gomez et al. 2009; Kjellström et al. 2010; Lenderink 2010; Jacob et al. 2011, 2014; Kotlarski et al. 2014; Aalbers et al. 2018), a comprehensive assessment of the added value provided by such a downscaling process has not been carried out, so that the added value issue is still a point of debate (Di Luca et al. 2013, 2015, 2016; Hong and Kanamitsu 2014; Laprise 2014; Xue et al. 2014; Torma 2015; Giorgi et al. 2016; Prein et al. 2016; Soares and Cardoso 2018; Qiu et al. 2019).

Although one might argue that higher resolution should in principle improve all aspects of a simulation, the added value of downscaling depends on the variable and regional context of interest. For example, a higher resolution is always better at resolving complex topography and coastlines, and consequently the intensity and spatial distribution of precipitation over such regions should be improved when downscaled. Similarly, extreme precipitation events are most often very localized in space and time, and thus increasing resolution should lead to better simulations. The simulation of fine-scale circulations and their effects on regional climates, such as due to sea breezes or mesoscale convective systems, would also in general benefit from increased resolution (e.g., Rummukainen 2016; Giorgi 2019).

While the effects of improved horizontal resolution in such cases is easily observed, it may not translate into more accurate or credible climate change information (Barsugli et al. 2013). This raises the issue of how to assess improvements in downscaled simulations over those provided by the forcing reanalyses or GCMs, and thus, how to assess their added value. Towards this goal, there have been many attempts to identify the added value of a RCM compared to the driving GCM (e.g., Giorgi et al. 1994; Kanamitsu and Kanamaru 2007; Coppola et al. 2010; Kanamitsu and DeHaan 2011; Di Luca et al. 2013, 2016; Torma 2015; Giorgi et al. 2016; Lucas-Picher et al. 2017; Fantini et al. 2018; Soares and Cardoso 2018). In particular, one of the metrics often used to quantify added value is the probability distribution function (PDF) of a given variable (e.g., Torma 2015; Fantini et al. 2018), as it describes the complete characteristics of the variable.

One of the quantitative metrics used to measure how a model reproduces observed PDFs is the Kolmogorov–Smirnov distance (Chakravarti et al. 1967), which Torma (2015) used to compare the maximum difference between the cumulative distribution functions (CDFs) of a model and an observation CDF. Fantini et al. (2018) employed a similar metric, the Kullback–Leibler divergence (Kullback and Leibler 1951), which compares the mean difference of two PDFs. Both metrics, applied to daily precipitation PDFs, indicated that high-resolution RCMs performed better than the coarse-resolution driving models. Fantini et al. (2018) also showed that the greatest added value was found in regions of complex topography, such as the Alps, Italy, and Norway. Instead of focusing on differences, Soares and Cardoso (2018) used the Perkins skill score (Perkins et al. 2007) to measure the common area between the simulated and observed distribution, which was then used to compare the gain (or loss) as a result of high-resolution downscaling. They showed that added value was present throughout the European region (especially for extreme precipitation) with some of the highest values obtained in the Alpine region.

The temporal correlation skill has proven effective to assess the spatial distribution of an added value metric in a point-by-point analysis (Kanamitsu and Kanamaru 2007; Kanamitsu and DeHaan 2011; Prein et al. 2016). In hese studies, a substantial geographical variability of the added value metric was shown, even with areas of negative added value, thus highlighting the importance of showing the geographical distribution of relevant metrics. However, this correlation-based added value index cannot be used within the context of simulations driven by GCMs since no substantial temporal correlation can be expected with an observation time-series due to the lack of real-world data assimilation.

A good alternative is to use spatial correlation (Di Luca et al. 2016; Prein et al. 2016). Prein et al. (2016) used the fraction skill score (Roberts and Lean 2008) and spatial correlation of each model with observations to compare the added value of low and high-resolution runs of RCMs. The study analysed European observation data-sets separately in order to visualise the spatial variation of added value. Di Luca et al. (2016) also used spatial correlation and the mean square error to quantify added value (Di Luca et al. 2013). These studies showed substantial improvements in the RCM simulations in most regions analysed, with some exceptions during different seasons.

A point-by-point analysis of PDFs can thus be an optimal solution to spatially assess the added value of a RCM, since it includes both a comprehensive representation of the characteristics of a variable and its geographical variation. Therefore this paper presents a new metric to quantify the added value of a RCM with respect to its driving GCM based on a point-by-point PDF analysis of daily precipitation. We apply our approach to the European region via the large ensemble of RCM projections produced as part of the EURO-CORDEX program (Jacob et al. 2013, 2020) and on different continents via the ensemble of projections recently completed as part of the CORDEX-CORE program (Gutowski et al. 2016). The choice of precipitation is due to the availability of high-resolution observation data in Europe and the rest of the world, and to be able to compare with past studies (Torma 2015; Giorgi et al. 2016; Prein et al. 2016; Fantini et al. 2018). Moreover, precipitation is strongly affected by topography and by fine-scale spatial and temporal processes, and thus downscaling can be especially useful in improving its simulation.

Quantification of the added value for a present-day simulation can be a relatively straightforward task if appropriate observations are available, but it is difficult to quantify the existence of added value in a future climate simulation. A novel way we propose to assess the potential for added value in climate change signals, is through the use of the same metric as for the present-day simulations but applied to the RCM and GCM change signals. This allows us to identify when and where the change signals diverge and how different they are (Giorgi et al. 2016). If these differences are shown to be large over the same locations where an added value was proven in the present climate validation exercise, then one could assume that the RCM projection could potentially be more accurate compared to the GCM’s. The proposed methods are described in the next section.

2 Materials and methods

We introduce here a new method for quantifying the added value of a variable and representing it spatially. This method stems from the spatial downscaling signal described by Giorgi et al. (2016) and the spatial correlation skill mentioned in Rummukainen (2016). Other studies (Kanamitsu and DeHaan 2011; Torma 2015; Fantini et al. 2018) use different metrics to describe the difference between simulated and observed PDFs, however, these are based only on parts of the distribution. Instead our method quantifies the added value by computing the absolute values of the differences across the entire PDF distributions, so that these differences do not cancel each other out. We then apply this method at each grid-point of the model domain so that we provide information on the spatial distribution of the added value.

For a variable of interest (in this case daily precipitation, including dry days), the method requires data from a RCM, the driving GCM, and an observation source (OBS; ideally of high-resolution) for the same time-period and frequency. Once the three datasets are interpolated onto a common grid, the PDFs can be calculated in a consistent way so that each grid point (for the 3 data-sets) has its own distribution, resulting in a grid of PDFs (hereafter referred to as PDF-grid). In order to ensure a fair comparison, the bin size should be identical for each grid point, however the number of bins must be independent to properly represent the different PDFs. In this paper, a bin-size of 1 mm/day is used in order to resolve high precipitation events in the tail-end of the PDFs, since the analysis is focused on wet extremes. The calculation of the added value index (see below) obviously depends on the bin size, and in the “Appendix” we present a sensitivity analysis of our results to a range of bin sizes. Furthermore, the grid-point maximum necessary for the computation of each PDF is taken as the maximum of all datasets at that grid point.

The resulting PDF-grid for a model is compared to the PDF-grid of the OBS by using the sum of the absolute differences between the model (M) and the observation (O) across all bin values (\({\nu }_{t}\)), divided by the sum of O. Here, we refer to this as the Relative Probability Difference, D (described in Eq. (1); Fig. 1), where N is the number of events in the dataset for a given bin \(\nu\), and \(\Delta \nu\) is the bin size of the variable. This calculation is done for both the RCM and GCM and the resulting plots describe the spatial distribution of DM with respect to the observations. In this manner, the difference value DM is a unitless quantity which represents the compounded discrepancies between the distributions. A smaller value of DM indicates a better performance by the model.

Fig. 1
figure 1

An illustrative plot of the precipitation distribution of a single grid point. The lines describe the distribution of a hypothetical model and an observation data-set. The shaded area represents the sum of the relative probability difference between the model and observations (DM)

$${D}_{M}=\frac{{\Sigma }_{\nu =1}^{{\nu }_{t}}\left|\left({N}_{M}-{N}_{O}\right)\Delta \nu \right|}{{\Sigma }_{\nu =1}^{{\nu }_{t}}\left({N}_{O}\Delta \nu \right)}.$$
(1)

The added value index (Ai) is thus quantified by comparing DGCM to DRCM (Eq. 2), where a positive (negative) index represents an improvement (degradation) of the RCM results compared to the GCM ones, as suggested by Di Luca et al. (2015). The quantity Ai is also unitless, and is given by

$${A}_{i}={D}_{GCM}-{D}_{RCM}.$$
(2)

A problem can arise when the PDF of the GCM is missing some bin-data, for which the corresponding RCM and OBS bin-data exist. This is common, for example, at the tail-end of the distribution which GCMs tend to fail to capture (Fantini et al. 2018; Torma 2015). Such cases represent an important contribution to the added value calculation, but they cannot be quantified properly by this method because in such situations DGCM is always equal to 1, while DRCM can exceed this value and thus produce a misleading negative value to Ai. Therefore, a conditional assumption is introduced by which if NGCM of a specific bin is zero, but the corresponding NRCM and NO are non-zero, that bin contributes 0 relative probability difference to the final DRCM, thereby ensuring a positive contribution to the index Ai. In other words, we assume that the RCM adds value to the GCM if it simulates events in bins for which the observations have events and the GCM does not simulate any, regardless of how many events the RCM simulates. The inverse situation can obviously occur (although in fact it rarely does), in which the RCM misses data in a bin where both the OBS and GCM simulate events. Also in this case the same procedure is applied, so that neither the RCM or the GCM are favored. We acknowledge that this is an assumption based on a subjective assessment that it is more important to capture the existence of events in a bin than to exactly simulate the number of such events, an assumption especially important for the tail end of the distribution which is characterized by small numbers of rare events.

Some studies (Torma 2015; Prein et al. 2016; Fantini et al. 2018) have shown how GCMs do not resolve precipitation extremes as well as RCMs. For this reason, the method above can also be modified to focus on a particular segment of the distribution, for example the 95–100 percentile interval. In such a case, the percentile values of the observation dataset are used as thresholds for the PDFs, and the part of the complete PDF (as in Fig. 1) that contributes to this percentile interval would be the only data included in Eqs. (1) and (2). Since the 95th percentile varies from one grid-point to another, the threshold applied must be specific to that grid-point and cannot be the field-mean over the analysis domain. The 95–100 percentile interval is not an arbitrary choice, as studies have shown that substantial added value in a RCM can be found at the tail-end of the precipitation distribution (Torma 2015; Fantini et al. 2018).

In an analogous way, a climate change downscaling signal, (ADS in Eq. 4) can be defined from the change between a PDF in a future climate period and a corresponding PDF in a historical period of the simulation. In this case, instead of comparing the model data to an observation dataset, we compare the future data (f) to the historical data (h) of the same simulation (as shown in Eq. 3). This is similar to the method described by Giorgi et al. (2016). In this case, the conditional assumption applied to Eq. (2) (where a model does not resolve a particular bin) cannot be applied, as this data is not compared to any observations. The larger the value of this downscaling signal, the more different the projected and reference PDFs are, and the magnitude of ADS is proportional to this difference. The climate change downscaling signal, ADS is described in the same manner as Ai, i.e., a unitless quantity expressed as.

$${D}_{Mf}=\frac{{\Sigma }_{\nu =1}^{{\nu }_{t}}\left|\left({N}_{Mf}-{N}_{Mh}\right)\Delta \nu \right|}{{\Sigma }_{\nu =1}^{{\nu }_{t}}\left({N}_{Mh}\Delta \nu \right)}.$$
(3)
$${A}_{DS}={D}_{GCMf}-{D}_{RCMf}.$$
(4)

The quantity DMf in Eq. (3) describes a relative climate change signal within a given model M; NMf is the value of the future period PDF at bin \(\nu\) for model M; and NMh is the corresponding bin value in the historical period PDF of the same model M. The ADS is the difference of the DMf signals of the RCM and GCM, i.e. it is based on the climate change signals in the driving and downscaling models (hence climate change downscaling signal). Here, large positive or negative values of ADS indicate a larger climate change downscaling signal, hence a greater difference between the RCM and GCM resulting in the potential for added value. ADS values close to 0 describe a weak downscaling signal. The sign of ADS does not quantify which model is ‘better’, but rather how different the two PDFs are. A positive (negative) value of ADS indicates a situation where the climate change signal of the GCM (RCM) in a given segment of the PDF is greater than that of the RCM (GCM). When the analysis is restricted to a specific percentile interval (such as 95–100, as mentioned above), since no observation data is included in this comparison, the percentile threshold is obtained from the historical data-set.

2.1 Simulated data

For our analysis we use two GCM-RCM projection ensembles. The first is the EURO-CORDEX ensemble (Jacob et al. 2013) of 55 RCM simulations at 0.11° (Table 1). This consists of 130-year climate projections (from 1970 to 2100) for the representative concentration pathway, RCP 8.5 (Moss et al. 2008), with an incomplete matrix of 12 RCMs driven by 8 different GCMs (one should note that simulations run by MOHC-HadGEM2-ES do not include the year 2100). The analysis is carried out on daily precipitation, with a special focus on the higher percentiles of the distributions. The second data-set is the CORDEX-CORE ensemble (described in Table 2; Mearns et al. 2017; Remedio et al. 2019; Coppola et al. 2020a, b; Teichmann et al. 2020), which includes 0.22° resolution simulations run by two RCMs, the RegCM4 (Giorgi et al. 2012) and REMO2015 (Jacob et al. 2012; Remedio et al. 2019), each driven by three GCMs, for 8 non-European CORDEX domains: Africa; North, Central, and South America; East, South-East, and South Asia; and Australasia.

Table 1 EURO-CORDEX RCM ensemble members and their corresponding driving GCMs (with variant label) used for this analysis
Table 2 CORDEX-CORE RCM ensemble members for each domain (excluding Europe) and their corresponding driving GCMs used for this analysis

The method requires that all data, i.e., RCM, GCM, and observations are defined on the same horizontal grid. This raises two issues; interpolating the GCM to a higher resolution grid may create unrealistic values, while interpolating the RCM to a lower resolution grid degrades the spatial signal and the PDF (Prein et al. 2016). The latter is especially true at the tail-end of the distribution (Torma 2015), where the largest added value is expected. To account for both issues, the analysis is conducted on two grids (using distance-weighted average interpolation), the RCM grid (0.11°) which allows us to have a more accurate representation of the spatial distribution of the index, and a 1.00° grid to ensure that the results are inter-comparable.

2.2 Observation sources

The added value calculations are dependent on the observation data used as reference, thus multiple observation datasets are used to test the method. These are reported in Table 3, and additional information on station density can be found in Prein et al. (2016) and Fantini et al. (2018). The time period available for the different datasets is not uniform so a different time period is used for each dataset. The analysis of the EURO-CORDEX data is compared to two observation sources: the EOBS v20e and a composite of 9 sub-regional observations, hereafter referred to as ‘European Composite Observations’ (ECO).

Table 3 Observation datasets used to assess the added value of the EURO-CORDEX ensemble

The CORDEX-CORE analysis is also carried out using multiple observation sources (reported in Table 4). The CPC data-set is used to assess the added value compared to low-resolution observations, while the TRMM dataset provides a comparison between satellite and station-based observations. The regional observation datasets GCOSGHCN, IMD, and APHRODITE were combined into a single data-source, hereafter referred to as global composite observations (GCO). Similarly to the European observation sources, the time periods used here are different for each dataset. However, since the indices are calculated using the entire dataset and not on a year by year basis, this should not affect the basic conclusions of the analysis.

Table 4 Observation datasets used to assess the added value of the CORDEX-CORE ensemble

3 EURO-CORDEX analysis

3.1 Added value for the present-day validation

Figure 2 shows the relative probability differences for the GCM and RCM ensembles, and the resulting added value index for the EURO-CORDEX ensemble. The relative probability difference of the GCM ensemble is shown to be significantly larger than that of the RCM ensemble virtually everywhere, resulting in a positive added value throughout the EURO-CORDEX domain. This is particularly the case in areas of complex topography, where the added value is therefore maximum.

Fig. 2
figure 2

Relative probability difference (D) for the GCM (left) and RCM (mid) ensemble, and added value (Ai) for the RCM ensemble (right) compared to ECO at 0.11° (top) and 1.00° (mid-top) and EOBS at 0.11° (mid-bottom) and 1.00° (bottom)

Figure 2 also provides a comparison of the added value calculated on the 0.11° and 1.00° grids. Clearly, the results are consistent for both the high and low resolution grids, and the geographical distribution of the added value is also maintained (although with less spatial detail) for the lower resolution grid. The added values calculated using the ECO and EOBS observation datasets are very similar, although slightly smaller in EOBS over some regions (e.g., Scandinavia, the British Isles, France, and the Carpatians).

Figures S1–S6 show the added value plots for individual simulations, and show a greater dependency on the GCM field than the RCM field (as also reported by Di Luca et al. 2016). The ensemble members show a large predominance of positive added value, although some members exhibit some areas of negative added value. This latter result mostly occurs due to a low relative probability difference (i.e., good performance) obtained by the driving GCM, such as HadGEM2-ES and MPI-ESM-LR (Figures S1 and S4). Conversely, the simulations providing the highest added value (Figures S3 and S6) are the ones driven by NorESM1-M and CNRM-CM5, where both GCMs display the highest relative probability difference (low performance). The only exception is the ALADIN53 driven by CNRM-CM5 which displays a very high relative probability difference (low performance) compared to the other RCMs.

Although the added value at 0.11° resolution (Fig. 2) is larger over areas of complex topography (for both ECO and EOBS), the signal appears to be smaller around the highest peaks. For example, over the Alpine region this may be attributed to localized areas with a low density of stations in the observation source (Isotta et al. 2014) which produce an apparent reduction of added value. Another reason might be the lack of an undercatch gauge correction, which is especially relevant during windy and snowy conditions and can account for up to 30% underestimation of real precipitation by gauge data (e.g., Adam and Lettenmaier 2003).

The added value shown in Fig. 2 is calculated using the entire PDF. If different percentiles of the distribution are considered, the resulting added value may be quite different. Figure 3 shows the added value as a function of the percentile interval. There are two possible choices of intervals, the first keeps one end of the interval fixed to zero and moves the other end from zero to 100 (0-x), and the second keeps the far end fixed to 100 and moves the near end from 0 to 100 (x-100).

Fig. 3
figure 3

The variability of spatial mean added value index at different percentile intervals compared to ECO (top) and EOBS (bottom) at 0.11°. The EOBS data in this figure has been masked to match the locations of ECO. Each point x describes the added value of the percentile fraction ‘0-x’ (left), and ‘x-100’ (right). The shaded area shows the standard deviation of the data

When the 0th percentile is included (case 0-x), the added value of the RCM ensemble-mean gradually increases with the upper-bound threshold. This suggests that a higher added value is found at the tail-end of the distribution. The intervals that do not include the 0th percentile (case x-100), show a substantially higher added value, even for the lower percentile intervals. This implies that the RCMs perform less adequately at the 0th percentile.

When omitting the 0th percentile (case 5–100), the added value is relatively constant until about the 50th percentile and then decreases gradually until it reaches a minimum around the 90th percentile, after which the added value increases sharply when compared to ECO. To understand this behaviour, in Fig. 4 we show the observed and simulated PDFs over different sub-regions covered by the ECO data-set. It can be seen that while the RCM PDF reproduces quite well the observed one, the GCM overpredicts the frequency of low intensity events and underpredicts that of high intensity ones. In other words, there is a point in which the GCM PDF intersects the observed PDF, and this point is located around the 90–95th percentile of the observed distribution. For this reason, as the percentile interval approaches this intersection the relative probability difference of the GCM ensemble will be closer to that of the RCM, thus resulting in the dip seen in Fig. 3.

Fig. 4
figure 4

PDFs of the RCM and GCM ensemble member data compared to all 9 regional observations at 0.11°. Each PDF includes a marker for the 75th, 95th, 99th and 99.9th percentiles

Figure 5 shows the geographical distribution of the added value calculated using ECO (at 0.11°) at different percentile intervals. Here, the ‘0-x’ intervals are positive throughout the PDF spectrum and increase in magnitude at higher intervals. This is consistent with the results presented in Fig. 3. When the 0th percentile is omitted, the ‘x-100’ intervals show a variability that is also consistent with that of Fig. 3. The 50–100 percentile interval has a larger magnitude than the 0–50 interval, and at higher intervals the added value decreases slightly in many regions due to the GCM PDF crossing the observed PDF, as explained earlier and shown in Fig. 4. The added value increases again (and peaks) at the 99–100 percentile interval, after which a second slight decrease in the added value is observed (as explained below).

Fig. 5
figure 5

Added value for RCM ensemble-mean at different percentile intervals compared to ECO at 0.11°

The added value compared to the EOBS data shows similar results when looking at the same regions covered by ECO (Fig. 3). However, the results in the other areas are very different (as seen in Fig. 6), and in many regions, e.g. Russia, we even see a decrease in the added value. This is likely caused by the low density of station observations over these areas (Haylock et al. 2008). The pronounced spatial diversity in these results also illustrates the importance of using an observation data-set of equal or higher resolution than the model’s throughout the entire analysis domain when assessing the added value.

Fig. 6
figure 6

Added value for RCM ensemble mean at different percentile intervals compared to EOBS at 0.11°

Another interesting feature of the added value of the ensemble mean is the slight decrease at the 99.9–100 fraction when compared to ECO (Fig. 5). The number of events occurring above the 99.9 percentile threshold tends to be very small, with numerous bins having zero events. Despite the improvement in the RCM representation of the tail, the magnitude of these extreme events is often different from the observations (and would thus correspond to a different bin value). Since this added value metric is comparing the frequency of the events in each bin, some of these cases would not be comparable. This means that the non-zero events for the GCM, RCM, and observations above this threshold may not always coincide in the same bin, which results in a more negative apparent added value. Since the frequency of events at this extreme percentile interval is very small compared to the rest of the distribution, this problem does not influence the calculations for the entire distribution.

A similar effect is also seen in the 99.9–100 added value compared to EOBS (Fig. 6), which shows a large positive added value for the 99.9–100 percentile interval fraction in many areas, but a large negative added value in others. This is likely a combined effect from the small number of events occurring at this percentile interval, and the low station density of some areas.

Our results indicate that the best observation source to use in order to assess the 0.11° EURO-CORDEX simulations is the ECO, since all the observation data-sets are of the same horizontal resolution as the model or finer. Figure 7 shows the added value of each ensemble member at the 99–100 percentile interval compared to ECO. This portion of the PDF is where the ensemble mean shows the highest added value. The positive added value is consistent in all RCM members, with the NorESM-1 driven simulations displaying the greatest values.

Fig. 7
figure 7

Added value for 99–100 percentile fraction of the EUR-11 ensemble members compared to the ECO at a resolution of 0.11°

3.2 Added value for the climate change projections

An observation-based analysis cannot be used to quantify the added value of future simulations. To address this issue, the downscaling signal described by Giorgi et al. (2016) is combined with our method, as described in Sect. 2, to provide a downscaling signal based on the PDFs. As an illustrative example, the far future time slice (2080–2099) is compared to the 1995–2014 reference period. The 90, 95, and 99 ‘x-100’ percentile intervals are shown in Fig. 8, together with the added value compared with ECO for the same intervals.

Fig. 8
figure 8

Added value (top) and climate change downscaling signal (bottom) of the EURO-CORDEX ensemble for the RCP 8.5 far future at different percentile intervals and at a resolution of 0.11°

Here, the downscaling signal near complex topographic regions and coastal areas becomes increasingly visible at the higher percentile intervals, which is consistent with the added value in the same regions. The strongest downscaling signal is found in the 99–100 percentile interval, and is visible in areas such as Scandinavia, the British Isles, and mainland Europe, where the latter shows a pronounced RCM signal (negative) that does not always appear to be linked to topography or coastal areas. This implies that the RCM projects a larger climate change signal than those obtained by the GCM, as also shown by Coppola et al. (2020a, b). The high added value obtained over these regions when comparing to observations might suggest that the RCM climate change signal is more reliable than that of the GCM, similar to the realised added value described by Di Virgilio et al. (2020).

This downscaling signal is also most pronounced at the highest percentile intervals, as the change in daily precipitation is greatest for extreme events (where also here it is dominated by a strong RCM signal). Furthermore, the spatial structure of the P99 change signal appears similar to the one seen in Coppola et al. (2020a, b), and also conforms with the downscaling signal reported by Giorgi et al. (2016). Once again, the higher percentile intervals show a stronger signal, not only because the precipitation change is larger at the extremes but also because GCMs tend to underpredict the tail of the distribution.

4 CORDEX-CORE analysis

4.1 Added value for the present-day validation

We now move to the analysis of the CORDEX-CORE ensemble described in Sect. 2.1. Consistent with the EURO-CORDEX results, the added value of the complete daily precipitation distribution (Figure S9) is mostly positive in all regions, with the most positive values occurring in complex topographical areas. However, a few notable exceptions show a negative added value, such as areas of western North America, Sahara, South Asia, and Australia. This negative added value is attributed to the lower percentile intervals (as explained in Sect. 3.1, and shown in Figs. 3 and S7), since a lower added value in these intervals would carry a greater weight on the overall distribution.

The added value of higher percentile intervals (Figures S10–S13) is consistent with this assessment. Figure 9 focuses on the added value of the 99–100 percentile interval of daily precipitation compared to the four observational datasets (described in Sect. 2.2). This percentile interval was shown to have the most positive added value in the EURO-CORDEX analysis (see Sect. 3.1) and this is confirmed in the CORDEX-CORE ensemble (Figures S10–S13). The added value is strongly positive in the tropics, characterized by the occurrence of more intense precipitation events than in mid-latitudes, which are evidently not captured by the GCMs.

Fig. 9
figure 9

Added value for the 99–100 percentile fraction of precipitation for the CORDEX-CORE ensemble members compared to CHIRPS, CPC, GCO (regionals), and TRMM, at 0.22°. The Europe data used is the added value compared to the ECO as in Fig. 2

The added value with respect to CHIRPS (Fig. 9) shows some areas of high negative values over African countries such as the Democratic Republic of Congo and South Africa. This may be at least partly due to the data sparsity in these areas (Funk et al. 2015), which would especially dampen the tail-end of the distributions, and thus favour the GCMs. It is important to note that the rest of the dataset was shown to be reliable (Funk et al. 2015). This negative added value is also visible when using the GCO dataset (especially for APHRODITE), and these regions also correspond to areas of low station density (Yatagai et al. 2009). Similarly, lower station densities (Menne et al. 2012) likely contribute to the area of negative added value in western North America. Furthermore, these areas of negative added value correspond to areas with very low moisture around the world. The aridity of these regions (and hence the larger number of dry-days; Daly et al., 1994) likely contributes to this low added value. This is somewhat similar to the added value associated with the EOBS (as explained in Sect. 3.1, and shown in Fig. 6), and to a smaller degree with ECO (Fig. 6).

All regional observations show a significant increase in the added value for the higher percentile intervals. The CPC shows a stronger signal where positive added value is found, and more areas with negative added value than the other observation sources. This wide variability in added value (similar to the case of low station density areas in EOBS in Fig. 6), is attributed to the low resolution of the data-set. Out of all the data-sets, the added value compared to TRMM shows the most consistently positive signals geographically.

4.2 Added value for the climate change projections

The climate change downscaling signal for the CORDEX-CORE analysis exhibits results similar to the EURO-CORDEX analysis (Sect. 3.2). Figure 10 compares the 99–100 percentile interval climate change downscaling signal with the added value compared to TRMM (which was found to produce the strongest added value). The 99–100 percentile fraction not only shows the highest added value, but also the strongest climate change downscaling signal.

Fig. 10
figure 10

RCM ensemble means of the CORDEX-CORE at 99–100 percentile intervals at 0.22°. (Top) added value compared to TRMM, and (bottom) climate change downscaling signal for the RCP 8.5 far future

While both signals are spatially very similar, they are not identical. A distinct topographical influence is also visible in the climate change downscaling signal, while a very strong RCM signal dominates over the equatorial regions. Once again, this implies that the RCMs are projecting a larger change in events than the GCMs in locations of strong added value with respect to observations.

5 Conclusions

In this paper, a new method for quantifying the added value of RCMs is described and tested using the EURO-CORDEX and CORDEX-CORE ensembles of GCM-driven RCM projections. The method is based on the intercomparison of PDFs for a given variable, in this paper daily precipitation, at the grid point level. It requires the comparison of GCM and RCM PDFs with corresponding observed data-sets at the same horizontal resolution and can be applied not only to estimate the added value in present-day climate but also the potential added value in the future projections. In our study we also tested the robustness of the results to different observation data-sets. An important caveat of our method is that, if at a given bin the observations have events and the RCM (GCM) simulates events while the GCM (RCM) does not, then the RCM (GCM) adds value regardless of how many events it simulates. Thus we assume that it is more important for a model to capture events in a given bin where there are observations than to reproduce the exact number of observed events. This situation occurs in particular towards the tail end of the distributions which are often not captured by the GCMs.

The RCM added value was found to be predominantly positive for the EURO-CORDEX ensemble mean, and became larger when assessing only the higher percentile intervals of the daily precipitation distribution (despite a higher uncertainty due to the decrease in frequency). This was also generally true for the CORDEX-CORE regions where the most positive added value was produced for the 99–100 percentile interval. The contribution of the lowest percentiles of the PDF substantially reduced the added value of the overall distribution due to the higher frequency of these events.

The observation sources used for comparison had a significant influence on the added value obtained. Higher resolution observations were more adequate in the identification of added value at fine scales, since these were more comparable to the model resolution and also had a better record of extreme events. Low-station density in the station-based gridded observations, which smooth out especially the tails of the distributions, could potentially produce a ‘false low or negative added value’. Overall, this method supports previous studies (Fantini et al. 2018; Torma 2015) in showing that RCMs provide added value by better representing extreme events.

The method was also used to produce a PDF-based climate change downscaling signal for future simulations, which was found also to increase at higher percentile intervals and in areas characterized by complex topography. The CORDEX-CORE ensemble showed this signal to be strongest in the equatorial regions.

The method described in this study explicitly demonstrates that RCMs provide an added value for precipitation in complex topographical regions, coastal areas and islands, as well as in tropical regions, especially for the tail-end of the distribution (extremes), as a result of the higher resolution of the downscaling models. Although the method was only used to assess precipitation at this stage, it can be used to quantify the added value of any variable provided reliable high-resolution observation data-sets are available.