1 Introduction

Identifying the most effective strategies to reduce the space/time scale gap between climate simulation results and users’ requirements represents a pivotal challenge in climate researches. Such a gap is even more evident since impact scientists require, as input for their models, tailored climate information that is not promptly available from current climate simulations (Fowler et al. 2007; Giorgi et al. 2009; Mearns et al. 2015; Reder et al. 2018).

In the last years, different strategies were developed trying to reduce this gap (Maraun and Widmann 2018). Enhancing the horizontal resolution of climate models through a dynamical downscaling could represent a first significant improvement (Kendon et al. 2012; Chan et al. 2013). Climate models with such an enhanced resolution are the regional climate models (RCMs). RCMs represent a dynamical refinement, over a limited area, of coarser general circulation models (GCMs) or observation-based dataset (reanalysis). Involving a limited area, RMCs need to be initialized with initial conditions and driven along their lateral atmospheric boundaries and lower-surface boundaries with time-variable conditions that are explicitly derived from the results of the coarser native model (GCM or reanalysis).

The gains or losses associated with the use of RCM simulations at finer resolution against GCM simulations, reanalysis or RCM simulations at coarser resolutions are acknowledged as added value (Di Luca et al. 2012, 2015; Lucas-Picher et al. 2012; Prein et al. 2013; Ban et al. 2014; Montesarchio et al. 2014; Hackenbruch et al. 2016; Kendon et al. 2017; Berthou et al. 2018; Chan et al. 2018; Fumière et al. 2019). The added value represents a general concept describing the degree of enhancement provided by a spatial refinement of climate models (namely how much the decrease in the model grid spacing can improve the representation of climate features).

The evaluation of the added value is a relevant issue, especially in mountainous areas, where the representation of local orography poses a considerable challenge for RCMs in reproducing mean climate and extremes, in particular for short-duration precipitation related to the convective instability. Convective precipitation falls over a localized area with variable intensity, due to the limited horizontal extent of convective clouds (cumulonimbus or cumulus congestus). In mid-latitudes, it is an intermittent event, often related to baroclinic boundaries and to orographic barriers. From a numerical viewpoint, convective processes are hard to simulate as it involves a multitude of processes occurring at a very local scale (< 4 km). For this reason, they are usually parameterized, even if the parameterization itself and related assumptions could induce systematic errors in the simulation of convective precipitation.

On this topic, Prein et al. (2016) pointed out that RCMs (resolution = 0.11°) are able to capture more efficiently, with respect to the coarsest ones, mean and extreme precipitation in Europe for almost all regions and seasons, mainly in the Alps. Such an enhancement is due to an improvement in the schematization of orography. Referring again to the Alpine region, Torma et al. (2015) considered the European and Mediterranean branches of the Coordinated Downscaling Experiment (CORDEX) (Drobinski et al. 2014; Jacob et al. 2014; Giorgi and Gutowski 2015) of the World Climate Research Programme (WCRP), referred to as EURO-CORDEX and Med-CORDEX, respectively, to highlight the added value due to the adoption of a higher resolution for the representation of mean and extreme precipitation. The authors state that such an added value is related to the improvement in the schematization of topographic features and, more importantly, it is associated with physical processes and not with a disaggregation of the large-scale forcing. However, other investigations highlight the inaccuracy of climate simulations, with deep convection parameterization and horizontal resolution in the order of 10 km, in reproducing short-duration precipitation (Hanel and Buishand 2010; Kendon et al. 2014; Berg et al. 2019). In this perspective, some studies have shown that very high-resolution (VHR) simulations (grid spacing below 4 km) could improve the models’ capability to reproduce these phenomena (Coppola et al. 2018), also thanks to the explicit treatment of the convective processes and a better representation of the orography (Ban et al. 2014; Prein et al. 2015; Berthou et al. 2018; Fumière et al. 2019). In the last years, an increasing number of studies were produced regarding convection-permitting climate simulation, showing that convection-permitting models do not necessarily better represent daily mean precipitation, but provide significantly improved sub-daily rainfall characteristics, such as the diurnal cycle and intensity of hourly precipitation extremes (e.g., Chan et al. 2013; Ban et al. 2014; Fosser et al. 2015; Pilon et al. 2016; Berthou et al. 2018; Fumière et al. 2019).

Despite the considerable efforts made in recent years, a statistical evaluation of the added value due to the horizontal and temporal high resolution in the representation of climate has not been fully explored yet (Fumière et al. 2019). In this perspective, the climate community is mainly interested in quantifying the advantages in considering time- and cost-expensive simulations for limited area applications (Giorgi et al. 2009; Kendon et al. 2012; Chan et al. 2013), especially for climate impact research.

Within this framework, this study aims at investigating the performances of VHR simulations, evaluating the capability to reproduce daily and sub-daily precipitation dynamics in a complex orographic context such as the Alpine region, often affected by heavy precipitation events which are likely to be significantly impacted in the future. The main goal is to objectively quantify gains and losses related to the modeling of the present climate due to an enhancement in temporal and spatial resolution.

This issue is addressed by comparing precipitation data, yielded from three climate experiments at different spatial scales, with areal and local observational datasets. The Expert Team on Climate Change Detection and Indices (ETCCDI) indicators (http://etccdi.pacificclimate.org/list_27_indices.shtm) and a selection of statistical models are used to assess precipitation distribution and extreme values for different durations of the precipitation events. To objectively evaluate gains and losses in adopting VHR simulations, results are compared by distribution added value (DAV) metric (Soares and Cardoso 2017). The study demonstrates a general gain in moving from the lowest to the highest resolution, especially at higher altitudes, thanks to a better representation of real topography and the possibility of switching off the deep convection parameterization.

First, the study describes (Sect. 2) the climate simulations and the observational datasets considered to evaluate VHR enhancements, as well as the methodology used to objectively quantify gains and losses in moving from lower to higher resolutions. Then, it shows the main results (Sect. 3) at the areal scale quantifying the potential added value of VHR and investigating the role of orography. Finally, the study shows the main results at the point scale to investigate sub-daily precipitation dynamics (Sect. 4) by statistically analyzing both the precipitation distribution and the precipitation extremes.

2 Materials and methods

2.1 Climate experiments

In this work, three regional climate simulations at different spatial scales have been selected (Fig. 1a).

Fig. 1
figure 1

Climate experiment analysis domains (a) and evaluation domain (including orography of CCLM 2.2 and a zoom on the selected local weather stations) (b)

The first dataset, labeled “EM-EC,” represents the ensemble mean of all EURO-CORDEX simulations available at January 2018 on the platform of the Earth System Grid Federation (ESGF), over 1979–2010 with daily resolution and driven by the ERA-Interim Reanalysis (Dee et al. 2011), with a spatial resolution of 0.11° (~ 12 km). The list of EURO-CORDEX simulations considered for the ensemble mean is reported in Table 1.

Table 1 EURO-CORDEX simulations at the horizontal resolution of 0.11° (about 12 km) covering the full period 1979–2010 with daily resolution, adopted in the present study (available at January 30, 2018; modified from Kotlarski et al. 2014)

The second and third consist of the results of climate simulations performed by the Centro Euro-Mediterraneo sui Cambiamenti Climatici (CMCC), characterized by two different configurations of the regional climate model COSMO-CLM (Rockel et al. 2008). The first configuration, labeled “CCLM 8,” is characterized by a spatial resolution of 0.0715° (~ 8 km) and an output frequency of 6 h, forced by ERA-Interim Reanalysis and covering the whole Italian peninsula and part of the neighboring countries. Its performances have been already widely evaluated over the Italian peninsula (Bucchignani et al. 2016; Zollo et al. 2015), highlighting a good agreement with several observational datasets in terms of mean and extreme values of temperature and precipitation. Furthermore, CCLM 8 has been used as input for several impact applications (Vezzoli et al. 2015; Reder et al. 2016; Rianna et al. 2017; Ciervo et al. 2017; Rianna et al. 2020). The second configuration, labeled “CCLM 2.2,” is a climate simulation characterized by a finer spatial resolution (0.02°, ~ 2.2 km) and an output frequency of 3 h, forced by the CCLM 8 simulation and covering a smaller area, centered over the Alpine space. Both simulations (CCLM 8 and CCLM 2.2) cover the period 1979–2010.

Table 2 summarizes the main features of the two configurations, listing the parameterizations used to account for the sub-grid-scale physical processes.

Table 2 Main characteristics of the two implemented COSMO-CLM configurations

Apart from model resolution, domain and time step, the model setup for the two configurations is the same (Table 2). From a physical point of view, the main difference is the convection representation. Formally, the default COSMO convective parameterization is the Tiedtke mass-flux scheme with moisture-convergence closure (Tiedtke 1989). Such a scheme distinguishes between shallow, deep and midlevel convection. In the convection-resolving setup (i.e., CCLM 2.2), only the shallow convection part of the scheme is active, while for deeper clouds the scheme is turned off.

2.2 Observational datasets

Two different observational datasets are taken into account to evaluate the accuracy of climate experiments

  • EURO4M (Isotta et al. 2014): It is a daily gridded dataset covering the European Alps and adjacent flatland with a horizontal resolution of 5 km for 1971–2009. It is based on rain-gauge data, with a distance-angular weighting scheme integrating climatological precipitation–topography relationships. The limitations due to the interpolation method are the underestimation (typically 10–20%) of high intensities (smoothing effect) and overestimation at low intensities (moist extension into dry areas), while systematic errors are more substantial for convective rainfall (Ban et al. 2014; Isotta et al. 2014).

  • Local weather stations (LWS): Hourly measures of precipitation provided by 11 local weather stations (Fig. 1b) at different altitudes managed by the Agenzia regionale per la protezione ambientale (ARPA) Lombardia (Italy) and freely available at https://www.arpalombardia.it/; the selected stations are listed in Table 3 together with the ID code, the spatial coordinates and the elevation. LWS data are used for the evaluation of CCLM 8 and CCLM 2.2 at the sub-daily scale. In this perspective, the position of local stations is adopted to select the corresponding grid point from the CCLM 8 and CCLM 2.2 grids using the nearest neighbor interpolation with a specific refinement for the CCLM 2.2 for which the grid point with the smallest altitudinal difference, searched in a 4 km radius around the station, is considered (Kaufmann 2008; Ban et al. 2014).

    Table 3 List of local weather station (LWS) used for the model evaluation at sub-daily scale

2.3 Analyzed domain and temporal resolution

The domain of the present study consists in the Alpine region (Fig. 1b). For each dataset (both climate experiments and observations), all the grid points belonging to this domain have been considered without performing remapping operations (1353, 3200, 40,801 and 8127 grid points for EM-EC, CCLM 8, CCLM 2.2 and EURO4M, respectively).

As regards the temporal range, climate experiments are analyzed over 1980–2008 for daily precipitation (areal evaluation with respect to EURO4M) and 1995–2010 for sub-daily precipitation (point-scale evaluation against LWS). Each period is obtained by a time intersection between model results and observed datasets; in the first case, the year 1979 was neglected as it is considered as spin-up for the CCLM simulations.

As concerns the temporal resolution, EM-EC is considered only for investigation at the daily scale, while CCLM 8 and CCLM 2.2 are adopted also for sub-daily analysis with a time step of 6 h (the time resolution shared by both the climate experiments). In this perspective, also the LWS data, collected at a time resolution of 1 h, have been aggregated at the 6-h time resolution.

2.4 Quantifying the added value: DAV score method

To assess the performances of climate experiments and mainly to objectively quantify the added value in adopting higher-resolution RCMs, the distribution added value (DAV) is adopted (Soares and Cardoso 2017). Such a metric provides an objective and normalized measure of the added value in terms of potential gain in the performance of climate models due to the usage of a higher resolution, comparing higher- and coarser-resolution simulation probability density function (PDFs) to the observational PDF. In this perspective, DAV accounts for the difference in skill scores (Perkins et al. 2007) between high resolution (subscript hr) and low resolution (subscript lr) assuming the observations (subscript obs) as reference:

$${\text{DAV}} = \frac{{S_{\text{hr}} - S_{\text{lr}} }}{{S_{\text{lr}} }} = \frac{{\mathop \sum \nolimits_{1}^{n} \hbox{min} \left( {Z_{\text{hr}} ,Z_{\text{obs}} } \right) - \mathop \sum \nolimits_{1}^{n} \hbox{min} \left( {Z_{\text{lr}} ,Z_{\text{obs}} } \right)}}{{\mathop \sum \nolimits_{1}^{n} \hbox{min} \left( {Z_{\text{lr}} ,Z_{\text{obs}} } \right)}}$$
(1)

where DAV is the distribution added value; Shr and Slr are the Perkins skill score for high and low resolution, respectively; n represents the number of bin considered to obtain the PDF; Zhr, Zlr and Zobs are the frequencies of values in a given bin for high resolution, low resolution and observations, respectively.

DAV allows estimating the benefit associated with a higher resolution:

  • DAV = 0 indicates that no gain is found;

  • DAV < 0 points out a loss associated with the usage of a higher resolution;

  • DAV > 0 expresses the beneficial impact of increasing the grid spacing.

In general, DAV represents a tool capable of comparing any kind of gridded information. For this reason, it can be tailored as for climate model results as for other physical variables such as the orography characteristics. It features a great potential, as it is versatile and synthetic, but it also has disadvantages due to its inability in locating over- and underestimations.

2.5 Methods and tools for daily analysis

Precipitation (PRCP) data are processed using a selection of ETCCDI indicators and statistical models able to assess mean distributions and extreme values for different durations of the precipitation events.

For daily scale analysis, the following ETCCDI indicators are considered:

  • PRCPTOT: annual total precipitation in wet days (PRCP ≥ 1 mm)

  • R20 mm: annual count of days when PRCP ≥ 20mm

  • CDD: maximum length of dry spell (i.e., maximum number of consecutive days with PRCP < 1 mm)

  • CWD: maximum length of wet spell (i.e., maximum number of consecutive days with PRCP ≥ 1 mm)

All indicators are computed over the entire domain (see Fig. 1b) on a yearly base and then averaged over 1980–2008.

For sub-daily analysis, precipitation patterns are evaluated by:

  • Interpreting data at time resolution of 1 day and 6 h through the empirical distribution function to analyze the pooled precipitation samples;

  • Fitting data at time resolution of 6 h through the index storm method to analyze the maximum values distribution.

The storm index method (Brath et al. 2003) is considered a common approach to analyze precipitation extremes since it is able to ensure rainfall consistency, preserving the increasing dependence of precipitation depth on both duration and return period (Padulano et al. 2019). According to the storm index method, the rainfall depth of an extreme precipitation event x with return period T and rainfall duration tr is obtained as function of a scale parameter (µ), only depending on duration, and a frequency parameter or “growth factor” (kT), only depending on the return period:

$$x_{T} \left( d \right) = \mu \left[ {x\left( {t_{r} } \right)} \right] \cdot k_{T}$$
(2)

Focusing on the growth factor kT, one of the most common functions for hydrological applications concerning maxima issues is represented by the generalized extreme value (GEV) probability distribution (Hosking et al. 1985), expressed as:

$$\varPhi_{\text{GEV}} \left( {k_{T} } \right) = \exp \left\{ { - \left[ {1 - \kappa \cdot \left( {\frac{{k_{T} - \mu }}{\sigma }} \right)^{{\frac{1}{\kappa }}} } \right]} \right\}$$
(3)

where κ, σ and µ are the shape, scale and location parameters, respectively. For κ = 0, GEV coincides with the Gumbel distribution; for κ < 0, it coincides with the Fréchet distribution; for κ > 0, it corresponds to the Weibull distribution.

3 Areal evaluation

3.1 Comparison in terms of ETCCDI indicators

The first part of the study consists in comparing the three climate experiments (EM-EC, CCLM 8 and CCLM 2.2) to the EURO4M observational dataset, considering the domain shown in Fig. 1b. Figures 2, 3, 4 and 5 show the results for the PRCPTOT, R20mm, CDD and CWD indices, respectively. It should be emphasized that the data are not remapped to avoid artificial downscaling/upscaling. Avoiding remapping should penalize more a coarser resolved model (featuring a smaller spatial variability per construction); however, such an approach aims at emphasizing the actual added value at a finer scale that is the scale usually used by impact scientists for their models.

Fig. 2
figure 2

Comparison between EURO4M (upper left panel), EM-EC (lower left panel), CCLM 8 (upper right panel) and CCLM 2.2 (lower right panel) for PRCPTOT over 1980–2008

Fig. 3
figure 3

Comparison between EURO4M (upper left panel), EM-EC (lower left panel), CCLM 8 (upper right panel) and CCLM 2.2 (lower right panel) for R20 mm over 1980–2008

Fig. 4
figure 4

Comparison between EURO4M (upper left panel), EM-EC (lower left panel), CCLM 8 (upper right panel) and CCLM 2.2 (lower right panel) for CDD over 1980–2008

Fig. 5
figure 5

Comparison between EURO4M (upper left panel), EM-EC (lower left panel), CCLM 8 (upper right panel) and CCLM 2.2 (lower right panel) for CWD over 1980–2008

As regards PRCPTOT (Fig. 2), EURO4M returns values between 300 and 1200 mm/year except for the northwestern and southeastern areas where the values range between 1800 and 2700 mm/year. Compared to EURO4M, all the climate experiments overestimate PRCPTOT. In particular, from a graphical viewpoint, the CCLM 2.2 simulation is characterized by the lowest bias.

As regards R20mm (Fig. 3), EURO4M exhibits a pattern similar to the PRCPTOT one, with values generally ranging between 10 and 40 days/year. EM-EC is characterized by a general overestimation, whereas CCLM 8 and CCLM 2.2 better reproduce both values and spatial distribution.

As concerns CDD (Fig. 4), EURO4M highlights lower values over the southeastern part of the domain, with values ranging between 20 and 40 days/year. Compared to EURO4M, EM-EC better reproduces this indicator with respect to CCLM 8; at the same time, CCLM 2.2 shows a lower bias in the central part of the domain, but it tends to be characterized by a strong overestimation on the southwestern part of the domain.

Contrary to CDD, EURO4M returns an increase in CWD (Fig. 5) from bottom to top with values generally between 4 and 12 occurrences/year. EM-EC overestimates such an indicator on the entire domain, whereas CCLM 8 and CCLM 2.2 show a lower bias, especially the CCLM 2.2 simulation, which reveals a reduced overestimation of the highest values in the central part of the domain.

3.2 Added value assessment

In order to quantify gains and losses associated with the use of VHR simulations, in this section the results in terms of DAV score (Eq. 1) are reported (Table 4). Specifically, the DAV is computed for each indicator by first comparing EM-EC, taken as lr, to CCLM 8, taken as hr, and then CCLM 8, assumed as lr, to CCLM 2.2, assumed as hr.

Table 4 DAV score for each ETCCDI indicator

Moving from EM-EC (about 12 km of resolution) to CCLM 8 (about 8 km of resolution), the obtained improvement is evident, especially in terms of PRCPTOT (about 21%) and CWD (about 30%), whereas a worsening is returned in terms of CDD (about − 10%). Moving from CCLM 8 to CCLM 2.2, such an improvement is attenuated and above all it is observed that the R20mm, assumed as an index of extreme precipitation events, yields a loss of performance. However, such a loss could not be due to the VHR itself, but it could be associated with the spatial resolution of the EURO4M dataset (about 5 km) considered as reference for the evaluation, which is intermediate between the two climate experiments. There are indeed many issues related to the observational dataset, such as systematic error in catchment area calculation due to local influences of wind, limitations due to the station density, changing positions and changing instruments, and dependences on surface altitude that can also be found. In addition, although the observational dataset EURO4M lies on a numerical grid with ~ 5 km grid spacing, its “effective resolution” is 10–15 km (Isotta et al. 2014) due to the density of the underlying station network. For instance, the highest station density can be found in Germany, Switzerland, Austria and France with ~ 8 to 14 stations per 1000 km2, whereas the station density in Italy is about 6 stations per 1000 km2 and even less in Croatia. In this perspective, Sungmi and Foelsche (2018) have recently demonstrated that at least 3 stations per 300 km2 (i.e., 10 stations per 1000 km2) are required to keep interpolation errors (normalized RMSE) of heavy (90th percentile) daily precipitation well below 20%. These insights suggest the increasing need for more resolute observational datasets, mainly for VHR models, with resolutions up to 1 km, whose results are sometimes difficult to validate (Kendon et al. 2014; Ban et al. 2015; Fosser et al. 2015).

3.3 Effect of orography

This section focuses on the added value quantified this time by clustering the results according to the orography. Such an evaluation is made up of two steps, the former consisting in testing the representation of orography itself, the latter consisting in testing what is the effect provided by the enhancement in the representation of local orography on the climate model results.

To take into account the limitations in the reliability of EURO4M dataset above 1500 m a.s.l. (Isotta et al. 2014), three altitude classes are considered: 0–750 m a.s.l., 750–1500 m a.s.l. and > 1500 m a.s.l. Table 5 reports the DAV obtained by moving from the coarser resolution (8 km) to the finer one (2 km) for the three aforementioned altitude classes. In this case, the digital elevation model over Europe EU-DEM v1.1 (https://land.copernicus.eu/imagery-insitu/eu-dem/eu-dem-v1.1?tab=metadata) at 25 m resolution has been assumed as reference. An improvement in orography refinement has been obtained adopting the finer resolution at 0–750 m a.s.l. and > 1500 m a.s.l. class, while CCLM 2.2 returns a slight underestimation for the 750–1500 m a.s.l.

Table 5 DAV score for orography refinement from CCLM 8 to CCLM 2 (EU-DEM v1.1 is assumed as reference)

Figure 6 plots the results obtained for each indicator assuming EURO4M as observation dataset and CCLM 8 and CCLM 2.2 as climate datasets. The results are represented as box-whisker plot pointing out, in addition to the mean value, also the 10th and 90th percentile of the distribution. Table 6 reports the DAV associated with the results of Fig. 6.

Fig. 6
figure 6

Comparison between EURO4M, CCLM 8 and CCLM 2.2 data clustered on the basis of the altitude for PRCPTOT (a), R20 mm (b), CDD (c) and CWD (d) indicators

Table 6 DAV score clustered for altitudes (data consider only the enhancement in resolution from CCLM 8 to CCLM 2.2)

As concern PRCPTOT, EURO4M shows values of about 1089 mm/year, 1270 mm/year and 1191 mm/year for the three altitude classes, respectively (Fig. 6a). On the other hand, CCLM 8 and CCLM 2.2 slightly underestimate at low altitude overestimating instead for the other two classes. Nevertheless, considering average and spread, an overall gain due to the resolution refinement (+ 32% in the range 750–1500 m a.s.l. and + 63% for > 1500 m a.s.l; Table 6) can be observed.

As regards R20mm, the only indicator related to extreme events, the observations return on average 14, 17 and 15 events with precipitation depth higher than 20 mm per year for the different altitude classes, respectively (Fig. 6b). Both climate experiments underestimate this indicator at the lowest altitudes and slightly overestimate it at the highest altitudes (Table 6) with a slight loss of performance of CCLM 2.2 with respect to CCLM 8 in the first case (− 13%) and a gain in the second one (up to + 7%).

As concerns CDD (Fig. 6c) and CWD (Fig. 6d), both indicators show negative DAV values over the lowest altitude areas, whereas a performance enhancement can be found for the higher altitudes when a finer resolution is adopted. This results in a gain due to the spatial refinement from 8 to 2.2 km (Table 6).

In summary, the spatial resolution refinement generates for the case in hand an added value in the range 750–1500 m and a loss at lower altitudes with some exceptions. For the highest altitudes, the evaluation of the DAV returns a gain for all the indicators. However, small DAV values in Table 6 may not be a reliable estimate of the added value as model performances could also be affected by the limited reliability in using EURO4M at the highest altitudes.

4 Point-scale evaluation

4.1 Improvement in orography representation and data quality analysis

The last section of this work focuses on the characterization of sub-daily precipitation patterns and the evaluation of the potential added value provided by the refinement of spatial resolution at the point scale. In this perspective, the datasets adopted to investigate these issues are LWS as observations and CCLM 8 and CCLM 2.2 as climate experiments, considering the period 1995–2010 for the analysis.

The data provided by LWS have been preliminarily analyzed to verify their quality in terms of completeness. In this perspective, the stations do not present any missing values, with the exception of ID108 and ID133, whose completeness is 86% and 46%, respectively. Assuming the value of 75% as threshold for the completeness analysis (ISPRA 2012; Padulano and Del Giudice 2020), it is decided to exclude ID133 from the investigation.

In order to compare observations and climate simulations, the position of local stations is used to select a corresponding grid point from the CCLM 8 and CCLM 2.2 grids using the nearest neighbor interpolation with a specific refinement for the CCLM 2.2 for which also an altitude constraint is introduced (see Sect. 2.2). Figure 7 compares the elevations of the selected grid points to the local station ones. The criterion adopted for the selection of the CCLM 2.2 grid points returns a significant improvement highlighting the enhancement in the representation of local orography obtained by refining the spatial resolution. The comparison returns a coefficient of determination increasing from about 37% for the coarser resolution to about 95% for the finer one.

Fig. 7
figure 7

Comparison of elevation between LWS and corresponding grid points for CCLM 8 and CCLM 2.2

4.2 Precipitation distribution

Precipitation distribution at the point scale is analyzed by evaluating the empirical CDFs (cumulative distribution function) of the precipitation samples for each dataset (LWS observations, CCLM 8 and CCLM 2.2) at daily and 6-h temporal resolutions, over DJF (December–January–February) and JJA (June–July–August) seasons.

The idea of focusing on DJF and JJA is mainly related to type of physical process leading to precipitation during these seasons. Indeed, JJA precipitation is mainly driven by convective processes capable of arising short-duration localized events with variable intensity, while DJF precipitation is mainly driven by advective processes capable of arising events of weak or at the most moderate intensity which can persist for several hours or even for whole days.

Figures 8 and 9 compare the CDFs of each LWS with the CDFs obtained considering the CCLM 8 and CCLM 2.2 climate simulations over JJA at daily and sub-daily resolution, respectively, whereas Table 7 reports the DAV score assessed for both the seasons and both temporal resolutions.

Fig. 8
figure 8

Empirical cumulative distribution function against daily precipitation for each local station over 1995–2010 JJA considering LWS observations and CCLM 8 and CCLM 2.2 climate experiments

Fig. 9
figure 9

Empirical cumulative distribution function against 6-h precipitation for each local station over 1995–2010 JJA considering LWS observations and CCLM 8 and CCLM 2.2 climate experiments

Table 7 DAV score for the local weather stations analyzed at daily and 6-h resolution (data consider the enhancement in resolution from CCLM 8 to CCLM 2.2 over DJF (December, January, February) and JJA (June, July, August))

Figures 8 and 9 show that the LWS observations return lower occurrence probability compared to the model data for each rain gauge; this means that the climate experiments overestimate overall precipitation at both the daily and the sub-daily scale. On the other hand, CCLM 8 and CCLM 2.2 highlight a similar behavior, with gains generally increasing with altitudes, also shown in Table 7. This feature is confirmed by plotting the DAV against the LWS altitude (Fig. 10) for the 6-h samples, where CCLM 2.2 shows better performances with respect to CCLM 8 for the higher altitudes and an opposite behavior for the lower ones in summer period (JJA); on the other hand, in winter (DJF), not only the gain in using the highest resolution does not depend on the altitude, but also this gain is substantially null.

Fig. 10
figure 10

LWS elevation against DAV score for sub-daily precipitation over DJF and JJA

It is noteworthy to remark the added value in the 6-h resolution compared to the daily resolution (Table 7). Such an added value is even more evident during JJA. This result is in line with the physical processes regulating precipitation: Convective-permitting models (e.g., CCLM 2.2) provide improved sub-daily rainfall characteristics when precipitation is of convective nature (i.e., during summer), while the gain is negligible for precipitation events when advection process is predominant (i.e., during winter).

4.3 Extreme values analysis

The yearly maximum precipitation is analyzed in terms of growth factors and mean value as described in Sect. 2.5. As concerns growth factors, for each dataset (LWS, CCLM 8 and CCLM 2.2) the maximum rainfall samples for durations of 6 h, 12 h and 24 h were extracted and normalized by their mean value. In this way, data at different time resolutions can be merged creating three pooled samples corresponding to observations, CCLM 8 and CCLM 2.2 for each rain-gauge location. These samples are then interpreted by using the GEV function (Eq. 3) with the probability weighted moment as fitting method, to determine the probability distribution of the growth factors. Table 8 lists the GEV parameters (shape, scale and location) carried out by fitting the different pooled samples.

Table 8 GEV parameters for LWS, CCLM 8 and CCLM 2.2 over and JJA (June, July, August)

Figure 11 compares the CDFs of LWS observations with the PDFs obtained considering the CCLM 8 and CCLM 2.2 climate simulations over JJA. Table 9 reports instead the DAV score assessed over the same season.

Fig. 11
figure 11

Cumulative distribution function against growth factor for each local station over 1995–2010 in JJA season, considering maximum yearly values of LWS observations and CCLM 8 and CCLM 2.2 climate experiments

Table 9 DAV score for the local weather station analyzed as normalized maximum precipitation to obtain the growth factor probability distribution (data consider the enhancement in resolution from CCLM 8 to CCLM 2.2 over JJA (June, July, August)

The comparison points out that CCLM 8 and CCLM 2.2 highlight a behavior in line with the previous results (Fig. 8): generally, by plotting the DAV against the LWS altitude (Fig. 12), a gain increasing with the altitude is evident, even if in this case such an increment is much more scattered.

Fig. 12
figure 12

LWS elevation against DAV score for normalized maximum precipitation

As concerns mean values, the aim is to investigate the dependence of mean yearly maximum precipitation on elevation and duration. To do this, for each dataset (LWS, CCLM 8 and CCLM 2.2), the mean values of the maximum rainfall samples are calculated at different temporal aggregations (6 h, 12 h and 24 h) and plotted against elevation (Fig. 13) and duration (Fig. 14).

Fig. 13
figure 13

Average of maximum precipitation at 6-h (a), 12-h (b) and 24-h (c) aggregation for each local station considering LWS observations and CCLM 8 and CCLM 2.2 climate experiments

Fig. 14
figure 14

Mean precipitation against duration for each local station over 1995–2010 in JJA season, considering LWS observations and CCLM 8 and CCLM 2.2 climate experiments

Figures 13 and 14 show that CCLM 8 reproduces the precipitation maxima more satisfactorily compared to CCLM 2.2 over lower altitude for different durations, whereas the opposite behavior is returned over areas with an altitude higher than 1200 m. However, in some cases, the differences can be considered negligible. The climate experiments on the other side fail in capturing the maximum precipitation for the ID833 local station at each temporal aggregation; this error will be subject to further investigation.

5 Conclusion

In the last years, the need to reduce the model errors associated with parameterized convection and a more detailed representation of present and future regional climate strongly motivated the increase in climate modeling activities at convection permitting scales (grid spacing below 4 km). The statistical evaluation of the added value due to the horizontal and temporal resolution refinement in the representation of climate has not been fully explored yet (Fumière et al. 2019). Despite being focused on a small area, the results presented make this work an important contribution in this framework, especially for climate impact research purposes.

The main results of the analysis are:

  • The analysis of daily precipitation data returns a general gain in moving from the lowest to the highest resolution (12–8–2.2 km), as shown by the DAV score analysis for PRCPTOT, R20mm, CDD and CWD indicators, even if the DAV has disadvantages due to its inability in locating such a gain precisely;

  • The effect of local orography is investigated both clustering spatial data and with point-scale analysis; in both cases, the simulation characterized by the highest resolution better captures local precipitation dynamics at higher altitudes. This is particularly evident from the analysis of sub-daily precipitation distribution and extreme events during summer when precipitation is mainly driven by convective processes capable of arising short-duration localized events with variable intensity.

This work reinforces and partially confirms the results carried out in other similar studies over other European regions (Fosser et al. 2015; Kendon et al. 2017; Berthou et al. 2018, Knist et al. 2018; Fumière et al. 2019; Piazza et al. 2019). In general there is an added value in the representation of the precipitation dynamics both at daily and sub-daily scale and in the representation of extreme events moving from lower resolution to the higher (12–8–2.2 km), in particular at higher altitude (over 1200 m/1500 m). However, some results also confirm the idea that the gain or losses in the precipitation representation are not linked only to the high-resolution simulation, but depend on combination of different factors, like the increasing of resolution, physical parameterizations, meteorological conditions and how the model represents the explicit representation of deep convection (Ducrocq et al. 2002; Vié et al. 2011; Coppola et al. 2018).