1 Introduction

Regional Climate Models (RCMs) provide for the investigation of the spatial and temporal evolution of the climate on a regional to continental scale. The use of current RCMs makes it possible not only to simulate atmospheric processes at higher temporal and spatial resolution, but should add significant skill to simulating the climate on a scale of 5 to 50 km. Information on how the climate might change in the future on the regional scale is of great importance for the planning of adequate adaptation measures.

In recent years considerable efforts have been made to understand both the regional climate predictability and capability of RCMs over Europe in simulating the regional climate within the framework of ENSEMBLES (Hewitt 2005) and PRUDENCE (van der Linden and Mitchell 2009; Christensen et al. 2002, 2007). In a study based on PRUDENCE, models were validated in terms of reproducing long term climate means and inter-annual variability of temperature and precipitation in comparison with the CRU observational data set over Europe (Jacob et al. 2007). Daily precipitation statistics from model simulations of both PRUDENCE and ENSEMBLES over Europe were evaluated by Boberg et al. (2009, 2010) and Kjellström et al. (2010). Nikulin et al. (2011) investigated temperature, precipitation and wind extremes over Europe, as simulated by the Swedish Climate Model RCA3 (Rossby Center Regional Climate Model).

An evaluation of the ability to simulate precipitation in seven different regions over the globe has shown that CCLM cannot be transferred directly to other climate zones, but requires specific adjustments in each case (Rockel and Geyer 2008). The RCM CCLM—which has also been used in this study—has furthermore been under investigation in respect of its model skills for the region of Europe (Bachner et al. 2008; Dobler and Ahrens 2008; Feldmann et al. 2008; Hollweg et al. 2008; Jaeger et al. 2008; Smiatek et al. 2009; Suklitsch et al. 2010; Davin et al. 2011).

Whereas regional climate simulations have been under thorough investigation for larger parts of Europe, only a few studies exist that evaluate RCMs specifically over the European Alpine Region. This region covers nearly 200,000 km² and is located at the centre of the European continent, marking the transition area between the Atlantic, Continental, and Mediterranean climate zones. Large mountain chains, reaching up to 4,000 m above sea level, alternate with deep valleys on a relatively small spatial scale. This complexity induces considerable differences in regional climate, which has led to the definition of distinct climatologic sub-regions (HISTALP regions) even in this rather small-scale area (Auer et al. 2007). Modifications of local climate due to the orography include phenomena like temperature inversions in valleys (e.g. Whiteman 1990), precipitation enhancement or shadowing effect by mountain chains (e.g. Roe 2005), as well as föhn events (e.g. Brinkmann 1971; Seibert 1990).

For the Alpine Region, different RCMs have been evaluated and compared in terms of precipitation on a daily basis for current climate conditions (Schmidli et al. 2007). The authors found substantial variations in model performance between different sub-regions in the Alps as well as higher skills for precipitation occurrence than for intensity. An extensive analysis of high resolution RCM simulations in the Alpine Region revealed considerable uncertainties and weaknesses in the simulation of daily precipitation, depending on the model and/or the specific setup chosen for the simulation (Suklitsch et al. 2008). However, the skill for simulated temperatures was generally higher. The bias was found to differ considerably among the sub-regions under investigation and no systematic dependency could be detected. Cloud resolving climate models still show large biases in complex terrain, but perform better than conventional RCMs in smaller investigation areas (Smiatek et al. 2009). Daily precipitation statistics from different RCM simulations were investigated by Frei et al. (2003) for the European Alps. They found that all models were able to simulate the spatial distribution of seasonal precipitation well, but revealed considerable shortcomings in wet-day frequency and precipitation intensity on the daily resolution at the same time.

All in all there are challenging demands on RCMs when meteorological processes and hence specific characteristics of the regional climate have to be simulated in complex terrain. Consequently, there is a strong need for further investigations on the skill of RCMs in such special regions like the European Alps, and for deeper insight in their capabilities and limits with a view to further improving the model physics.

The purpose of this paper is to evaluate the ability of the RCM CCLM 4.8 (COSMO model in Climate Mode, version 4.8) to simulate the present climate in the Greater Alpine Region (GAR) at a spatial resolution of 10 km. Hindcast simulations driven by ERA-40 reanalysis data (Uppala et al. 2005) are compared to several observational datasets for temperature and precipitation. Beside the conventional evaluation of climatologic variables for different sub-regions, the authors focused on the model‘s capabilities in respect of underlying meteorological processes.

The paper is structured as follows: In the data and methods section, an introduction to the CCLM-setup and the observational datasets is given, along with an assessment of the uncertainties in the observational datasets and a description of the evaluation methods used. In the results section, the CCLM skill concerning temperature and precipitation as well as temperature trends and altitude dependencies of model bias and correlation is shown. In the discussion section, the main findings are discussed and possible reasons for the bias in the model are addressed. The conclusions summarize the most important findings of this study.

2 Data and methods

2.1 RCM setup and data

The CCLM is the climate version of the weather forecast model of the German Weather Service (DWD) (Böhm et al. 2006; Doms et al. 2002). It is a non-hydrostatic RCM, using a regular latitude/longitude grid with a rotated pole and a terrain following height coordinate with 32 vertical layers. The CCLM includes the multilayer soil model TERRA (Schrodin and Heise 2001) with 10 soil layers defined down to a depth of 15 m. The evapotranspiration of plants is parameterized based on the Biosphere-Atmosphere Transfer Scheme (BATS) (Dickinson et al. 1986). For the moist convection the Kain-Fritsch parameterization scheme has been chosen (Kain and Fritsch 1993). The numerical integration followed the Runge-Kutta-approach. The model has been used previously for the simulation of present and/or future climate conditions in the following EU funded projects: PRUDENCE (http://prudence.dmi.dk/), ENSEMBLES (http://ensembles-eu.org/) and is also involved in the CMIP5 simulations for the 5th Assessment Report of the IPCC. In this study the model version 4.8 has been used, simulating the Alpine climate in a two-step nesting approach. The first nesting encompasses the European domain at 50 km spatial resolution (104 by 104 grid points, 360 s time step) and the second step covers the Greater Alpine Region (GAR) in a 10 km resolution (125 by 110 grid points, 80 s time step) (see Fig. 1).

Fig. 1
figure 1

Simulation domains and model orography of CCLM in the first nesting step in 50 km resolution (left) and the second nesting step in 10 km resolution (right). The simulation domain is determined by the whole map domain. The analysis is carried out only for the coloured areas skipping the sponge-zone of 10 grid points (white margin) and mostly for climatologic sub-regions, shown on the right hand side as thick black lines (NW Northwest, NE Northeast, SW Southwest, SE Southeast, HI High Altitude)

2.2 Observational data

2.2.1 Gridded observation data from CRU for air temperature and precipitation

For the validation of surface temperature and precipitation, the Climate Research Unit (CRU) monthly mean global gridded dataset version TS 2.1 has been used (Mitchell and Jones 2005). It has a spatial resolution of 0.5° × 0.5° covering the time period 1901–2002 and is based on in-situ measurements from a large number of stations from different sources. The dataset contains variables such as daily mean, minimum and maximum temperatures, diurnal temperature range, precipitation amount, wet day frequency, frost day frequency, vapour pressure and cloud cover. The interpolation method takes correlations between the stations into account and identifies inhomogeneities.

2.2.2 Gridded observation data from E-OBS for air temperature and precipitation

This data set consists of European land-only daily high-resolution gridded data for precipitation, and minimum, maximum and mean surface temperatures covering the period 1950–2006. These data is provided in four different versions, each with a different spatial resolution to allow the validation of RCM results. Observations from about 250 stations having data over 50 years collected via the European Climate Assessment and Data set (ECA&D), from other research projects like STARDEX (Haylock et al. 2006) and additional time series provided by European National Weather Services have been interpolated. Based on the spatial correlation of the data (altogether 2,316 stations, number varies over time) an estimation of uncertainty is offered to the users (Haylock et al. 2008).

2.2.3 Gridded observation data from GPCC4 for precipitation

The full reanalysis product from the German Weather Service GPCC4 is a global gridded dataset (Schneider et al. 2008) with a horizontal resolution of 0.5° and a monthly temporal resolution covering the time period from 1901 to 2007. The reanalysis is based on quality controlled observation data from up to 67,200 stations (Rudolf et al. 2011).

2.2.4 Gridded observation data from HISTALP for precipitation

Within the framework of the HISTALP activities at the Austrian Weather Service (ZAMG), a high resolution precipitation dataset was created by linear interpolation of the gridded precipitation data set from Efthymiadis et al. (2006). It spans the time period from 1,800 to 2003 providing gridded data on a monthly basis. The spatial resolution is 0.08° covering the Greater Alpine Region from 4°–19° eastern longitude and 43°–49° northern latitude. (Chimani et al. 2011)

The observational datasets by CRU, E-OBS and GPCC were interpolated to the rotated CCLM grid using a bilinear approach considering land-/sea-fraction and a constant height gradient for temperature of 0.0065 K/m, whereas the HISTALP dataset was interpolated using a nearest neighbour technique due to the similar spatial resolution.

2.3 Uncertainties in observational datasets

RCM simulations barely produce ideal results when hindcast runs driven by reanalysis data are compared to observations. However, gridded datasets of observed climate variables are afflicted with uncertainties as well. As Frei et al. (2003) stated, precipitation measurements at high alpine sites are generally afflicted with a considerable amount of uncertainty emerging from a systematic measurement bias. This is mainly caused by wind field deformation and deflection of hydrometeors over the gauge orifice, leading to significant undercatchment of precipitation. Estimates of this error for the Alpine region are largest in winter, due to high wind speed and a high fraction of snowfall leading in general to underestimations of 40 % at altitudes above 1,500 m and 12 % in summer (Frei et al. 2003).

Other potential sources of uncertainty in observational, gridded datasets emerge from the different gridding techniques applied (Hofstra et al. 2008; Ensor and Robeson 2008) or more simply from data quality issues (e.g. Schmidli et al. 2001). Factors like a low station density, an uneven allocation of stations, as well as changes in station density over time potentially influence resultant grid-point average estimates, especially by changes in variance (Hofstra et al. 2009a; Perry and Hollis 2005). As a result, extremes are much more affected than the means, especially as regards variables with large spatio-temporal variability like precipitation in mountainous regions.

All these aspects are of relevance also for temperature, although in general the effects on temperature are less than on precipitation (Hofstra et al. 2009b). In mountainous terrain, however, temperature can show a very complex vertical structure even in narrow space and especially in the winter half-year, leading to an additional uncertainty in the gridded data, dependent on the complexity of the underlying terrain itself (Stahl et al. 2006; Daly 2006).

Various gridded datasets have been used in this study to get an insight into the range of uncertainty amongst these datasets, especially in the critical areas with highly complex orography.

Figure 2 shows the range of seasonal precipitation totals between four different observational datasets used in this study (CRU, E-OBS, HISTALP and GPCC). The ranges in the upper two panels are defined as the difference of the dataset with the maximum value and the dataset with the minimum value of temperature and mean precipitation sum, respectively, at each grid point. The absolute precipitation range shows a similar pattern in each season, with the largest uncertainties over the central Alpine region according to the areas of highest elevation. The picture is somehow different as regards uncertainty in terms of relative values. These have been calculated by dividing the absolute range by the respective seasonal mean of each dataset. Values of more than 100 % indicate that the range is even bigger than the mean seasonal precipitation amounts. The relative ranges as displayed in Fig. 2 (lower panel) show a stronger signal in winter due to the lower absolute precipitation totals in the cold season. Nevertheless, uncertainty seems to be largest over mountainous terrain again, indicating a much higher relative range compared to the lowlands. Especially at some points in the central and western parts of the Alpine Ridge, there is little correspondence between the given datasets, indicating large uncertainties in the observation.

Fig. 2
figure 2

Seasonal range between gridded observational datasets for temperature (CRU, E-OBS, upper panel) and precipitation (CRU, E-OBS, HISTALP and GPCC) as the absolute range of seasonal precipitation totals (middle panel) and relative range of seasonal precipitation totals with respect to the mean (lower panel)

2.4 Evaluation methods

The evaluation in this study has been carried out based on seasons, and separately for the different sub-regions. The seasons are defined as follows: winter—December, January, February (DJF), spring—March, April, May (MAM), summer—June, July, August (JJA) and autumn—September, October, November (SON). Mean values always refer to the time period 1961–2000.

The sub-regions were defined by Auer et al. (2007) by applying a Principal Component based analysis of HISTALP station data to study climate variability in the Alpine region. These sub-regions are: Northwest (NW), Northeast (NE), Southwest (SW) and Southeast (SE), each comprising specific characteristics of temperature, precipitation, pressure, sunshine duration and cloudiness. To investigate the skill of CCLM within the mountainous terrain of the study region, a fifth sub-region has been introduced—the High Altitude (HI) region—indicating those CCLM grid boxes above 1,000 m sea level in the Alpine region. The domains of the sub-regions are shown in Fig. 1.

To assess the model skill, statistics of mean bias, correlation and variability, as well as trends were calculated. In terms of correlation, the spearman rank correlation was applied, as a robust alternative to the Pearson product moment correlation (Wilks 2006). To determine linear trends, simple linear regressions on the given time series were used. The magnitude of trends always refers to the whole length for each time period under consideration unless specified otherwise.

Inter-annual variability is expressed in terms of standard deviation, derived from annual totals or means of a certain variable under consideration.

3 Results

3.1 Temperature

As compared to the CRU and E-OBS data, CCLM shows a mean annual bias of −0.6 and −0.7 °C averaged over the whole domain. This cold bias is largest in the Alpine areas, as opposed to some regions south of the Alps with a slight warm bias at different locations dependent on the observational dataset (see Fig. 3). The temperature bias in winter (Fig. 4) is between −2 and −3.5 °C with the largest deviation in the HI region, followed by the SW region, but with considerable variability in space, particularly against CRU ranging from 1° to nearly −7 °C in the HI region. The other sub-regions show a range of cold bias spanning from roughly −1.5 to −2.0 °C, whereas the temperature bias in summer is close to zero averaged over all sub-regions. While nearly no bias is apparent in the northern sub-regions, the southern sub-regions show a warm bias with a maximum of more than 1 °C compared to CRU and less than 1 °C to E-OBS in the SW region. The transitional seasons show a similar pattern with almost no bias in the low land sub-regions. The HI region clearly stands out against the others considering the cold bias, but also in terms of spatial variability against the CRU dataset. This underestimation of temperatures is somewhat stronger in autumn, especially at higher altitudes, reaching values of more than −2 °C (against E-OBS) in the HI region as compared to spring. In both transitional seasons the bias in the other sub-regions is roughly the same, spanning from 0 to 1 °C.

Fig. 3
figure 3

Spatial distribution of the mean annual temperature bias, CCLM minus CRU and E-OBS

Fig. 4
figure 4

Seasonal mean temperature bias in sub-regions, CCLM minus CRU (green squares) and E-OBS (purple circles), the upper and lower bars indicate the spatial variability within a sub-region (5th and 95th percentile of the distribution of all grid points within a sub-region)

Similarly to the mean bias, the correlation between modelled and observed seasonal mean temperatures (Fig. 5) shows considerable variations between seasons and regions. Figure 5 shows that correlation is very high in winter in the northern sub-regions, whereas the HI region faces lower correlations around 0.7, but with large spatial variability that is very high against E-OBS. The opposite pattern occurs in summer with higher correlation towards high altitude regions. Correlation in spring is more or less homogeneous over the whole domain, reaching values of 0.8 to 0.9. In autumn correlation reaches 0.8 in the eastern sub-regions, whereas the western sub-regions and the HI sub-region show lower correlation of 0.7, but with higher spatial variability compared to the eastern sub-regions.

Fig. 5
figure 5

Correlation of seasonal mean temperature in sub-regions between CCLM and CRU (green squares) and CCLM and E-OBS (purple circles), respectively; the upper and lower bars indicate the spatial variability within a sub-region (5th and 95th percentile of the distribution of all grid points within a sub-region)

The main characteristics in the representation of the mean annual cycle of temperature in the CCLM simulations are an underestimation in winter, followed by a decreasing bias towards spring and evolving into an overestimation in late summer and early autumn (cf. Fig. 6). The weakest bias is evident in the Northern sub-regions with modest overestimation in the warm season. The cold bias in winter is in a range of −1 to −2 °C, with largest values mostly in December. The temperature bias shows a very pronounced dependency on the seasons in the southern and the HI sub-regions. Substantial differences in the temperature bias occur in the SW sub-region with a bias range of 5 °C spanning from −3° in winter to +2 °C in summer. In the HI sub-region a somewhat smaller bias range is evident, but with a shift towards intensified cold bias, particularly compared to E-OBS where no warm bias occurs in summer.

Fig. 6
figure 6

Mean annual cycle of temperature of CCLM (black circles), CRU (black hollow squares) and E-OBS (black hollow circles) and temperature bias of CCLM against CRU (green squares) and E-OBS (purple circles), averaged over the sub-regions

To evaluate the performance of CCLM in reproducing the year to year variability of 2 m temperature, the standard deviation of detrended time series of seasonal temperatures averaged over the sub-regions is calculated. Figure 7 summarizes these results showing the highest values of standard deviation in winter especially in the northern sub-regions and the lowest in summer. CCLM is not fully able to reproduce this characteristic, which is mainly due to the overestimation of variability in summer and autumn, apparent in every sub-region, with the least differences in the HI region. A possible reason for this behaviour can be determined when looking at the time series of summer temperatures (see Fig. 15), where CCLM shows a strong warm bias in the first 10 years of the simulation resulting in higher variability than observed. The variability in CCLM in winter is somewhat overestimated, particularly in the western sub-regions, whereas deviations are small in the eastern sub-regions and the HI region. The year to year variability is best simulated in spring, when only minor differences between CCLM and observations arise. As a matter of fact, the HI region is the best simulated one in terms of year to year variability of temperature.

Fig. 7
figure 7

Standard deviation of seasonal mean temperatures averaged over sub-regions of CCLM (black circles), CRU (green squares) and E-OBS (purple circles), the upper and lower bars indicate the spatial variability within a sub-region (5th and 95th percentile of the distribution of all grid points within a sub-region)

As stated in Sect. 2.3, the uncertainties in the observational datasets of temperature are largest in the high alpine regions, mainly due to low station density. As a consequence, the results of the bias, correlation and variability analysis should be interpreted with caution, particularly in the HI region because the deviations might be a combination of both model bias and lack of gridded observations in these areas.

3.2 Precipitation

The simulation of annual mean precipitation totals in CCLM is characterized by a mean bias which is between +8 to +23 % when averaged over the whole domain dependent on the observational dataset. The spatial patterns thereof are displayed in Fig. 8, showing a pronounced wet bias in the Alps and the northern surroundings, as well as a dry bias mainly occurring south of the Alps. The mean seasonal bias averaged over each sub-region is shown in Fig. 9. The overall bias in winter is mainly generated in the northern sub-regions and over the high elevated parts of the domain. South of the Alps precipitation is simulated quite well with a mean bias between 0 and +15. Spring shows a similar pattern but with an increase in wet bias particularly in the northern sub-regions and the HI region. In summer a relatively inhomogeneous pattern with biases from −20 to +20 % in the lowlands is apparent, with wet bias mainly in the HI region. Autumn is the season with the largest underestimation of precipitation, which occurs in the southern sub-regions ranging between −40 and −20 % according to the different observations. Only the NW regions show a positive bias against all observations.

Fig. 8
figure 8

Mean annual precipitation bias, CCLM minus CRU, E-OBS, HISTALP and GPCC

Fig. 9
figure 9

Seasonal precipitation bias in sub-regions, CCLM minus CRU (green squares), E-OBS (purple circles), HISTALP (red triangles) and GPCC (yellow diamonds) respectively; the upper and lower bars indicate the spatial variability within a sub-region (5th and 95th percentile of the distribution of all grid points within a sub-region)

The temporal correlations of seasonal precipitation amounts between CCLM and observations show large variations depending on the respective season (Fig. 10). In winter, values are highest reaching 0.8. A clear decrease of the correlation towards the warm season is visible, with values around 0.4 in summer. In autumn the correlation is substantially higher, but with a gradient from high correlation in the West of around 0.7 to lower correlation in the eastern parts of the domain of about 0.5.

Fig. 10
figure 10

Correlation of seasonal precipitation sums in sub-regions between CCLM and CRU (green squares), E-OBS (purple circles), HISTALP (red triangles) and GPCC (yellow diamonds) respectively; the upper and lower bars indicate the spatial variability within a sub-region (5th and 95th percentile of the distribution of all grid points within a sub-region)

The annual cycle of CCLM precipitation is characterized by an overestimation in spring and an underestimation in autumn, with the magnitude of deviation dependent on the sub-region (cf. Fig. 11). The results for the NW region show a general overestimation over the year with least deviation in September, compared to the other low land sub-regions. These regions show a positive bias mainly in winter and spring with values up to +40 mm and a negative bias in late summer and autumn. In the HI region a similar pattern is apparent, but with an intensified wet bias in winter and spring and large ranges amongst the datasets of 20 to 40 mm.

Fig. 11
figure 11

Mean annual cycle of precipitation of CCLM (black circles), CRU (black hollow squares), E-OBS (black hollow circles), HISTALP (black hollow triangles) and GPCC (black hollow diamonds) and precipitation bias of CCLM against CRU (green squares), E-OBS (purple circles), HISTALP (red triangles) and GPCC (yellow diamonds) respectively, averaged over sub-regions

The year to year variability of seasonal precipitation amounts in CCLM and observations is displayed in Fig. 12 in the same way as for temperature (cf. Fig. 7). In winter only small deviations to the observations occur, except for the HI region where CCLM overestimated variability. Spring is characterized by a minor overestimation in each region. In summer these positive deviations are even larger and also the spatial variability in CCLM is exceptionally higher, particularly in the southern sub-regions. In autumn only small differences of CCLM to observations are apparent; some underestimations occur in the southern sub-regions and minor overestimation is evident in the northern sub-regions and the HI region.

Fig. 12
figure 12

Standard deviation of seasonal precipitation totals averaged over sub-regions of CCLM (black circles), CRU (green squares), E-OBS (purple circles), HISTALP (red triangles) and GPCC (yellow diamonds); the upper and lower bars indicate the spatial variability within a sub-region (5th and 95th percentile of the distribution of all grid points within a sub-region)

The seasonal bias of wet days (number of days with precipitation >1 mm in a given time period, RR1) and the simple daily intensity index (precipitation sum divided by the number of days in a given time period, SDII) of CCLM against E-OBS in each sub-region was calculated (van Engelen et al. 2008) and is displayed in Fig. 13. In winter an overestimation of wet days can be seen in each sub-region, whereas the SDII shows only a weak bias. This means that the overall bias in winter is mainly caused by an overestimation of rain events, whereas intensity is simulated well. In spring a striking positive bias of RR1 in the HI region contrasts the rather low positive bias in the SDII, consequentially leading to the season with the largest overall wet bias. The summer season is characterized by an underestimation in RR1, except for the HI region. In contrast to the RR1 bias the SDII bias shows large positive deviations over the entire domain. This leads to fewer rain events with higher intensities, which in turn results in a reasonable simulation of mean summer precipitation amounts (cf. Fig. 9). The overall negative bias in autumn is to a large part caused by an underestimation of wet days, except for the HI region which shows a positive bias. The bias pattern of SDII in autumn shows only minor bias. In the latter case, the underestimation of wet days results in a dry bias in autumn in the southern sub-regions.

Fig. 13
figure 13

Seasonal bias in the number of wet days (RR1) on the left panel in absolute (top) and relative (bottom) values and seasonal bias of the simple daily intensity index (SDII) on the right panel in absolute (top) and relative (bottom) values between CCLM and E-OBS, averaged over sub-regions

Gridded datasets for precipitation are afflicted with a considerable amount of uncertainty, especially regarding the alpine areas and the winter season. As shown in Fig. 2, the range of precipitation between the observational datasets in the HI region in winter is above 60 % relative to the mean for most areas. This has a major impact on the bias of CCLM as shown in Fig. 9. The precipitation bias against CRU is roughly +10 % in winter in the HI region, spanning from −60 to +80 % within the sub-region. On the contrary, CCLM vs. GPCC shows a mean bias of +70 %, ranging between 0 and almost +140 %. Considering temporal correlation in winter, the deviations among the observational datasets used for calculation are again substantial, ranging between 0.65 and 0.85.

3.3 Temperature trends

Figure 14 shows the linear trend of seasonal mean 2 m temperatures in every grid box. The upper panel displays the trends derived from CCLM, in the middle and the lower panel trends from the CRU and E-OBS datasets are plotted. The most striking feature in this figure is the difference in the trend signal between the model and the observations. The trends coincide only in winter, with positive trends of 1 to 2 °C in CCLM, CRU and E-OBS, except for some areas in CCLM that show a weaker trend, for example in the central and southern Alpine regions. In the other seasons, there appears to be no dependence of temperature trends on orographic features in the CRU and E-OBS datasets. CCLM, however, shows a shift from positive trends of 0.5 °C in the northwest of the domain to −1 °C in the south eastern part in spring. In summer, generally negative trends occur in CCLM, primarily in the northern sub-regions, but also in some areas of the southern sub-regions. In the Alpine Ridge, however, the trends of summer temperatures in CCLM are sharply standing out against the surrounding regions showing only weak positive trends. In autumn, the CRU and E-OBS data show a patchy spatial pattern of areas with positive and negative trends. This is in contrast to the CCLM simulation where an overall negative trend of −1 °C or even larger occurs.

Fig. 14
figure 14

Linear trend of seasonal mean temperatures at every grid point simulated by CCLM (upper panel) and the gridded observations by CRU (middle panel) and E-OBS (lower panel)

It should be noted that the linear trend is a highly sensitive measure when used to determine model skill, as regards outliers at the beginning or the end of the time series. To get a more robust measure, floating trends are used; each calculated over a time period of 15 years in steps of 2 years from seasonal mean values of temperature over the whole GAR. The results are plotted in the upper panel of Fig. 15, which shows the seasonal 15-year floating trends as points on the location of the 8th year (middle of the period) for CCLM, CRU and E-OBS and also for the ERA40 data and the first nesting step of CCLM with a spatial resolution of 50 km. Overall, model and observations coincide in winter, when the steep, wavelike structure of rising and falling trends is mostly captured by CCLM. Larger differences appear in spring, when CCLM constantly shows stronger negative trends in the first half of the period and weaker positive trends in the second as compared to the observations. Over the entire period, a balanced trend in CCLM is observed. In summer, this behaviour of the model is more pronounced, with trends being strongly negative at the beginning, switching to a positive signal at the end of the 1970s and turning negative at the end of the whole period again. This is in contrast to the observations that show continuous positive trends after 1975. In autumn, the observations show a wavelike progression of the floating trends similar to the winter season, but with smaller amplitude. CCLM is able to simulate this wave feature quite well, but strongly overestimates the negative trends at the beginning and the end of the period, which results in an entire negative trend that cannot be detected in the CRU, E-OBS or ERA40 data in that order of magnitude.

Fig. 15
figure 15

Upper panel: seasonal 2-year floating trends (15 year periods) averaged over the GAR from CCLM (black circles), CCLM_50 km (first nesting step, grey hollow circles), ERA40 (grey crosses), CRU (green squares) and E-OBS (purple circles); lower panel: time series averaged over the GAR of seasonal mean temperatures from CCLM (black), CCLM_50 km (grey dashed line), ERA40 (grey chain line), CRU (green) and E-OBS (purple) and the corresponding bias as well as the trend values of the bias

When analysing ERA-40, it becomes evident that the disparity of trends is not an artefact arising from the driving data, because ERA-40 trends match CRU and E-OBS quite well. The disparity may be caused by a shortcoming of the CCLM simulation at the first nesting step (CCLM_50 km), when these trends start to become apparent, with only minor differences in the summer season.

In the lower panel of Fig. 15, the corresponding time series of seasonal temperature of the floating trends are plotted, revealing a potential cause for this mismatch of decadal trends. The bias of seasonal temperatures develops from positive or neutral conditions in the beginning to negative values at the end of the period in all seasons, except in winter. The trends of the bias in spring, summer and autumn are all significant at 95 % using the non-parametric Mann-Kendall trend test and range between −1 and −1.7 °C/40 years. In winter, the model results show negative trends in the bias as well, but weaker and significant only against CRU. This in turn could lead to the hypothesis that there might be systematic temperature drifts in the simulations.

3.4 Altitude dependencies

Based on the investigation of seasonal mean bias and correlation in the different sub-regions, a strong altitude dependence of the results can be assumed (cf. Sects. 3.1 and 3.2). In this section, the relation between model skill and altitude will be investigated in greater detail. In every season, except in summer, an increasing negative bias (Fig. 16) is obvious. In summer, the bias is positive at altitudes below 500 m.s.l; above this height, the bias is negative against E-OBS and positive against CRU.

Fig. 16
figure 16

Bias (upper panel) and correlation (lower panel) of seasonal mean temperature (CCLM vs. CRU in green and E-OBS in purple) plotted against the corresponding altitude; the thick curves represent the mean in 100 m classes of altitude smoothed with a Gaussian low pass filter, the thin lines represent the smoothed 5th and 95th percentile of the distribution in every altitude-class; above 2,400 m.s.l. the curve of the mean value is dotted and the confidence lines are missing because of too little grid points beyond this altitude to reasonably calculate percentiles

The height dependence of the correlation is characterized by substantial differences among the seasons. Winter shows high correlation in grid boxes below 1,000 m.s.l, whereas a rapid decrease is visible beyond this level, dropping to values of 0.5 mean correlations in elevations around 2,500 m.s.l. This situation is different in summer, when the correlation is lowest in the plains ranging between 0.4 and 0.7 and increases continuously with heights up to 1,500 m.s.l. Thereafter, the correlation is fairly stable at the mean value of 0.7. In autumn, the correlation is generally high (0.8), with a slight decrease above 2,000 m.s.l. The values of the correlation in spring are again high (0.8) at all altitudes.

The bias and correlation of precipitation shows larger variations for height than for temperature dependence. In winter, an enhanced wet bias with altitude is apparent only against the HISTALP and GPCC datasets, whereas CCLM against CRU and E-OBS shows no clear altitude dependence. Spring is characterized by a similar pattern, but the increasing wet bias against HISTALP and GPCC is less pronounced. In summer, the bias pattern is comparable among the observational datasets, with increasing bias from 0 to 1,500 m.s.l. and decreasing bias above. Autumn shows no altitude dependence in bias of CCLM against CRU and E-OBS, whereas it is increasing with height against HISTALP and GPCC.

The correlation of precipitation against altitude does not indicate any clear height dependence (cf. Fig. 15). Only autumn shows an increase in correlation with higher levels of altitude.

In the light of the mentioned uncertainties in the observational datasets in high alpine terrain, the interpretation of this height dependency is problematic for determining model skill. This is the case especially for precipitation, where the spread of bias in winter ranges from 0 to 100 % between the observations. This spread is reduced in summer, but is still in a range of −20 to +30 %. When considering temperature, bias with height coincidence amongst the observations is found from 0 to ~800 m.s.l; beyond that altitude, the spread reaches 1 °C or more at 2,400 m.s.l. (Fig. 17).

Fig. 17
figure 17

Bias (upper panel) and correlation (lower panel) of seasonal precipitation sums (CCLM vs. CRU in green, E-OBS in purple, HISTALP in red and GPCC in yellow) plotted against the corresponding altitude; the thick curves represent the mean in 100 m classes of altitude smoothed with a Gaussian low pass filter, the thin lines represent the smoothed 5th and 95th percentile of the distribution in every altitude-class; above 2,400 m.s.l. the curve of the mean value is dotted and the confidence lines are missing because of too little grid points beyond this altitude to reasonably calculate percentiles

As shown in Fig. 12, observed seasonal mean temperature trends are poorly represented by CCLM. Apart from that, trends in the model show a dependence on altitude, whereas the observations do not, especially not in winter and summer. In Fig. 18, scatter plots of temperature trend against altitude are shown for winter and summer. CCLM shows a kind of “S”-shape in vertical trends in winter (upper panel), with weaker overall trends towards higher altitudes. This is in contrast to the observations, where no height dependence is observed in the CRU data and slightly larger positive trends appear in the E-OBS data above 1,000 m.s.l. As opposed to that, increasing trends with altitude in CCLM can be detected in summer (lower panel), starting with negative values from 0 to 1,500 m.s.l. and shifting to positive trends at higher altitudes. The observational data show uniform trends of about 1.5 °C/40 years at all height levels. The only difference in the observations is the diverse spread, in winter as well as in summer, being closer around the mean in CRU and wider (3 °C/40 years) in E-OBS.

Fig. 18
figure 18

Temperature trend at every grid point plotted against altitude for winter (upper panel) and summer (lower panel) and for CCLM (black), CRU (green) and E-OBS (purple), with the scatter plot of CCLM plotted in the background of the observation plots; the thick curves represent the mean in 100 m classes of altitude smoothed with a Gaussian low pass filter

4 Discussion

The results show that the CCLM model, with the specific setup used in this study, is basically able to simulate the spatial and temporal characteristics of seasonal climate in the Central European Alpine Region. The mean biases are of similar magnitude as those in other RCM simulations focusing on the Alpine Region, as several studies have shown (Bucchignani et al. 2011; Davin et al. 2011; Kotlarski et al. 2010; Jaeger et al. 2008; Roesch et al. 2008; Kotlarski et al. 2005). Differences are mainly due to the use of different model versions, model setups and driving data. The evaluation of ERA-40 driven CCLM hindcast simulations by Jaeger et al. (2008) e.g., show a reasonable simulation of winter temperatures except for the mountainous regions, but a strong underestimation of summer temperatures, which is in contrast to our results that show a pronounced cold bias in winter and reasonably simulated temperatures in summer. Besides that, the bias of precipitation can be compared to our simulations, with an overestimation of rainfall in winter and the highest values of wet bias at the northern rim of the Alps. The simulation of summer precipitation shows a prominent dipole pattern with a wet bias north of the Alpine Ridge and a dry bias on the southern slopes of the Alps. Overall, the simulations show both similarities and contradictions; an in-depth comparison, however, is not reasonable because of the different model versions, parameterization schemes, surface boundary conditions etc., as mentioned above.

Bearing in mind that our simulations cover the highly complex orography of the European Alps, the investigation of height dependence of the model skill is an important issue. CCLM shows an intensification of cold bias with increasing height, as well as a simultaneously increasing wet bias in higher altitudes. This relationship of elevation and temperature and precipitation bias might originate from an overestimation of rainfall in the high alpine areas, which can in turn lead to an underestimation of mean temperature due to snow cover, moisture and evaporation feedbacks. Kotlarski et al. (2010) have shown that their simulation of Alpine climate with the RCM REMO also faced an altitude dependence of precipitation bias, with an increase of wet bias mostly between 1,000 and 2,000 m sea level and a decrease of bias between 2,000 and 3,000 m. They concluded that this bias could be caused by a missing advection scheme for rain and snow from one grid box to another, leading to an overestimation of precipitation on the windward slopes and an underestimation downwind (Kotlarski et al. 2010). The CCLM used in this study has an advection scheme for drifting rain and snow, so this may not be the main reason for the specific bias patterns in our simulations. Kotlarski et al. also argued that the effect of non-smoothed model topography leads to sharp gradients between neighbouring grid boxes and the generation of small scale gravity waves (Roe 2005), which can cause a dislocation of zones of up- and downdraft on a coarse model grid. This might come along with too early condensation of water vapour and a too early occurrence of precipitation when humid air masses are lifted over an orographic obstacle. These may have been part of the reasons for the bias patterns of precipitation in our investigation, but probably only for the summer season and partly for autumn, when there is a pattern of dry to no bias at the highest ridges of the Alps surrounded by considerable wet bias towards the north and the south. The other seasons show diverse patterns with largest wet bias arising mainly over the Alpine crest. An in depth investigation of possible reasons and mechanisms leading to this altitude dependence of precipitation bias would go beyond the scope of this study, but there is definitely a strong demand for further simulations and sensitivity studies with special focus on that issue.

Apart from evidences that CCLM shows some insufficiencies in simulating Alpine climate, uncertainties in the observational datasets which the model results are compared to, have to be taken into account. The uncertainties are evident in both precipitation and temperature datasets, although they are much higher in the gridded rainfall data, particularly in winter and the alpine regions. The analysis showed that the precipitation bias of CCLM in winter in these areas (+10 to +70 %) is in the same order of magnitude as the uncertainties (range between the datasets of roughly 50 % relative to the mean) in observational data. This uncertainty expressed as a spread between the datasets relative to the mean is lower in the other seasons, mainly due to the higher absolute precipitation rates. The absolute range amongst the datasets is in the same order of magnitude in each season. So the bias of CCLM is not just an expression of the model shortcomings, but has to be seen as combined information on model performance and dataset performance. As our results show, the attribution of skill to the CCLM is strongly dependent on the observational dataset used for comparison. It is therefore suggested that the evaluation of RCM performance should be carried out using as many datasets as possible for bias calculation to take all uncertainties rigorously into account.

Apart from altitude dependent model and observational dataset performance, the simulation of observed temperature trends is another important issue in this investigation. The positive trends in CCLM are generally weaker than in the observations, and even switch to a negative sign, which contradicts the trends found in the observational datasets. The differences are largest in spring, summer and autumn with more or less pronounced negative trends over the whole domain as simulated by CCLM in comparison to generally positive trends in CRU and E-OBS in spring and summer and weak varying trends between a positive and a negative trend sign in autumn. The analysis of seasonal temperature progression revealed a pronounced overestimation of temperature in summer at the beginning of the simulation, followed by a convergence of model temperatures towards observed values at the end of the simulation period, which produced the overall negative trend in summer. In spring and autumn, the situation is somehow different, with simulated temperatures being close to the observed ones at the beginning of the investigation period, but drifting into cooler conditions as observed by CRU or E-OBS, which leads to similar features of negative trends over the whole simulation period. The prominent feature of strongly overestimated summer temperatures at the beginning lead to the assumption that these biases are caused by flaws in the simulated surface energy balance. An investigation of model soil moisture, latent and sensible heat flux (not shown) revealed that the initial state of soil water is far too dry, because within the 3 years of the model spin-up (1958–1960), the relative soil wetness was doubled as compared to the initial values. After the model spin-up, there is still a slight positive trend in soil moisture apparent, which is negatively correlated to the temperature trend. Due to the important role of soil moisture for the energy balance at the surface (Jaeger and Seneviratne 2010) and related feedbacks, the lack of soil moisture at the beginning might cause the trend reversal in the CCLM simulations. However, this is just a first guess and further analysis is required to get a better understanding of all processes involved. We also conclude that the flawed trend representation is not a matter of the ERA-40 driving data, since the trends in the reanalysis are similar to those of CRU and E-OBS.

The analysis of temperature trends also revealed distinct altitude dependence, especially in the summer season. The trends are negative in the lowland areas of the northern sub-regions and switch to positive ones above 1,500 m sea level. This behaviour can also be observed in other RCM simulations (Ceppi et al. 2010), but with stronger variations in magnitude. On the contrary, this altitude dependence cannot be seen in CRU and E-OBS observations, and in addition, trend signals are much more pronounced than those from CCLM. This might be a consequence of limited soil moisture in high mountain areas leading to weaker related energy balance feedbacks as compared to the lowlands, resulting in a slight positive trend signal above 1,500 m sea level.

5 Conclusions

This paper presents a comprehensive evaluation of climate simulations for the Greater Alpine Region for past conditions from 1961 to 2000, conducted with the CCLM regional climate model driven by ERA-40. Four different observational datasets have been used to assess the model skill of temperature and precipitation by analysing the seasonal mean bias and correlation, representation of the annual cycle and inter-annual variability in five different sub-regions, proportion of wet days and rainfall intensity, as well as temperature trends and altitude dependencies of bias, correlation and trends. The main findings of this study are:

  • The overall temperature bias of −0.7 to −0.8 °C and precipitation bias of +8 to +23 % are similar to results from other evaluation studies carried out for RCM simulations covering Europe and the Greater Alpine Region.

  • A considerable cold bias is apparent in winter (−1.5 to −4 °C), particularly at high elevations above 1,000 m.s.l.

  • The wet bias in CCLM is most dominant in winter and spring and the HI region throughout the year. Especially in the northern sub-regions and at high elevated areas above 1,000 m.s.l, bias is largest, especially in winter, with values of +10 to +70 %, which might be caused by additional uncertainty in the observational datasets.

  • A disproportion of wet days and rainfall intensity is apparent mainly in summer, with an underestimation of wet days accompanied by an overestimation of rainfall intensity in the lowlands.

  • The CCLM simulations show negative temperature trends in spring, summer and autumn to a large extent. This is quite contrary to the observations, where mostly positive trends are perceived.

  • Both temperature and precipitation biases are altitude dependent, showing increasing biases along with height.

  • The summer temperature trends in CCLM also show pronounced altitude dependence, which cannot be seen from observational data.

  • Gridded datasets of observed temperature and precipitation are afflicted with a considerable amount of uncertainty, particularly in areas with complex orography and high elevation, which makes the assessment of model skill difficult.

In summary, the performance of CCLM is similar to other ERA-40 driven simulations for Europe or the Greater Alpine Region, but biases of temperature and precipitation are rather large, even on a seasonal basis. Especially for the needs of climate impact research over complex mountain terrain like the Alps, the CCLM-model has to be improved in terms of temperature and precipitation. Additional detailed analysis is required to detect specific processes responsible for the model shortcomings on a daily basis, in order to improve the model. This study also confirmed that there is demand for improved gridded observational datasets over alpine terrain to reduce uncertainties in the model skill assessment and that it is not reasonable to compare the evaluation of RCM results to only one observational dataset, because major differences in the different observational datasets are apparent, particularly for precipitation datasets over complex terrain.