1 Introduction

Heat waves (periods of extremely high air temperature in summer) are important phenomena of the European climate. Extraordinary heat waves that were observed in the past two decades, mainly the extreme 2003 heat wave in France (Black et al. 2004) and the extraordinarily hot summer of 2010 in Russia (Schneidereit et al. 2012), have attracted much interest in the climatological community. Heat waves have major impacts on terrestrial ecosystems, water resources, forestry, agriculture, the power industry and human health (e.g. De Bono et al. 2004; Beniston et al. 2007; Barriopedro et al. 2011). Due to the expected rise in global mean air temperature (IPCC 2013) and projected strengthening of atmospheric blocking over the Euro-Atlantic region due to Arctic Amplification (Francis and Vavrus 2012), there are concerns that the losses caused by heat waves will be increasing. Meehl and Tebaldi (2004) and Seneviratne et al. (2012) analysed outputs of global climate models (GCMs) to demonstrate that heat waves will become more frequent and intense in a future climate. In addition, Fischer and Schär (2010) emphasized that the most pronounced changes would occur in low-altitude river basins affecting many densely populated urban centres. To verify the credibility of these projections, model outputs for recent climate must be evaluated against observed data.

The evaluation of modelled daily maximum temperature in summer (Tmax) over Europe was performed by many authors. Kjellström et al. (2007) examined Tmax from several regional climate models (RCMs) from the PRUDENCE project (Christensen and Christensen 2007). The RCMs (driven by various GCMs) generally tend to underestimate Tmax in Scandinavia and the British Isles while they overestimate Tmax in Southern and Eastern Europe. This bias was larger in the tails of the Tmax distribution. A similar Tmax pattern over Europe was reported by Nikulin et al. (2011), who examined the RCA3 regional climate model (Samuelsson et al. 2011) driven by the ERA-40 reanalysis (Uppala et al. 2005). Christensen et al. (2008) found larger warm biases in extremely warm conditions with the implication that climate models may not properly represent future warmer conditions correctly.

Over Central Europe, RCMs tend to slightly underestimate Tmax (Kjellström et al. 2007; Nikulin et al. 2011). Plavcová and Kyselý (2011) evaluated Tmax in RCM simulations from the ENSEMBLES project (van der Linden and Mitchell 2009). Their results were consistent with majority of previous works, indicating negative biases of modelled Tmax over Central Europe. It should be noted, however, that Kjellström et al. (2010) reported that the biases in this region are the smallest of all those across the entire ENSEMBLES-RCMs domain.

Central Europe recently experienced exceptionally high temperatures in August 2012, when the new all-time temperature record (40.4 °C) was set in the Czech Republic (Němec 2012). In summer 2013, moreover, Central Europe was affected by a series of heat waves that peaked on 8 August, when the new absolute maximum temperature (40.5 °C) was measured in Austria. The previous all-time temperature record for Austria (39.9 °C) had been set only a few days earlier, on 3 August, 2013 (ZAMG 2013).

Compared to Tmax simulation, a proper reproduction of heat waves is even more challenging. This requires not only a good simulation of the right tail of the Tmax distribution but also of the persistence of extremely high Tmax. The capability of RCMs to simulate heat waves over Europe was evaluated by Vautard et al. (2013). They used a high resolution (0.11°) ensemble of RCM simulations from the CORDEX project (Giorgi et al. 2009). Due to the absence of observed gridded data in very high resolution, they interpolated model outputs to the ECA&D stations (Klein Tank et al. 2002) using the nearest-neighbour method with elevation adjustment. Modelled Tmax in Central Europe still suffered from biases, which influenced the characteristics of modelled heat waves that were too persistent and severe.

Many papers have focused on potential sources of these biases, and especially on atmospheric circulation and land–atmosphere coupling. Although the relationship between atmospheric circulation and surface air temperature in Europe is most significant in winter (e.g. Cattiaux et al. 2012), extreme high summer temperatures are also related to specific circulation patterns (Della-Marta et al. 2007; Kyselý 2008). The capability of RCMs to reproduce circulation indices (flow direction, strength and vorticity) in Central Europe was investigated by Plavcová and Kyselý (2012). Driven by the ERA-40 reanalysis, the utilized RCMs reproduced the circulation indices relatively well. These results are in concordance with Blenkinsop et al. (2009), who evaluated simulated circulation indices over England. Plavcová and Kyselý (2012) also demonstrated that differences between frequency distributions of circulation indices were higher when the model ensemble contained one RCM driven by various GCMs. On the contrary, these differences were smaller when the model ensemble involved various RCMs driven by one GCM. This reflects the fact that atmospheric circulation is primarily given by lateral boundary conditions and is little modified by individual RCMs.

The significant influence of land–atmosphere coupling on high summer air temperatures was shown by Fischer et al. (2007) who performed RCM simulations of Tmax during the most severe European heat waves with coupled and uncoupled soil-moisture scheme. They found major differences between these two types of simulations, thus indicating that improper simulation of soil-moisture content can dramatically alter a reproduction of Tmax and heat waves. These results were confirmed by Jaeger and Seneviratne (2010) who studied RCM simulations of Tmax over Europe under several soil-moisture scenarios. A reduction of soil-moisture content led to increase of Tmax and prolonged mean heat wave length.

Although a lot of work has been done to evaluate summer Tmax in RCM simulations (including an attribution of biases) and a number of studies examined heat wave characteristics for individual grid points as well, little attention has been given to evaluating heat waves as spatial temperature patterns. In this study, we analysed spatial and temporal characteristics of heat waves in an ensemble of RCM simulations from the ENSEMBLES project. These simulated heat waves were evaluated against observed ones delimited from the E-OBS gridded dataset (Haylock et al. 2008). We investigated the capability of RCMs to reproduce their spatial extent, interannual variability, temperature amplitude and length. Furthermore, the capability of RCMs to reproduce the most severe Central European heat wave observed in 1994 (Lhotka and Kyselý 2014) was analysed, and simulation of an “erroneous” heat wave at the turn of July and August, 1967 was examined in detail. Because soil-moisture feedback can significantly alter the heat wave pattern (e.g. Fischer et al. 2007), we focused on this aspect when studying variations among individual RCMs during these events.

2 Data and methods

2.1 Regional climate model simulations

We examined 7 RCM runs driven by the ERA-40 reanalysis from the ENSEMBLES project (Table 1). The simulations cover the 1961–2000 period. The model runs were selected on the basis of their cartographic projection (rotated latitude/longitude grid with South Pole coordinates −39.25 N, 18.00 E and 25 km resolution). This specification corresponds to the E-OBS gridded dataset (0.22° rotated grid version). In addition, all RCMs have metadata available. The number of vertical levels in the RCMs varies from 24 (SMHIRCA) to 40 (KNMI-RACMO2), orography was adopted from the GTOPO30 dataset (except for METO-HC_Had, which uses the US Navy 10′ dataset). Each RCM utilizes its own land–surface scheme with different types of land cover with specific behaviours and have several soil layers for modelling heat and moisture storage and fluxes. An example of how these processes are described in one of the RCMs is given in Samuelsson et al. (2011). The depth of model bottom varies across individual RCMs, and that creates some difficulties when analysing soil moisture conditions. Further descriptions of individual models are available in metadata files at the ENSEMBLES RT3 data portal (http://ensemblesrt3.dmi.dk/).

Table 1 Examined RCMs driven by the ERA-40 reanalysis

2.2 Area of interest

The analysis was performed over Central Europe as defined by 1,000 grid points (40 × 25) and covering an area of 625,000 km2 (Fig. 1). This region is located in the area within approximately 47–53°N and 8–22°E. It includes Germany (excluding northern areas and the Rhineland), northern Austria, the Czech Republic, the south-western part of Poland, Slovakia (excluding its eastern part) and northern Hungary.

Fig. 1
figure 1

Definition of Central Europe (black dashed polygon), the larger domain (grey solid polygon) and the elevation model (GTOPO30) used in the E-OBS 9.0 gridded dataset

2.3 Datasets utilized

To evaluate spatial and temporal characteristics of Central European heat waves, we examined modelled time series of Tmax that had a length of 3,680 days for each RCM (92 days in each summer over the 1961–2000 period; summer was regarded as the period between 1 June and 31 August). Observed data (E-OBS) for the same period were taken from the E-OBS 9.0 gridded dataset (Haylock et al. 2008) with the same projection and resolution as the modelled data.

For evaluating precipitation rates during and before heat waves, we used modelled daily precipitation data. This data was compared with observed precipitation from the E-OBS gridded dataset (Haylock et al. 2008). Due to different thickness of soil layers among individual RCMs and various depths of RCM bottoms, it is impossible to compare simulated soil moisture content directly between individual RCMs. Inasmuch as wet soils yield a high evaporative fraction (e.g. Small and Kurc 2003), soil moisture conditions were estimated on its basis. The evaporative fraction (EF) is defined as the ratio between latent heat flux (Qe) and available energy [sum of latent heat flux and sensible heat flux (Qe + Qh)] and it is related to the Bowen ratio (β):

$$EF = \frac{{Q_{e}}}{{Q_{e} + Q_{h}}} = \frac{1}{{1 + {\upbeta}}}$$

2.4 Definition of heat wave

The definition of a heat wave (HW) was based on the persistence of hot days (HDs) and is the same as that in Lhotka and Kyselý (2014), who dealt with HWs in the E-OBS gridded dataset. For each day in summer, daily maximum air temperature (Tmax) in each grid point over Central Europe was transformed into Tmax deviation by subtracting the grid point specific 95 % quantile of summer Tmax distribution (calculated from the 1961–2000 period). Any day was considered a HD when the average of these Tmax deviations over all grid points in Central Europe (Fig. 1) was greater than zero. Thus, a HD can occur only if a substantial part of Central Europe is affected by Tmax above the 95 % quantile.

A HW over Central Europe was defined as a period of at least three consecutive HDs. For this period, the grid maps of positive Tmax deviations (excesses) were summed into a cumulative map. The relatively strict criteria allow identifying only major HWs that are presumed to have considerable impacts on the natural environment and society. This definition was applied for both simulated and observed data. Due to biases in the modelled 95 % quantile of the Tmax distribution and because our intention was to focus on spatial and temporal characteristics of HWs rather than the Tmax bias itself, we utilized respective 95 % quantiles (calculated for each RCM) when delimiting HWs from modelled data. The respective quantiles were applied also by Vautard et al. (2013) who evaluated HWs in CORDEX-RCM simulations over Europe.

To describe the severity of individual HWs, we used a heat wave extremity index (Ihw; Lhotka and Kyselý 2014) that is calculated from a cumulative map of positive Tmax deviations. Values of individual grid points (TS′max) are summed up and scaled by the total number of grid points in Central Europe (1,000):

$$I_{hw} = \frac{1}{1000}\sum\limits_{i = 1}^{n} {\left( {TS_{max}^{\prime}} \right)_{i} \left[ {{^\circ }{\text{C}}} \right]}$$

where n is the number of grid points with a positive Tmax deviation in a cumulative map. This index uses summed deviations over the whole period during which a HW persists, and hence it captures joint effects of temperature magnitude, spatial extent and also length of a HW. Detailed evaluation and discussion of the extremity index is given in Lhotka and Kyselý (2014).

2.5 Heat wave characteristics

In addition to using Ihw, we evaluated several other characteristics of HWs. The temperature amplitude (Tamp) is the highest daily value of Tmax deviations from the 95 % quantile of summer Tmax distribution during the HW (in any grid point in Central Europe) and represents an anomaly of its peak temperature. The length of a HW (L) is the number of days during which a HW persists (the number of consecutive HDs). The spatial extent (A) is given by an area where the Tmax deviations from the 95 % quantile of summer Tmax distribution were positive for at least 3 days. This is the only characteristic that was calculated over the larger domain (Fig. 1) in order to capture the larger-scale pattern associated with each HW. The larger domain is defined by 10,000 grid points (100 × 100), but we excluded grid points over the sea and used only 7,016 continental grid points in order to allow a comparison with E-OBS. Although HWs were visualised over this larger region, grid points outside Central Europe were not taken into account when calculating Ihw, Tamp and L. The characteristics of HWs are summarized in Table 2.

Table 2 Characteristics of heat waves

The ratio between the total duration of HWs and the total number of HDs indicates whether HDs have a high clustering tendency (a ratio close to 1), or whether HDs tend to occur separately throughout summer (a ratio close to 0). This ratio is hereafter referred as the clustering index (Icl).

2.6 Temporal autocorrelation

Persistence of Tmax over Central Europe was assessed by temporal autocorrelation computed as Pearson product-moment coefficients for lagged data pairs, which is the most commonly used method (Wilks 2011). For each day in summer, Tmax values across 1,000 grid points over Central Europe were averaged into a regionally averaged Tmax, which was used to compute correlation coefficients. Since the Tmax series is not continuous, we computed correlation coefficients individually for each summer and averaged them thereafter.

3 Evaluation of heat wave characteristics and temporal variability in RCMs

During the 1961–2000 period, the RCMs simulated 104.6 hot days on average (Table 3), which is comparable to E-OBS (105). By contrast, the multi-model mean of the total duration of HWs was larger than in E-OBS, which is due to a higher clustering tendency of hot days in the RCM simulations (manifested in greater values of Icl). The temperature amplitude was overestimated in the majority of the RCMs, and especially in METO-HC_Had. This RCM suffered from unrealistically hot isolated grid points that caused the highest average temperature amplitude (14.3 °C), far exceeding the observed value (6.0 °C). The average length of HWs was too short in modelled data, and the only RCM that simulated too long HWs on average was KNMI-RACMO2. Although the multi-model mean of simulated spatial extent was similar to E-OBS, large variations were present among individual RCMs. SMHIRCA simulated low average values (796,100 km2), while the spatial extent of HWs was considerably enhanced in KNMI-RACMO2 and METNOHIRHAM (Table 3). Pronounced differences among the RCMs exist also in average and total extremity index. For example, both average and total Ihw were more than twice as great in KNMI-RACMO2 as in SMHIRCA (Table 3). In the multi-model mean, the average value of Ihw (characteristic of a single HW) was underestimated while the total Ihw was overestimated, which is related to the greater number of HWs in modelled data.

Table 3 Comparison of HW characteristics in RCM simulations and observed data (1961–2000)

Because Icl was overestimated in all RCMs, we evaluated a temporal autocorrelation of regionally averaged Tmax among individual RCMs in comparison to E-OBS (Fig. 2). In general, most RCMs (except for METO-HC_Had) exhibited slightly greater values of autocorrelation coefficients than E-OBS. The lowest values of Icl in C4IRCA3 and METO-HC_Had are linked to relatively low correlation coefficients (but still higher than the observed one in the case of C4IRCA3). The second highest value of Icl in METNOHIRHAM is accompanied by the highest correlation coefficients, particularly for lags of 2 days and more. Although the relationship between Icl and the correlation coefficients is not perfectly expressed, the results suggest that the generally enhanced clustering tendency of hot days in the RCMs is related to an overestimated autocorrelation of Tmax.

Fig. 2
figure 2

Temporal autocorrelation of regionally averaged Tmax in RCM simulations and E-OBS during the 1961–2000 period

The substantial overestimation of the total Ihw by the RCMs was analysed with respect to precipitation rates during the HWs (Table 4). The observed summer precipitation climatology (238.8 mm) and the average summer daily precipitation (2.6 mm) over Central Europe were simulated reasonably well in all RCMs and they are captured almost perfectly by the multi-model mean. In E-OBS, the average daily precipitation during HWs was 0.9 mm, which is approximately one-third of average summer daily precipitation (35 %). This ratio was considerably underestimated in KNMI-RACMO2 (9 %) that simulated virtually no precipitation during HWs, which is probably related to the generally overestimated severity of heat waves in this RCM (expressed by the greatest value of average Ihw as well as total Ihw). Overestimated values of total Ihw in ETHZ-CLM, METNOHIRHAM and METO-HC_Had are also linked to low average daily precipitation during the HWs, while lower values of total Ihw in the rest of the RCMs are associated with higher average daily precipitation during HWs.

Table 4 Precipitation rates during the HWs in RCM simulations and observed data (1961–2000)

Interannual variability of modelled hot days and HWs in each RCM and E-OBS is shown in Fig. 3. Generally, the RCMs had a tendency to overestimate the number of hot days and severity of HWs in the first decade (1961–1970), especially METNOHIRHAM. In 1967, all RCMs simulated more hot days compared to E-OBS. Moreover, all RCMs simulated at least one HW in 1967 while no HW occurred in E-OBS that year.

Fig. 3
figure 3

Temporal variability of the number of hot days (HD, red dots) and the annual sum of the heat wave extremity index (Ihw, grey bars) in RCM simulations and observed data during the 1961–2000 period. DIFF represents the difference between the multi-model mean and E-OBS

The 1971–1990 period was typical for a small number of hot days and low Ihw values in E-OBS. This feature was well depicted by the majority of RCMs, however, KNMI-RACMO2 and METO-HC_Had clearly simulated too many hot days and HWs. On the contrary, METNOHIRHAM simulated no HW and very few hot days in this period. It should be noted that the observed 1974, 1976 and 1983 HWs were not captured by most RCMs. In addition, MPI-M-REMO simulated a very high annual sum of Ihw in 1986 while the sum of Ihw in 1986 was equal to 0 in E-OBS.

The last analysed decade (1991–2000) was punctuated by the extreme summers of 1992 and 1994. These severe HWs were only reasonably well reproduced in ETHZ-CLM, KNMI-RACMO2 and METNOHIRHAM while the rest of the RCMs failed to simulate major HWs in these years. In addition, METNOHIRHAM and MPI-M-REMO led to very high annual sums of Ihw and severe HWs in 1996, which contradicts observations.

The difference between the multi-model mean and E-OBS showed that the RCMs overestimated the severity of HWs mainly in 1967 and 1996 while the severity of HWs was underestimated in 1992 and especially in 1994. Possible causes of these discrepancies for the summers of 1994 and 1967 are investigated in detail in Sects. 4 and 5, respectively.

4 Reproduction of the 1994 heat wave in RCMs

In this section, we investigated the capability of RCMs to reproduce the most severe Central European HW (according to Ihw) that occurred in 1994 and persisted for 15 days (July 23–August 6). This was a record breaking HW across Central Europe over the 1950–2012 period for which E-OBS data were available (Lhotka and Kyselý 2014). For the 15-day heat wave period, we summed the grid maps of positive daily Tmax deviations for each RCM into the cumulative maps in order to obtain simulated temperature patterns. While all RCMs agreed with E-OBS that Tmax deviations were largest in the area north of Central Europe, we found major differences between the observed 1994 HW pattern and individual RCM simulations (Fig. 4).

Fig. 4
figure 4

Cumulative maps of positive Tmax deviations between 23 July and 6 August 1994 for each RCM, the multi-model mean and E-OBS

Relatively good reproduction of the major 1994 HW was found in ETHZ-CLM, KNMI-RACMO2 and METNOHIRHAM. In particular, ETHZ-CLM and METNOHIRHAM simulated the spatial distribution of cumulative temperature deviations quite well, however, the Ihw was slightly reduced (Fig. 4). In KNMI-RACMO2, a distinct area of extreme hot anomalies (sum of temperature deviations above the 95 % quantile of summer Tmax distribution >50 °C) was simulated over north-eastern Germany, north-western Poland and southern Sweden. On the contrary, the south-eastern part of Central Europe was only little affected, thus resulting in a lower Ihw also in this RCM. In the rest of the RCMs, the severity of the 1994 HW was substantially underestimated. Especially C4IRCA3, MPI-M-REMO and SMHIRCA simulated only small temperature anomalies, thus resulting in low values of Ihw (Fig. 4). In 3 RCMs (C4IRCA3, METO-HC_Had, MPI-M-REMO), no HW according to the definition applied was simulated during the period corresponding to the observed HW.

In order to determine the causes of these large differences among the RCMs, we analysed the simulated precipitation and soil-moisture conditions. First, we evaluated the accumulated amount of precipitation in the RCMs averaged over Central Europe during the 1994 HW (Fig. 5).

Fig. 5
figure 5

Accumulated average precipitation over Central Europe during the 1994 heat wave (23 July–6 August)

In E-OBS, the accumulated average precipitation over Central Europe during the 1994 HW was 7.3 mm, which is considerably below the normal precipitation amount for this period (36.3 mm). We found large differences in this characteristic among the RCM simulations and a clear relationship to the simulated temperature patterns. The reproduction of precipitation during the 1994 HW was most realistic in ETHZ-CLM and METNOHIRHAM (Fig. 5), i.e. in the two RCMs with the most realistic temperature patterns (Fig. 4). In these RCMs, the accumulated average precipitation was close to E-OBS. Almost no precipitation was simulated by KNMI-RACMO2 (the third RCM with a pronounced HW) during the whole period of the 1994 HW. By contrast, the rest of the RCMs substantially overestimated precipitation over Central Europe during the period corresponding to the observed HW, especially MPI-M-REMO and METO_HC-Had. This suggests that quite realistic reproduction of temperature patterns for the 1994 HW in ETHZ-CLM and METNOHIRHAM was linked to credible simulation of precipitation rates during this period. Almost no precipitation in KNMI-RACMO2 was probably related to unrealistically hot anomalies in the northern part of Central Europe, while considerably overestimated precipitation rates (by a factor of 3–6) in C4IRCA, METO-HC_Had, MPI-M-REMO and SMHIRCA were associated with reduced temperature patterns during the HW period.

We also investigated the precipitation amount over Central Europe in the early-summer period preceding the HW (from 1 June to 22 July). The precipitation deficit was simulated by all RCMs but it was less pronounced than in E-OBS (Table 5). While this might have contributed to the fact that the 1994 HW was simulated as less severe over Central Europe (according to Ihw) in all RCMs, we found no relationship between the precipitation rates in the previous period and the simulated temperature patterns for the 1994 HW in individual RCMs. In fact, the deficits were larger in those RCMs that did not reproduce the HW.

Table 5 Precipitation rates for the period 1 June–22 July

The development of evaporative fraction (EF) during the 1994 HW is shown in Fig. 6. EF is the ratio between the latent heat flux and the available energy (Sect. 2.3). These variables are not available in E-OBS, so our analysis was limited to inter-comparison of the RCMs. Above-average EF values (moister conditions) during the 1994 HW period were simulated by MPI-M-REMO (0.82), SMHIRCA (0.81) and C4IRCA (0.77) in which weak temperature patterns were found for the 1994 HW (Fig. 4). Below-average values of EF (drier conditions), simulated in the rest of the RCMs (METO-HC_Had (0.59), ETHZ-CLM (0.60), METNOHIRHAM (0.62), KNMI-RACMO2 (0.67)), were related to the more pronounced temperature anomalies (Fig. 4), although this link was not well expressed compared to the precipitation amount during the HW. It is noteworthy that although METO-HC_Had simulated high precipitation rates (Fig. 5) its mean EF was lowest (Fig. 6). Moreover, KNMI-RACMO2 had moderate EF values (Fig. 6) despite almost no precipitation simulated (Fig. 5). This issue is discussed in more detail in Sect. 6.4.

Fig. 6
figure 6

Development of evaporative fraction over Central Europe during the 1994 heat wave (23 July–6 August)

5 “Erroneous” 1967 heat wave in RCM simulations

All examined RCMs except ETHZ-CLM simulated a HW between 31 July and 4 August 1967 while only a single hot day was observed during this period in E-OBS. Analogously to Sect. 4, we summed the grid maps of positive Tmax deviations for each RCM into cumulative maps in order to obtain simulated temperature patterns for this period that were compared against observations (Fig. 7). In E-OBS, only small positive Tmax anomalies were found over Central Europe that resulted in a low value for Ihw (Fig. 7). By contrast, all RCMs considerably enhanced temperature patterns, especially METNOHIRHAM that simulated a severe HW over Central Europe.

Fig. 7
figure 7

Cumulative maps of positive Tmax deviations between 31 July and 4 August 1967 for each RCM, the multi-model mean and E-OBS

As in Sect. 4, we evaluated the accumulated amount of precipitation averaged over Central Europe between 31 July and 4 August 1967. The accumulated precipitation in E-OBS for the examined period was 17.3 mm (Fig. 8), which was slightly more than the average precipitation for this period of year (13.2 mm). Precipitation was considerably underestimated in all RCMs, which supported development of the “erroneous” HW. The lowest amount of precipitation (1.3 mm) was simulated by METNOHIRHAM, in which temperature anomalies over Central Europe were most pronounced. A similar amount of precipitation (1.6 mm) in KNMI-RACMO2 was associated with much less pronounced temperature anomalies (Fig. 7), but this RCM has generally very low precipitation amounts during heat waves (Sect. 3). Between one-third and one-half of the observed precipitation was simulated for the HW period in the remaining RCMs.

Fig. 8
figure 8

Accumulated average precipitation over Central Europe between 31 July and 4 August 1967

The precipitation in the RCMs and E-OBS during the preceding period from 1 June to 30 July is given in Table 6. In observed data, the precipitation amount was only slightly below the 1961–2000 climatology (91 %). In the majority of the RCMs, by contrast, considerable precipitation deficits were simulated, especially in METNOHIRHAM (41 %). The combination of this major precipitation deficit and almost no precipitation during the simulated HW most probably contributed to the greatest temperature anomalies among all the RCMs. The precipitation deficit in June and July and reduced soil moisture in the other RCMs (except for METO-HC_Had) have also supported the development of a HW at the turn of July and August in their simulations.

Table 6 Precipitation rates for the period June 1–July 30

6 Discussion

6.1 Evaluation of spatial and temporal characteristics of simulated heat waves

In evaluating spatial and temporal characteristics of heat waves, substantial differences became apparent among individual RCMs and between the RCMs and observations. Inasmuch as we calculated 95 % quantiles of Tmax distribution for each RCM separately to delimit heat waves, these errors were not caused by simple Tmax biases.

The clustering index of hot days (the ratio between the total duration of heat waves and the total number of hot days) revealed a generally greater clustering tendency for hot days in the RCMs than in the observed data. Higher temporal autocorrelation of summer Tmax found in most RCMs, and particularly for lags of 2 days and more, is likely associated with the enhanced clustering tendency of hot days.

Vautard et al. (2013) showed that RCMs from the CORDEX project (Giorgi et al. 2009) tend to overestimate amplitude and persistence of heat waves, even when respective percentiles (calculated for each RCM) are used. The greater mean amplitude of heat waves accords with our study, but a comparison of persistence is more complex. Vautard et al. (2013) demonstrated that all RCMs overestimated the number of heat waves that persisted more than a few days at the expense of shorter events, and the overestimation generally increased with duration. Such attribute was not present in our study, since only KNMI-RACMO2 simulated a longer heat wave than was observed (18 days long heat wave between 25 July and 11 August, 1994), and the average length of heat waves was shorter in most RCMs compared to observation.

These seemingly contrasting results are probably associated with different definitions of events. While our definition of a heat wave was based on at least three consecutive hot days and a larger area’s being affected, Vautard et al. (2013) evaluated also single-day events and no requirement on their spatial extent was imposed. This definition, then, resulted in substantially greater number of events. Moreover, a higher respective temperature quantile (95 %) was applied in our study. If single-day events would be included in our study as well, then the average length of HWs would decrease more in observed than simulated data (as the clustering index of hot days is smaller in the observed data) and the results would be more consistent. Increasing overestimation for heat waves of longer duration (Vautard et al. 2013) is also consistent with greater overestimation of autocorrelation of daily Tmax for longer lags, as found for most RCMs in our study.

Substantial biases in precipitation rates during heat waves among the RCMs were detected; these differences were linked to the total extremity index of heat waves and may be related to different convection schemes. The influence of convection schemes on the heat wave development in RCM simulations was recently studied by Stegehuis et al. (2014), who demonstrated their crucial role for a simulation of heat waves.

The general overestimation of heat waves severity in the 1960s and its contrasting underestimation in the 1990s is related to underestimated temperature trends in the RCM simulations. Lorenz and Jacob (2010) showed that most RCMs from the ENSEMBLES project driven by the ERA-40 reanalysis failed to simulate the observed temperature trend properly. This feature was consistent in all domains over Europe.

6.2 Reproduction of the 1994 heat wave

Using as an example the most severe Central European heat wave observed between 23 July and 6 August 1994, we demonstrated that the temperature patterns were model-dependent and differed considerably from observations in most RCMs. Since RCMs driven by the ERA-40 reanalysis reproduce persistence of weather regimes (Sanchez-Gomez et al. 2009) and flow indices (Plavcová and Kyselý 2012) over Central Europe relatively well, we focused on land–atmosphere coupling as a possible driver for the errors in simulations of this event.

The importance of realistically simulating soil-moisture conditions in RCMs for the development of heat waves was emphasized by Fischer et al. (2007) and Vautard et al. (2013). We identified a pronounced overestimation of precipitation amount during the period of the 1994 heat wave in all RCM simulations that considerably underestimated the magnitude of the temperature pattern (4 out of 7), while the simulated precipitation was much closer to reality in the 3 RCMs that captured the 1994 heat wave reasonably well. This is also in accordance with model simulations performed by Jaeger and Seneviratne (2010), who concluded that heat waves are strongly affected by the total amount of soil-moisture.

Since Fischer et al. (2007) found no precipitation deficit in Central Europe in spring 1994, we analysed the preceding precipitation amount only from 1 June to the onset of the July–August 1994 heat wave. The precipitation deficit was present in all examined RCMs, but it was smaller in each case than that actually observed and was not linked to the magnitude of the temperature pattern of the 1994 heat wave. This suggests that the precipitation deficit during this heat wave was more important for its development than was the preceding precipitation amount. However, the slightly overestimated early-summer precipitation in all RCM simulations might also have contributed to the general underestimation of the July–August 1994 heat wave.

6.3 “Erroneous” 1967 heat wave in RCM simulations

Analogously to the most severe Central European heat wave in 1994, we investigated also the substantial overestimation of Tmax and the “erroneous” heat wave that appeared in 6 out of 7 RCMs between 31 July and 4 August 1967. During this period, all RCMs simulated considerably lower precipitation amounts compared to observed data. Although the precipitation was underestimated by a factor of 2–10, the relationship between the magnitude of temperature pattern and precipitation during this event was weaker than in case of the 1994 heat wave and it was probably not the only major source of errors.

The analysis of preceding precipitation rates revealed a major deficit in METNOHIRHAM that simulated the most pronounced temperature pattern over Central Europe in 1967. This suggests that these errors were caused by a joint effect of precipitation underestimation during this event amplified by the preceding precipitation deficit and associated drying. The importance of dry soils in driving and/or enhancing heat waves was reported by several previous studies (e.g. Fischer et al. 2007). Errors in the 1994 heat wave simulations, by contrast, were caused primarily by incorrect simulation of precipitation during this event, while the role of the preceding precipitation deficit was relatively small. This is probably related also to the different lengths of the events.

6.4 Evaporative fraction during the 1967 and 1994 heat waves

Since the examined RCMs have different thickness of soil layers and a various soil depths, we analysed soil-moisture conditions in the RCMs indirectly through evaporative fraction (Sect. 2.3). Although this micrometeorological characteristic is today measured worldwide within the FLUXNET project (Baldocchi et al. 2001), measurements of evaporative fraction before the beginning of the 21st century were rather sparse. The typical annual values for evaporative fraction range from 0.01 in very dry areas to 0.8 in quite humid regions (Jung et al. 2011). According to Hartmann (1994), the estimated average value of evaporative fraction over Europe is 0.6, and Fischer et al. (2007) pointed out that the evaporative fraction was about 0.1 during the severe 1976 heat wave over the British Isles.

Hence, the simulated evaporative fraction during the 1994 heat wave seems to be overestimated (average values among the RCMs range from 0.6 to 0.8). A systematic overestimation of evapotranspiration, which is closely related to evaporative fraction, was found by Mueller and Seneviratne (2014) in GCMs, and it is possible that a similar bias is also present in the examined RCMs.

We found some signs of suspicious relationships between precipitation and evaporative fraction in our study. Although KNMI-RACMO2 simulated virtually no precipitation over Central Europe during the whole 1994 heat wave, the evaporative fraction in this RCM was relatively high. In addition, KNMI-RACMO2 underestimated precipitation amount in the period preceding the heat wave, which is inconsistent with the high evaporative fraction. Van Meijgaard et al. (2008) noted that the RACMO 2.1 land surface scheme was modified since the previous model simulated insufficient soil drying. The suspiciously high evaporative fraction might be related to the persistence of this issue also in the examined KNMI-RACMO2 simulation. A similar feature was found in METNOHIRHAM that simulated high values of evaporative fraction during the “erroneous” 1967 heat wave (around 0.9, not shown), despite the fact that this RCM simulated considerable precipitation deficit in the early summer and during the heat wave. By contrast, METO-HC_Had had low evaporative fraction, despite relatively high precipitation rates during the 1994 heat wave. The low evaporative fraction might enable a development of relatively hot conditions despite substantially overestimated precipitation during this event. These results suggest that realistic reproduction of soil moisture is a critical issue in the RCMs, and this is particularly relevant for heat waves and the credibility of their scenarios in a possible future climate.

6.5 Performance of individual RCMs

The main features of the individual RCMs with respect to the simulation of spatial and temporal heat wave characteristics over Central Europe can be summarized as follows:

C4IRCA3 simulated the average temperature amplitude and spatial extent of heat waves well. By contrast, the 1994 heat wave was poorly reproduced, interannual variability of heat waves was distorted and the total heat wave extremity index was slightly underestimated.

ETHZ-CLM simulated temperature amplitude and spatial extent of the heat waves reasonably well. The 1994 heat wave was also captured. Some insufficiencies were found in the simulated interannual variability of heat waves, and the total heat wave extremity index was overestimated.

KNMI-RACMO2 best reproduced the interannual variability and its simulation of the 1994 heat wave was relatively good too. By contrast, this RCM substantially overestimated the spatial extent and total extremity index of heat waves, which is probably related to large underestimation of precipitation during heat waves.

METNOHIRHAM best reproduced the 1994 heat wave and it simulated relatively well the average length of heat waves. However, this RCM suffered most from “erroneous” severe heat waves in 1967 and 1996 when no heat waves were observed. Similarly to KNMI-RACMO2, the spatial extent and total extremity index of heat waves were substantially overestimated.

METO-HC_Had had difficulties in reproducing the characteristics and the interannual variability of heat waves. The largest drawbacks were found in temperature amplitude, which was simulated unrealistically due to isolated hot grid points.

MPI-M-REMO simulated the spatial extent and total extremity index of heat waves relatively well. By contrast, the average length of heat waves was considerably underestimated, the 1994 heat wave was not properly reproduced and the interannual variability of heat waves was distorted.

SMHIRCA substantially underestimated temperature amplitude, spatial extent and total extremity index of heat waves. The reproduction of the 1994 heat wave was poor and the interannual variability of heat waves was simulated insufficiently.

Although there is no single RCM that outperforms others as to the simulation of all heat wave characteristics over Central Europe, ETHZ-CLM performed relatively well in most characteristics. Previous studies (Christensen et al. 2010; Holtanová et al. 2012) indicated that KNMI-RACMO2 performed best among RCMs from the ENSEMBLES project. The drawbacks of KNMI-RACMO2 found in our study suggest important limitations in reproducing temporal and spatial structure of daily temperatures also in this RCM. The RCMs with the worst simulation of heat waves over Central Europe in the examined ensemble were probably METO-HC_Had and SMHIRCA.

7 Conclusions

We investigated the ability of RCMs to reproduce spatial and temporal characteristics of heat waves over Central Europe, using the ensemble of 7 RCM simulations driven by the ERA-40 reanalysis over the 1961–2000 period. We utilized the E-OBS 9.0 gridded dataset as a source of observed data. Since soil-moisture feedback enhances temperature amplitude and prolongs duration of heat waves, we also focused on land–atmosphere coupling in the RCM simulations during the most severe Central European heat wave in 1994 as well as during an “erroneous” heat wave found in the RCM simulations in 1967. The main conclusions are as follows:

  • The RCMs simulated more heat waves despite the fact that the number of hot days is almost the same as in the observed data. The overestimation of the clustering tendency of hot days is probably related to enhanced temporal autocorrelation of summer Tmax, particularly for lags of 2 days and more.

  • On average, heat waves tended to be shorter but with a too-pronounced temperature peak in most RCMs. The spatial extent and heat wave extremity index were reproduced reasonably well in the ensemble mean, although considerable differences were found among individual RCMs.

  • Substantial variations in precipitation rates during heat waves in the RCMs were found, and they were related to the total extremity index of heat waves.

  • We found major differences among the RCMs when reproducing interannual and interdecadal variability of heat waves and hot days. In general, the RCMs overestimated the severity of heat waves and the number of hot days in the 1960s and underestimated both during the extreme 1992 and 1994 summers.

  • Focusing on the ability of the RCMs to reproduce the most severe 1994 heat wave, we found considerable differences between observations and the RCMs that were linked to the simulated precipitation during this event. Only those RCMs that reproduced the precipitation deficit captured the 1994 heat wave reasonably well, while the heat wave was weak or entirely missing in all RCMs that substantially overestimated precipitation during this period.

  • Analogously, unrealistically overestimated temperature anomalies in 1967 in all RCMs were related to unrealistic precipitation deficits simulated during the heat wave as well as in the preceding early-summer period.

  • The evaporative fraction was suspiciously high in the RCMs during the 1994 heat wave. This suggests a possible contribution of other factors such as cloud cover and associated downward radiation that might strongly affect heat wave development in the RCMs. Moreover, the link between simulated evaporative fraction and precipitation was poorly expressed, indicating deficiencies in land surface schemes among the RCMs.

  • Although there is no single RCM that outperforms others as to the simulation of all heat wave characteristics over Central Europe, ETHZ-CLM performed relatively well in most characteristics. By contrast, METO-HC_Had and SMHIRCA were probably the least performing RCMs in the examined ensemble as to the simulation of heat waves over Central Europe.

Regional climate models have become a powerful tool for exploring impacts of global climate change on a regional scale. Further work is needed to determine the relationships between extreme high temperature, atmospheric circulation, soil-moisture conditions, cloud cover and associated incoming shortwave radiation in RCM simulations. Evaluating these driving processes before and during simulated heat waves can provide a better attribution of errors in reproducing such extreme events. It is important to better understand the strengths and weaknesses of RCMs also for assessing the credibility of projected regional changes of heat waves in future climate and, ultimately, for improving the RCMs.