1 Introduction

Heat waves and droughts are climate phenomena that can impact human mortality rates, lead to economic losses, and damage ecosystems including their carbon and water function (Ciais et al. 2005; Van der Molen et al. 2011; Vandentorren et al. 2004; WHO 2004). In the last decade, Europe has encountered several such events with major consequences, which fostered research on these phenomena in order to better understand, predict them and reduce their negative effects. Such research might become even more pressing since climate models are predicting increasing temperatures and drier conditions (Meehl and Tebaldi 2004; Schar et al. 2004; Fischer and Schär 2010; Seneviratne et al. 2006). Despite progress in seasonal weather forecasts, the predictability of heat waves and droughts remains poor in the mid-latitudes (Koster et al. 2011; van den Hurk et al. 2012). Even though some recent developments of numerical weather prediction models appear to improve the seasonal predictability of European heat waves (Weisheimer et al. 2011), an effective skill still remains to be demonstrated on such time scales. In a drought and heat wave event, changes in the water and energy fluxes, and their local and regional feedbacks need to be understood.

Europe is characterized by two contrasting hydroclimatic zones. In Southern Europe (south of approximately 44°N except along the Atlantic coast) a Mediterranean climate is dominating, whereas over the rest of Europe, climate is mostly influenced by maritime weather throughout the year. The Mediterranean climate is characterized by a long anticyclonic, warm and dry summer season where soil moisture is depleted and most plants are dormant. In Central and Northern Europe, soil moisture availability in summer is usually not limiting for ecosystems transpiration and photosynthesis, and potentially problematic hot temperatures, i.e. exceeding 30–35 °C, do not occur frequently. However, climate variability sometimes leads to severe episodes of heat and drought to which ecosystems and society are not adapted. This variability is mostly driven by the configuration and persistence of large-scale flows such as the North Atlantic Oscillation (Hurrel 2000) or other weather regimes (Michelangeli et al. 1995; Reinhold and Pierrehumbert 1982). The variability of these phenomena is driven by internal baroclinic and barotropic instabilities, which make their evolution unpredictable. In westerly flow conditions, cloudy conditions generally prevail, limiting the amount of energy received by the surface and the evolution towards a possible drought through soil moisture depletion. Continental or stagnant flows, on the contrary, enhance soil drying due to decreased cloudiness and increase of sensible heat fluxes and surface temperatures, causing a positive feedback on soil moisture depletion and elevated temperature.

Heat and drought are also preconditioned by low soil moisture availability due to precipitation in the months preceding summer. Over dry soils, evapotranspiration is driven by soil moisture (Teuling et al. 2009; Seneviratne et al. 2010). A deficit of soil moisture limits evapotranspiration (LE), which results in an increased sensible heat fluxes due to the energy conservation constraint. The shift towards higher sensible heat fluxes in turn produces drier and warmer air and increases evaporative demand, which dries the soil further. This positive feedback loop generates less clouds and increased surface shortwave radiation, which again causes even more drying. In this “soil-moisture limited regime”, positive feedbacks on drought and elevated temperatures can therefore take place, possibly leading to the amplification or extension of an initial drought and to the possible development of heat waves (Jaeger and Seneviratne 2011). By contrast, such an amplification of initial dryness does not take place over wet soils, where evapotranspiration is controlled essentially by the net radiation available for the surface (Teuling et al. 2009). In such cases even persistent anticyclonic episodes cannot provoke important temperature rises because the surplus of net radiation can be used to evaporate water. The controls on the evapotranspiration regime and its evolution in the course of the spring and summer are therefore critical to the occurrence of summer heat waves and droughts. While we know basic principles about these processes, there is still uncertainty in their precise progression and development (Teuling et al. 2010; Seneviratne et al. 2010), as well as on model’s skill to reproduce observations. In this context, comparison of measurements with model analysis of the evolution of evapotranspiration during the course of the spring and summer can provide useful information about the establishment of hot and dry summers.

Over Southern Europe, modeled LE variability was shown to be correlated with precipitation, while it was correlated with radiation over Northern Europe (Teuling et al. 2009). This illustrates the existence of distinct soil-moisture and energy limited prevailing regimes in each region of Europe. However these correlations were computed from annual statistics, whereas the regime controlling LE may change during the development of a summer season. In particular, we expect the limit between the two regimes to move northward in late spring and early summer as soils get progressively drier. After a winter/spring rainfall deficit we also expect a soil moisture limited regime to take place early in Southern areas and to favor earlier Northward development (Vautard et al. 2007).

These rather theoretical or empirical considerations are consistent with the observation of a rainfall deficit in Southern Europe in seasons preceding the hottest summers of the past 60 years (Vautard et al. 2007). The subsequent mechanism of northward propagation of heat and drought was shown in regional sensitivity simulations (Zampieri et al. 2009). In favorable southerly flows, southern moisture deficit reduces cloud cover over Central/Northern Europe through advection or warm and dry air, enhancing shortwave downward radiation. Persistence of such flows then help the exposed areas to switch from energy to soil-moisture limited conditions in the middle of the summer. Thus moisture deficit in Southern Europe during springtime can help induce dry conditions and heatwaves in summer, even over the Northern parts of Europe. This process is also favored by a dynamical feedback of dry soils (Haarsma et al. 2009; Zampieri et al. 2009).

The evolution of observed temperature and precipitation during the inception and establishment of a heatwave has been extensively studied (Hirschi et al. 2011; Haarsma et al. 2009), but the evolution of latent and sensible heat fluxes has not yet received much focus (Teuling et al. 2010). One of the reasons for this is the lack of observations with appropriate spatial and temporal coverage. Latent heat flux can be measured in several ways at site scale; directly with lysimeters, with the eddy-covariance technique at site-level (e.g. the FLUXNET global network, Baldocchi et al. 2001), with the large scale atmospheric or the terrestrial water balance, approximations with land surface models or derivation from remote sensing data (Seneviratne et al. 2010). However, all of these methods have disadvantages, which resulted in the absence of long-term gridded data products. Recently some datasets were developed which provide this information, such as GLEAM data product, based on remote sensing data (Miralles et al. 2011) and the Model Tree Ensemble (MTE) product of Jung et al. (2010), based on FLUXNET data. The latter is an interpolation of pointwise but temporally continuous flux tower measurements with gridded meteorological fields and remote sensing measurements of the fraction of absorbed photosynthetically active radiation (fAPAR). MTEs (Jung et al. 2009) were constructed to estimate monthly latent and sensible heat fluxes on a 0.5° global grid from 1982 to 2008. Although this dataset is based on site-level flux measurements, the global maps of LE are largely extrapolated in time and space because the network of flux towers has gaps over biomes such as savannas, tropical forests, and to some extent Mediterranean ecosystems. Another problem is the lack of energy balance closure at many flux tower sites, which requires a considerable bias correction to the original flux data (see Jung et al. 2010 SI). Therefore, this global LE data product is very attractive for model evaluation but it cannot be seen as a direct observation, rather as an empirical model elaborated from observations.

Besides measurements, information about fluxes can be obtained from models. However, another consequence of the absence of long-term flux measurements is the lack of constraints for models and benchmarks for modeling studies (Miralles et al. 2011). In order to obtain good model performance, all key processes have to be modeled accurately. Without proper benchmarks, large uncertainties remain in present day summer climate simulation, which also results in an increased uncertainty for future projections (e.g. Lenderink 2010; Christensen et al. 2010; Déqué et al. 2011; Boberg and Christensen 2012; Boé and Terray 2008).

The aim of this study is twofold: first to understand the different evolutions of LE and sensible heat (SH) in warm and cold European summer years by using the evapotranspiration and sensible heat flux data product of Jung et al. (2011) and considering its uncertainty in order to build increased predictive knowledge on the occurrence of heatwaves and droughts. Second we investigate the simulated evolution of both LE and SH in 13 model simulations taken from the European ENSEMBLES project, and attempt to assess their skill in reproducing these reconstructed data and re-analyses. Here we focus on the evaluation of the European ENSEMBLES project (Hewitt and Griggs 2004). In the ENSEMBLES project several regional climate models (RCMs), driven by ERA-40 re-analysis boundary conditions (Uppala et al. 2005) were run over the period 1961–2000. The ENSEMBLES database (http://ensemblesrt3.dmi.dk) furthermore contains the results of RCMs driven by global circulation model (GCM) boundary conditions from 1961 to 2050. While the latter is important for looking at the future, the former runs are necessary to evaluate the RCM performance.

Section 2 gives a description of the data; the reconstructed dataset of Jung, E-OBS data, ECMWF re-analyses data and the ENSEMBLES project respectively. Section 3 describes the methods, Sect. 4 the results and discussion, followed by the conclusions in Sect. 5.

2 Data description

2.1 MTE monthly LE and SH dataproduct

The dataset created by Jung et al. (2011) derived from FLUXNET observations contains an estimate of monthly latent and sensible heat fluxes with global coverage, a resolution of 0.5°, and a time span of 27 years, from 1982 to 2008. This dataset is hereafter referred to as MTE. The machine learning algorithm Model Tree Ensembles (Jung et al. 2009) was trained to predict monthly fluxes using the global La Thuile FLUXNET data set (http://www.fluxdata.org) based on meteorological, climate, remotely sensed vegetation state (fAPAR), and land cover data. The MTEs are then applied at global scale (Jung et al. 2010, 2011). Their performance was evaluated with a fivefold leave-sites-out cross-validation (Jung et al. 2011), with an ensemble of the Global Soil Wetness Project land surface model results, and with basin scale LE derived from water budgets (Jung et al. 2010).

The upscaling process can be divided in three main steps which are: (1) processing and quality control of fluxtower data, (2) MTE training for both sensible and latent heat flux using site-level explanatory variables, and (3) using gridded data sets of these variables for global upscaling by applying the MTEs (Jung et al. 2011). An exact description of the methods can be found in Jung et al. (2009, 2011).

The data product (Jung et al. 2010) used in this study uses LE corrected for energy balance closure using the method described in Jung et al. (2010 SI). In this default version, only precipitation (P) and temperature (T) are used as meteorological data (i.e. with interannual variability) to train the MTE. To illustrate the uncertainty of the data-driven LE product with respect to energy balance closure, we use several different versions of the MTE product constructed by Jung et al. (2010): (1) default energy balance correction where the energy balance residual is distributed to SH and LE according to the Bowen ratio (LEcor—default version), (2) LE derived as the residual of the energy balance (LEres = Rn−SH−G), and (3) no energy balance correction applied (LE) (Table 1). In comparison to derived catchment water balances it was found that “no correction” of LE results in a systematic underestimation (significant bias) of mean annual LE, while no significant bias is found when LE is either corrected (LEcor) or derived from the energy balance residual (Jung et al. 2010 SI).

Table 1 Different MTE data products with their meterological drivers

To account for the uncertainty of the choice of meteorological drivers, we furthermore investigate different versions of the MTE product with the energy balance closure: (1) with additional use of global radiation (Rg) and vapor pressure deficit (VPD) (LEcor_P_T_Rg_V); (2) with additional use of net radiation (Rn) and VDP (LEcor_P_T_Rn_V) and 3) with additional use of net radiation, VPD and wind speed (U) (LEcor_P_T_Rn_V_U) (Table 1; Fig. 1). While in all cases tower measurements of Rg, Rn, V, U were used for training, the data sets of Rg, V, U used for global application are based on Sheffield et al. (2006), while Rn is based on the simulation of the VIC land surface model using the Sheffield forcing data set.

Fig. 1
figure 1

Average over 26 years of the different latent heat MTE data products. The product used in most of the article is denoted “LEcor_P_T”. Note that products using radiation (Rn) exhibit a few grid cells with higher values (the cause remains unknown)

The reason for choosing the default version with only P and T as meteorological variables was twofold. Firstly, it was found that little information was lost by ignoring radiation and vapor pressure, likely because of their strong covariation. Secondly, P and T are available as gridded observation based products, while only reanalysis products are available for e.g. radiation and vapor pressure deficit, which are associated with considerable uncertainty, especially regarding interannual variability.

Cross-validation analysis (LEcor_P_T, SH_P_T) revealed that the spatial variability of mean annual fluxes and the seasonal variability is very well captured (Pearson correlations between 0.87 and 0.94) while monthly anomalies (deviations from mean seasonal cycle) are more uncertain (Jung et al. 2011). Nevertheless, it appears that the magnitude of interannual variability is underestimated as is indicated by the ratio of the variance of predicted anomalies to the variance of observed anomalies (0.45 and 0.36 for LE and SH respectively). This is confirmed by a comparison of the interannual variability of carbon fluxes between process models and MTE (Jung et al. 2011). The reason for the underestimation of the magnitude of the interannual variability is not clear.

2.2 ECA & D data

The European Climate Assessment & Dataset Project provides daily gridded observational datasets (E-OBS) for temperature and precipitation in Europe from 1946 to till 2010 (Haylock et al. 2008). We use both daily variables on a 0.5° resolution from 1950 to 2010.

2.3 ECMWF re-analyses

The European Centre for Medium-Range Weather Forecasts (ECMWF) provides re-analyses data over a long period. In this study both ERA-40 (Uppala et al. 2005) and ERA-Interim (ERA-I) (Dee et al. 2011) are used to cover the period 1961–2008. From ERA-40, the years 1961–2000 are used, and 1983–2008 from ERA-I to cover the same time span as the MTE reconstructed data. Latent and sensible heat flux, 2 m temperature, total precipitation and the land-sea masks are obtained on a resolution of 2.5° for ERA-40 and 0.75° for ERA-I.

2.4 Regional climate model simulations: the ENSEMBLES data set

In order to examine the uncertainty in the simulation of the evolution of LE and SH fluxes from regional climate models that are also used for future climate projections, we used the set of simulations carried out within the FP7 ENSEMBLES project (Hewitt and Griggs 2004; van der Linden and Mitchell 2009). These simulations are described in Kjellström et al. (2010). Since our aim is to evaluate regional processes of land–atmosphere interactions, we first focus on the simulations (RT3) of regional climate models (RCMs) that are driven at the boundaries current climate, i.e. ERA-40 re-analyses (see below). The used variables are described in Table 2, only the monthly means being considered here. The model set is described in Table 3. The simulation period from 1961 to 2000 is analyzed. The model results are projected onto a common 0.50° × 0.50° grid (see Sect. 3).

Table 2 ENSEMBLES models variables
Table 3 Characteristics RT3 ENSEMBLES models used in this study

In order to evaluate models used for future climate projections, we also used a second set of simulations where RCMs are forced by global climate models (GCMs) at the boundaries. This set (RT2B) also runs from 1961 to 2000 and is further described in Table 4. We expect larger uncertainties compared to observed fluxes of the RT2B simulations compared to the RT3 ones.

Table 4 Characteristics RT2B ENSEMBLES models used in this study

3 Methods

3.1 Domain

The analysis is carried out over Europe (EU) for the domain 37° to 60°N, 15°E to 25°W. The following sub domains called “Southern Europe” (SEU) denote latitudes below 46°, and “Northern Europe” denote latitudes above 46°, chosen to be consistent with the previous studies of Zampieri et al. (2009) and Vautard et al. (2007). Although the most Southern border was taken as 36° in these studies, we choose here to use 37° in order to eliminate the part of North Africa that is present if a border at 36° is taken. The Iberian Peninsula (IP) alone is also considered, bounded by latitudes from 37° to 44° and longitudes from 10°E to 3°W following Christensen and Christensen (2007). The sensitivity of our results to the size of the EU, SEU and NEU domains has been tested by altering slightly their boundaries. In all cases, we verified the robustness of the results. All datasets were re-gridded to the MTE grid prior to the analysis with bilinear interpolation, which conserves the global flux. Land-sea masks were used to cover only land pixels. For all models and re-analyses products their own masks were taken. The reconstructed dataset only covers land area.

3.2 Selection of warm and cold summers

Due to the relatively short length of the series, the MTE data product time span (1982–2008) does not allow to produce reliable statistics of climate extremes at the seasonal timescale. In order to obtain most reliable statistics, we calculate statistics of differences between the distribution of years corresponding to all warm summers and all cold summers halves. Warm (and cold) summers are defined with an index that is the mean European summer (JJA) 2-m E-OBS continental temperature over the period 1950–2010, linearly detrended on each grid point, and averaged over Europe. For analysis of the MTE data, this leaves 13 “cold” and 13 “warm” summers (Table 5). For the ENSEMBLES models (re-analyses), the same method is applied, but warm and cold summers years are calculated with temperature simulated by each model (re-analyses). For ERA-40 and ENSEMBLES models, data for period of 1961–2000 are detrended, while for ERA-I only data over 1983–2008 are detrended. The E-OBS temperature data are detrended from 1961 to 2000 to allow a consistent comparison between E-OBS temperature and precipitation data with results from the ENSEMBLES models. Besides defining warm/cold summers using mean detrended summer temperature, we also tested the same index but calculated from detrended summer daily maximum temperature. Since the results were found to be very similar, we only show the results of the analysis done with detrended daily mean temperature.

Table 5 Warm and cold summers defined by detrended 2-m continental JJA temperature from E-OBS. The years are sorted by decreasing temperature anomaly

3.3 Model performance analysis

Analysis of interannual variability of the output of the ENSEMBLES models is done based on the Mean Squared Deviation (MSD) (Eq. 1) (Kobayashi and Salam 2000). In this method the deviation (MSD) of a simulation from an observation is calculated based on a squared bias (SB) (Eq. 2), the squared difference between the standard deviations (SDSD) (Eq. 3) and a misfit of correlation weighted by the standard deviations (LCS) (Eq. 4):

$$ MSD = SB + SDSD + LCS $$
(1)
$$ SB = \left( {\bar{x} - \bar{y}} \right)^{2} $$
(2)
$$ SDSD = \left( {SD_{s} - SD_{m} } \right)^{2} $$
(3)

where \( \bar{x} \) is the mean of the simulated values, \( \bar{y} \) is the mean of the measured values, SDs is the standard deviation of the simulation and SDm the standard deviation of the measurement.

$$ LCS = 2SD_{s} SD_{m} \left( {1 - r} \right) $$
(4)

where r is the correlation coefficient between the measurement and the simulation. For a detailed description of the method see Kobayashi and Salam (2000). It should be noted that a possible underestimation of the magnitude of the interannual variability in the MTE products will lead to higher values of MSD through its impact on SDSD and LCS. However, it is unlikely that this will impact the ranking of the individual models. Trend analysis is checked with the Mann–Kendall test.

4 Results

4.1 Seasonal cycles of fluxes from the reconstructed data and its uncertainty

4.1.1 Seasonal cycle

Both LE and SH exhibit a strong seasonal cycle with highest absolute values during summer, when net radiation is highest (Fig. 2a—default MTE version). LE values from MTE reach about 60 Wm−2 but rapidly decrease in July. SH values from MTE show a peak in June of about 40 Wm−2 and a slower decreasing rate later than for LE. This indicates that the Bowen ratio generally increases after June, throughout the summer season. Over SEU, the magnitude of LE flux is lower than over NEU; the Bowen ratio is higher and increases at a higher rate in Summer (Fig. 2b). LE is the lowest over the Iberian Peninsula, with a maximum between May and June occurring earlier than elsewhere, and highest in summer over NEU (figures not shown). The SH flux shows opposite values, with highest absolute values over the Iberian Peninsula and lowest ones over NEU. Further, there is early increase of the Bowen ratio in SEU average. The LE decrease mirrored by a parallel SH increase from June to July over SEU is typical of a soil moisture limited regime.

Fig. 2
figure 2

Annual cycle of latent and sensible heat flux (Wm−2) averaged over EU (a) and SEU (b) and the difference between warm and cold European summers for the latent heat MTE data products averaged over EU (c) and SEU (d). Annual cycle for latent heat flux for the different runs of MTE data over EU (e) and SEU (f) We show SEU separately because of its possible effect on northward propagation

4.1.2 Difference between warm and cold summers in the seasonal cycle of LE and SH

When looking at the mean seasonal evolution of the LE/SH difference between warm and cold summers, we find that warm European summers, when compared to cold summers, are preceded by a positive anomaly LE in March in most regions. This anomaly is further extended to April over Southern Europe (Fig. 3). The positive LE anomaly is accompanied by a negative SH anomaly in March (Fig. 4). The total reconstructed SH + LE anomaly in March is positive before a warm summer, although not significantly, as a possible consequence of higher net radiation. During the following summers months in a warm summer (especially in July), NEU exhibits a positive anomaly of both latent and sensible heat (P < 0.01), but the IP is characterized by a deficit of latent heat. Sensible heat in IP shows an excess in June and July, but this excess is only significant (P < 0.1) when using a temperature index based on maximum temperature to define a warm summer. The SH excess preceding a warm summer is marked from April to July over the IP. The general anti-correlation behavior of LE and SH in the South during summer, but positive correlation in the North, confirms the tendency for a soil-moisture limited regime in SEU and of an energy-limited regime in Northern Europe.

Fig. 3
figure 3

Monthly evolution of the spatial pattern of the difference in latent heat flux (Wm−2) between warm and cold summer years in the MTE cor_P_T dataset (data averaged from 1983 to 2008)

Fig. 4
figure 4

As Fig. 3 but for sensible heat flux (Wm−2)

4.1.3 Difference between warm and cold summers in the seasonal cycle of temperature

The temperature data from E-OBS over the same period than LE and SH availability also show an anomalous warming in March preceding a warm summer (Fig. 5). This anomaly, although not significant (P < 0.2), may also contribute to increased evaporative demand and therefore to the anomaly of LE in the same month. The same temperature data, analyzed over a longer time span (1951–2010), also reveal such a robust March positive anomaly preceding a warm summer (Fig. 6), which is significant over Southern Europe (P < 0.05). Furthermore, when using the period 1951–2000 instead of 1983–2008, a significant temperature anomaly is also observed in April (P < 0.05). This might indicate that the limited amount of years available quantitatively influences our results for LE and SH anomalies preceding a warm summer, since both fluxes are correlated with temperature.

Fig. 5
figure 5

As Fig. 3 but for temperature (°C) from E-OBS data, over the period of LE and SH data availability, 1983–2008

Fig. 6
figure 6

As Fig. 5 but for temperature (°C) from E-OBS data over the full period, 1951–2010

4.1.4 Difference between warm and cold summers in the seasonal cycle of precipitation

Winter and spring precipitation deficit was found to be a possible indicator of summer heatwaves (Vautard et al. 2007; Zampieri et al. 2009), and thus might be able to explain some of the LE or SH anomalies. This hypothesis is however, not confirmed by our analysis with sufficient significance over the period 1983–2008 (Fig. 7). We used precipitation frequency here because it has a more homogeneous spatial distribution than cumulative precipitation (Vautard et al. 2007). Only May shows a precipitation frequency (number of days whit precipitation >0.5 mm) deficit over SEU (P < 0.05) preceding a warm summer. In June, the precipitation deficit is found to be significant only over IP (P < 0.05), and moves over NEU in July (P < 0.1) (Fig. 7).

Fig. 7
figure 7

Spatial pattern of the difference in precipitation frequency between warm and cold summers—1983–2008, averaged over January–May, and for June, July and August

The differences between our results and those of previous studies (Vautard et al. 2007 and Zampieri et al. 2009) can be explained by our index of separation of warm and cold summer years, instead of looking at extreme hot summer years. Indeed if we extend the period to 1950–2010 and look at the ten hottest and coldest summer years (calculated similarly to the method used before but now looking at the extreme summers instead of the warm and cold halves), we confirmed that hot summers are preceded by a Southern Europe rainfall deficit (P < 0.05), as was analyzed by Vautard et al. (2007) and Zampieri et al. (2009).

4.1.5 Interpretation

The results obtained from the rather short (26 years) but observation-based, gridded data-product of LE and SH only allow us to draw marginally significant conclusions. The variation of LE or SH preceding on average a warm summer, includes a positive anomaly in LE + SH in March, probably due to an excess of net radiation. This translates predominantly into an excess of LE preceding warm summers over most of EU, but into a LE deficit in the IP from April to July. It is tempting to conclude that the LE excess could have dried soils and this signal could extend into summer, but our data analysis only is not sufficient to demonstrate this conclusion. The LE behavior analyzed from the MTE data-products is consistent with the establishment of an early soil-moisture limited regime in the spring preceding a warm year. During the summer, both SH and LE take above normal values in NEU during warm years, showing no switch from energy to soil-moisture-limited regime. These results are consistent with the northward drought propagation mechanism described by Zampieri et al. (2009).

4.1.6 Variability among different MTE data products

The uncertainty and variability in the MTE LE data products is shown by the seasonal cycle of sensitivity tests data products using the same method (see Sect. 2.1), over both EU and SEU domains (Fig. 2e, f). Between the lowest LE value uncorrected for lack of energy balance closure, and the highest LE value from the LEcor_P_T_Rn_V_U data product (Table 1), a range from 50 Wm−2 to 69 Wm−2 is found during summer, i.e. a difference of 38 %. Over spring, the MTE LE difference is 43 %, which is even higher. Note that the MTE data product used as a reference in this study (LEcor_P_T) is the lowest estimate among all corrected (LEcor) sensitivity tests, indicating a possible general underestimation.

For the difference between warm and cold years, the variability between the different MTE products (Table 1) of LE and SH must be considered, since this represents some of the uncertainties in the gridded data product. The spring excess of LE preceding a warm summer is present in all reconstructed sensitivity test data products (P < 0.05), but it is smaller for both LEcor_P_T_Rn_V and LEcor_P_T_Rn_V_U (P < 0.1). During summer there is less agreement among the MTE sensitivity tests products; while the corrected LEcor_P_T and LEcor_Rg show a significant difference (Fig. 2c), no anomalies are found for the residual LE and again for the inclusion of net radiation. Spring LE over SEU shows significant differences between the sensitivity test data products (P < 0.05) (Fig. 2d). Overall, the results are robust for almost all the different LE data products. Usually the additional use of the meteorological variable net radiation to estimate LE with MTE seems to increase the signal between warm and cold summer years, but this is accompanied by an increase in temporal variability. Figure 8 shows the mean and spread of all data products over spring and summer months. The spread is overall somewhat larger during the summer months, as was also shown in Fig. 2e. To summarize: a significant March LE anomaly is seen in all MTE data products over EU (P < 0.1) and over SEU (P < 0.05). A significant July anomaly is only seen in 3 out of 6 data products over Europe.

Fig. 8
figure 8

Spatial pattern of latent heat flux (Wm−2), mean (first row) and spread (second row) of the six different MTE-LE data products, over spring and the summer months

4.2 Seasonal cycle of LE and SH in RCM compared to observation based data-products

The seasonal cycle of LE and SH in the ENSEMBLES RT3 simulations shows a large spread for LE (Fig. 9a). This is especially pronounced in the summer months, with a range of a factor of 2, from 54 Wm−2 to 117 Wm−2. Between both re-analyses data sets, we find a summertime LE difference of 11 Wm−2. A study by Jiménez et al. (2011) of global LE fluxes simulated by GCMs also found a large spread in the annual cycle of LE, although, with a spread of approximately 25 Wm−2, still lower than the one we diagnose over Europe. Besides the spread of LE across models, almost all RT3 models (except GKSS, UCLM and HC3) simulate a higher average latent heat flux than in the MTE data-products used as a comparison, which takes summer values ranging from 55 Wm−2 (LE uncorrected) to 75 Wm−2 (LEcor_P_T_Rn_V_U corrected). This systematic difference could reflect an underestimation of the MPI LE data, especially when not corrected for energy balance closure, or alternatively reflect a general overestimation by RT3 model results, or both, but only independent measurements could show it.

Fig. 9
figure 9

Annual cycle of latent heat flux (a), sensible heat flux (b), 2-m temperature (c) and precipitation (d) over Europe, from 1961 to 2000 for RT3-models, ERA-40 and E-OBS, and from 1983 to 2008 for MTE and ERA-I. And annual cycles of latent (e) and sensible (f) heat flux from 1983 to 2000. Two specific models are colored which is explained in the main text

By contrast to LE, the summer SH flux is lower in most RT3 models compared to MTE data product (Fig. 9b). This low bias of simulated SH in summer is even more pronounced during other seasons. Also here we obtain a large spread between RT3 model results, mainly due to the high SH value of model HC3 as an outlier. The same model exhibits a specific LE behavior in summer with a decrease starting as early as in April, indicating a soil moisture limited regime starting in spring. In this outlier model, feedbacks probably amplify soil drying, precipitation deficit (Fig. 9d) and temperature increase (Fig. 9c). At the other end of the RT3 range of model results, a response in the opposite direction seems to occur for the SMHI model, which exhibits low summer SH fluxes, high precipitation and low temperatures. This is also consistent with the study of Christensen et al. (2010) who concluded to a positive summer temperature bias of HC3 and a negative one for the SMHI model.

Besides uncertainties (bias) in the MTE data product, the ENSEMBLES model results database covers a different time span. To exclude the possibility that differences can be an effect of different periods the seasonal cycle over 1983–2000 (the common period between MTE and ENSEMBLES) is also shown for both fluxes in Fig. 9e, f. Although small differences can be noted in each individual model behavior, our main finding of an overestimation of LE and a general underestimation of SH by RT3 models as compared to MTE data products remains unchanged.

4.3 Interannual variability

4.3.1 Interannual variance and bias of modeled LE and SH compared to data-products

In the above section, focus was on the mean seasonal cycle. Next, we focus on the interannual variability of both LE and SH. Figure 10a shows the interannual variability (IAV) of LE in spring. In most of the RCMs and in the re-analyses models as well, the LE flux IAV tracks the one of the MTE data products rather well. The IAV of spring SH on the other hand, shows lower correlation between models results and the MTE data-product (Fig. 10c). In summer the IAV variance of the models is higher than in spring, and much higher than in the MTE data which is known to have a low IAV (Tables 6, 7—SDSD). This results into a higher RMSE of the interannual anomalies since the variance of MTE-LE is higher in spring than in summer. The correlation between summer MTE-LE and RT3 models is on average still relatively good (Fig. 10b), but seems to be lower than earlier in the year (Fig. 10a). Summer sensible heat shows better correlation (Fig. 10d), and also here, the variance in the models is, as expected, higher than the variance in the MTE dataset.

Fig. 10
figure 10

Timeseries of spring LE (a) and SH (c) anomalies, and of summer anomalies (b, d). In order to create the anomalies the mean value of the flux in each model (grey), in the ECMWF reanalysis (red) and in the MTE data-product (black), over each grid point during 1983–2000 is subtracted over the whole period to have a proper comparison on the overlapping period. The data is standardized by dividing each timeseries by their standard deviation

Table 6 Mean squared deviation (MSD) decomposed into a simulation bias (SB), a variance misfit (SDSD) and a correlation misfit (LCS) for latent heat flux; RT3 models versus MTE-LE IAV from 1983 to 2000
Table 7 As Table 5 but for sensible heat flux; RT3 models versus MTE-SH IAV from 1983 to 2000

The IAV of LE in the RT3 models can be further understood by looking at the mean squared deviation (MSD) (Kobayashi and Salam 2000). The MSD can be decomposed into three components: a simulation bias (SB), a magnitude or variance misfit (SDSD), and a misfit of correlation (LCS). We found that the last two components are relatively small, but that the LE bias of models during both spring and summer seasons is large, except from one model (METNO), where the SDSD is largest (Table 6). The SB is also the largest component of the total RMSE for SH, although in summer the magnitude misfit (SDSD) is larger for most models, and sometimes even larger than the SB (GKSS, METNO, OURANOS) (Table 7).

4.3.2 Trend analysis

Jung et al. (2010) inferred a positive linear trend of global ET from 1982 to the late 1990s that was possibly attributed to global brightening, or increasing trend of solar radiation (Wild et al. 2005). This trend in LE was stalled and even showed a decrease from 1998 to 2008. Over Europe we find an increasing LE trend as well, but sustained over the whole period. This is consistent with increased solar radiation over Europe also after 1998, a phenomenon attributed to a decrease in cloudiness in addition to reduced atmospheric concentration of aerosols by Wild et al. (2009), and reflected in visibility trends (Vautard et al. 2009). Trend analysis of LE further reveals a decreasing trend of sensible heat flux over the whole year except during summer.

From the RT3 models, only 8 out of 15 models show a positive LE trend (P < 0.1) over Europe (Table 8). In spring, this is reduced to 2 models (ETHZ and ICTP), and in summer 2 models even show a negative LE trend (ICTP and MPI). The lack of trend in simulated LE is possibly due to the fact that models do not have aerosols forcing on climate and so they cannot simulate the effect of brightening directly, possible only indirectly through the advection of air masses outside the model domain. On the other hand, the MTE-LE data product does not use radiation, so the MTE trend of LE cannot be unambiguously attributed to brightening either. However, from the 8 RT3 models with a significantly positive LE trend, only 3 show a positive trend in temperature as well (Table 8), not paired with radiation trends. Sensible heat shows a negative trend for 4 of the 8 models. The might indicate that soil water remains available for increasing ET in EU, which suggests that an energy limited regime dominates. The European trends of LE are mostly due to trends over NEU, where they are paired with a positive precipitation trend, sometimes a positive temperature trend and a negative trend of SH. Radiation, however, does not show an increase over NEU. But because Northern Europe is mostly energy limited, and the ecosystems are quite temperature limited so higher temperatures most likely cause more plant transpiration, an increase in temperature alone can explain an increase in LE. Also In summer we find a positive trend in temperature over Southern Europe in 10 out of 15 models, but again this is not correlated with an increase in shortwave down radiation.

Table 8 Trend analysis of LE, SH and their climate driving variables over EU, SEU and NEU. The MTE data-product of LE and SH are analyzed for trends over the period 1983–2008 and ENSEMBLES models from 1961 to 2000 for temperature (T), precipitation (P), net surface longwave radiation (LR) and net surface shortwave radiation (SR)

4.4 Warm and cold years in RCM simulations

4.4.1 Difference between warm and cold summer years over Europe

In the previous sections we studied the seasonal cycle, IAV and trends of LE and SH in the RT3 models. Here we investigate the different evolution of LE and SH between warm and cold years and compare the results with the MTE data products. Although all models do agree on a summer excess of SH (Fig. 11b), for LE there is no such agreement (Fig. 11a). Some models exhibit an excess of LE in summer and some others a deficit. This indicates that the ET regime is not simulated robustly and that there is a large uncertainty among models. In August most models show a LE deficit and a SH excess, while neither ERA-40, ERA-interim or any of the MTE data-products show this behavior, suggesting a general tendency to too pronounced soil moisture depletion, and thus of too strong soil-atmosphere feedbacks in RCMs.

Fig. 11
figure 11

Difference between warm and cold years LE (a), H (b), temperature (c) and precipitation (d) for RT3 regional climate models. Difference between warm and cold summers for LE for RT2B models from 1961 to 2000 (e)

In spring (March) a LE excess is found during warm summer years in most models as for the MTE data-product, followed by a small upward tendency in late spring and early summer. Some models have a large enough soil moisture reservoir to sustain LE during warm (and generally dry) summers, but a majority seems to dry out the soil, which results in a deficit of LE heat at the end of the summer. The magnitude of this process however, differs from one model to another. The results of the HC3 model for example show the largest deficit of LE (Fig. 11a) in summer during a warm summer, and a large excess of SH (Fig. 11b), almost the largest deficit of precipitation (Fig. 11d) and an excess of temperature (Fig. 11c). Oppositely, the SMHI model does not show this behavior, and precipitation and temperature do not show large differences between warm and cold years in this model.

4.4.2 Difference between warm and cold summer years over Southern Europe

The strong difference in LE between RCM simulations and MTE or re-analyses is even more pronounced over SEU (Fig. 12a, b), when looking at warm versus cold summers in this region. The LE deficit in SEU starts in early summer and proceeds throughout the rest of the season in the models. The recovery occurs only in autumn, whereas in the MTE data-product and re-analyses, no such large negative LE (and positive LH) anomalies are found in warm summers. Also in the MTE data-product the LE deficit during a warm summer does not extend over the whole region of SEU, but remains confined to the IP region. This confirms that feedback mechanisms in models cause too much soil drying, too little ET, and too high summertime temperatures. This could be one possible source of the nonlinear bias in summertime temperatures of this ensemble found recently by Boberg and Christensen (2012).

Fig. 12
figure 12

Difference between warm and cold SEU summer years of LE (a) and H (b) over Southern Europe

4.4.3 Spatial distribution of the difference between warm and cold summer years

Figure 13 shows the average of the 15 RT3 models of the evolution of the spatial distribution of LE flux. In April a small deficit of LE is already present in the south-eastern part of IP, where Mediterranean climate dominates. During the consecutive months, this deficit expands and spreads, even until the LE excess over North and central Europe disappears, suggestion a depletion of soil moisture over most of Europe. Sensible heat shows similar expansion from IP towards NEU (Fig. 14).

Fig. 13
figure 13

Spatial pattern warm minus cold years RT3 ensemble mean latent heat flux

Fig. 14
figure 14

Spatial pattern warm minus cold years RT3 ensemble mean sensible heat flux

Precipitation evolves, as found by Vautard et al. (2007), from a deficit in winter–spring months over SEU to a general deficit in summer, particularly in Central/Northern Europe (Fig. 15). At first sight this might seem surprising, although a plausible cause of this phenomenon is the constant lack of summer precipitation in SEU. With a small amount of precipitation, there cannot be a difference between warm and cold years in SEU. Less precipitation causes a decrease in ET in a moisture limited regime, in which Southern Europe is located before warm summers. In the RCM models, during a warm summer year, however, the northward propagation of soil-moisture limited regime and drought is probably exaggerated, as argued previously.

Fig. 15
figure 15

Spatial pattern warm minus cold years RT3 ensemble mean precipitation

4.4.4 Regional climate models driven by modeled climate at boundaries: RT2B models

So far we only analyzed simulations from the RT3 RCMs with prescribed climate from ERA-40 at the boundaries of the EU domain. However, the ENSEMBLES project database also includes model output from RT2B RCMs prescribed with Global Climate Models (GCM) fields at their boundaries. The advantage of the latter is that the same configuration is used for historical and future periods. A drawback is the uncertainty induced by potential biases in the climate of GCMs used for boundary conditions, which add up to the uncertainty of the RCMs (Jacob et al. 2007). Figure 11e shows the difference between warm and cold years of GCM-driven RT2B RCMs over the period 1961–2000 (same time span as the previous analysis of ERA-40 driven RT3 RCMs). Even though the results exhibit differences, the overall picture shows similarities between the RT2B and the RT3 model results, with some models keeping moisture and so latent heat flux in summer, while others drying the soil too much and ending up with a large LE deficit late in the season. The interesting result here is that the RCM models used both in RT2B and RT3 exhibit the same qualitative behavior in for the LE difference between warm and cold summers. Therefore, the different evolution of LE between warm and cold summers seems to be mostly due to regional processes and feedbacks within the EU domain, and not to boundary conditions. This can be seen from the similar spatial pattern of the ET difference of warm minus cold summers, between the RT2B and the RT3 models (Fig. 16). In spring however, there is less agreement between RT2B and RT3 model results for the LE difference between warm and cold summers, which suggests that boundary conditions during this season are more important for models (Fig. 16). In summer, strong land–atmosphere feedbacks cause non-linear changes in both LE and SH fluxes, which make the internal conditions within the EU domain, more important than those of the boundary. In spring however, land–atmosphere feedbacks are less strong due to a smaller amount of net radiation, and boundary conditions thus play a more important role. Information about model performance over the past, a period which you can compare with observations, is important in order to obtain reliable future predictions (Kjellström et al. 2010; Christensen et al. 2010; Déqué et al. 2011; Lenderink 2010; Coppola et al. 2010).

Fig. 16
figure 16

Correlation between RT2B (boundaries from global models) and RT3 (boundary from ‘observed’ climate reanalysis) model results for the average difference between warm and cold summer years for latent heat flux (red), sensible heat flux (blue) and temperature (yellow). Mean correlation (over all models) of the mean fluxes and temperature over Europe for each year in the period 1961–2000

5 Concluding remarks

In this study, we have carried out an analysis of the surface LE and SH fluxes over Europe in order to identify precursors of the development of summertime temperature. We studied the evolution of LE and SH flux throughout the year with special emphasis on the anomalies between years associated with warm and cold summers. We use observational-based gridded data for both LE and SH, derived from interpolated eddy-covariance site-level measurements. Furthermore we look at the performance of regional climate models driven at their boundaries by either or ERA-40 and GCM fields, in order to provide an estimate of the uncertainties underlying regional climate projections.

We find a clear difference between Northern Europe and Southern Europe in the evolution of both LE and SH difference between warm and cold summers. In general, positive springtime (March–April) differences of LE are found over Europe preceding a warm summer. Sensible heat flux positive anomalies also tend to develop over Southern Europe early in the season (April and May), and move northwards during the rest of the summer, in particular during July. Also these anomalies are more pronounced in SEU. Our results for LE and SH confirm the finding of earlier studies about the northward propagation of drought (e.g. Vautard et al. 2007; Zampieri et al. 2009), even if LE over NEU remains mainly energy limited even during the warm years. This might indicate that extreme warm years are necessary to switch to a moisture limited evapotranspiration regime in NEU. This hypothesis can be tested by a future modeling study.

The RT3 and RT2B model results show us that both latent and sensible heat flux, and thus land–atmosphere feedbacks, are very different between models. Large spreads with a factor of two are found in the mean seasonal cycle of LE, for instance. This spread is even larger for the LE difference between warm and cold years. This spread is most likely attributable to the representation of soil, land cover and soil-atmosphere exchange parameterizations, currently weakly constrained by sparse observations. The spread may also result from some spread in radiation (Lenderink et al. 2007). Most models tend to dry too much in early summer, which results in a collapse of LE, turning all incoming energy into SH rather than in a mix of both fluxes. This behavior is not observed in the observation-based MTE data products; suggesting that the representation of land surface processes in RCMs can be improved. This overestimation by RCM models of the LE decrease is coupled with both temperature and precipitation, and show larger differences between warm and cold summers than observation data does. In SEU there is more convergence between the models. This might be caused by the lesser amount of soil moisture present in this region, so that feedbacks are pushing the system into moisture limitation in all those RCM models.

Furthermore the models show on average better skills in simulating SH than LE in both spring and summer; spring is better simulated for both fluxes. Stronger land atmosphere feedbacks in summer, may result in a higher simulation bias and standard deviation of the LE misfit. The LE misfit of correlation is lower is summer, confirming that the magnitude of the feedback simulation is the difficulty and not the feedback process itself.

In this study we averaged the results from all the ENSEMBLES regional climate models. This method was used to reduce the uncertainties of single models. However, in recent studies (e.g. Christensen et al. 2010; Coppola et al. 2010) a weighted average is proposed to favor models that perform better, which is especially interesting when looking at future projections. However the analysis of future projections is left for a future study.

We conclude that, although 26 years of latent and sensible heat flux observational-based data-products might not be enough to evaluate models for differences between warm and cold summers, these new data-products provide interesting new information. We have to keep in mind however, that the MTE datasets cannot be considered as direct observed energy fluxes, but the existence of different sensitivity tests of MTE provides products that can be associated with an uncertainty. Still uncertainty remains, also between different observational datasets (Mueller et al. 2011). We further show that data from model simulations can help to overcome the time issue, but that there are still uncertainties in the simulations done by the models.

Furthermore we conclude that the seasonal predictability of summertime drought and heat waves based on LE and SH fluxes as precursor signals remains limited, which is especially due to the monthly time step and the “small” (26) number of years. Further research and more detailed observation-based data-products are necessary to understand the processes that cause the ET regime to switch, but also the conditions early in the year that favor such a switch. While a European LE deficit in March was found to (statistically significantly) precede warm summers, this indicator does not yet provide us with sufficiently robust information that would allow to forecast the occurrence of a heat wave. Further research can provide such a potential early warning signal, so that better precautions can be taken to reduce the negative effects of heat waves on society and ecosystems.