1 Introduction

There are two main techniques in regional modelling: one is based on variable resolution general circulation models (GCMs), in which horizontal resolution is maximum over the region of interest (Gibelin and Déqué 2003; Fox-Rabinovitz et al. 2008). The other technique used in regional climate modelling consists of a limited-area model driven at its lateral boundaries by data generated by a GCM or by reanalyses (see Giorgi and Mearns 1999 for a review). Contrary to the limited-area model technique, the first one allows feedback between the high resolution area and the global scale dynamics. However, due to their reduced spatial domain, limited-area models provide an attractive approach allowing high spatial resolution climate simulations at an affordable computational cost. In addition, a common constraint can be prescribed to different regional models at their lateral boundaries. The latter technique has been extensively used to provide regional climate change projections to the impact community (IPCC 2007).

As for GCMs, uncertainties in projected climate change associated with RCMs arise from several sources:

  • the amplitude of the anthropogenic emissions and the resulting greenhouse gas (GHG) concentrations

  • the formulation and accuracy of the model

  • the choice of parameterisations used to mimic the unresolved scales

  • the chaotic nature of climate system.

For the RCMs, the lateral forcing by the GCMs or reanalysis introduces an additional source of uncertainty. Even forced by the same lateral boundary conditions (LBCs), RCMs may simulate different atmospheric circulation patterns within the domain (Giorgi and Bi 2001; Weisse et al. 2000; Christensen et al. 2001; Caya and Biner 2004; Rinke et al. 2004; Vannitsem and Chomé 2005; Alexandru et al. 2007; Lucas-Picher et al. 2008a). This variability is often called the internal variability (IV) of RCMs (von Storch 2005) and can be determined by the spread among the members in an ensemble of simulations driven by identical LBCs with the same RCM (de Elia et al. 2007). The impact of the IV is almost negligible on long-term average climate, but it has important consequences in the day-to-day variability, and thus may be detrimental to processes studies (Laprise et al. 2008).

Only in the past few years some studies concerning the internal variability of RCMs have emerged. The main findings show that the IV depends on various factors: the geographical location of the domain, the size of the domain, the season and the synoptic conditions within the limited-area domain. Concerning the location of the domain, the RCMs over the high latitudes regions, as the Arctic, present a stronger IV compared to mid latitudes areas such as North America and Europe, due to a weaker inflow through the boundaries (Rinke et al. 2004). While a lot of research has been carried out on the IV for RCMs centred on North America, there are a few studies focusing on the European sector (Christensen et al. 2001). As far as the domain size is concerned, results show that the IV increases with larger domains (Christensen et al. 2001; Vannitsem and Chomé 2005; Lucas-Picher et al. 2004; Lucas-Picher 2008a). The IV presents distinct seasonal behaviours: many studies have reported that the IV is stronger in summer, due to stronger local processes (such as convection) combined with weaker control exerted by the LBCs (Caya and Biner 2004, van Ulden et al. 2007). There are not yet studies reporting the influence of synoptic conditions on the IV.

A technique that aims at reducing the IV in the RCMs is the spectral nudging approach (von Storch et al. 2000). It consists of prescribing large-scales to the RCM inside the entire domain, not just at the lateral boundaries. The model is expected to develop realistic detailed regional features consistent with the prescribed large-scales. The spectral nudging approach has been successfully applied on the RCMs over North America and Europe domains (Miguez-Macho et al. 2004; Radu et al. 2008). Although it is expected that the IV should decrease with the spectral nudging technique, there is still not agreement whether it is appropriate to use systematically in RCMs. As pointed out by Alexandru et al. (2007), it is still controversial whether the IV generated by the RCM could be considered as necessary or should be avoided. The presence of spread in an ensemble of RCM experiments forced by the same LBC has major consequences for the processes studies or forecasting, since the day-to-day variability can change from one member to another. It has been shown (de Elia et al. 2007) that the IV is reduced when fields are averaged in time to define climate values. Nevertheless the effects of the IV still remains non negligible in long period averaged simulations, indicating that the presence of IV should be taken into account in climate applications.

The main goal of this work is to explore one of the major concerns within the regional climate modeling community: to what extent RCMs are able to reproduce the large-scale atmospheric circulation of the driving model? This is important because the RCMs can generate finer scale features that are absent in the driving field and then in turn these smaller-scales can affect the large-scale flow supplied by the LBC. As pointed out by Laprise et al. (2008), the aim of the RCM simulations is to feed statistical downscaling algorithms, and they are supposed to maintain the large-scale circulation used to drive the model. However in their study, they show that the large-scale features can be modified in the nested model. Moreover, there is the possibility that the errors contained in the LBC can be corrected or magnified by the regional model.

One way to characterize the large-scale circulation is the weather regime approach. Weather regimes (Vautard 1990; Cheng and Wallace 1993) can be defined as the preferential states of the large-scale atmospheric circulation, they are recurrent and spatially well defined patterns that are usually obtained by cluster analysis (Michelangeli et al. 1995; Cassou et al. 2004). In this study we assess the RCMs ability to reproduce the weather regimes features of the driving field by using a multi-model approach. The database comes from the EU-FP6 ENSEMBLES project (Hewitt and Griggs 2004) and consists of an ensemble of experiments performed by 13 different RCMs on European domains (Fig. 1). All RCMs have been driven by ERA40 reanalysis for the period 1961–2000 at their lateral boundaries. We will evaluate the regional model ability to simulate the North Atlantic weather regimes obtained from the geopotential height field at 500 hPa (Z500) of ERA40 reanalysis. Although the weather regimes extend over the entire North Atlantic basin, the ENSEMBLES domain is large enough to allow a good representation of the large-scale atmospheric circulation.

Fig. 1
figure 1

Spatial domains for each RCM within the ensemble (dot dashed lines). The minimum common area used for the weather regime analyses is indicated by the solid line

The inter-model spread in simulating the weather regimes is assessed to interpret our results. Because the ensemble has been generated with different RCMs, we cannot formally use the term “internal variability” of an RCM. The spread of a set of simulations generated by different models with the same LBCs is different to the one of an ensemble built with the same RCM. Since the IV is present in both cases, we can expect the inter-model spread to be larger that one-model spread, due to the different formulations of each model. In the present work we will use the term “inter-model spread” (IMS) rather than IV. In this work we show that the IMS of the ensemble behaves similarly to the IV studied in previous works (Alexandru et al. 2007; Lucas-Picher et al. 2008a) which have used ensembles generated by a single RCM.

In addition, our RCM ensemble allows us to explore the impact of the spectral nudging technique on the simulation of weather regimes, because one of the RCMs within the ensemble has been nudged to follow the large-scale part of ERA40 reanalysis. Moreover, we are also able to investigate the impact of horizontal resolution on the reproducibility of weather regimes because the simulations have been produced with both a 50 and 25 km grid-mesh.

The outline of the paper is as follows: In Sect. 1 we give a brief description of the dataset and methodology. The results concerning the validation of RCMs to reproduce ERA40 weather regimes are provided in three blocks: the mean behaviour, the inter-annual variability and the day-to-day chronology. This is summarized together with the study of the impact of spectral nudging horizontal resolution in Sect. 2. Some investigation of IMS of the ensemble is showed in Sect. 3. Finally we close the paper with a discussion section.

2 Data and methodology

A summary of the main characteristic of the RCMs used in this work is presented in Table 1. More details about each individual model can be found in the ENSEMBLES project website: http://ensemblesrt3.dmi.dk. To produce this multi-model ensemble all RCM experiments have been performed for the time period 1961–2000 using six hourly lateral boundary conditions provided by the ERA40 reanalysis (Uppala et al. 2004) at 1.125° horizontal resolution. The sea surface temperature (SST) and sea-ice concentration are also from ERA40 dataset. All models are required to cover the ENSEMBLES minimum area in Fig. 1. The RCMs used their own model setup as well as grid specifications like rotation and number of vertical levels, but similar horizontal resolution. The ENSEMBLES project has produced two sets of RCMs experiments with horizontal resolution of 50 and 25 km over the same geographical area. In a first stage we will investigate the ability of RCMs to reproduce the ERA40 weather regimes for 50 km grid-mesh experiments. Then in a second stage, we study the impact of increasing horizontal resolution. The impact of the spectral nudging technique will be investigated by using the GKSS model, which has a comparable setup with the ETHZ model with the exception of the spectral nudging applied to the wind field above 850 hPa (von Storch et al. 2000).

Table 1 Summary of the main features of the regional climate models participating in the FP6 ENSEMBLES project

The atmospheric variable used here to characterize the “observed” weather regimes is the Z500 from the ERA40 reanalysis, which is also the same data used to drive the RCMs. The daily values have been computed from the six hourly data for the time period 1961–2000. In a first step, the spatial domain to compute the weather regimes is the North Atlantic basin: from 90°W to 30°E, and from 20 to 80°N. A Principal Component Analysis is performed on the Z500 anomalies to reduce the number degrees of freedom before applying the k-means cluster algorithm see (Michelangeli et al. 1995 for a detailed description). This step is necessary because the clustering process has a high computational cost and a weak robustness when the number of degrees of freedom is of the same order as the sample size. We keep the first 15 principal components which explain about 90% of the total variance. With the k-means algorithm the number of groups is a priori unknown and a Monte–Carlo test is needed to determine the optimal number k of clusters. Using this approach, we obtain k = 4 patterns corresponding to the well-known North Atlantic weather regimes (Vautard 1990; SanchezGomez and Terray 2005). Note that the ERA40 clusters have been determined for the entire North Atlantic basin instead of for RCMs domain. A larger domain provides more robust results from the statistical point of view, since no optimal number of cluster was found for the RCM spatial domain after a Monte Carlo test. In a second step we adapted the ERA40 weather regimes to the RCM common area (Fig. 1). This is done by projecting the ERA40 data onto the weather regimes centroids, both previously restricted to the minimum common area. Daily maps are attributed to the respective centroids by a minimization of a similarity criterion. Here we have used the Euclidean distance to measure the similarity between the cluster centroid and the daily Z500 map.

Weather regimes are represented as the composites of Z500 anomalies, obtained by averaging over all the days for the same weather regime (Fig. 2). The Blocking regime (BL) displays a strong blocking cell over Scandinavia. The Zonal regime (ZO), also considered as the positive phase of the North Atlantic Oscillation (NAO) (Hurrell et al. 2001), is characterized by an enhanced zonal flow crossing the North Atlantic basin. The Atlantic Ridge (AR) regime presents a positive anomaly over the North Atlantic basin. And finally, the Greenland Anticyclone (GA) exhibits a strong positive anomaly centred over west of Greenland. This regime has been frequently identified as the negative phase of the NAO.

Fig. 2
figure 2

Composites of the North Atlantic weather regimes in winter for ERA40 (top) and CNRM model (bottom). The isolines are the Z500 anomaly composite (solid lines are positive and dot dashed are negative values). Contour interval is 30 gpm

To obtain the weather regimes in the RCMs simulations, we do not apply the k-means algorithm separately on each model data: if the clustering solution for the models would have provided different k values, then no comparison task (the goal of this work) would have been possible. Therefore we implicitly assume that the number and structure of weather regimes do not vary between the models and reanalyses. A straightforward way to proceed is then to project the daily Z500 anomalies from the RCMs on the clusters centroids recalculated for the RCMs domain as explained above and detailed in Sanchez-Gomez et al. (2008). Before the projection, the model data, once interpolated to the ERA40 grid, have been represented in the space spanned by the first 15 principal components from ERA40.

3 Weather regimes analysis

3.1 Evaluation of the RCMs mean behavior

The study has been carried out for winter (December–March) and summer (June–September) seasons. Figure 2 shows the winter period weather regimes composites for the RCMs domain represented by the Z500 anomalies for ERA40 and the CNRM model. We observe that the spatial structure of the weather regimes is very well reproduced in this RCM experiment. This can be confirmed by the values of the spatial correlation between the ERA40 and the model composites in Fig. 2. We obtain 0.97 for Zonal regime, 0.99 for Blocking, 0.95 for Atlantic Ridge and 0.99 for Greenland anticyclone. The same behaviour is noticed for the rest of RCMs and the values of the spatial correlation are always higher than 0.95. For summer the weather regimes composites are very similar to those illustrated in Fig. 2 for both the ERA40 data and the RCMs, although the values of the Z500 anomalies are weaker than in winter.

The mean values of the frequency of occurrence (Fig. 3) are also well captured by the RCMs. We have addressed the statistical significance for the frequency of occurrence by a Monte Carlo technique. A weather regime daily sequence of values 1, 2, 3, 4 can be created from the original classification. Then we have generated random series that allow to build the probability density function for the frequency of occurrence for each weather regime and to determine the confidence levels. Note that there are two models that slightly overestimate the mean frequency of occurrence of the AR weather regime in winter. METOHC and UCLM are the RCMs, with the largest spatial domain, which penalizes their ability to reproduce the weather regimes of the driving field. For larger domains, the control exerted by the large-scale flow from the driving field in the minimum common area is weaker than in a model with smaller domain.

Fig. 3
figure 3

Mean frequency of occurrence of each weather regime computed as the average over all winters (a) and summers (b) within the time period 1961–2000. Slim bars correspond to RCMs weather regimes and block bar corresponds to ERA40 reanalysis. The confidence limits at 95%, obtained by building surrogate weather regimes classifications, are indicated by the dot lines

The capacity of RCMs to simulate the mean persistence of the weather regimes episodes is illustrated in Fig. 4 for the winter period. The persistence values are in good agreement with the reference field, although for the AR and GA regime the two models (10 and 13 in the figure) with the largest spatial area present the strongest bias. Model 3 (CNRM) exhibits an important bias in the mean persistence for the BL regime. With respect to the winter period, the persistence of weather regimes of ERA40 in summer (not shown) decreases to 6.2 days for the BL and ZO regimes stays at 5.2 days for AR regime and decreases to 5.8 days for GA regime. For summertime, the RCM bias with respect to ERA40 is almost negligible. In general the mean behaviour of the weather regimes of the driving field (composites, mean frequency of occurrence and mean persistence) is quite well reproduced in the RCMs. This fact also indicates a good performance of our projection procedure. The next question is whether the year-to-year and day-to-day evolution of weather regimes of ERA40 is well represented in the RCMs.

Fig. 4
figure 4

Mean persistence values (in days) of the four weather regimes in the winter period for ERA40 (big dot) and the RCMs (stars). The numbers on the x-axis are: ERA40 (1), CHMI (2), CNRM (3), DMI (4), ETHZ (5), GKSS (6), ICTP (7), KNMI (8), METNO (9), METOHC (10), MPI (11), SMHI (12), UCLM (13), OURANOS (14)

3.2 Temporal chronology of weather regimes

We have obtained annual time series by computing the frequency of occurrence of weather regimes for each winter (or summer) within the whole time period. Figure 5 shows the temporal evolution of BL and ZO weather regimes. From this figure it is clear that the RCMs reproduce correctly the temporal chronology of the seasonal frequency of occurrence of the weather regimes from ERA40. Nevertheless, an important fact is that there is a non-negligible spread among the models. The spread has been calculated as the standard deviation among the members of the ensemble excluding the GKSS model. This spread is more important for the summer season. This can be explained by the conclusions obtained by previous works for a North American domain (Caya and Biner 2004, Lucas-Picher et al. 2008b): the large-scale flow is weaker in summer and there is less control exerted by the LBCs on the RCM solution.

Fig. 5
figure 5

Annual time series of the frequency of occurrence of Zonal (ZO) and Blocking (BL) weather regimes for winter (top) and summer (bottom) periods. Red lines represents the ERA40 values. The blue line is the mean frequency averaged over the ensemble of the RCMs. The shading indicates the spread among RCMs calculated as the standard deviation excluding the GKSS model

To assess the RCMs performance to reproduce the inter-annual variability of weather regimes for ERA40, we have built the Taylor diagrams for the frequency of occurrence time series (Taylor 2001). In a Taylor Diagram, the pertinent statistics to quantify the similarity between the model and the reference field are their correlation, their root mean squared (RMS) difference and their standard deviation. We have normalized both the RMS and the standard deviation of the model by the reference field (ERA40). In this case the reference point (REF) is plotted on the x-axis at unit distance from the origin. Figure 6 represents the Taylor Diagrams for AR regime in winter and summer which is the weather regime showing the worst correspondence with ERA40 in terms of inter-annual chronology. In the Taylor Diagram, the nearer a model is of the REF value, the better its performance is. The spread among the RCMs in reproducing the weather regimes annual frequency of occurrence is also evident in the Taylor Diagrams. For the AR case in winter, the correlation values between the RCMs and ERA40 are high and similar, around 0.99, the root mean square errors (RMS) to the reference field (circles centred at the REF value) are equivalent and there are the same number of models on the left side (variability underestimation) that on the right side (variability overestimation) of the REF value. However there are two less performing models which present a lower value of correlation (0.95), an overestimation of the variance of ERA40 (more than 1.50) and a stronger RMS. These are the two models having the largest spatial domain. As pointed in Sect. 2.1, considering a common spatial domain for all RCMs in the ensemble penalizes their ability to simulate the weather regimes. For the rest of weather regimes, the Taylor Diagrams in winter (not shown) indicate a very good agreement between the models and ERA40. The AR situations in summer show stronger spread than in winter. Model performance in reproducing the large-scale atmospheric circulation of ERA40 in summer is generally weaker. The values of the correlation between the RCMs and ERA40 decrease in summer, except for the GKSS model that, as expected by the spectral nudging technique, maintains almost the same ability to simulate the large-scale of ERA40 in both seasons.

Fig. 6
figure 6

Taylor diagrams for the Atlantic ridge (AR) regime in winter and summer. The data used to built the Taylor diagrams are the frequency of occurrence of the RCM weather regimes and the reference field is ERA40 reanalyses. For more details about the Taylor diagrams see Taylor 2001

Now we focus on the model behaviour at daily time scale. To validate the day-to-day correspondence between the models and ERA40, we count for each season the number of days in which an RCM does not simulate the same weather regime as ERA40 and then we normalize by the total number of days for each regime. Figure 7 shows the relative “wrong” days percentages for each RCM and each weather regime, together with the multi-model ensemble excluding the GKSS model. The “wrong” day percentage is not only due to the internal variability of RCMs, there are also other error sources more related to each model, as in the procedure for applying the lateral boundaries conditions, the dynamics and the physical parameterisations contributing to the inter-model spread.

Fig. 7
figure 7

Percentage of “wrong” days in which the RCMs are not in the same weather regime as ERA40 (the driving field) for winter (a) and summer (b) periods. ALL represents the ensemble mean, excluding the GKSS model

In winter, the CNRM model exhibits the poorest performance in reproducing the weather regimes with respect to ERA40, whereas the ETHZ model is the best of the non spectral nudged models. Note that the GKSS model presents very small errors comparing to the rest of models, confirming the efficiency of the spectral nudging approach. In summer, the percentage of “wrong” days increases for all RCMs, even for the GKSS model. This is consistent with the idea that in summer the LBCs are weaker and the internal variability of RCMs is stronger compared to winter months.

Regarding the multi-model mean error for winter (ALL in Fig. 7), the GA situations present the highest relative “wrong” days percentage (14% of the GA days), followed by the AR regime (12.4%), the BL regime (11%) and the ZO regime (8%). In the case of the summer period, the relative error for each weather regime in the model increases to 18% for GA, 20% for AR regime, 15% for BL and 13% for ZO. As total number of wrong days on average we obtain 10.2% for winter and 17% for summer. This percentage is obtained by dividing the total number of “wrong” days by the total number of days.

The information contained in Fig. 7 is not contradictory to Figs. 3 and 4, since in the latter we have represented the behaviour of the RCMs irrespective to the chronology, whereas in Fig. 7 we concentrate on the day-to-day correspondence between the model and ERA40.

Taking into account the number of “wrong” days for each season, we built Table 2 which indicates the five winter and summer seasons where the weather regimes are the best and least reproduced by the ensemble of RCMs. We believe it may be useful for the regional modelling community to perform some case studies, especially to investigate whether there is some specific atmospheric circulation situations supporting or affecting the RCMs ability to reproduce the large-scale of ERA40.

Table 2 Five best and worst winters and summers simulated by the ensemble of RCMs

The model errors can be dependent on the large-scale atmospheric conditions of the driving field, which are conditioning the information flux through the boundaries. If the large-scale flow is strong and persistent the control exerted by ERA40 is stronger and the models are more constrained by the LBCs. However if the atmospheric flow is more variable and less persistent, the RCMs are more susceptible of making errors in reproducing the large-scale conditions introduced by their boundaries. For a given winter or summer a way of broadly estimating a persistent flow is to determine the number of days of weather regimes transitions per season. With more days of transitions for a given season, less persistent and more unstable can be the large-scale atmospheric flow. To investigate the links between the number of days of transitions and the percentage of “wrong” days within a season we have built Fig. 8. We determine the transition days as described in (SanchezGomez and Terray 2005) for the 4 North Atlantic weather regimes. Briefly, we define a weather regime episode as one in which the duration of the weather regime exceeds at least 3 days. The correlation between the percentage of transition days and the “wrong” days is 0.43 in winter and 0.33 in summer. These values, though weak, are significant at 95% confidence interval after a classical t test on the correlation. It seems that a relationship exists, more clearly in winter than in summer, between the number of weather regimes transitions and the models’ performance to reproduce the large-scale of the driving model.

Fig. 8
figure 8

Scatter plots showing the percentage of “wrong” days per winter (left) and summer (right) within 40 years (40 dots) versus the number of days of weather regimes transitions (in percentage)

3.3 Impact of spectral nudging

In view of the results it has been confirmed that when applying the spectral nudging technique to an RCM, as in the case of the GKSS model, the model solution is closer to the large-scale features of the driving field. We observe that in terms of the ability of reproducing the weather regimes of ERA40, the GKSS model performs the best. Nevertheless, as shown in Figs 6 and 7, there is still a small, almost negligible, percentage of error that remains after the spectral nudging (2.5% for winter and 3.8% for summer). This fact may suggest that the model does not lose completely its capability of generating some internal variability.

3.4 Effects of the horizontal resolution

In order to investigate the impact of increasing the horizontal resolution in the RCMs, we have performed the same weather regimes analyses for the 25 km grid-mesh experiments for winter and summer times. The domain size is the same for both 25 and 50 km grid-mesh. The results show that the ability of RCMs to reproduce the weather regimes characteristics of ERA40 remains the same for some models or slightly degrades for other models. Figure 9 shows the annual time series of the frequency of occurrence of the GA regime for the CNRM model for both horizontal resolutions. The temporal correlation between ERA40 and the model remains the same in winter (0.95) and slightly decreases from 0.87 for 50 km to 0.85 for 25 km in summer. This may suggest that the RCMs ability to reproduce the large-scale circulation of the driving field rather depends on the domain size in km than on the number of grid points within the domain.

Fig. 9
figure 9

Annual time series of the frequency of occurrence of the Greenland anticyclone (GA) weather regime for the CNRM model for winter and summer periods. Solid lines represent the ERA40 values, dot lines the 50 km grid-mesh experiment and dash-dot lines the 25 km grid-mesh simulations

4 Link between the weather regimes and the inter-model spread

In this section we investigate the spread among the RCMs when simulating the weather regimes of the driving field. The inter-model spread (IMS) depends on: (1) The characteristics of the atmospheric circulation and thus the information flow through the boundaries of the RCMs as well as the synoptic situations within the RCM domain; (2) the internal variability associated with the RCMs; and (3) the fact that in the ensemble, we are dealing with different models. In the present work we focus on factors 1 and 2, since factor 3 would require a deeper knowledge of each individual model.

To begin, we investigate the link between the spread and the error of the RCMs at the inter-annual time-scale. The correlation between the annual time series of models error, considered as the number of “wrong” days per winter or summer, and the models spread, computed as the standard deviation among the members, is 0.69 and 0.63 for winter and summer, respectively. This indicates that in general for years in which the models ability to simulate weather regimes is weaker, the spread is more important. Although, there are some years with large model errors and almost no spread, suggesting that there are certain large-scale atmospheric situations in which most of the models tend to behave in the wrong way.

We have estimated the IMS of the ensemble of RCMs for the 50 km grid-mesh experiment in the same way as previous works estimated the internal variability of only one RCM (Alexandru et al. 2007; Lucas-Picher et al. 2008a). We compute the standard deviation among the 12 members (excluding GKSS model) for each grid point and for each day, then we determine the mean spatial distribution of the inter-model spread as the time average over all winter and summer days within the time period. Because the atmospheric variability is stronger in winter than in summer, we normalize the inter-model spread by the estimation of the transient-eddy variability (Caya and Biner 2004; Lucas-Picher et al. 2008a). This is, on average, the natural variability of a GCM and is mainly due to the weather systems activity along the storm track. In an ensemble of RCMs generated by the same LBCs the inter-model spread is expected to be smaller than the natural variability (Laprise et al. 2008). In the studies of IV with one RCM, it is assumed that the members of the ensemble are unbiased, uncorrelated and share the same variance. In these cases, the transient-eddy variance is estimated as the temporal variance of one of the members of the ensemble. In our case, since the ensemble is generated with different RCMs, we can not make the above mentioned assumptions. Therefore, since all models have been forced at their boundaries by the same field, an estimation of the transient-eddy variance is obtained by computing the mean value of the 12 temporal variances of the ensemble.

The ratio between the inter-model spread and the transient-eddy variance, that is the relative inter-model spread (RIMS hereinafter), is represented in Fig. 10. A ratio close to one means that the inter-model spread in the ensemble is nearly equal to the natural variability in a GCM. The spatial distribution of the RIMS is similar for both seasons, although the values are larger in summer. On the west side of the domain the RIMS magnitude is weaker, indicating that the RCMs are dominated by the ERA40 forcing, in terms of the westerly flow entering from the western boundary. As we move eastward, the control exerted by the LBCs decreases and the chaotic variability becomes stronger. At the eastern boundary the inter-model spread is larger than on the western boundary, showing that the North Atlantic zonal flow is stronger than the continental influence coming from central Europe. The RIMS spatial pattern presents the largest values over the Balkan Peninsula, reaching more than 0.5 in summer. The maximum amplitude of RIMS over the Balkan Peninsula in summer is likely connected to the “summer-drought” (Jacob et al. 2007). This feature was a systematic problem observed in the RCMs participating in the EU PRUDENCE project, characterized by important warm and dry bias over the Balkan Peninsula in summer.

Fig. 10
figure 10

Relative inter-model spread (RIMS) defined as the ratio between the inter-model spread and the transient-eddy variance of the ensemble of the 13 RCMs for winter (left) and summer (right)

The seasonal variability of the RIMS displayed in Fig. 10 supports the idea that the models’ ability to reproduce the large-scale conditions of ERA40 is weaker in summer (Fig. 7), since the model errors to reproduce the day-to-day weather regime are higher in summer. This corroborates the fact that there is a link between the inter-model spread of RCMs and the error to simulate the atmospheric conditions of the driving field.

The spatial distribution of the RIMS for the European domain is quite similar to the relative internal variability pattern of sea level pressure field obtained by Lucas-Picher et al. (2008a) using an ensemble generated by only one RCM in the North American domain. The RIMS is negligible on the west of the domain, where the driving field forcing is strongest, then increases to the east. In their case, the maximum amplitude of the relative internal variability achieves 0.8 and is located over Newfoundland. The lower values of the RIMS found here might be due to the fact that we have used a non surface variable as Z500 and/or a smaller domain size.

The inter-model spread in the ensemble depends on the synoptic situations within the domain. In the present work we want to investigate whether the spatial distribution of the spread of the RCM ensemble depends on the weather regimes. With this purpose, we have computed the inter-model spread composites for each weather regime by averaging in time the spread over the days belonging to the same weather regime. To compute the average we consider the days with a certain circulation regime in ERA40, then we select these days in the RCMs data for averaging the spread. We have normalized the inter-model spread composites by the transient variability associated with each weather regime. Figure 11 shows the RIMS composites for the winter and summer periods. The spatial distribution of the RIMS is very similar for the four weather regimes, with the largest values in the centre-east of the domain. Nevertheless, there are some differences, in particular one can see that for the BL regime the RIMS maxima is extends more towards the north, whereas for ZO, AR and GA regimes RIMS is confined over the Balkan Peninsula. Figure 11 indicates that the spatial pattern and the maximum value of the RIMS depends on the weather regime and thus on the associated synoptic situations. If we compute the spatial average of the RIMS over all the grid points for each weather regime we obtain 0.31 for BL, 0.27 for ZO and AR and 0.28 for GA in winter; and approximately 0.39 for BL, ZO and AR and 0.35 for GA in summer. During winter, these values may also indicate that the inter-model spread and the model error are somewhat related, since the BL episodes present the highest values for the error (Sect. 3) and the RIMS.

Fig. 11
figure 11

Relative inter-model spread associated with each weather regime computed from the days belonging to same weather regime for winter and summer

In this section we have compared the inter-model spread values for the 50 and 25 km horizontal resolution experiments. We find (not shown) that in winter the spread for the 25 km grid-mesh is slightly stronger in most of the domain, whereas in summer the values of spread are equivalent, leading to the conclusion that the number of grid points does not have an significant impact on the model spread and hence on the internal variability of RCMs.

5 Summary and discussion

The goal of this work is to investigate the ability of an ensemble of regional climate models to reproduce the large-scale circulation of the driving field. The motivation of our work arises from the idea that the added-value of the RCMs is to provide finer scale details not present in the coarse resolution driven field, while maintaining the large-scale features provided by the LBC. A number of studies have shown a degradation of large-scale within the regional domain (Castro et al. 2005; Separovic et al. 2008). It has been also shown that an ensemble generated by only one RCM may provide different solutions despite being controlled at its boundaries by the same large-scale atmospheric flow. This feature is named the Internal Variability of regional models.

In the present study we assess the reproducibility of the large-scale flow through the weather regimes approach. The weather regime concept constitutes an attractive approach to describe the large-scale atmospheric dynamics. We use the four well-known North Atlantic weather regimes: the Blocking pattern, the Zonal, the Atlantic Ridge and the Greenland Anticyclone. Our data consists of an ensemble of experiments carried out by 13 different RCMs for the European domain, driven by the same LBCs provided by 40 years of the ERA40 reanalysis. To evaluate the model ability to reproduce the large-scales, we have considered three features: the mean behaviour (composite pattern, frequency of occurrence and mean persistence), the inter-annual variability and the day-to-day variability. These features are summarized in the following paragraphs:

  1. 1.

    Results show that all RCMs reproduce very well the composite pattern, the mean frequency of occurrence of weather regimes as well as the mean persistence values. As weather regimes have been computed within the common domain to all RCMs, those models with the largest domain size are penalized against the models with smaller domains. This is due to the fact that for larger domains, the control exerted by the large-scale flow in the minimum common area is weaker than in a model with smaller domain.

  2. 2.

    Concerning the inter-annual evolution of the frequency of occurrence, the models capture reasonably well the long-term trends and the inter-annual chronology of the four weather regimes. Nevertheless, we observe that there is some spread among the RCMs, which is stronger in summer than in winter. This is related to the fact that in summer, as the strength of the large-scale atmospheric flow decreases, the control exerted by the LBC is weaker and the nested models are more free to deviate from the large-scale of the driving field (Alexandru et al. 2007).

  3. 3.

    Regarding the day-to-day correspondence between the weather regimes in ERA40 and in the RCMs, the discrepancies among the models are more evident. We have computed the percentage of “wrong” days, which corresponds to the days where the large-scale conditions of eRA40 are not well simulated. We found that in the ensemble there are nested models which significantly degrade the large-scales of ERA40, whereas for other models the percentage of “wrong days” is less important. In summer all models degrade significantly. The weather regimes associated with the largest error are the GA for winter and AR for summer. In both seasons the ZO regime exhibits the smallest errors. The percentage of “wrong” days is somewhat related to the number of transitions days between the weather regimes for a given season. The correlation values are weak but significant for both summer and winter, though stronger in winter the values. This corroborates the theory that an atmospheric flow persisting during several days may provide a stronger control on the RCM through the LBC.

Our conclusion for this part of the study is that the RCMs can reproduce the long-term means of the ERA40 large-scales. However at day-to-day time-scales the model clearly degrades the large-scales. At the inter-annual time-scale the discrepancies among the RCMs are already evident. The degradation of the large-scales can be due to the model formulation (physics and dynamics), the nesting technique and the internal variability inherent to the RCMs. So far, the RCM modelling community has not provided a satisfactory explanation for this degradation. Following these results, one should be cautious in a statistical downscaling scheme that proposes the large-scale solution generated by a RCM as a daily predictor field. Case studies with a particular RCM and based on a given year should also be interpreted with caution. However, we think that the RCMs are perfectly reliable for studies dealing with climate time-scales such as present climate studies (mean behaviour, trends, inter-annual variability) or climate change scenarios.

We have investigated in more detail the spread among the RCMs by estimating the relative inter-model spread (RIMS) of Z500. The spatial structure of RIMS patterns is similar in winter and summer, though the values are larger in summer. The maxima of the RIMS are located over the Balkan Peninsula which is the region where the strongest bias of the RCM surface fields has been reported (Jacob et al. 2007). The spatial pattern of RIMS and the seasonal dependence show that the inter-model spread behaves in a similar way to the internal variability associated with only one RCM (Alexandru et al. 2007; Lucas-Picher et al. 2008a; Laprise et al. 2008). Note that the internal variability feature is included in the inter-model spread. The proper model performance related to its own physics, dynamics and experimental set up is the other contributing factor.

One solution to avoid both the RCM’s degradation of the large-scales and to reduce the IV is the nudging to the large-scales of the driving field. In this study, we have compared a nudged experiment to non nudged ones. The spectral nudging technique has been applied to one of the RCMs within the ensemble. We showed that the model performance to reproduce the large-scale conditions of ERA40 improves significantly with the spectral nudging. The wrong day percentage is much smaller in the spectral nudged simulation. In the regional modelling community it is still controversial whether the nudging of the large-scales will become the norm for the RCMs. Some studies have shown that the nudged simulations can reduce the differences between the large-scales of the driving field and the RCMs. They have also suggested that adding the nudging approach to the traditional nesting technique can also suppress the internal variability of RCMs. We think that the existence of internal variability in the RCMs has to be taken into account in order to provide some robust conclusions when dealing with non nudged simulations.

Another point is the impact of increasing the horizontal resolution on the inter-model spread. In this work, we showed that in our ensemble of RCMs the horizontal resolution does not change significantly the model ability to reproduce the weather regimes for ERA40. Nevertheless the spread is slightly larger in winter as the number of grid points increases. Even though we can not draw robust conclusions about the impact of domain size on the models’ performance; it seems that larger domains lead to more model discrepancies as previously shown for the Europe and the North America area (Vannitsem and Chomé 2005; Lucas-Picher et al. 2004).

To complete this work, we explore whether there are some synoptic situations (weather regimes) which are more sensitive to the inter-model spread. For that, we have computed the relative inter-model spread (RIMS) for the four weather regimes. The results show that for both winter and summer, the Blocking (BL) regime exhibits the most pronounced spread. While for the other weather regimes the RIMS maximum is more confined on the Balkan Peninsula, in the BL regime it is extended northward until the Scandinavian Peninsula. The BL regime presents the strongest spread probably because the main action centre of this regime is located in the centre of the domain (over the Scandinavian Peninsula) and thus can develop with little constraint from the LBCs. This is not the case for the ZO, AR and GA weather regimes, which are more constrained by the ERA40 forcing because they have action centres located closer to the western boundaries of the domain. Moreover, blocking cells are more persistent and thus associated with longer residence times of the air parcels within the domain, therefore increasing the internal variability of the nested models (Lucas-Picher et al. 2008b).

Within the context of the EU ENSEMBLES project, many analyses with other atmospheric variables are being carried out to validate the models’ performance by different approaches. The goal is to establish a weighting scheme by assigning weights to individual RCMs. The ability of RCMs to reproduce the large-scale flow of ERA40 will take part of this weighting system, as this feature is a key characteristic of RCM performance.