1 Introduction

The variability of climate in the mid-latitudes, whose chaotic nature drives the daily weather, originates from the baroclinic instability created by strong equator-to-pole temperature gradients. It is particularly strong in winter, both when considering inter-annual and intra-seasonal fluctuations. This variability has been described, in several large regions and at a hemispheric scale, as the alternation of the atmosphere between preferred states of the North-Atlantic atmospheric dynamics, or weather regimes (Vautard 1990), which result from the existence of multiple stationary solutions in the dynamics equations (Legras and Ghil 1985; Charney and DeVore 1979). In Europe, temperature extremes are often associated with an exceptional persistence of one particular weather regime. For example heat waves of summers 1976 and 2003 were characterized by persistent blocking anticyclones (Schär and Jendritzky 2004; Cassou et al. 2005), and the cold episode of winter 2009/2010 by a persistent negative phase of the North Atlantic Oscillation (NAO) (Cattiaux et al. 2010; Seager et al. 2010). The correct representation of such regimes, their spatial patterns and persistence properties is therefore essential for a global climate model (GCM) to properly simulate climate variability and its long term changes.

In recent years, GCMs have been incorporating more and more complex components with higher and higher resolutions. In particular, since its first version, the IPSL model has incorporated more comprehensive atmospheric (LMDZ, Hourdin et al. (2006)) and oceanic models (NEMO)—including both ocean (NEMO-OPA, Madec et al. (1997)) and sea-ice (NEMO-LIM, Fichefet and Maqueda (1999)) models—and included a surface model (ORCHIDEE, Krinner et al. (2005)). All components are synchronized using the OASIS coupler (Valcke 2006). In addition, the resolution of its atmospheric model has been increasing in both horizontal and vertical directions. In particular, simulations designed for the Third Phase of the Coupled Model Intercomparison Program (CMIP3) were based on a 96 × 71 horizontal grid and 19 vertical levels (IPSL-CM4 version, Marti et al. (2005)), while simulations designed for CMIP5 (IPSL-CM5A version, Dufresne et al. (2011)) use two finer grids (96 × 96 and 144 × 142) and 39 vertical levels. These changes require careful investigations and diagnostics. In this paper we investigate the sensitivity of the North-Atlantic dynamics and the European temperature extremes to both horizontal and vertical atmospheric resolution, from the IPSL-CM5A version and an intermediate version IPSL-CM4v2 described in Sect. 2.

The typical horizontal size of North-Atlantic circulation patterns is of the order of 1,000 km, so that their representation is likely to be sensitive to GCM resolutions (typically of the order of 100 km at mid-latitudes in current models). For instance, some studies suggest that horizontal resolution is responsible for the underestimation of blockings episodes (D’Andrea et al. 1998; Matsueda et al. 2009), because large resolutions fail to reproduce small-scale eddies necessary to the maintenance of larger-scale blockings. In particular, Doblas-Reyes et al. (1998) show that both North-Atlantic wintertime storm tracks and blockings were significantly improved when increasing the resolution from T63 to T106. Here we use a set of experiments with resolutions encompassing this interval, so that we can expect differences in both North-Atlantic weather regimes and associated European temperatures.

In addition to the horizontal grid, the vertical resolution has been showed to substantially affect the simulation of the tropospheric dynamics in climate models (e.g., Roeckner et al. 2006), which mostly results from a better representation of the stratosphere-troposphere coupling. From observations, stratospheric processes are indeed suspected to play a demonstrable role on the intra-seasonal variability of the extratropical tropospheric climate (Baldwin and Dunkerton 1999), including weather regimes (Baldwin and Dunkerton 2001). For instance Ouzeau et al. (2011) showed that the simulation of the winter 2009/2010 negative NAO was significantly improved when nudging the stratospheric dynamics towards reanalyses.

This paper is structured as follows. Section 2 provides details on model experiments and both reanalysis and observational dataset used for comparison. The sensitivity of North-Atlantic dynamics to model resolution is discussed in Sect. 3 from the analysis on both jet stream and weather regimes. Section 4 focuses on the representation of summertime heat waves and wintertime cold spells over Europe, while a discussion and some conclusions are provided in Sects. 5 and 6.

2 Model details and set of simulations

We use two different versions of the IPSL coupled model. The first one is IPSL-CM4v2, an intermediate version between IPSL-CM4 and IPSL-CM5A, respectively used for CMIP3 and CMIP5. The second one is IPSL-CM5A. Both versions are composed of the LMDZ atmospheric model, the NEMO ocean model at 2° resolution, and the ORCHIDEE land-surface model.

For the CM4v2 version, the experiments presented in this paper only differ in their atmospheric component: the dynamical core of LMDZ uses finite-difference schemes on a latitude-longitude grid, for which five different horizontal resolutions are used (Table 1). The lowest resolution has 96 points in longitude and 71 in latitude (experiment C4-96×71) and was used in IPSL-CM4 for CMIP3. The four other resolutions are respectively 96 × 96, 144 × 96, 144 × 142 and 192 × 142. This set-up allows a discrimination between longitudinal and latitudinal sensitivities: only longitudinal (latitudinal) resolution is increasing from C4-96×96 to C4-144×96 and C4-144×142 to C4-192×142 (C4-96×71 to C4-96×96 and C4-144×96 to C4-144×142). The time step for the dynamics is changed between the experiments following the resolution in longitude, in order to respect numerical stability criteria. The timescale for the horizontal diffusion at the lowest resolved scale is also lowered for the two experiments with the highest resolutions. All other parameters are unchanged between these five experiments.

Table 1 Description of the set of experiments

The lowest-resolution simulation in this set-up (C4-96×71) is colder at the surface than the others by an average of 1.5 °C, which could influence the simulation of some climatic features. In order to test this influence, a second sensitivity experiment (C4-96×71p) was performed at this resolution by changing the surface albedo, which is a sensitive parameter for global temperature. For this experiment the parameter p-magic is deliberately tuned from 0.02 to 0.01, in order to bring the global-mean temperature to the same level as the other resolutions.

For each experiment, we use a control simulation run over 1860–1959 with greenhouse gases concentrations fixed at 1960 level (in particular CO2 at 348 ppm). No historical runs were available. As a 50-year period is sufficient for our study, we extract daily outputs of both geopotential height at 500 mb (Z500) and daily mean temperature over 1910–1959. This choice is arbitrary but not crucial since chronology is unimportant in such control simulations.

In addition to CM4v2 control experiments, this study uses outputs of historical runs performed with the CM5A version (Table 1), at the two resolutions retained for the climate simulations of CMIP5: C5-96×96 (conventionally named IPSL-CM5A-LR) and C5-144×142 (IPSL-CM5A-MR). The essential difference between the CM4v2 and CM5A versions is the vertical discretization in the atmosphere: CM4v2 uses 19 hybrid sigma-pressure levels on the vertical, as in CM4, whereas the CM5A version uses 39 levels and has a better resolved stratosphere. Thus, even if some physical parameterizations and/or settings have been changed between CM4v2 and CM5A, one can consider that comparing C4-96×96 versus C5-96×96 and C4-144×142 versus C5-144×142 discriminates the influence of the vertical resolution. In order to ease such comparison, we decided to use the period 1910–1959 of CM5A historical runs, which provides a relatively constant CO2 level (∼320 ppm), close to the 348 ppm of the CM4v2 control runs.

Reference fields are the geopotential height at 500 hPa (Z500) dataset provided by the NCEP/NCAR reanalysis (hereafter NCEP, Kistler et al. (2001)) and the daily mean temperature of the ECA&D in-situ measurements (Klein-Tank et al. 2002), interpolated on a regular 0.5 ° × 0.5 ° grid (E-OBS dataset, Haylock et al. (2008)). For consistency with model runs, a 50-year subset is considered and both NCEP Z500 and E-OBS temperatures are extracted over 1960–2009. The sensitivity of observed North-Atlantic weather regimes to the choice of the reference period is tested by comparing the 1960–2009 NCEP daily Z500 to the 1910–1959 Z500 provided by the twentieth century reanalysis V2 (20CR, Compo et al. (2011)). Although recent and not of large use so far, the 20CR reanalysis has been shown in fair agreement with NCEP (among other former reanalyses) for representing the wintertime North-Atlantic circulations over their period of overlap (1948–2006) (Ouzeau et al. 2011). Since assimilated data over this region do not change much over the whole 20CR period (1871–2006, see Compo et al. (2011)), the 20CR Z500 over 1910–1959 can be considered as reliable.

3 North-Atlantic weather regimes

3.1 Methodology

Weather regimes (Vautard 1990) are generally obtained by performing clustering algorithms on a circulation variable (such as Z500) (Michelangeli et al. 1995), and the analysis of their occurrence frequency and/or persistence provides a synthetic and discrete description of the complex atmospheric dynamics. This description assumes an underlying multi-modality of the probability density function (PDF) of the atmospheric circulation, or at least areas in phase space where atmospheric trajectories “like to stay”. Indications for such behavior in the North-Atlantic sector have been provided by Michelangeli et al. (1995) or Woollings et al. (2010) (among others), even if recently discussed in (e.g.) Christiansen (2007).

Given the annual cycle of North-Atlantic atmospheric circulations, we separate here summertime (May to September) from wintertime (November to March) weather regimes. We restrain Z500 fields to the North-Atlantic domain, defined as 90°W–30°E/20-80°N. For each season, the computation of weather regimes of either a reanalyzed or modeled 50-year Z500 field comprises two major steps:

  1. 1.

    n centroids are obtained by applying a clustering algorithm on the k first Empirical Orthogonal Functions (EOFs, von Storch and Zwiers (2001)) of daily Z500 anomalies. In our case we use the k means algorithm (Michelangeli et al. 1995) with n = 4 classes, after selecting k = 14 EOFs which carry at least 80 % of variance. Anomalies are obtained by removing the 50-year climatology of the raw Z500 field.

  2. 2.

    each day is placed in the class whose centroid is the closest to the day’s Z500 anomaly in terms of minimal Euclidean distance. In the end, each class contains a distribution of daily Z500 anomalies, that can be described at first order by its mean (hereafter “class center”). Class centers and centroids generally differ, because centroids are computed on the first 14 principal components while centers are obtained in the full N dimensional field.

This methodology has been used in a couple of recent studies, including Cassou et al. (2005) and Cassou (2008) from which the names of both summertime and wintertime weather regimes are picked for our study (see Sect. 3.3).

Our approach to compare weather regimes from IPSL experiments with reanalysis can be decomposed as follows:

  1. 1.

    we compute centroids for each experiment and reanalysis, and thus obtain n = 4 centroids for each experiment and each season;

  2. 2.

    we test whether reanalysis centroids can be identified with those obtained from IPSL experiments;

  3. 3.

    we classify each experiment and reanalysis among reanalysis centroids taken as a common reference, which is justified if the previous condition is verified;

  4. 4.

    we compare the main features of each regime between IPSL experiments and reanalysis, which is relevant when using common centroids.

The performance of this approach is presented in Sects. 3.3 and 3.4. We first start by investigating the multi-modality issue in the atmospheric circulation, by prior analyzing in the PDF of the position of the jet stream in Sect. 3.2 based on the diagnostics performed in Woollings et al. (2010).

3.2 Preferred positions of the jet stream

We compute the latitudinal position of the jet in the North Atlantic by first zonally averaging the 850-hPa zonal wind between 75°W and 15°E for each day. The latitude of the jet is then taken as the center of the latitude band where the wind speed is greater than the maximum speed minus 1 ms−1. The 850-hPa level was chosen as it is representative of the eddy-driven jet and not influenced by the subtropical jet.

PDFs of daily jet latitudes in winter and summer months are shown in Fig. 1 for both reanalyses and IPSL simulations, together with the 95 %-confidence interval for NCEP reanalysis, computed by a bootstrap procedure with 1,000 realizations (gray shadings). In winter, the observed PDF displays a trimodal structure, as shown by Woollings et al. (2010). The observed poleward peak, located between 55 and 60°N, is absent in all simulations. The other two peaks are present in the model, albeit with overestimated frequencies (by compensation of the poleward deficiency). The increase in horizontal resolution tends to reduce (enhance) the equatorward (middle) peak, but has little effect on the poleward deficit. The CM5A version seems to shift the equatorward peak towards lower latitudes, which improves the fit with the observed PDF.

Fig. 1
figure 1

Left Winter and Right summer PDFs of the daily latitude of the 850-hPa jet (see text for details), for both reanalyses black and IPSL experiments (colors). For NCEP, 95%-confidence intervals obtained from bootstrap procedures are indicated (gray shadings)

In summer, the observed PDF presents a fairly flat distribution with only a single maximum near 50°N. The IPSL model exhibits a spurious peak around 35–40°N at lowest horizontal resolutions, which progressively disappears as both horizontal and vertical resolutions increase. In contrast to winter, this low-latitude decrease is compensated by increased occurrences of the jet at both mid—(around 45°N) and high (above 60°N) latitudes. The simulation of the summer PDF thus generally improves with (1) horizontal resolution and (2) transition between CM4v2 and CM5A, except in the 45–50°N latitude band where it becomes overestimated.

3.3 Centroids

Centroids obtained from NCEP daily Z500 anomalies over 1960–2009 are presented in top panels of Figs. 2 (winter) and 3 (summer). In winter we find the regimes recently used e.g. in Cattiaux et al. (2010) and described in Cassou (2008): the Blocking (BL), characterized by a strong anticyclone over Scandinavia, the two phases of the NAO (NAO− and NAO+), and the Atlantic Ridge (AR) bringing cold air masses from the Arctic over Europe. In summer, the atmospheric dynamics is less intense and only the negative phase of the NAO can be clearly identified from the clustering algorithm. We thus find the regimes described in Cassou et al. (2005): the Atlantic Low (AL) characterized by a deep trough over the ocean, the Blocking (BL), which can be considered as a reminiscence of the summer NAO+, the NAO− and the Atlantic Ridge (AR). In the following the centroid of the ist NCEP regime (i  ∈ 1…4) will be denoted by C 0 i (“0” standing for “reference”).

Fig. 2
figure 2

Top Winter centroids obtained from NCEP Z500 clustering: Blocking (BL), NAO− , NAO+ and Atlantic Ridge (AR). Middle Same for the C4-96×71 experiment. Bottom Same for the C5-144×142 experiment. For both IPSL experiments, centroids are sorted relative to NCEP ones (see details in text), and E values with NCEP centroids are indicated

The same procedure of centroids computation is performed for each of IPSL-CM4v2 and IPSL-CM5A experiments over 1910–1959. For each model m, the centroid of the jst regime is denoted by C m j (j  ∈ 1…4). We now test if all observed centroids C 0 i can be clearly identified among IPSL centroids C m j .

In order to answer this question, NCEP centroids are bilinearly interpolated to each of IPSL resolutions, and the similarity between all C 0 i and C m j is tested from their coefficient of explained spatial variance E (hereafter E value), which is given for two vectors x and y of same size by

$$ E(x,y)=1-\frac{\sigma^2(y-x)}{\sigma^2(x)} $$
(1)

where σ2 stands for variance. The E value tests both positions and amplitudes of the principal centers of action, which is necessary and sufficient for characterizing large-scale circulation (van Ulden and van Oldenborgh 2006). Here, this metrics is a compromise between a spatial correlation which only tests their position, and a Euclidean distance which depends on vector size (i.e. horizontal resolution). We nevertheless verified that replacing E value with spatial correlation or Euclidean distance in the following procedure does not change our results (not shown).

Computing E values of all pairs of C 0 i and C m j gives a 4 × 4 matrix (noted E m) where

$$ E^m_{ij}=1 - \frac{\sigma^2(C^m_j-C^0_i)}{\sigma^2(C^0_i)} $$
(2)

Each C 0 i is identified with one of the four C m j by looking at the maximum value of the ist row of E m. If all these per-row maxima occur in distinct columns, then each C 0 i corresponds to one C m j and we say that NCEP centroids are well-represented by the experiment. Else, NCEP centroids are said to be misrepresented by the experiment. The significance of this procedure was tested by generating 1,000 realizations of two independent random processes (A and B) with spatio-temporal characteristics similar to Z500 maps, and applying the clustering algorithm to each A and B. Even if randomly generated, we found that ∼10  % of A centroids could be identified among B ones. This roughly means that our identification procedure is significant at the 10 %-level.

All E m matrices are gathered in Table 2. According to our identification procedure, centroids are found to be well represented for all experiments in winter. This is illustrated in Fig. 2 for two of the experiments used in CMIPs, i.e. C4-96×71 (CMIP3, middle row) and C5-144×142 (CMIP5, bottom row). Despite slight differences in structures and/or amplitudes of patterns, all centroids can be visually and numerically identified to NCEP ones.

Table 2 E m matrices of cross E values between model and NCEP centroids, for both winter and summer seasons and all 20CR reanalysis and IPSL experiments

In summer, centroids are well represented for all experiments except C5-144×142 (Table 2), which may seem surprising since this experiment is one of the closest to observations in simulating the PDF of the jet position (Sect. 3.2). This misrepresentation may be due to an unstable choice between BL and NAO−, which leads here to a combination or mixture of the two. More precisely, while both AL and AR can be identified among C5-144×142 centroids, the two other centroids resembles to (1) the opposite to NCEP BL and (2) the mean between NCEP BL and NAO− (Fig. 3, bottom). The question whether this occurs for physical reasons or is a mathematical artifact of the clustering algorithm would require further investigation. For all other experiments, simulated centroids can be visually and numerically identified to NCEP ones, as illustrated for C4-96×71 in Fig. 3 (middle).

Fig. 3
figure 3

Top Summer centroids obtained from NCEP Z500 clustering. Middle Same for the C4-96×71 experiment. Bottom Same for the C5-144×142 experiment. For both IPSL experiments, centroids are sorted relative to NCEP ones (see details in text), and E values with NCEP centroids are indicated. Centroids are respectively: Atlantic Low (AL), Blocking (BL), NAO− and Atlantic Ridge (AR)

The procedure is also applied to 20CR centroids (Table 2). NCEP centroids can be identified among 20CR centroids, indicating the robustness of the weather regimes method to changes of reanalysis dataset and/or computational periods. In addition, even if E 0 i values (i.e. on the diagonal of E m matrices) are generally higher for 20CR than for IPSL, the weak difference between IPSL and 20CR E values suggests that the departure between IPSL and NCEP centroids is of the same order than the multi-decadal variability of observed centroids.

3.4 Between- and within-class main features

3.4.1 Class centers

In order to compare weather regimes between all experiments and reanalyses in terms of frequency of occurrences and persistence only, daily classifications—second step of weather regimes computation (Sect. 3.1)—need to be performed relative to common centroids. Conclusions drawn in Sect. 3.3 justifies the use of NCEP centroids. A similar choice is made in this issue by Cattiaux et al. (2012).

However, with this procedure, within-class distributions of Z500 anomalies, i.e. the four subsets of the total distribution of Z500 anomalies conditionally to each regime, may differ from one experiment to another (Rust et al. 2010). At the first order, this can be tested by comparing the means of within-class Z500 distributions, hereafter referred to as “class centers”. Similarly to Figs. 2 (3) and 4 (5) illustrates the winter (summer) class centers obtained for NCEP (top panels), C4-96×71 (middle panels) and C5-144×142 (bottom panels). By construction, NCEP class centers have very similar patterns as NCEP centroids, albeit more pronounced since class centers consider the whole Z500 distribution while centroids are computed in the reduced EOFs phase space (∼80  % of the total Z500 variance). By construction also, IPSL class centers are closer to NCEP ones than centroids are, albeit some differences in spatial patterns can be observed. For instance the IPSL low (high) pressure system of the winter NAO− does not extend as much over Scandinavia (Greenland) as for NCEP. In addition, IPSL class centers are generally less pronounced than NCEP ones, especially in summer (Fig. 5).

Fig. 4
figure 4

Winter class centers obtained after classifying daily Z500 wrt. NCEP centroids (see details in text). Top NCEP, middle C4-96×71 and bottom C5-144×142. For both IPSL experiments, E values with NCEP class centers are indicated

Fig. 5
figure 5

Summer class centers obtained after classifying daily Z500 wrt. NCEP centroids (see details in text). Top NCEP, middle C4-96×71 and bottom C5-144×142. For both IPSL experiments, E values with NCEP class centers are indicated

In order to better quantify such differences, Fig. 6a, c gathers all E values computed between class centers obtained for each experiment and NCEP ones, respectively for winter and summer. 20CR class centers have the highest E value with NCEP (E = 0.98 on average for both seasons), which suggests that no major change occur in within-class daily circulations between 1910–1959 and 1960–2009. In order to estimate the uncertainty in E values due to the observed variability, a bootstrap procedure is applied over days used to compute class centers of NCEP and 20CR. This provides a 95 %-confidence interval for the distribution of the E value between two class centers derived from observations (gray shadings in Fig. 6a, c), and we consider that two Z500 distributions whose class-center E values fall into this interval are equivalent. In addition, in order to visually illustrate differences in class centers, we project them onto the orthogonal base defined by the first two EOFs of the NCEP Z500 distribution (Fig. 6b, d). In particular, this verifies that NAO regimes mainly project onto the first EOF (x-axis) of the total Z500 distribution. In this reduced 2D-space, 20CR class centers appear very close to NCEP ones, which is consistent with their high E values.

Fig. 6
figure 6

a E values between all winter class centers and NCEP ones, with mean values added in the rightmost panel. b Projections of all winter class centers in the space defined by the first two EOFs of NCEP daily winter Z500 anomalies. cd Same as ab for summer

In winter, C4-96×96 (C4-96×71p) presents the highest (lowest) E value of IPSL-CM4v2 experiments with an average of E = 0.91 (E = 0.83, Fig. 6a). The poorest agreement between IPSL-CM4v2 and NCEP occurs for the NAO− class center, especially for experiments at lowest resolution (C4-96×71 and C4-96×71p). This is particularly illustrated in Fig. 6b, where all experiments significantly depart from the NCEP NAO− class center. IPSL-CM5A shows a substantial improvement in the representation of winter class centers compared to IPSL-CM4v2 at same horizontal resolutions, especially for the NAO− regime. As the main improvement between both versions lies in the increased vertical resolution (from 19 to 39 levels), this could suggest a crucial role of a better-resolved troposphere and/or stratosphere in the simulation of the wintertime NAO. The latter is consistent with findings of previous studies (Baldwin and Dunkerton 2001; Roeckner et al. 2006).

Similarities between simulated and NCEP class centers are greater in summer. Again, both experiments at the lowest horizontal resolution present the poorest scores (E = 0.89 and E = 0.90). All other experiments with both IPSL-CM4v2 and IPSL-CM5A have very close E values, even reaching the 95 %-confidence band of observations. Overall, the best representation is found for C4-144×96 (E = 0.96). No major change arises from the transition between CM4v2 and CM5A versions.

3.4.2 Conditional position of the jet

Another way to check the simulation of different regimes is to look at the jet stream position. PDFs of the jet latitude are displayed in Fig. 7 conditionally to the days falling into each of the different regimes. Differences with NCEP are shown to emphasize the structure of errors for each regime; the total NCEP distribution with its 95 %-confidence interval is added for reference.

Fig. 7
figure 7

PDFs of the daily latitude of the North-Atlantic jet, for days falling in the top winter and bottom summer regimes. NCEP reanalyses are in absolute values (solid black), and 20CR (dashed black) and IPSL experiments (colors, see legend) are represented as differences relative to NCEP. Gray shadings encompass non-significant differences at a 5% level. All PDFs have been normalized to an integral of 1 for each different regime

In winter, the PDF during the NAO− regime clearly occupies the equatorward peak of the total observed PDF (Fig. 1). This feature remains true in every simulation, albeit with a generalized poleward shift of the peak, which appears as a dipolar anomaly of the distributions in Fig. 1. Such finding suggests an underestimation of the NAO− amplitude by the IPSL model, which is consistent with the weak projection of the NAO− onto the first EOF of NCEP (Fig. 6b). For the other regimes, the model misses the contribution of the poleward peak and overestimates the jet occurrence in the mid-latitudes, especially for the AR regime. The global structure of geopotential anomalies is well simulated in that case (Fig. 4), but despite positive wind anomalies at high latitudes, the IPSL wind maximum remains south of the observed one.

In summer, the general shape of the PDF is improved with resolution for all regimes. In the NAO− and AL regimes, the spurious subtropical peak (35–40°N) progressively disappears. The frequency of jet positions at latitudes higher than 50°N also increases with resolution; it remains significantly underestimated in the BL and, to a lesser extent, AL regimes, but is close to the 95 %-confidence interval of observed frequency in the other two. The only feature that deteriorates with resolution is the excess frequency of the jet around 45°N, which appears in all regimes but is most prominent in AL.

3.4.3 Occurrence

Figure 8 compares seasonal frequencies of occurrence of each weather regime, defined as the percentage of days attributed to each class per season. For each experiment or reanalysis, a bootstrap procedure is applied over years in order to estimate 95%-confidence intervals for mean frequencies. The 1980s–1990s were characterized by anomalously high (low) occurrences of NAO+ (NAO−) in winter (e.g., Scaife et al. 2007), which could induce a bias if comparing NCEP with IPSL-CM4v2 control experiments or IPSL-CM5A 1910–1959 historical runs. The 20CR reanalysis used over the period 1910–1959 gives an estimation of such multi-decadal variations in observed frequencies of weather regimes. Overall we consider that the uncertainty of the observed occurrences is encompassed by both NCEP and 20CR confidence intervals (see gray shadings in Fig. 8).

Fig. 8
figure 8

a Winter and b summer mean seasonal frequencies of occurrence of each weather regime, in %. Reference levels are taken from NCEP, and 95%-confidence intervals obtained from bootstrap procedures are indicated. Mean errors, defined as the mean of absolute differences relative to NCEP over the four regimes, are indicated in rightmost panels

In winter all IPSL-CM4v2 experiments over- (under-) estimate the mean occurrences of NAO− and AR (BL and NAO+). However, statistically significant departures are only found for C4-96×71 and C4-96×71p in BL and NAO− regimes. Such departures could be related to a shift of NAO− class centers towards BL ones (Fig. 6b). On average over all regimes, the four highest-resolved experiments have equivalent occurrences, that are closer to 20CR than C4-96×71 and C4-96×71p. In summer all IPSL-CM4v2 experiments simulate fairly well the regimes occurrences, despite a generalized slight (and non-significant) over- (under-) estimation for NAO− (AR). Again, both C4-96×71 and C4-96×71p exhibit the highest mean error, while all other horizontal resolutions are very close to NCEP/20CR. For both seasons, no major difference is observed between versions CM4v2 and CM5A at same resolutions.

3.4.4 Persistence

Figure 9 compares persistences of each weather regime, defined as the mean number of consecutive days attributed to a given class for all episodes in that class. Again, a bootstrap procedure is applied in order to estimate uncertainties of mean values, and we consider that both NCEP and 20CR confidence intervals encompass the uncertainty for observed persistences.

Fig. 9
figure 9

a Winter and b summer mean persistences of each weather regime, in number of days. Reference levels are taken from NCEP, and 95%-confidence intervals obtained from bootstrap procedures are indicated. Mean errors, defined as the mean of absolute differences relative to NCEP over the four regimes, are indicated in rightmost panels

In winter IPSL biases on persistence are related to biases on occurrence (Fig. 8b): excess (deficit) in NAO− and AR (BL and NAO+) persistence. In particular C4-96×71 and C4-96×71p simulate on average 2-day longer NAO− episodes than NCEP/20CR, which is statistically significant. Interestingly, the increase in horizontal resolution tends to reduce the persistence of the winter NAO+, so that the three highest-resolved experiments significantly underestimate the observed average. On average over all regimes, the lowest two resolutions exhibit the strongest mean error, while the four other experiments have relatively equivalent biases. In summer all experiments tend to produce more persistent episodes than observed, especially for NAO− and BL regimes. Again C4-96×71 and C4-96×71p have the largest departures from reference, with in particular more than two days above average for BL. C4-144×96, C4-144×142 and C4-192×142 present the closest persistences to 20CR. As for occurrences, the CM5A version shows no significant improvement in regime persistences.

4 European temperature extremes

In this section, the ability of each IPSL-CM4v2 and IPSL-CM5A (low resolution) experiment to simulate cold/heat waves over continental Europe is assessed through the computation of several statistical indices of frequency, duration and intensity. As evidenced in this issue, all IPSL experiments used here exhibit a cold bias over Europe when compared to E-OBS observations (Menut et al. 2011; Hourdin et al. 2011). If extreme events are defined in relation to fixed thresholds, this would lead to a systematic overestimation (underestimation) of cold (heat) waves. In order to explore the model variability in its own climate, we thus choose to define extremes as departures in standard deviations (σ) from the mean climatology of the considered experiment.

For each experiment and each grid point, we define a cold (heat) wave as a wintertime (summertime) event of two or more consecutive days during which the local daily mean surface air temperature is at least 2σ below (above) the corresponding wintertime (summertime) temperature distribution. Such definition is also used by Vavrus et al. (2006) to select persistent extreme events (2 days and 2σ-filter) and coherent values in terms of socio-economic impacts. The temperature thresholds in Kelvin, obtained from the 2σ-departures from the mean, are shown in Fig. 10 for E-OBS and two of the IPSL experiments used in CMIP3 and CMIP5. Due to the generalized cold bias of the model simulations, the temperature thresholds are on spatial average colder than E-OBS ones for both seasons, especially for the lowest resolution (96 × 71).

Fig. 10
figure 10

Top Wintertime and bottom summertime 2σ-threshold temperature values for both C4-96×71 and C5-144×142 experiments and E-OBS observations. Units: °C. Spatial averages of biases relative to E-OBS are indicated

Various cold and heat waves indices were selected and calculated for each grid point (inspired from Beniston (2007)) to describe the behavior of temperature variability. Here, the following indices describing cold/heat wave characteristics are:

  • CWN/HWN (Cold/Heat-Wave Number): total number of cold/heat waves events over the whole period (i.e 1960–2009 for E-OBS, 1910–1959 IPSL simulations).

  • CWF/HWF (Cold/Heat-Wave Frequency): total number of days in cold/heat waves events.

  • CWD/HWD (Cold/Heat-Wave Duration): duration in days of the longest cold/heat wave over the whole period.

  • CWI/HWI (Cold/Heat-Wave Intensity): sum of σ-levels exceeding the 2σ-threshold over all days involved in cold/heat waves events.

To compare adequately these indices from model simulations of different resolutions, we bilinearly interpolate both E-OBS observations and model outputs on a regular intermediate 2.5° × 2.5° grid. All index-computations are weighted by both grid-point surface and time-series length (IPSL-CM4v2 is based on a 360-day calendar and IPSL-CM5A on a 365-day calendar). Figure 11 summarizes these calculations for cold and heat waves indices indicating relative errors, i.e. departures in % from E-OBS values (with 95%-confidence intervals), as well as spatial correlations with E-OBS patterns. Confidence intervals are estimated with a 1000-realization bootstrap procedure on years, except for the maximum-duration index (CWD/HWD) which is by definition a single value. Spatial correlations are obtained with a Pearson’s product moment correlation test on a 5 % significance level. In general, for a given resolution, cold waves characteristics are better simulated than heat wave ones, for both correlation coefficient and relative error. Moreover, IPSL experiments overestimate the number of wintertime cold waves CWN (+13 % on average) and largely underestimate those of summertime heat waves HWN (−55 % on average) compared to E-OBS (see also Figs. 12 and 13). Note that HWN even reaches zero at some grid points in Central Europe. Fair results are obtained in the simulation of CWF, notably for the two simulations with the two highest resolutions (C4-192×142, C4-144×142 and C5-144×142) for which highest correlation coefficients among indices (0.73, 0.79 and 0.77 respectively) are obtained as well as a low relative error except for CM5 (+4, 8 and 21 % respectively). However, poor results are obtained with the simulation of HWF: spatial correlations lower than 0.5 and relative errors, on average over all experiments, of −52 %. All experiments overestimate the total intensity of cold waves CWI (+27 % on average) and underestimate those of heat waves HWI (−55 % on average). Finally, maximal duration of heat waves are underestimated by about 33 % while this index seems well simulated for cold waves.

Fig. 11
figure 11

a Wintertime cold waves and b summertime heat waves indices, averaged over Europe and represented as departures in % from E-OBS values. Indices are total number of waves (CWN/HWN), total number of extreme days (CWF/HWF), maximum duration (CWD/HWD) and total intensity (CWI/HWI). Red (blue) stands for positive (negative) departures, and 95%-confidence intervals calculated with bootstrap are reported on each bar (except for CWD/HWD which are single maxima values). E-OBS raw values are explicitely indicated. The intensity of the filling color represents the spatial correlation value between respective IPSL and E-OBS index maps, see color bar in bottom-right corner. Hatches indicate non-significant correlations (p value > 0.05)

Fig. 12
figure 12

Wintertime cold-wave number (CWN) in all IPSL experiments and E-OBS observations. Spatial averages are indicated in topleft corners

Fig. 13
figure 13

Summertime heat-wave number (HWN) in all IPSL experiments and E-OBS observations. Spatial averages are indicated in topleft corners

Nevertheless, two groups of model resolution can be distinguished: “middle-to-high” resolutions (C5-96×96, C4-144×142, C4-192×142) that give fair results characterized by significant correlation coefficient and low relative error especially for cold waves, in opposition to “low-to-middle” resolutions (C5-96×96, C4-96×71, C4-96×71p, C4-96×96) with both low spatial correlation coefficient (even non-significant for heat waves) and high relative error. The transition from CM4v2 to CM5A version gives mixed results. For the 96 × 96 horizontal grid, CM5A significantly improves the simulation of cold waves characteristics but degrades results for heat waves (Fig. 11). For the higher 144 × 142 resolution, CM5A tends to increase the overestimation of cold waves, but gives similar heat-wave indices as CM4v2. Eventually, it is worth noting that C4-96×71 and C4-96×71p give very similar results in such extreme indices, despite a strong reduction of mean biases in C4-96×71p (tuning of ocean albedo parameter). Overall, this lowest resolution (used in CMIP3) exhibits the poorest agreement with E-OBS, especially for spatial correlations.

In order to investigate spatial distribution of cold/heat-wave characteristics, patterns of number of cold waves (CWN) are shown for each experiment and E-OBS in Fig. 12. The focus on CWN is motivated since this index symbolizes a concrete societal issue, in both present-day and climate change contexts. Significant spatial correlation coefficients (0.44–0.69) are obtained with resolutions C4-192×142, C4-144×142 and C5-144×142. The lowest relative error is obtained with the highest resolution (C4-192×142). However, inadequate patterns are found with poorer resolutions (C4-96×96, C4-96×71p, C4-96×71, C5-96×96) with much higher values than E-OBS in Central and Southeastern Europe. In particular the position of the maximum number of cold waves located in Northwestern Russia and Baltic Sea borders (E-OBS), already evidenced by Vavrus et al. (2006), is captured by the highest three resolutions but not by the others.

Figure 13 is similar to Fig. 12 but for total number of summertime heat waves (HWN). As previously mentioned, IPSL model simulations highly underestimate the presence of heat waves although slightly better correlation coefficient and relative error are obtained with better resolutions. In general, patterns of heat waves are inadequately simulated by all the resolutions as confirmed by poor statistical results in Fig. 11.

5 Discussion

In this section we discuss the link between large-scale dynamics (Sect. 3) and temperature extremes (Sect. 4), in order to understand why increasing horizontal and/or vertical resolution causes improvements in the representation of some variables but not necessarily in others. This link is made in Fig. 14: for each grid point, we find the wintertime/summertime regime with the highest relative frequency of occurrence over days of cold/heat waves, defined as the frequency over cold/hot days only divided by the frequency over all days, as done by Yiou and Nogaj (2004). This provides a spatial information about regimes which are the most important for the representation of temperature extremes.

Fig. 14
figure 14

Top Predominant wintertime weather regime associated with cold waves, defined at each grid point as the regime with the highest relative frequency over days of cold waves, for both C4-96×71 and C5-144×142 experiments and E-OBS observations. Bottom Same for summertime regimes and heat waves

In the observations, we find that wintertime cold waves are mostly associated to (1) NAO− over Northern Europe (above 45°N), (2) AR over Spain, and (3) BL over Central to South-Western Europe (Fig. 14, top), which is consistent with Cattiaux et al. (2012). Only the two IPSL experiments used for CMIP3 and CMIP5 are shown for convenience, but for all resolutions the IPSL model tends to overestimate the influence of the AR regime, which shifts the area of NAO− influence above 50°N. The influence of BL is also underestimated, depending on the experiment, but no clear signal arises with increasing resolution. The general overestimation of cold waves by the IPSL model (Fig. 11), resulting from a overestimated wintertime intra-seasonal variability, can therefore be linked to the overestimation of both winter NAO− and AR frequencies (Fig. 8). The better simulation of both structure and frequency of the winter NAO− by high horizontal resolutions (Figs. 6 and 8) directly improves the representation of cold waves over Scandinavia (Fig. 12). Below 50°N, since the influence of the resolution on the AR regime is less pronounced, the reduction of the cold-wave overestimation is somewhat limited. Overall, this can explain why spatial features of cold waves are improved with increasing horizontal resolution, while such an improvement is less clear on spatially-averaged indices. However, the representation of weather regimes hardly explains the changes in wintertime cold spells between CM4v2 and CM5A versions of the IPSL model. These changes are therefore suspected to arise from non-dynamical processes, and their understanding is left for future studies.

The link between large-scale dynamics and temperature extremes is generally weaker in summer. We find that heat waves are mostly associated with both AL and BL over most of Europe, AR in Spain and AR and NAO− in South-Western Europe. For the IPSL model at low resolutions, due to the small number of hot days (even zero is Central Europe), maps of the preferred regime for heat waves are somewhat noisy (Fig. 14, bottom). Higher resolutions tend to increase the number of heat waves and improve the fit with observations in Fig. 14, even if the AR influence is overestimated over Central Europe (as in winter). Thus, contrarily to wintertime, the better representation of summertime temperature extremes by highest resolutions (both spatial features and spatially-averaged indices, Figs. 11 and 13) seems rather linked to an improvement of the association with the large-scale dynamics than to an improvement of large-scale features themselves. In addition, the generalized high underestimation of heat waves seems hardly attributable to biases in weather regimes. It rather denotes a systematic underestimation of the summertime variability by the IPSL model, which is likely to be caused by soil processes or radiative feedbacks (clouds, aerosols).

6 Conclusions

In this paper we have investigated the influence of the atmospheric resolution in the simulation of North-Atlantic dynamics and European temperature extremes by the IPSL model. For that purpose we have used both IPSL-CM4v2 experiments with five different horizontal resolutions and 19 vertical levels and two IPSL-CM5A experiments with 39 vertical levels.

The first part was dedicated to the evaluation of the North-Atlantic dynamics in IPSL model by using a weather-regime approach. Since this approach relies on the multi-modality of the North-Atlantic jet stream, we first tested such multi-modality in IPSL experiments. We have found that IPSL jet stream is on average located southward of the observed one for both seasons. In winter the IPSL model misses the 58°N peak of the observed trimodal distribution of the jet stream position, and the balance between 38°N and 46°N peaks is improved by resolution increase. In summer the resolution increase tends to improve the fit between simulated and observed jet stream positions. Despite these biases in jet stream positions, the observed weather regimes can be identified in IPSL circulation clusters, which indicates that IPSL experiments fairly reproduce the alternation between the preferred states of the atmospheric dynamics. However some differences are observed in intra-regime distributions, especially at the lowest resolution where both occurrence and persistence of the winter NAO− regime are overestimated, but its mean intensity underestimated. Overall, a threshold effect appears between the 96 × 71 resolution and all others. The CM5A version tends to improve the representation of winter regimes, suggesting a crucial role of a high-resolved stratosphere for representing wintertime dynamics. These results indicate that IPSL simulations designed for CMIP5 will show higher skills for North-Atlantic dynamics than the CMIP3 runs.

The second part addresses the issue of the representation of European temperature extremes by the IPSL model. We find that indices of winter cold spells (summer heat waves), such as frequency or intensity, are overestimated (underestimated) for all experiments. In winter the overestimation of cold extremes may be linked to the overestimation of NAO− and AR occurrence. In summer the large underestimation of warm extremes seems hardly attributable to errors in large-scale circulation since weather regimes are generally well captured by the model, but should rather result from systematic biases in the representation of local processes, including soil moisture or cloud feedbacks. For both seasons, the increase in horizontal resolution usually contributes to a better representation of extreme statistics. In particular, the better-represented orography for higher resolutions significantly improves spatial patterns of all indices. Interestingly, the horizontal resolution seems to have a stronger influence on temperature extremes than the correction of the mean temperature bias, so that the horizontal grid used in CMIP3 generally leads to the poorest agreement with observed indices. However the transition between CM4v2 and CM5A versions do not necessarily improve extreme temperature statistics, which may seem particularly surprising for the winter seasons since (1) weather regimes appear better simulated and (2) the influence of the large-scale circulation on European temperatures is stronger. The latter suggests that regional processes that modulate temperature extremes have been modified between CM4v2 and CM5A versions. Further investigation, including sensitivity experiments to land parameters, or statistical breakdown methodologies such as presented in Cattiaux et al. (2012), could allow to discriminate the respective contributions of large-scale circulation and regional physical mechanisms to biases in temperature extremes.