1 Introduction

Numerous studies have examined the ability of General Circulation Models (GCMs) to reproduce many aspects of the general circulation. Very often, applications have focused on the long-term average behaviour of the most important large scale and hemispheric modes of atmospheric variability (Latif et al. 2001; Cohen et al. 2005; Lucarini et al. 2007). However, the mid-latitude atmospheric circulation is also influenced by transient synoptic-scale systems and persistent disturbances locked in geographically recurrent regions. The analysis of these small-to-large scale phenomena on a daily basis constitutes a more stringent test to GCMs since they reflect the day-to-day behaviour of the model and result from interaction processes covering a wide range of scales, some of them at the limits of the model resolution. Thus, the diagnosis of weather systems in GCMs provides a powerful tool for: (1) model validation and improvement (e.g. D’Andrea et al. 1998; Ulbrich et al. 2009); (2) investigating the dynamics of the diagnosed process or its sensitivity to different specifications of the model (horizontal resolution, Sea Surface Temperatures, SSTs, e.g. Tibaldi et al. 1997) and, if the representation of the diagnosed phenomena in the GCMs is reasonable, (3) examining the response to future (e.g. Sausen et al. 1995; Ulbrich et al. 2009) and/or (4) past changes in external forcing (e.g. Fischer-Bruns et al. 2005; Raible et al. 2007).

Due to the large size of data sets and the need for objective detection of these weather systems, automatic routines have become a common tool in data analysis. Among them, atmospheric blocking has been a recurrent topic in both numerical weather prediction models (WPM, Tibaldi and Molteni 1990; Anderson 1993; Tibaldi et al. 1994) and GCMs (Sausen et al. 1995; Tibaldi et al. 1997; D’Andrea et al. 1998). A common finding is a generalized underestimation of the observed blocking frequency owing to inherent problems to simulate the blocking onset (i.e. the transition from a zonal to a blocked flow) and persistence (Tibaldi and Molteni 1990; Anderson 1993; Tibaldi et al. 1994, 1997; Nutter et al. 1998). In recent years, realizations with ensembles of members that take into account some uncertainties in initial conditions and in model formulation have improved the simulation of atmospheric blocking in the context of medium range weather forecasting (Watson and Colucci 2002; Pelly and Hoskins 2003a), whereas GCMs simulations have only shown modest improvements (Randall et al. 2007).

Low spatial resolution or problems in the model formulation of certain physical parametrizations (usually related to small-scale processes) have been widely acknowledged as important limiting factors towards a proper simulation of important extratropical weather systems (e.g. Tibaldi et al. 1997; Bengtsson et al. 2006), including blocking. Nevertheless, an additional error source may also arise when automatic algorithms are applied to GCMs without an adequate adaptation of the scheme detection to the climate of the model. Thus, those thresholds that have been adjusted from the observational evidence can be highly inappropriate when applied directly to the GCMs output (e.g. Doblas-Reyes et al. 2002; Walsh et al. 2007). Additional drawbacks appear when the method relies on a priori or subjective criteria that work in the real world but may not apply straightforward to the specific climate of GCMs. These caveats are clearly evident in the case of blocking. The relative low number of studies addressing blocking in GCMs is justified by the complexity of its diagnosis, the lack of agreement among researchers towards a unified blocking definition and the fact that not all blocking indices can be directly applied to modelled climates since its automatic computation relies on the specification of critical parameters that cannot be extrapolated to the climate of GCMs (see the companion Paper I for further details).

In Paper I, a novel automatic method to diagnose atmospheric blocking was designed and applied to the Northern Hemisphere (NH) using reanalysis data for the 1950–1989 period. The main assets of the novel blocking detection method are its “blended” approach, which reconciles the two most widely used blocking indices, and its applicability to GCMs. This companion paper addresses the implementation of the blocking automatic method described in Paper I to the specific climate simulated by the ECHO-G Atmosphere–Ocean General Circulation Model (AOGCM, Legutke and Voss 1999) and its comparison with the NCEP/NCAR reanalysis (Kalnay et al. 1996). The ultimate goal is to evaluate the blocking behaviour under different specifications of external forcing. However, for the projections of future climate to be credible, it is important that the main observed characteristics of the phenomena under analysis (spatial pattern, frequency distribution, persistence, etc.) be simulated with a reasonable level of accuracy under present-day conditions.

The objectives of this paper (Paper II hereafter) are: (1) to examine the ability of the ECHO-G model to reproduce the observational features of NH blocking as a case example in which observational based parameters are adapted to a different climate; (2) to use the automatic method as a tool for model validation by comparing the model behaviour with observational results and identifying possible causes of model failure; (3) to estimate the magnitude of the model error attributable to different choices of thresholds (as those that would result from observational parameters) in order to stress the need of adapting automatic methods to the climate of the GCMs if a true objective diagnosis is to be performed.

This paper is organized as follows: the following section presents data sources and summarizes the blocking detection method, fully described in Paper I. Section 3 focuses on the model performance. Critical parameters and blocking climatologies are first compared with observations. Secondly, some possible sources of model failure are addressed and compared to those resulting from inappropriate adaptations of the automatic method. Finally, the last section provides some concluding remarks.

2 Data and methodology

Daily geopotential height (Z500) and monthly zonal wind fields at 500 hPa (U500) for the whole NH are employed in this study. Data from the NCEP-NCAR reanalysis for the 1950–1989 period will be used as reference observations (OBS), with the awareness of representing a consistent modelling assimilation of meteorological information and not real observations. For the sake of comparison, reanalysis data (at 2.5° × 2.5°) have been interpolated to the same resolution of the model (ca. 3.75° × 3.75°), although the blocking algorithm is applicable to data of different resolutions (see Paper I). Modelled data have been extracted from the ECHO-G AOGCM. The atmospheric component is the ECHAM4 (Roeckner et al. 1996) used with a T30 horizontal resolution (ca. 3.75º) and 19 hybrid sigma-pressure vertical levels, five of them located above 200 hPa and the highest being at 10 hPa. A land surface scheme comprises a soil model, hydrology, snow cover physics and vegetation effects. The ocean model component (HOPE-G) includes a Gaussian T42 grid (ca. 2.8°) with a gradual meridional refinement reaching 0.5° at the Equator (Wolff et al. 1997). A dynamic-thermodynamic sea-ice model is also included in the ocean code. Both models are coupled with the OASIS (Terray et al. 1998) software by exchanging mean atmospheric momentum, heat and freshwater fluxes as well as surface conditions (SSTs, sea-ice concentration and thickness, snow depth). In order to avoid climate drift, heat and freshwater flux adjustments are applied to the ocean. The flux adjustments are constant in time through the integration and their global contribution is zero.

Three model experiments are used in this study: a 1,000 year long control simulation (hereafter CTL, Zorita et al. 2003) with external forcings fixed to the present climate values for the three main greenhouse gases, CO2, CH4, and N2O (353 ppmv, 1,720 ppbv and 310 ppbv, respectively) and two forced simulations (FOR1 and FOR2) for the period 1000 to 1990. These forced simulations differ only on their initial conditions and are driven with estimates of external forcing factors such as atmospheric greenhouse gas concentrations (GHG), solar irradiance and volcanic activity (imposed as significant reductions in the solar constant). Sulphate aerosols or vegetation changes are not included in these simulations. The original source for the forcing specifications used in these simulations is Crowley (2000). A more in depth description on the forcing and the simulations as well as information about previous analysis made on them can be found in González-Rouco et al. (2009).

As this paper addresses blocking features derived from the objective application of the blocking detection method described in Paper I to the specific climate simulated by ECHO-G and its comparison with reanalysis, the work described herein will focus on the last 40 year period (1950–1989) of both forced simulations. That time interval would be the most comparable to the reanalysis data from the point of view of external forcing factors used in those simulations. Since the two forced simulations can be viewed as two realizations of the same climate state with different initial conditions, results herein will be provided as the average of the statistics derived from the two forced runs (labelled as FOR). Conversely, the 1,000-year control run has been sliced into 25 consecutive periods, each of 40 years of length. Since the forcing is constant in time in the CTL simulation, and considering the short time scales of atmospheric dependence from the initial state (i.e. a few months), these 25 parts can be treated as an ensemble in which each temporal slice can be thought of as a different control run with different initial conditions, and therefore independent of the other members of the ensemble. This allows establishing a blocking climatology that describes its spatial and temporal variability only as a function of the internal variability in the model and that serves as a reference to the results obtained from both forced runs.

The identification of blocking is fully described in Paper I. It is based on a combined approach of absolute and anomaly daily Z500 fields that provides a complementary perspective of blocking by merging the traditional blocking indices of Tibaldi and Molteni (1990, hereafter TM) and Dole and Gordon (1983, hereafter DG). The anomaly field is computed by removing a running annual mean and the seasonal cycle as in Sausen et al. (1995, hereafter SKS) but taking into account long-term changes that may occur in the seasonal cycle. Daily blocks are identified as contiguous 2-D spatial signatures with anomalies above a given threshold (z a ) associated with meridional Z500 gradient reversals (i.e. easterly winds) around a reference latitude (ϕc) representative of the westerly jet stream. Thresholds are classified as critical or secondary depending on the sensitivity of the method to changing cut-off values. The anomaly threshold and the reference latitude for blocking occurrence are considered critical parameters and they are climatology-dependent, i.e. their values are calibrated to the specific characteristics of the observational or simulated climate under study. The reference latitude is identified for each longitude as the latitude with maximum variance in 5-day high-pass Z500 filter outputs. The anomaly threshold is derived from the one-standard deviation level of the daily Z500 anomalies for those grid points lying north of the reference latitude. Both parameters are allowed to vary from month to month to accommodate the seasonal cycle. The reference latitude also accounts for long-term shifts that may occur in the location of the jet stream. Additional secondary criteria such as the requirement of a minimum 2-D extension, some fraction of overlapping between successive daily blocks and persistence are also required to account for the typical scales of the block and its spatio-temporal evolution. Cut-off values are set to 2 × 106 km2, 50% and 4 days, respectively, for both observations and model simulations, since the specific setting of these thresholds are not critical for the method (see Paper I for further details).

3 Model performance

In this section, the critical parameters needed for blocking detection are derived for the ECHO-G model experiments and compared with the observational ones. Then, the model performance is assessed through the examination of blocking activity from both a grid-point description and an event-based approach. Because of the general high resemblance in model performance between CTL and FOR experiments, the following sections will mainly describe results derived for FOR, unless stated otherwise.

3.1 Parameters

Figure 1 compares the critical parameters obtained for OBS and FOR over the common period 1950–1989. The climatological reference latitudes (averaged over the whole period) reveal similar spatio-temporal variability, with locations further north over the Atlantic than over the Pacific and a northward shift in summer (Fig. 1a, b). The model simulates realistically interseasonal variability and captures the regions with maximum deviations in the jet stream position, in spite of a southward shift from eastern Atlantic to Eurasia and a general underestimation of variability in Eurasia and North America. Figure 1c shows the 1950–1989 daily mean of Z500 anomaly distributions for OBS and FOR. They share the Gaussian shape although model anomalies are relatively less spread than those in observations. This model behaviour is found through the whole year. As a consequence, monthly anomaly thresholds employed for blocking detection (i.e. the corresponding standard deviation levels from the monthly distributions) are lower in the model than in the reanalysis.

Fig. 1
figure 1

Critical parameters. a OBS and b FOR longitudinal distribution of the 1950––1989 annual averaged reference latitude, ϕc (solid line). Dashed (dotted) lines represent the corresponding reference latitudes for July–August–September (January–February–March). Light (dark) shaded areas indicate the ±2σ level of the monthly (annual mean) time series. Grey solid line in b indicates the ensemble mean of the annual reference latitudes for CTL. Highlighted areas at the bottom of panel b show longitudes where the monthly series in FOR deviate more than ±2σ from the OBS mean distribution; c frequency distribution histogram of daily Z500 anomalies for the period 1950–1989 and for all grid points north of the reference latitude ϕc. Solid/dashed line corresponds to OBS/FOR. The solid/dashed vertical line indicates the annual mean anomaly threshold z a in OBS/FOR estimated from the 1σ level of the total distribution. Shading around FOR values represents the corresponding values for the 25 realizations if the CTL simulation

As these objective parameters are specific of the data set employed, they take into account possible model biases instead of assuming that the simulations convey the same climate as the reanalysis. Therefore, the differences obtained between observed and simulated parameters can be regarded as a validation test to the model, being useful indicators of the model performance. For example, a comparative analysis of latitudinal reference distributions shows a poor model performance in the location of the Eurasian jet stream, which is shifted south as compared to observations, especially over the European continent. This should have an effect in blocking features over this sector. A comparison of anomaly thresholds also allows for testing the skill of the model. Differences between FOR and OBS monthly anomaly thresholds peak in winter (November to January) and summer (June to August) months, pointing to maximum model biases in these seasons (not shown). Focusing on the range of anomalies above the adopted limits for blocking occurrence, the strongest underestimation in the modelled Gaussian distribution occurs in the range of 150–250 gpm (Fig. 1c). Blocks with anomalies of that magnitude are more frequent during winter and spring, particularly in the Euro-Atlantic sector (e.g. Diao et al. 2006). Therefore, a model underestimation of the frequency and/or persistence of this subset of blocking episodes could be expected.

Finally, the general good agreement between critical parameters derived from FOR and CTL should be stressed. Only small differences between the reference latitudes over the Euro-Atlantic and western Pacific sectors are worth of mention (Fig. 1b). These results suggest that the essentially different critical parameters in FOR and OBS are related to the model performance rather than changes in the forcing.

3.2 Blocking distribution

The 2-D geographical distribution of annual blocking frequency (in percentage of days) is displayed in Fig. 2a for OBS (solid lines) and FOR (shaded areas). Preferred regions for blocking occurrence (i.e. western Pacific and Euro-Atlantic sectors) are well captured by the model. The model does also a reasonable job in reproducing the amplitude of the Pacific maximum, but there is a considerable underestimation over the Euro-Atlantic sector (statistically significant at p < 0.01 after a Student’s t test applied to the 40-year mean annual series). Note that the 2-D distribution of blocking includes all the grid points embedded in the anomaly pattern and hence the same block is counted at different grid points. Thus, the model failure may arise from different errors, namely its inability to simulate: (1) meridional reversals in the absolute flow, (2) blocking persistence, (3) blocking extension or (4) most probably, a combination of these.

Fig. 2
figure 2

2-D blocking distribution. Climatological (1950–1989) annual mean blocking frequency (in percentage of annual days) as derived from: a the proposed blocking index; b a modified version of the DG blocking index. Solid lines (shaded areas) indicate the frequency in OBS (FOR). Thick solid lines in the OBS field indicate the minimum contour with all embedded grid points showing significant differences between OBS and FOR at p < 0.01 after a two-tailed Student’s t test

In order to address model performance in reproducing blocks with the right amplitude and/or extension, a comparative climatological analysis has been performed by identifying blocks from an anomaly only blocking index. The idea aims to assess how much of the model failure can be attributed to the anomaly field after removing detection criteria based on the total flow. For this purpose, a modified version of the DG blocking index has been applied by running the same code but without demanding a meridional height reversal in the total flow. Thus, blocks are only identified as 2-D persistent anomalies, regardless of the absolute flow. The analysis (Fig. 2b) reveals that, in this case, the frequency of DG blocks is better simulated by the model, as evidenced by the lower underestimations (around 10% of blocking reduction) found in DG than those derived from our index (about 30%). The Pacific maximum is fairly realistic in the model, whereas Euro-Atlantic occurrence shows a modest (only locally significant) underestimation as compared to observations. Two conclusions can be gleaned: (1) most of the model underestimation of Euro-Atlantic blocking activity with our method derives from the model inability to simulate height reversals over that area; (2) a blocking index based exclusively on anomaly fields like the DG index does not capture realistically the lack of blocks in the model resulting from our method.

To further quantify the model skill in simulating zonal disruptions, Fig. 3a compares the 1-D (zonal) frequency of blocks in OBS and FOR, computed as the number of days of the year (in percentage) when a given longitude was blocked (i.e. a meridional height reversal was detected at that longitude together with a 2-D blocking event anomaly). The internal variability in the model (superimposed in shaded grey) has also been estimated as plus and minus two standard deviations relative to CTL. The model captures the main sectors of blocking activity but the observed frequency deviates more than 2-sigma above the simulated one over a sector spanning from eastern Atlantic to eastern Europe, confirming the model inherent inability to simulate the observed frequency and/or persistence of blocks. In order to confine better the error signal, the longitude-time monthly difference of blocking frequency between OBS and FOR is displayed in the bottom plot of Fig. 3a. Most of the blocking underestimation spreads along the eastern Atlantic and most of Europe in cold seasons and, to a less extent, over central Eurasia in warmer months.

Fig. 3
figure 3

1-D blocking distribution. Annual mean frequency of blocked days (in percentage to the total days) as a function of the longitude. a Black solid (grey dashed) lines represent the blocking distribution in OBS (FOR). Grey shaded areas indicate the ±2σ level from the ensemble of 25 members of CTL. Highlighted areas at the bottom of panel a show longitudes where the annual mean series in OBS deviate more than ±2σ from the FOR mean distribution; b as a but for OBS (solid grey line), FOR (solid black line), FOR using the OBS reference latitude (JET) and FOR after correcting by the time-mean model bias (TMB). The bottom graphic displays the longitude-time Hovmöller diagram with the monthly blocking frequency difference between: a OBS-FOR, b FOR-JET (vertical lines) and FOR-TMB (horizontal lines). Only positive differences are shown

In what concerns the performance of the different experiments, FOR simulation deviates up to (even beyond) the limit of the estimated range of internal variability in the model (CTL) in several regions of the NH, including western Pacific and Europe. This suggests that, in spite of the similarity of blocking critical parameters obtained for FOR and CTL (Fig. 1), blocking activity may respond to changes in the external forcing within the analysed period. These results are encouraging for future analysis of blocking under different specifications of external forcing. Figure 3b describes the results of applying various methodological variants to the same analysis as in Fig. 3a. These will be discussed in Sect. 4.2, within the context of methodological errors.

3.3 Annual cycle

In what follows, our attention will turn to a blocking event description in order to assess the model ability to simulate the annual cycle of event-related parameters. The annual cycle has been obtained by computing the monthly time series of blocked days (i.e. the number of days in which a blocking event was detected anywhere), blocking events and mean event durations (Fig. 4). In order to strengthen the seasonal cycle and avoid sampling problems, monthly variability has been removed by applying a 3-month running average (a given month actually representing the seasonal mean centred in that month).

Fig. 4
figure 4

Annual cycle of blocking event parameters. Long-term annual cycle of: a blocked days; b blocking events; c blocking duration (in days). Monthly values are actually computed as the 3-month average centred in that month. Bar (solid line) refers to OBS (FOR). Shaded areas indicate the ±2σ level from the ensemble of 25 members of CTL. Bottom panels show the OBS–FOR difference (solid line) in percentage and the corresponding error from the ±2σ level of the mean ensemble of CTL (shading areas). Black dots indicate significant differences between OBS and FOR at p < 0.01 after a two-tailed t test. Two annual cycles are shown for better visualization

Focusing on the frequency of blocked days (Fig. 4a), the model provides a realistic simulation of the shape of the annual cycle, as well as a satisfactory amplitude, despite a general frequency underestimation that becomes larger in late spring-early summer and early-to-mid winter (see error values in the lower panel). An appreciable shift in time is seen in FOR, which peaks later in the year than in OBS. This phase shift is more evident in CTL. The annual cycle of blocking events (Fig. 4b) reflects similar behaviour, with an earlier (later) phase location of the annual maximum (minimum) that results in an unlocked seasonal cycle. The model error peaks up in late spring and early winter, coinciding with the results of Fig. 4a. The magnitude of the error is also similar to that of Fig. 4a, ranging from 15 to 30% in all months. Finally, Fig. 4c shows the annual cycle of average blocking event persistence. The model has a recognisable seasonal cycle in good correspondence with observations, but simulates shorter blocking episodes at almost any time. The modelled phase of the annual cycle also agrees with observations, although CTL again tends to peak later in the year. The most remarkable difference is an amplitude reduction of the seasonal cycle in the model as a result of a poor reproduction of the observed blocking persistence at the annual peaks (maximum and minimum).

Some caution is required here since time series of blocking events and durations are highly prone to suffer from sampling problems due to the low number of blocking episodes involved in the analysis. Results from the CTL simulation indicate that the phase shifts of the annual cycles in FOR are within the internal variability of the model, and hence, they may be attributed to sampling. However, the CTL simulation still supports the occurrence of two periods with maximum model underestimation in early winter and summer. These periods and the blocking sectors contributing to the model error will be further discussed in the following sections.

3.4 Assessment of errors

Results from the previous section suggest that the model underestimation of blocked days in cold and warm months may arise from its inability to simulate blocking events with the observed persistence. Figure 5 compares duration frequency distributions in OBS and the model. Events with durations between 1 and 3 days, as derived from the detection method, are also included, being aware that they do not represent blocking episodes in a strict sense. The shape of the modelled distribution resembles the observed one (Fig. 5a), except for a general underestimation above ~7 days. As long-lasting blockings are more likely to occur in winter and over Europe, they again arise as the most firm candidates to show the poorest realism in the model.

Fig. 5
figure 5

Duration criterion. a Normalized distribution of events with durations equal or higher than the given bin in OBS (dark grey) and FOR (light grey). The number of blocking episodes of a given bin is normalized by the total number of events. Dashed lines indicate an exponential fit. b as a but in a semi-logarithmic scale plot. The slopes of the linear regressions (dashed lines) for the OBS blocking event distribution above and below the duration criterion are denoted as t 0 and t 1, respectively. The corresponding values in FOR are shown in brackets with an estimative range of error inferred from the ensemble of CTL simulations

The lower persistence of modelled blocks is better seen in Fig. 5b. When attention focuses on the typical range of blocking durations (above 4 days), both distributions reveal the well-known temporal scale of about 4 days. However, the underestimation of medium-to-long lasting blocking episodes in the model causes a sharper slope. This suggests that in spite of their similar time scales, modelled blocks have a less persistent nature (i.e. faster decaying time scales) than in observations. The corresponding fit for episodes lasting less than 4 days is also shown. Their characteristic time scale is about half than that for blocking events in both OBS and FOR. The change in the slope of the duration distribution means that a higher proportion of events last at least another day once they have reached ~4 days long. This behaviour has been interpreted as a distinctive feature of blocking persistence (Pelly and Hoskins 2003b) and it is well reproduced by the model.

A more detailed analysis is conducted in Fig. 6 to quantify the relative contribution of model errors in blocking events and durations to the total number of blocked days. The approach is based on the fact that N =  i n i d i  − N s where N is the number of blocked days, n i is the number of blocking events of a given duration d i and N s the number of days with simultaneous blocks. Although the simultaneous occurrence of blocking is an observed fact (e.g. Lupo 1997; Woollings and Hoskins 2008), it can be assumed that N s  <<  i n i d i (Lejenäs and Økland 1983; Diao et al. 2006; Tyrlis and Hoskins 2008), thus simplifying the relationship to N =  i n i d i .

Fig. 6
figure 6

Time-duration distribution of characteristic blocking event parameters. Frequency histogram (in percentage) of blocking parameters against month and duration (in days) for: a blocking event frequency in OBS; b blocking event contribution to the number of blocked days in OBS (i.e. the product of event frequency by its duration); c as a but in FOR; d as b but in FOR; e the difference between a and c; f the difference between b and d. For better visualization negative differences in e and f are omitted and the ordinate axis is reversed as compared to that in panels from a to d. Upper plot in e and f shows the OBS-FOR difference in 2-D, with shading indicating positive differences (increasing from light to dark). Monthly frequencies actually represent the 3-month mean centred in that month. The frequency of each bin of duration in a and b has been normalized by the total frequency of blocking events in the given month. Units are percentage relative to that month

Figure 6a shows the frequency distribution of blocking events in OBS as a function of its duration and month. Monthly blocking frequencies are computed for a 3-month period centred in that month. Each bin of certain duration has been normalized by the total number of blocking events in the given month and expressed as a percentage for a better interpretation. The exponential shape of the distribution is appreciable through the whole year, but dominated by short-lived blocks (usually shorter than 15 days) in warm months and by a wider range of durations in cold months. The relative contribution of blocking events with a given duration to the total number of blocked days in a given month (i.e. the ratio 100n i d i /N) is shown in Fig. 6b. The exponential shape in the blocked days domain is more spread relative to that of the blocking event, since the former is weighted by the duration d i . Thus, for example, 5-day blocking events represent 25% of episodes in summer, but they only contribute with close to 15% to the number of blocked days. Conversely, long-lasting blocks, typically observed in winter, contribute almost equally to the number of blocked days than short-lived blocks (note that one blocking event of 15 days is equivalent to 3 blocking events of 5 days in terms of blocked days). From this analysis it is concluded that a proper simulation of short-to-medium-lived blocks is critical to reproduce realistically the distribution of blocking activity during the warm season. The situation is different in colder seasons where, due to a wider dispersion of blocking events through the whole spectrum of durations, a realistic representation of both short and long blocks is required to minimize model errors.

The corresponding figures in FOR are shown in Fig. 6c and d, respectively. The exponential shape of both distributions is more evident than in OBS, confirming that the model fails to reproduce with the observed frequency long-lasting blocking episodes. The OBS minus FOR difference of the blocking event distribution and its relative contribution to the number of blocked days are shown in Fig. 6e and f, respectively. For simplicity, negative values are omitted (white boxes). In warm months, the poorest model skill lies over medium-age blocking events, a range of blocking durations with appreciable contributions to the number of blocked days and events, and hence, its underestimation in the model brings a simultaneous reduction in blocked days and blocking persistence (Fig. 5a, c). In cold months, the model bias in blocking events is caused by a smaller but more sustained underestimation of blocking events with durations above ~7 days. Within this range, longer-lasting blocking events contribute more to the number of blocked days than shorter blocks and hence, it is the suppression of longest-lasting blocking events the main responsible for the model underestimation of winter blocked days. Thus, in terms of blocked days blocking underestimation in the model peaks in winter for very persistent blocks and in summer for medium-age blocks (Fig. 6f). Both periods reflect the two local maxima of model error in the annual cycle of blocked days (see Fig. 5a).

Summarizing, the underestimation of blocked days through the whole year seems to be due to a sustained underestimation of blocking persistence. Local departures from this general reduction peak in late spring–early summer and in early winter, the former arising from a model failure in the simulation of medium-age blocking events and the latter being due to fewer persistent blocking episodes. This model inability is expected to impinge on regions where persistent blocking events are more prone to occur. These regions depend on the season, namely the Euro-Atlantic sector in winter and the Eurasian continent in summer (e.g. Barriopedro et al. 2006). This is supported by Fig. 3a, which reveals that most of the blocking underestimation is confined to the Euro-Atlantic sector in cold months and to the Eurasian sector in warm months.

4 Source of model errors

These preliminary results reveal a reasonable performance of the model in terms of: blocking location, annual cycle and duration distribution, but with reduced blocking frequencies and a general tendency towards shorter blocks. This is a common finding in most previous blocking diagnostic studies with GCMs simulations. However, in addition to the decrease of blocking frequencies and the average duration of blocks, some of these studies also reveal deficiencies in reproducing the longitudinal and/or seasonal variability of blocking (see D’Andrea et al. 1998 for a full comparison of 15 GCMs), thus pointing some improvement in the last generations of GCMs.

Attributing model biases to specific error sources is not an easy task since a large variety of underlying processes may be responsible for model failure and they cannot be assessed without numerical experimentation. However, two main error sources in blocking simulation have been identified: (1) physical parametrizations, especially related to sources and sinks of momentum fluxes (e.g. Mullen 1994) but also to diabatic heat fluxes, which are key processes to realistically reproduce blocking interactions with other forcing factors such as tropical-extratropical SSTs and snow cover (Ferranti et al. 1994; García-Herrera and Barriopedro 2006); (2) model constraints (i.e. spatial resolution, uncertainties in initial conditions, etc., e.g. Tibaldi et al. 1997). Among these, spatial resolution has been recognised as a common cause of model failure in blocking simulations. The resulting model error has been ascribed to the lack of eddy activity in lower resolution models, which is considered an important process for blocking occurrence and maintenance via a feedback between large-scale flow and synoptic eddies that decelerate the westerlies and help to maintain the blocking flow (Shutts 1983; Hoskins et al. 1983; Colucci and Alberta 1996; Chen and Juang 1992; Lupo and Smith 1998). The effect of model resolution has also been observed in WPMs but with different sensitivity over the Atlantic and the Pacific oceans (Tibaldi et al. 1997). The different blocking response in both oceans has also been supported by observational studies, suggesting that regional blocking may be the result of different dynamical processes (D’Andrea et al. 1998; Nakamura et al. 1997).

Most of the aforementioned candidates usually produce systematic errors that lead to biased signatures in the basic flow and hence in blocking simulations (Miyakoda and Sirutis 1990; Kaas and Branstator 1993; D’Andrea et al. 1998). As the time-mean state (waves and zonal flow) and intraseasonal variability are two of the features that better characterize the atmospheric background for climatological blocking development, both variables will be used here to understand blocking differences between the reanalysis and the model. From now on the 5–30 days band-pass filtered standard deviation will be referred to as intraseasonal low-frequency (ILF) variability, bearing in mind that such band actually represents the high-frequency component and not the whole spectrum of the frequencies involved in the intraseasonal variance.

4.1 Time-mean bias

In this section, the analysis will focus on the 3-month periods that have shown worse model performance in blocking activity, namely November–December–January (NDJ) and May–June–July (MJJ) and, by extension, these will be referred to as the cold and the warm season, respectively. To explore the ability of the model to capture the amplitude and phase of the planetary waves, a Fourier decomposition of the Z500 field in the zonal direction is performed for each day, and these fields are then averaged for the corresponding seasons. Planetary waves with zonal wave numbers from 1 to 5 are only considered. Figure 7a and b show the composites of the wave components with poorest model performance for the cold and the warm season, which correspond to wave numbers 2 and 3, respectively. The wave number 2 pattern in winter exhibits climatological ridges over Europe and eastern Pacific. In the warm season, the positive loading centres for wave number 3 are placed over Europe, North America and eastern Asia. The model bias is computed as the difference between FOR and OBS (lines), their difference being tested with a two-tailed Student’s t test. The model reproduces with reasonable realism the phase of the waves, whereas wave amplitudes are significantly underestimated. Model errors peak over some of the main areas of blocking occurrence, particularly the Euro-Atlantic sector, which is affected by both wave numbers. This is supported by Fig. 3a, where the model errors in regional blocking frequency mirror those of the wave decomposition analysis. Wave components of higher order tend to be zonally out of phase in the model (not shown).

Fig. 7
figure 7

Time-mean model bias. a Cold season Z500 wave number 2 composite (in gpm); b as a but for the warm season Z500 wave number 3; c U500 (in ms−1) in cold season; d as c but for warm season. Shaded areas (lines) refer to OBS (FOR–OBS difference). Only significant differences at p < 0.1 level are shown

These results suggest that modelled waves are of either lower frequency or amplitude than in observations over key regions for blocking occurrence. This is a major cause of blocking underestimation in models, as supported by some dynamical theories that consider blocking as the result of the either resonant interaction of planetary waves or stationary and travelling waves being in phase (Egger 1978; Austin 1980; Nigam and Lindzen 1989; Lejenäs and Madden 1992). In particular, the wave numbers 1, 2 and 3 are usually involved in blocking processes. Austin (1980) demonstrated that these waves tend to show near normal phases but enhanced amplitudes during blocking episodes. As a consequence of the model underestimation in the wave amplitude, the interaction between distinct waves may not be well represented and blocking would be systematically underestimated (or vice versa, i.e. the reduction of blocking activity in the model partially accounts for reductions in the wave amplitudes).

The reduction of wave activity in the model is expected to lead to excessive zonality in the flow. On the other hand, the error in wave number 1 (not shown), which is characterised by a moderate reduction in wave amplitude and an appreciable southward shift, is suggestive of a southward location of the jet stream. In order to support that arguing, the systematic error in seasonal U500 is displayed in Figs. 7c and d (lines). There is a moderate increase of westerlies over central-east Pacific as compared to observations (shaded areas) during the cold season. However, as the core of maximum wind anomalies occurs relatively far west and south of the blocking action centre (located over the Alaskan Peninsula, Fig. 2a), the effect is only perceived as an eastern confinement of blocks (Fig. 3a). During the warm season the Pacific error becomes less evident and, as blocking activity is not remarkable over there, the impact in blocking is negligible.

Over the Euro-Atlantic sector the relationship between model systematic error and blocking is particularly dramatic. Meridionally oriented dipoles of zonal wind anomalies emerge clearly in both seasons, with the positive centres located between the observed polar and subtropical jet streams, thus revealing a poor simulation of the double jet structure and an underestimation of the diffluence pattern. As a consequence, westerlies are southward shifted. During the cold season, the overestimation of westerly winds in the model over Europe is much stronger than the underestimation obtained over the contiguous polar region. Thus, in addition to a relative southward shift of the jet stream, excessive zonal winds are also expected in the model.

A pattern like the one represented in Fig. 7c can be attributed to a mean zonal flow configuration that systematically replaces meridional reversals by westerly winds and, hence, inhibits blocking occurrence. Kaas and Branstator (1993) suggested that the stationary wave amplitude and transient variability associated with blocking are affected by the zonal wind forcing, which can be estimated as the first empirical orthogonal function (EOF) of the zonal mean zonal wind for the NH. The bias of the zonal mean zonal wind over the Euro-Atlantic sector (not shown) strongly resembles the phase of the zonal wind forcing associated to blocking suppression as described by Kaas and Branstator (1993). These results suggest that the main model error consists of excessive zonal winds (but moderately shifted southwards) over the Euro-Atlantic sector in the cold season. On other hand, during the warm season, the model is characterised by modestly increased zonal winds but strongly shifted southwards over Eurasia. As a consequence, the overall effect is a local suppression of winter Euro-Atlantic blocking due to the underestimation of the diffluence over the eastern Atlantic and a southward shift of summer Eurasian blocks.

4.2 Variability bias

As previously stated, intraseasonal variability provides a measure of the degree of blocking activity. Since blocking extracts part of its energy from that band of variability, an underestimation of ILF variance in the model should have an impact in blocking frequency, persistence and/or amplitude. From an alternative perspective, blocks can be viewed as important contributors to intraseasonal variability (e.g. Swanson 2002) although neither all of the ILF variance is due to positive anomalies nor all positive anomalies are blocks. As such, a blocking reduction is expected to impinge in the variability of the frequency band where blocking plays a major role. Figure 8a shows the longitudinal monthly evolution of the ILF standard deviation averaged over typical blocking latitudes. The maxima in the OBS distribution (solid lines) broadly reflect some preferred regions for blocking occurrence. The model does a reasonable job in simulating the main features of the spatial distribution and its evolution but underestimates the standard deviation through the whole year, as realised from the presence of negative FOR–OBS differences (shaded areas). The overall reduction in ILF standard deviation is, at least qualitatively, in agreement with that of blocking frequency (Fig. 3a). The largest departures, in the order of 15%, occur in cold (warm) months over the Euro-Atlantic (Eurasian) sector. Note that the DG index provides similar underestimations in blocking frequency (~10%; Fig. 2b) than the ILF standard deviation, whereas our index reveals stronger reductions in blocking activity (~30%; Fig. 2a), since it is also affected by the bias in the time-mean.

Fig. 8
figure 8

Variance model bias. a longitude-time Hovmöller diagram with the monthly standard deviation of the band-pass filtered (5–30 days) Z500 daily series for the 1950–1989 period and a latitudinal belt 30° north of the reference latitude. A set of weights proportional to the squared cosine of the latitude are applied to the standard deviation; b latitude-time diagram of the zonal mean monthly standard deviation of the high-pass filtered (<5 days) Z500 daily series for the 1950–1989 period. Black solid lines (shaded areas) refer to OBS (FOR–OBS difference). Thick white lines indicate significant differences at p < 0.1 after a Fisher’s F test of variances. Units are gpm

A similar analysis in the high-frequency (<5 days) spectrum also reveals a generalized underestimation in mid-high latitudes and moderate positive departures along mid-subtropical latitudes of Europe and central-eastern Pacific (not shown). Although the relative contribution of both errors varies through the year they are in respective agreement with excessive westerlies and southward shifts of the jet stream in the model. Figure 8b displays the time-latitude distribution of the zonal mean high-frequency standard deviation in OBS (solid line) and the model error estimated as FOR–OBS (shaded areas). The model simulates the corresponding maximum of eddy activity further south of the observed one and hence, it is responsible for the major deviations detected in the reference latitudes. In fall and winter the error pattern arises as a north–south dipole of anomalies, suggesting a moderate southward shift of the jet stream (in agreement with Fig. 7c). In warmer seasons, the reduction of eddy activity in the model is more noticeable and extends further north without a corresponding compensation at southern latitudes, thus shifting the reference latitudes accordingly.

As shown in Paper I, the reference latitudes used in the scheme detection are useful proxies for the westerlies and they can be used to estimate the skill of the model in reproducing the intensity of the jet stream. Figure 9 shows the climatological locations of the reference latitudes employed in OBS, FOR and CTL. The spatial distribution of the intensity of the jet stream has been estimated as the U500 mean around the reference latitude. The FOR–OBS differences in latitude and speed are plotted below, with the thickness and shading being proportional to the magnitude and sign of the speed difference, respectively. During the cold season the eastern Pacific ridge, which is a proxy signature of blocking location, is placed further east in the model. The model also simulates excessive wind speeds in the central Pacific, but over a region that is too small as to significantly affect Pacific blocking occurrence. Thus, the simulation in the blocking area produces just a modest underestimation over its western margin. In the Atlantic Ocean, the simulated and observed crests are in phase over the Greenwich meridian and hence, longitudinal block locations are fairly well reproduced. However, the jet stream is too strong, resulting in a significant reduction of blocks over this sector (Fig. 3a). The situation is critical over Europe and western Eurasia where both jets divert with simultaneous wind strength departures. By comparing the cold and the warm season, it is confirmed that the southward shift in the model jet peaks in warm months, while wind differences reach their maximum in cold months (in agreement with Fig. 8b).

Fig. 9
figure 9

Jet stream model bias. Longitudinal distribution of the long-term mean reference latitude in OBS (black line), FOR (dark grey line) and the ensemble of 25 members of CTL (light grey line) for the: a cold season; b warm season. The line thickness is proportional to the wind speed (in ms−1) averaged over a 10° latitudinal belt centred in the reference latitude. The bottom plot indicates the FOR–OBS difference in latitude. The shading (thickness) of the line is scaled to the magnitude of the FOR–OBS difference (|FOR–OBS| absolute difference) in wind speed

The seasonal dependence of these biases is better seen in Fig. 10, where the OBS and FOR reference latitudes are simultaneously plotted with an estimate of the westerly winds and the standard deviation of the high-pass-filtered Z500 field for the cold and the warm season. There are two superimposed model errors that are timely and coincidental with the strongest underestimations in blocking activity (Figs. 3a, 4). The one related with the jet stream location may be responsible for much of the underestimation of warm-season blocks, while that due to bias in the zonal wind intensity would dominate the blocking reduction during the cold season.

Fig. 10
figure 10

Reference latitude model bias. Long-term (1950–1989) mean reference latitude for: a December; b June in OBS (dark grey thick line) and FOR (light grey thick line). Thickness is proportional to U500 averaged in a 10° latitudinal belt centred in the reference latitude. Shaded areas (solid lines) show the long-term standard deviation of the high-pass (<5 days) Z500 field for the 3-month period centred in that month in OBS (FOR).Units are gpm

4.3 Methodological bias

It could be asked how much of the blocking underestimation in the model can be attributed to specific methodological approaches in blocking detection and how the climatology would look like if observational parameters were applied directly to the model simulations, with no concern for any kind of adaptation measures. Such an exercise could be interpreted as a prospective evaluating test aiming to estimate errors ascribed to inadequate detection schemes in the presence of model bias.

Several modifications have been applied to the original method by removing separately the bias in the two critical parameters related to the main systematic model errors, namely, the reference latitude and the excessive zonality of the flow (Fig. 3b). For comparison purposes, the OBS (solid grey line) and FOR (solid black line) distributions with their own parameters are also plotted. The correction of the reference latitude location has been performed by applying the observational reference latitudes to the model simulations (experiment JET). The resulting climatology provided blocks in better correspondence with observations, the improvement being especially prominent over Eurasia in spring and early summer (vertical lines, bottom plot), as expected from the largest departures between OBS and FOR jets (Fig. 9). On other hand, much of the blocking underestimation can also be attributed to the presence of excessive zonal winds (time-mean bias) as evidenced by the corresponding simulations (experiment TMB) obtained by subtracting the climatological monthly Z500 error to every daily field falling in the given month. In that case, the model performance improves significantly over the Euro-Atlantic sector in cold months (Fig. 3b, horizontal lines, bottom plot). The removal of the time-mean bias also corrects for some of the blocking underestimation due to the error in the reference latitude (especially over western Europe where both jet streams do not drift apart too much), but the model still shows poor improvement over Eurasia. The remaining error may be accounted for by additional biases in other statistics. For example, the reduction of eddy activity which tends to underestimate the feedback on the amplification of stationary waves, thus leading to excessive time-mean zonal winds.

The question that immediately arises is whether these corrections (TMB and/or JET) can be imposed to the base fields in order to raise the applicability and reliability of model simulations in past and future climates. Different studies have shown that the dynamical correction of the time-mean systematic error can effectively suppress the TM blocking frequency bias (Doblas-Reyes et al. 2002), although the reverse is not true, i.e. the removal of the bias in TM blocking activity does not substantially reduce the systematic error (D’Andrea et al. 1998). Unfortunately, a similar reasoning applied to the model bias in the location of the reference latitude does not apply. Thus, if the observed estimate of the reference latitude is used in the model, the method no longer looks for features that block the jet stream in the model. As our particular model suffers from excessive zonal winds, more realistic values are systematically obtained after drifting apart the reference latitude from the westerly maximum (experiment JET). However, if the model would have suffered from weakened zonal winds (i.e. excessive blocking) the resulting increase of blocking activity after correcting by latitude would have in fact increased the error. The model improvement is, therefore, fictitious since the increase of blocking frequency after removing this model bias is achieved at the expense of a weaker efficiency in the detection.

To illustrate this, a comparative analysis of modelled blocking episodes missed in the FOR and detected by its corresponding JET experiment has been conducted over the region with maximum jet bias [Eurasia, EUR, (0°, 60°E)] and for the two seasons with the strongest blocking underestimation (Fig. 11). In both seasons, blocking composites resemble the standard blocking signatures, supporting that the method succeeds in identifying meridional height reversals associated to positive departures. Blocking features lie over the observed jet stream (dark grey thick line), implying a substantial perturbation in its zonality. Nevertheless, no remarkable impact in the circulation of the modelled jet latitudes is observed (light grey thick line). In fact, the core of the blocking anomalies is placed north of the Scandinavian Peninsula, which means too far north from the jet stream in FOR so as to block the passage of storm-tracks. As a consequence, weather systems associated to these specific signatures would be catalogued as blocks in observations, but not in the climate of the model. These results support that the same thresholds that work for the observations can bring undesirable results in presence of model biases.

Fig. 11
figure 11

Regional blocking signatures. Composites of Z500 (solid lines) and Z500 anomalies (shaded areas) for EUR blocked days detected by the experiment JET and missed by FOR during the: a cold season; b warm season. Shaded areas indicate positive anomalies with contour interval of 25 gpm starting at 50 gpm. The thick dark (light) grey line indicates the corresponding climatological reference latitude for JET (FOR)

From the previous discussion, different responses of the blocking index to systematic errors can be identified. Anomaly based blocking indices of the type of DG and SKS are particularly sensitive to the model skill in reproducing variability of the typical frequencies where blocking operates, since they exclusively depend on the anomaly threshold. Alternatively, the TM index is more susceptible to the time-mean state and its seasonal cycle because it depends on the set of latitudes chosen and the meridional height gradient of the absolute flow (e.g. Doblas-Reyes et al. 2002). In other words, the blocking-related bias in the DG index can be efficiently suppressed by choosing specific-model parameters in the anomaly threshold, while the TM index requires the removal of the time-mean error from the raw data. Given that ILF variability is relatively well captured by the ECHO-G model, the DG index provides a fairly realistic simulation of blocking frequency (as realised by blocking frequency underestimations much lower than those derived from our index). However, this result is not in full agreement with the presence of excessive zonality in the model. In that sense, our method is a more stringent test to model performance, since both fields (total flow and ILF variance) are required to be in reasonable agreement with observations in order to reproduce realistic blocking climatologies in the model.

5 Concluding remarks

In this paper, an objective data-dependent automatic method is applied to 40-year of observations and present-day forced simulations of the ECHO-G model in order to assess the model performance in reproducing the main characteristics of the NH atmospheric blocking. Comparisons against the reanalysis reference using grid- and event-based blocking activity parameters show an overall model underestimation of blocking frequency. The model failure is related to a considerable underestimation of blocking activity in the Euro-Atlantic sector and a relative southward shift of blocks in the Eurasian sector (particularly acute in the warm season). When attention is focused on blocking event parameters it becomes evident that most of the blocking reduction arises from the model inability to generate persistent blocking episodes with the observed frequency, which directly impinges in the frequency of blocked days.

A comprehensive analysis conducted in terms of systematic errors in the model demonstrates that blocking underestimation is in agreement with a reduction in the intraseasonal variance of the model. However, most of the model failure results from the superposition of two model biases in the absolute flow: (1) its inability to generate amplified waves due to the presence of excessive zonal winds that prevent flow reversals and (2) a model tendency to place the jet stream further south of the observed one. Although both inadequacies are observed trough the whole year, the former is more prominent during the cold season, while the latter dominates in the warm season. The first kind of error is responsible for much of the blocking underestimation over the Euro-Atlantic sector and is in agreement with a specific zonal wind forcing pattern that has been associated to a blocking suppression. The underestimation of warm-season Eurasian blocks is mostly attributed to the model deficiency to capture the correct locations for blocking occurrence, which appear too far south in Eurasia. Such a bias is in agreement with a concurrent southward shift of synoptic eddy activity downstream of the exit zone of the jet stream and a strong underestimation of synoptic perturbations at typical latitudes of blocking. These biases are probably inhibiting the proper relationship between synoptic eddies and the large-scale flow to produce the observed feedback involved in wave amplification.

Further analyses are performed to estimate the error associated to inappropriate choices of critical parameters in the detection method. Model failures are partially missed by other methodological approaches or when thresholds are calibrated with an observed reference. The magnitude of these methodological errors can be almost half of that of the model bias, suggesting that, in addition to climate change signals and natural variability, significant departures can arise from inappropriate adjustments of parameters and thresholds in objective detection methods. As a consequence, from the perspective of adapting automatic methods to simulated climates, it is important to consider the basic state (mean and variance) and its temporal evolution in order to partially account for the lack of realism in the model. Such an adaptation can be viewed as a tool to assess the skill of the model in reproducing certain features of the current climate and as an estimator of how much confidence can be placed in its response to sensitivity experiments within the context of climate change scenarios.

Albeit the aforementioned errors, it is possible to recognise some model success. The most important features of blocking activity are captured by the ECHO-G model with fairly realistic accuracy, including (1) the preferred location in the eastern margins of both oceans, (2) the relative activity of action centres, (3) the seasonal cycle variability and (4) the exponentially decaying distribution of blocking lifetimes. As from a climate change perspective it is important to address relative changes in frequency, location or persistence of specific atmospheric regimes rather than absolute changes, some credit can be given to the model. On other hand, the presence of modest blocking deviations in the forced simulations as compared to the control one also suggests that blocking activity in the model may respond beyond its internal variability to variations in the external forcing in the context of past or future climate change scenarios. These and other issues will be addressed in a future paper.