1 Introduction

The quasi-biennial oscillation (QBO) in the zonal wind is a marked and even dominant feature of the equatorial stratosphere as described in the review paper by Baldwin et al. (2001). The review describes the robust connection of the QBO with the tropical and extratropical stratosphere but only modest attention is given to connections with the extratropical troposphere. Connections between the QBO and the extratropical troposphere are of interest not only because of the dynamics that may connect these regions but also for the practical reason of seasonal prediction. Seasonal forecasting is difficult in the extratropics since whatever predictable signal that exists is often swamped by unpredictable natural variability noise. The result is that seasonal prediction skill is modest in the extratropics for temperature, virtually absent for precipitation, and lower over land than over ocean. The skill that exists is typically attributed to the El Niño/La Niña phenomenon where long timescale tropical Pacific sea surface temperature anomalies (SSTAs) provide a forcing which directly affects the tropics but which may also propagate to and affect the extratropics.

Two-tier seasonal prediction methods use the forcing provided by SSTAs, which are provided as bottom boundary conditions to atmospheric general circulation models (AGCMs). The AGCMs respond to the SSTAs to produce a global pattern of seasonal climate anomalies. The long time-scales of SST permit simple but skilful predictions of SSTAs for the first season via persistence or simple statistical methods.

Seasonal forecasts made with numerical models will include a QBO-related signal if the models are initialized with realistic winds in the tropical stratosphere, if they realistically forecast the evolution of the QBO and if they represent the dynamics of stratosphere–troposphere coupling. Most AGCMs do not generate a spontaneous QBO; however, but instead simulate a tropical stratosphere with weak and almost steady mean easterly winds. When the models are initialized with realistic initial conditions in the tropical stratosphere (or an idealized strong jet), the zonal winds generally relax towards the model climatology with a timescale of a month or even less (Hamilton and Yuan 1992). A few AGCMs have been shown to simulate a spontaneous “QBO-like” oscillation in tropical mean zonal winds (e.g. Takahashi 1996; Horinouchi and Yoden 1998; Hamilton et al. 1999, 2001; Scaife et al. 2000; Giorgetta et al. 2002), although even in these models the “QBO” usually exhibits some unrealistic aspects. Experience suggests that AGCMs will simulate a QBO-like oscillation only if the vertical resolution in the stratosphere is quite fine (i.e. 500 m or less vertical grid level spacing), the model is run with comparatively weak horizontal subgrid-scale diffusion (which usually means running at fairly fine horizontal resolution as well), and either the model includes a very active moist convection parameterization scheme which will generate a strong flux of upward-propagating gravity waves in the tropics (such as moist convective adjustment), or the model includes a parameterization of the effects of a broad spectrum of non-stationary subgrid-scale vertically propagating gravity waves.

Few of the current models applied to seasonal forecasting have either the spatial resolution or the appropriate parameterizations to spontaneously generate, or maintain, a QBO in the tropical stratosphere. This situation may be beginning to change and Saha et al. (2006) note that the NCEP forecast system appears to show a slower decay of QBO initial anomalies than other models. When integrated from realistic initial conditions, however, most forecast models are expected to lose QBO information relatively quickly and with it the possibility of representing extratropical tropospheric QBO effects and associated forecast skill.

Although most model-based seasonal forecasts as currently produced are not expected to include the effects of the QBO, it may, nevertheless, be possible to use the long-timescale information inherent in the QBO to improve forecast skill. This is investigated using results from the second Historical Forecasting Project (HFP2) a multi-model retrospective seasonal forecasting project. HFP2 is a successor to HFP1 (Derome et al. 2001) and consists of 35 years of retrospective four model two-tier seasonal forecasts from 1969 to 2003.

2 The QBO

Soon after the basic features of the QBO in the tropical stratosphere were discovered (Reed et al. 1961; Veryard and Ebdon 1961) attempts were made to relate equatorial stratospheric QBO variability to the circulation in other regions of the atmosphere (e.g. Angell and Korshover 1968). One of the most robust of these connections is between the tropical QBO and the circulation in the extratropical Northern Hemisphere (NH) stratosphere. Holton and Tan (1980) found that the westerly vortex at 50 hPa in the winter stratosphere was, on average, weaker during the easterly phase of the equatorial QBO (as measured by the sign of the 50 hPa equatorial wind) than during the westerly phase. In addition to a QBO signal in the strength of the zonal-mean component of the vortex, Holton and Tan also found that the stationary wave 1 pattern in 50 hPa geopotential height at high latitudes was generally stronger in the easterly phase of the equatorial QBO than in the westerly phase.

The standard view is that the tropical QBO is largely forced by the interaction of the stratospheric mean flow with vertically propagating equatorial and gravity waves excited in the tropical troposphere. The QBO effects on the extratropical stratosphere are attributed to mean-flow modulation of the propagation characteristics of quasi-stationary planetary waves (QSPWs) generated in the extratropical NH winter troposphere. When there are mean easterlies in the tropical stratosphere the QSPWs are prevented from propagating into the equatorial region and are more confined to higher latitudes. When the QSPWs are dissipated, a net westward forcing on the mean flow results and for the waves dissipated in the mid to high latitude stratosphere the effect is a weakened polar vortex. The presence of mean QBO easterlies should foster a weaker and more disturbed vortex at high latitudes, consistent with the observations of Holton and Tan (1980). This basic view of the dynamics of QBO–extratropical interaction is supported by mechanistic model studies (e.g. Holton and Austin 1991), as well as by comprehensive GCM studies (e.g. Hamilton 1998; Hamilton et al. 2004) in which the QBO mean wind variations are imposed in the tropical stratosphere.

Various attempts have been made to detect the effect of the stratospheric QBO on the tropospheric circulation. Speculative studies have proposed that the QBO affects tropical rainfall and even the occurrence of hurricanes (see Baldwin et al. 2001, for a review). More substantial evidence exists for a QBO modulation of the large-scale tropospheric circulation in the extratropical northern winter troposphere somewhat similar to that seen in the winter stratosphere. Ebdon (1975), Holton and Tan (1980), Baldwin et al. (2001) and Thompson et al. (2002) all find that for NH winters with westerly (easterly) QBO phase, the surface pressure (or 1,000 hPa height) is anomalously low (high) near the pole, and high (low) in a midlatitude belt. This zonal-mean signal is similar to that seen in the stratosphere, i.e. the tropospheric mean westerly vortex strengthens during the westerly QBO phase. The anomalous meridional gradient of pressure associated with the QBO appears to be enhanced over the North Atlantic sector—a feature, which appears consistently in the results of Ebdon (1975), Holton and Tan (1980), Baldwin et al. (2001) and Thompson et al. (2002), despite the different observational periods employed. The QBO-related tropospheric pattern resembles the North Atlantic Oscillation and Northern Annular Mode (NAM) patterns. Coughlin and Tung (2001) show that there is an observed QBO component of the NAM variation that is coherent with the tropical stratospheric QBO.

2.1 Representation of the QBO

Observational studies of the effects of the tropical QBO, such as those described above, employ an objective measure of the phase of the QBO. Generally this is based on the time series of monthly-mean zonal winds from radiosonde observations at near-equatorial stations (Canton Island, 2.8S, through August 1967; Gan, 0.7S, until December 1975, and Singapore, 1.4N, since 1976) compiled at the Free University of Berlin (Naujokat 1986). A simple approach (adopted e.g. by Holton and Tan 1980) is to characterize the QBO phase by the value of the zonal wind at a single level, usually 40 or 50 hPa. Wang et al. (1995) fit linear EOFs to the height-time section (from 70 to 10 hPa) of the Free University of Berlin data set and define a phase in terms of the ratio of the time series coefficients of the first two EOFs. This has the advantage of incorporating information on the vertical structure of the winds in the phase definition, but the simplified representation of the QBO in terms of two linear EOFs has some significant deficiencies. In particular, the transitions in the EOF-reconstructed fields are much more gradual than in the real data and the downward phase progression is more regular in the EOF representation than in the real data.

Hamilton and Hsieh (2002) use neural-network fitting techniques to fit a single periodic nonlinear principal component (NLPC) to the time series of the 70–10 hPa level winds. They show that this single nonlinear principal component represents the raw data more faithfully than do two linear EOFs. The fitting procedure produces a principal component structure U(p,θ) which is periodic in phase θ, and a time series θ(t) of the phase. Hamilton and Hsieh analyse 45 years of wind data (1956–2000) at 7 pressure levels from 70 to 10 hPa. The analysis is extended here to the 51 year period (1956–2006) and the resulting U(p,θ) and θ(t) are shown in Fig. 1. When the zonal wind is expressed as

$$ u {\left( {p,\;t} \right)} = U{\left( {p,\theta {\left( t \right)}} \right)} + u^{'} $$
(1)

the single principal component QBO representation U(p,θ) captures 94.8% of the variance of the time series of monthly mean winds.

Fig. 1
figure 1

The structure U(p, θ) of the non-linear representation of the QBO (upper panel) updated from Hamilton and Hsieh (2002) with additional data to 2006. The bottom panel shows the time series of monthly θ values. The contour interval is 5 ms−1 and dashed contours denote easterly winds

3 Seasonal forecasting

Seasonal forecasting is difficult in the extratropics because the predictable signal is small compared to unpredictable natural variability noise and there is only a short data record with which to develop, verify, and improve seasonal forecasting methods. Progress has, nevertheless, been made and further progress is anticipated (see Barnston et al. 1994; Goddard et al. 2001 for reviews). One source of a predictable signal is known to be tropical sea surface temperature anomalies (SSTAs) which themselves may be forecast by empirical/statistical and/or dynamical methods (see Mason and Mimmack 2002 for a review). SSTA information may then be connected to anomalies of temperature and precipitation (or other variables) at remote locations either statistically or by using the predicted SSTAs as boundary conditions for an AGCM, which translates them into seasonal anomalies of model variables such at temperature, geopotential height and precipitation over the globe.

Ensembles of forecasts are typically produced with the idea that the ensemble average will average out much of the unpredictable natural variability noise and so give a better representation of the signal. Multi-model ensembles of forecasts are becoming increasingly common. Benefits potentially include larger ensembles and the possibility that different models may show skill in different regions so that a combined result is more skilful overall (e.g. Demeter 2005). In this study we use a four-model ensemble of two-tier forecasts spanning the 35-year period from 1969 to 2003. Reanalysis data from ERA40 (Uppala et al. 2005) and the NCEP reanalysis (Kalnay et al. 1996) are available for verification.

4 HFP2 retrospective seasonal forecasts

The first Historical Forecasting Project (HFP1), described in Derome et al. (2001), is an attempt to develop and verify a seasonal forecasting methodology in a purely objective way. Retroactive model-based forecasts which have no information from the future are produced and verified. HFP1 is based on results from two atmospheric models, GCM2 and SEF. GCM2 (McFarlane et al. 1992) is the atmospheric component of the then current Canadian Centre for Climate Modelling and Analysis (CCCma) coupled climate models, CGCM1 and CGCM2 (Flato et al. 2000; Flato and Boer 2001) used for climate change simulations (Boer et al. 2000a, b). SEF is a global spectral weather forecasting model developed at Recherche en Prévision Numérique (RPN) for data assimilation and medium range forecasting (Ritchie 1991; Ritchie and Beaudoin 1994). The models are spectral with resolutions of T32L10 and T63L23, respectively and incorporate the usual range of physical parameterizations.

HFP1 depends on atmospheric initial conditions and surface boundary conditions to provide the predictable signal that is sought. The surface boundary conditions consist of SSTAs, which are persisted from the month preceding the beginning of the forecast and observed sea-ice cover, which is relaxed back to climatological values over the first month. Snow cover is initialized from observations and subsequently evolves. Despite the simplicity of the SSTA “forecast” it displays considerable skill for the first season because of the long SST timescales involved. In HFP1, an ensemble of six integrations is made with each model using initial conditions from the reanalyses immediately proceedings the beginning of the forecast period.

The second Historical Forecasting Project is similar to HFP1 but is a considerable extension to it. A suite of four models is used. Both GCM2 and a slightly modified version of SEF are again used. The atmospheric component of the new version of the CCCma climate model, referred to as GCM3, is the third model used. GCM3 incorporates higher resolution (T63L31) and a variety of improved physical parameterizations including the CLASS land surface scheme (Verseghy et al. 1993), surface exchanges of heat moisture and momentum following Abdella and McFarlane (1996), the cumulus parameterization of Zhang and McFarlane (1995), improved solar radiative calculations and cloud treatment and the hybrid moisture variable proposed by Boer (1995). The fourth model in HFP2 is the climate version of the Global Environmental Multi-scale (GEM) weather prediction model of RPN (Côté et al. 1998a, b). It is a finite element model and, as the name implies, may be operated over a range of scales. For HFP2 the horizontal resolution used is 96 × 192. HFP2 results comprise ensembles of 10 forecasts with each model for each season for the 35 years from 1969 to 2003. We concentrate on the traditional 500 hPa height field (Z500), the 850 hPa temperature field (T850) and the December–February (DJF) season for the investigations reported here.

Some results from HFP2 are also available as part of the international Seasonal forecast Model Intercomparison Project SMIP2 (http://www.grads.iges.org/ellfb/SMIP2/smip2.top.html) and have also been supplied to the Asian-Pacific Climate Network (APCN), now the Asian-Pacific Climate Centre (APCC), as part of that organization’s attempt to develop an international multi-model prediction system (http://www2.apcc21.net/index.php) by combining ensembles of seasonal forecasts produced by members of the Asian-Pacific Economic Community (APEC).

5 Signal, noise, error and skill

The deviation of a variable from its long-term climatological average is represented as

$$ X = X_{f} + x = X_{s} + X_{q} + x $$
(2)

where X f symbolizes a deterministic/forced component and x unpredictable natural variability noise. A deterministic/forced signal may arise from a number of long timescale processes such as surface forcing (associated with anomalous SSTs and perhaps other features such as the soil moisture, sea ice and the like), the state of the QBO and/or other long timescale processes of various kinds. The possible existence of several deterministic/forced processes is symbolized in (1) by writing X f  = X s + X q .

Presuming the processes in (2) are independent, their variances are

$$ \ifmmode\expandafter\bar\else\expandafter\=\fi{X}^{2} = \sigma ^{2}_{X} = \sigma ^{2}_{{X_{f} }} + \sigma ^{2}_{x} = \sigma ^{2}_{{X_{s} }} + \sigma ^{2}_{{X_{q} }} + \sigma ^{2}_{x} $$
(3)

where the overbar is an average over the observations for many years for a particular season. The corresponding predictions, which are deviations from the model forecast climate, are represented as

$$ \begin{aligned}{} & Y = Y_{f} + y = Y_{s} + Y_{q} + y \\ & \sigma ^{2}_{Y} = \sigma ^{2}_{{Y_{f} }} + \sigma ^{2}_{y} = \sigma ^{2}_{{Y_{s} }} + \sigma ^{2}_{{Y_{q} }} + \sigma ^{2}_{y} \\ \end{aligned} $$
(4)

The mean square forecast error follows as

$$\overline {^{e^2 } } = \overline {(Y - X)^2 } = \overline {^{e_f^2 } } + \overline {e^{'2} } = {\left( {\sigma ^{2}_{{X_{f} }} + \sigma ^{2}_{{Y_{f} }} } \right)}{\left( {1 - \beta R} \right)} + {\left( {\sigma ^{2}_{x} + \sigma ^{2}_{y} } \right)} $$
(5)

where R is the correlation coefficient between the deterministic/forced components X f and Y f and \( \beta = (2\sigma _{{X_{f} }} \sigma _{{Y_{f} }} )/(\sigma ^{2}_{{X_{f} }} + \sigma ^{2}_{{Y_{f} }} ) \) is a factor that is 1 if the variances are equal and less than 1 if not. The corresponding correlation between the natural variability components x and y is set to zero since they are generally not predictable on seasonal timescales. Thus (4) indicates that even if the prediction of the deterministic/forced component were perfect (βR = 1) the natural variability error \( \overline {e^{'2} } = \sigma ^{2}_{x} + \sigma ^{2}_{y} \) would remain.

Fractional variances follow from (3) as

$$ 1 = \frac{{\sigma ^{2}_{{X_{f} }} }} {{\sigma ^{2}_{X} }} + \frac{{\sigma ^{2}_{x} }} {{\sigma ^{2}_{X} }} = P + N = P_{s} + P_{q} + N $$
(6)

where \( P = \sigma ^{2}_{{X_{f} }} /\sigma ^{2}_{X} \) is the potentially predictable variance fraction (ppvf) attainable only with a perfect seasonal forecast model and \( N = \sigma ^{2}_{x} /\sigma ^{2}_{X} = 1 - P \) is the “always unpredictable” noise fraction. If the forecast and observed variances are the same (5) becomes

$$ \overline {^{e^2 } } = 2\sigma ^{2}_{X} {\left[ {P(1 - R) + N} \right]} = 2\sigma ^{2}_{X} (1 - PR). $$
(7)

The usual skill score

$$ S = 1 - \overline {^{e^2 } }/ \overline{^{e_0^2 } } $$
(8)

compares the mse of the forecast to that of a “no-skill” reference forecast \( \overline{^{e_0^2 } } . \) The maximum score is 1 for a perfect forecast, 0 for a forecast that is no better than the reference forecast, and negative for a forecast which is worse that the reference forecast.

For seasonal prediction the reference forecast is taken to be a climatological forecast, that is, a forecast of no deviation from climatology (Y = 0) whence \( \overline{^{e_0^2 } } = \ifmmode\expandafter\bar\else\expandafter\=\fi{X}^{2} = \sigma ^{2}_{X} \) and in the case where (7) holds

$$ S = 1 - 2{\left[ {P{\left( {1 - R} \right)} + N} \right]} = 2PR - 1 $$
(9)

which indicates that the skill depends on both the potentially predictive fraction of the variance P and the skill with which it is predicted as measured by the correlation R, all limited by the unpredictable variance fraction N = 1 − P. Thus even a good forecast has low skill where the ppvf P is small and a skilful forecast with S > 0 demands that R > 1/2P. This is a severe limit on the skill attainable in the extratropics where P is comparatively small and N comparatively large in general. This is also a reason why it is important to consider all sources of predictable variance which, if not incorporated in the forecast, are assigned to N rather than P and thus are doubly harmful for seasonal forecasting.

In the case of an ensemble of forecasts the presumption is that the deterministic/forced component is essentially the same in all forecasts and survives ensemble averaging (indicated by curly brackets {Y}) while the natural variability noise is reasonably independent in each case and is reduced by ensemble averaging over the n ensemble members so that

$$ \begin{aligned}{} & {\left\{ Y \right\}} = Y_{f} + {\left\{ {Y^{'} } \right\}} \\ & \sigma ^{2}_{{{\left\{ Y \right\}}}} = \sigma ^{2}_{{Y_{f} }} + \sigma ^{2}_{y} /n \\ \end{aligned} $$
(10)

Thus the mse can be reduced and the skill increased by using the ensemble mean forecast where, in the limit that \( \sigma ^{2}_{y} /n \) can be neglected (7) and (9) become

$$ \begin{aligned}{} \overline {^{e^2 } } & = 2\sigma _{X} ^{2} {\left[ {P{\left( {1 - R} \right)} + N/2} \right]} = \sigma _{X} ^{2} {\left[ {1 - P{\left( {2R - 1} \right)}} \right]} \\ S & = 1 - 2{\left[ {P{\left( {1 - R} \right)} + N/2} \right]} = P{\left( {2R - 1} \right)} \\ \end{aligned} $$
(11)

and skill is enhanced. In what follows we concentrate on the skill of the ensemble mean forecasts obtained by averaging over all ensemble members produced by the four models for a total ensemble size of 40.

6 QBO effects in the extratropical troposphere

Although the QBO clearly affects the extratropical stratosphere, the existence of QBO-related effects in the extratropical troposphere is less obvious. This is partially due to the fact that natural variability, and hence N, is relatively large there. While there are plausible dynamical reasons why QBO effects can exist in the northern extratropical troposphere in winter as discussed in Sect. 2, the magnitude of the effects may nevertheless be modest and perhaps discounted by the characterization of the QBO. If a QBO index contaminated by noise is used it may compromise the connection to extratropical variability.

Here the QBO is represented in terms of and index based on the statistically non-linear characterization of the QBO developed in Hamilton and Hsieh (2002) and discussed in Sect. 2. The QBO index is used to relate the state of the QBO to extratropical variability in a straightforward way. Finally, and importantly for this study, we test the hypothesis that seasonal forecasts made with current AGCMs which are not able to represent the QBO will lack a QBO-related predictable signal which could potentially supply additional extratropical forecast skill.

6.1 A QBO index

The QBO is characterized in terms of the NLPC structure of Fig. 1 and the QBO index q is defined as its weighted average over the 30–50 hPa layer,

$$ q{\left( t \right)} = \frac{1} {{p_{2} - p_{1} }}{\left( {{\int\limits_{p_{1} }^{p_{2} } {U{\left( {p,\theta (t)} \right)}dp} }} \right)} $$
(12)

The black curve in Fig. 2 plots q(t) and the red curve shows the average of the raw zonal wind values u. Using the NLPC based q has the advantage of characterizing the state of the QBO throughout the 70-10 hPa range and reducing contamination of the signal by sampling noise in the raw time series of single station monthly means of twice-daily samples. Note also that the 51 year mean of the raw values in Fig. 2 is −3.4 m s−1, while the long-term mean has been removed from the q index. A comparison of q in (12) with other indices based on raw wind values and/or single level values indicates that the results are not strongly dependent on a particular choice. We prefer q in (12) for its smooth representation of the QBO evolution and its incorporation of information on the vertical structure.

Fig. 2
figure 2

The black curve plots the QBO index \( q{\left( t \right)} = \frac{1} {{p_{2} - p_{1} }}{\left( {{\int_{p_{1} }^{p_{2} } {U{\left( {p,\theta (t)} \right)}dp} }} \right)} \) where the average is over the 30–50 hPa layer, the red curve is the corresponding average of the raw monthly-mean wind values u. The blue dots are the statistically predicted values of q for the December–February season from the November value

The circles in Fig. 2 show the results of a statistical forecast of the December, January and February mean q values starting from a knowledge of the November-mean winds. Specifically the forecast is made using

$$ q{\left( {t + \Delta t} \right)} = \frac{1} {{p_{2} - p_{1} }}{\left( {{\int\limits_{p_{1} }^{p_{2} } {U{\left( {p,\theta (t) + C(\theta )\Delta t} \right)}dp} }} \right)} $$
(13)

where C(θ) is the average progression rate for θ at a particular phase of the NLPC at this season, as determined from the entire record. The results indicate that, because of the long timescales of the QBO and especially of the filtered index q, very good predictions of the value of the index for DJF are possible.

6.2 Composite patterns

Holton and Tan (1980) investigate the surface signature of the QBO in terms of the difference in composites of 1,000 hPa geopotential height for positive and negative phases of the QBO defined as positive and negative values of equatorial stratospheric winds at 50 mb. Baldwin et al. (2001) show an updated version of the Holton and Tan result in their Fig. 31 based on December–February NCEP data for 1964–1996.

Following the theory outlined in Sect. 2, we expect to see extratropical tropospheric QBO effects, if any, in Northern Hemisphere winter but not in the Southern Hemisphere or in summer. We maintain the historical connection by repeating the Holton and Tan analysis here using NCEP reanalysis data and the QBO index q for the 49 DJF periods during 1956–2005. The composite pattern is the difference in 1,000 hPa geopotential height averaged over all DJF seasons with positive (Westerly) q minus the average over the corresponding seasons with negative (Easterly) q in Table 1. The corresponding DJF 500 hPa geopotential height composites as well as the surface temperature and 850 hPa temperature composites are also shown for reference. The 1,000 hPa pattern resembles that of the earlier 1964–1996 composite published by Baldwin et al. (2001) in general, although the high latitude negative anomalies in Fig. 3 are weaker. Neither Holton and Tan nor Baldwin et al. (2001) provide a significance test of their results. A t test indicates that a region around the positive maximum in the North Atlantic is statistically different from zero at the 5% level for all of the variables in Fig. 3. The size of the area of significance increases away from the surface.

Table 1 Phase of the QBO defined as westerly or easterly if the index q is westerly or easterly for each of the three months in the DJF season
Fig. 3
figure 3

Westerly minus easterly 1,000 and 500 hPa geopotential height composites (left panels) and the surface and 850 hpa temperature composites (right panels) for the DJF season. Composites are defined as westerly or easterly if the QBO index q is westerly or easterly for each of the three months in the season

The results resemble the AO/NAO related patterns in these or closely related variables (i.e. 500–1,000 hPa thickness is very nearly 850 hPa temperature) used in a range of studies such as Wallace and Gutzler (1981), Hurrell et al. (1995), Thompson and Wallace (1998) and Thompson et al. (2002). The two-tier forecasts being investigated here are produced using a very simple forecast of SST, namely the persistence of the SST anomaly. This is both straightforward and reasonably skilful but has the disadvantage for the study of QBO effects of strongly controlling the surface temperature and less strongly the 1,000 hPa geopotential height in the forecasts. In order to avoid the direct influence of (errors in) the forecast SSTs and concentrate on the possible dynamical and thermodynamical effect of the QBO, the skill of forecasts of 500 hPa geopotential height and 850 mb temperature are considered. The 500 hPa height is a basic tropospheric flow variable that is standard in many studies and resembles the flow pattern at other levels in the atmosphere while the 850 mb temperature captures the thermodynamic structure of the lower troposphere.

7 Seasonal prediction and the QBO

As noted above, the kinds of AGCMs used in seasonal prediction typically do not support a QBO. If the QBO affects the extratropical troposphere, its lack in models will be seen as additional error and will reduce the already modest skill attainable. Although the initial conditions contain QBO information, the models typically lose this information as they adjust toward their climatological flow state, which is generally weak easterly. This is shown explicitly in the lower panel of Fig. 4 for the behaviour of the GEM model where the 1969–1971 monthly-mean 50 hPa observed zonal wind in the Free University of Berlin dataset is plotted together with the nonlinear principal component U(50 hPa, θ) reconstruction described in Sect. 2. The black circles emphasize the monthly wind observations for two December–March periods. The blue crosses show the equatorial zonal-mean zonal wind on December 1 as represented in the NCEP reanalyses data set, which was used to initialize the HFP2 forecasts. The red circles show the monthly-mean equatorial zonal-mean zonal wind at 50 hPa in the GEM forecasts that were initialized on (or near) December 1. Even in the dataset used for initialization, the zonal wind is weaker than that indicated in the raw radiosonde data, a flaw in the NCEP reanalyses that has been noted for example by Huesmann and Hitchman (2001). The forecast zonal wind then diverges further from the actual wind evolution, with a relaxation towards a state with weak mean easterlies as seen in the upper panel of Fig. 4. This overall behavior is expected on the basis of earlier GCM studies (e.g. Yuan and Hamilton 1992), and is apparent in all the forecasts in the HFP2 project.

Fig. 4
figure 4

Results for GEM model forecasts for 1969–1971. The monthly-mean 50 hPa observed zonal–mean zonal wind u is plotted together with the nonlinear principal component U(50 hpa, θ) reconstruction described in Sect. 2. The black circles emphasize the monthly mean winds from observations for two December–March periods. The blue crosses show the equatorial zonal-mean u on December 1 as represented in the NCEP reanalyses data set used to initialize the HFP2 forecasts. The red circles show the ensemble monthly mean equatorial zonal-mean u at 50 hPa in the GEM forecasts that are initialized on (or near) December 1

The retrospective seasonal forecasts comprising HFP2 consist of 35 years of 1-season 2-tier forecasts using 4 AGCMs which are initialized with reanalysis data and which use persisted SSTAs as boundary conditions over the oceans. These are real forecasts in that no information from the future enters the forecast process itself. These forecasts, and seasonal forecasts in general, are skilful at low latitudes as a consequence of the direct local response of the tropical atmosphere to SSTAs. At extratropical latitudes the skill is much more modest and is largely a consequence of teleconnections from the tropics associated with the El Niño/La Niña phenomenon. We analyse the collection of forecasts in HFP2 and ask to what extent they capture the extratropical SSTA forced signal and if knowledge of the QBO could add additional skill.

7.1 Variance and skill components

We consider the 4-model HFP2 ensemble mean DJF forecast anomaly Y and allow a single parameter statistical adjustment with

$$ \begin{aligned}{} & X = aY + W \\ & a = \overline{{XY}} /\sigma ^{2}_{Y} = r_{{XY}} (\sigma _{X} /\sigma _{Y} ) \\ \end{aligned} $$
(14)

The fraction of \( \sigma ^{2}_{X} \) accounted for in this regression approach is \( P_{{{\text{HFP}}}} = r^{2}_{{XY}} = a^{2} (\sigma ^{2}_{Y} /\sigma ^{2}_{X} ) \) where r XY is the correlation coefficient and \( \sigma ^{2}_{W} \) is the variance not accounted for by the scaled forecasts.

We then ask if knowledge of the QBO, as incorporated in the QBO index q in (12), can account for some part of the remaining variance using

$$ \begin{aligned}{} & W = (X - aY) = bq + e \\ & b = \overline{{Wq}} /\sigma ^{2}_{q} = r_{{W_{q} }} (\sigma _{W} /\sigma _{q} ) \\ \end{aligned} $$
(15)

where \( P_{{QBO}} = r^{2}_{{W_{q} }} = b^{2} (\sigma ^{2}_{q} /\sigma ^{2}_{W} ) \) is the additional QBO-related variance accounted for.

The components of the variance of X associated with the forecasts and the QBO in this context are

$$ 1 = \frac{{a^{2} \sigma ^{2}_{Y} }} {{\sigma ^{2}_{X} }} + \frac{{b^{{^{2} }} \sigma ^{2}_{q} }} {{\sigma ^{2}_{X} }} + \frac{{\sigma ^{2}_{e} }} {{\sigma ^{2}_{X} }} = P_{{{\text{HFP}}}} + P_{{{\text{QBO}}}} + N $$
(16)

where the connection with (6) is reflected in the notation. P HFP is the fraction of the variance of X accounted for by the optimally scaled HFP2 forecasts, P QBO the additional fraction that can be accounted for by knowledge of the QBO, and N the remaining unpredicted fraction.

The relationships in (15) can be recast in terms of the skill measures (7) and (9). The error and skill associated with the scaled forecasts is

$$ \begin{aligned}{} & \overline{{e^{2}_{Y} }} = \sigma ^{2}_{W} = \overline {(aY - X)^2 } \\ & S_{{\rm HFP}} = 1 - \overline{{e^{2}_{Y} }} /\sigma ^{2}_{X} = r^{2}_{{XY}} \\ \end{aligned} $$
(17)

and the error and skill associated with both the scaled forecast and the QBO index q is

$$ \begin{aligned}{} & \overline{{e^{2}_{{Yq}} }} = \sigma ^{2}_{e} = \overline{{{\left( {bq - W} \right)}^{2} }} = \sigma ^{2}_{W} - b^{2} \sigma ^{2}_{q} = \sigma ^{2}_{X} {\left\{ {(1 - r^{2}_{{XY}} ) - \frac{{\sigma ^{2}_{W} }} {{\sigma ^{2}_{X} }}r^{2}_{{Wq}} } \right\}} \\ & S_{{Yq}} = r^{2}_{{XY}} + \frac{{\sigma ^{2}_{W} }} {{\sigma ^{2}_{X} }}r^{2}_{{Wq}} = S_{{{\text{HFP}}}} + S_{{{\text{QBO}}}} \\ \end{aligned} $$
(18)

Under these particular circumstances, the skill measures S HFP and S QBO also measure the fraction of the variance P HFP and P QBO accounted for by the HFP forecasts and the QBO. We may calculate and compare the skill S HFP implicit in the scaled forecast with the additional skill S QBO that may potentially be obtained with knowledge of some index, such as q, which contains information about a long timescale process not treated by the forecast system itself.

The statistical significance of the results may be judged from the correlation coefficients assuming the DJF seasons are independent from year to year over the 35 years of the HFP2. The correlation coefficient is deemed to be significantly different from zero at the 5% significance level for values in excess of 0.34 (0.12 for the squared correlation coefficient).

7.2 HFP2 forecast skill

Figure 5 displays the correlation of the ensemble mean forecasts of DJF 500 hPa geopotential height and 850-hPa temperature (upper panels) based on the suite of retrospective ensemble mean HFP2 forecasts for the period 1969–2000. The skill S HFP of the scaled HFP2 forecasts are also shown (lower panels) and exhibit the expected distribution of skill with a large fraction of the variance correctly captured by the forecasts at low latitudes where both the real and modelled atmospheres respond directly to the SSTAs. S HFP decreases rather rapidly in the extratropics, however, where the interaction is less direct and where natural variability is larger. Regions of higher skill extend into the extratropics and reflect the representation of the teleconnections forced by the SSTAs in the forecast models.

Fig. 5
figure 5

The correlation patterns between the observations and the multi-model ensemble mean DJF forecast for 500 hPa geopotential height and 850 hPa temperature based on the 34 retrospective HFP2 forecasts for the years 1969–2003 (upper panels). The corresponding skill measures S HFP for the scaled forecasts (lower panels). The colour bar gives the values as percentages

7.3 NINO3.4

The hypothesis is that the HFP2 forecast system largely captures the signal due to long timescale SSTAs, which are boundary conditions for the AGCMs. Of course error in the forecast system might contaminate the results beyond that which can be adjusted for by a simple scaling of the anomalies. The extent to which additional skill could potentially be obtained using an index, which captures the state of the tropical central Pacific and hence the presence and strength of the El Niño/La Niña phenomenon is of interest.

We use the NINO3.4 index n, which is the average SSTA in the region (5S–5N; 170 W–120 W). The index comes from NCEP (http://www.cpc.ncep.noaa.gov/data/indices) and is available from 1950 to the present. The NINO3.4 region exhibits SST variability on El Niño/La Niña time scales that is connected to shifts and changes in the large region of rainfall typically located in the far western Pacific. These changes in the location and magnitude of latent heat release associated with these shifts in precipitation patterns are important drivers of teleconnections to the extratropics.

The upper panels of Fig. 6 display the correlation of the ENSO index n with that part of the 500 hPa geopotential height and 850 hPa temperature that is not accounted for the by the scaled HFP2 multi-model ensemble mean DJF forecast. The lower panels of Fig. 6 give the additional skill S n that may potentially be obtained with knowledge of the ENSO index (as in Sect. 7.1 but with n replacing q). Here regions of both positive and negative correlation may contribute to additional skill after suitable (positive or negative) scaling. This is a linear approach and it may well be possible for improved models to capture additional non-linear skill. It is apparent from the figure that there is essentially no additional extratropical skill to be had from a knowledge of the ENSO index using this approach. This is indirect evidence that the AGCMs capture much of the available extratropical skill from the forcing associated with SSTAs. Since the SSTA boundary conditions for the AGCMs are forecast values, albeit forecast in a rather simple way, while the ENSO index used is contemporary, it is striking that the HFP2 forecasts apparently capture much of the available SSTA induced extratropical variability. According to Fig. 6 there does remain some available improvement in skill in the tropical region from knowledge of the (contemporary) ENSO index possibly reflecting the error in the persistence forecast of the SST anomalies.

Fig. 6
figure 6

The correlation patterns (upper panels) between the ENSO index n and the remaining variance not accounted for by the scaled HFP2 multi-model ensemble mean DJF forecasts of 500 hPa geopotential height and 850 hPa temperature in Fig. 5. The corresponding measures of additional skill S n (lower panels) that may potentially be obtained with knowledge of the ENSO index. The colour bar gives the values as percentages

7.4 QBO skill

Figure 7 parallels Fig. 6 but for the QBO index q and indicates that there is the possibility of obtaining some modest additional extratropical skill by including QBO information in the forecasts. As noted in Sect. 7.1, the skill measure S QBO measures the additional fraction of the variance that may be statistically accounted for from a knowledge of q. The conclusion is that some part of the variability that cannot be accounted for by the HFP2 forecasts, and which is not accounted for by the state of the ENSO, can be explained by the state of the QBO. This extra skill is largest, not surprisingly, where the connection between the QBO index and other variables is strongest according to the composites of Fig. 3. Although the additional skill is small, the result supports the contention that typical AGCM based seasonal forecasts such as those of HFP2, which lack a QBO also lack the effect of the QBO on the extratropical troposphere.

Fig. 7
figure 7

The correlation patterns (upper panels) between the QBO index q and the remaining variance not accounted for by the scaled HFP2 multi-model ensemble mean DJF forecasts of 500 hPa geopotential height and 850 hPa temperature in Fig. 5. The corresponding measures of additional skill S QBO (lower panels) that may potentially be obtained with knowledge of the QBO index. The colour bar gives the values as percentages

These results suggest that taking account of QBO effects directly using models, which adequately represent the physical processes involved could modestly improve extratropical seasonal forecasts for DJF in the North American and North Atlantic regions. Alternatively, since we have demonstrated that q itself can be forecast for DJF with considerable accuracy (Fig. 2), the kind of a posteriori statistical correction described here could be used to add a modest amount of additional QBO-related seasonal forecast skill to standard numerical ensemble forecasts.

8 Summary

Seasonal forecasts are reasonably skilful at low latitudes as a consequence of the close link with long timescale tropical sea surface temperatures. Seasonal forecasts are less skilful at higher latitudes and the modest skill that exists is thought to be largely a consequence of teleconnections from the tropics and especially from ENSO. The comparatively long timescales in the climate system, which provide extratropical seasonal forecast skill are usually associated with the ocean and to a lesser extent with land surface processes rather than with the atmosphere. However, the QBO in the tropical stratosphere is a notable long timescale atmospheric process, which may be linked to the extratropical troposphere through planetary wave behaviour.

The atmospheric general circulation models currently used in seasonal forecasting typically do not represent the processes that produce the QBO and soon lose the QBO information that is available in their initial conditions. This is the case for the models used in the HFP2, which consists of 35 years of retrospective 2-tier 1-season 10-member ensemble forecasts made with each of four models. In the absence of a QBO in the forecast models, knowledge of the state of the QBO can, however, provide a modest additional source of extratropical seasonal forecasting skill beyond that contained in the HFP2 results.

The available extratropical skill for the basic seasonal forecast variables of 500 hPa geopotential height and 850-hPa temperature are investigated for the HFP2. Knowledge of the state of the equatorial central Pacific and the El Niño/La Niña through the NINO3.4 index does not provide additional extratropical skill to HFP2 forecasts. By contrast, knowledge of the state of the QBO does add extratropical skill centred in the region of the North Atlantic. Although the additional skill is modest, the result supports the contention that taking account of the QBO either statistically after the fact or directly by using models, which can represent the physical processes of the QBO could modestly improve seasonal forecasts in the region.