1 Introduction

The thermohaline circulation (THC) is an important part of the climate system, responsible for transporting large amounts of heat northwards to high latitudes in the Atlantic sector (Broecker 1991). The temperate climate of the eastern North Atlantic sector depends on the heat, supplied by the THC, to a significant extent. However, we have known since the pioneering work of Stommel (1961) that the circulation can display multiple equilibria, and may be vulnerable to interruption or collapse in the future if warming and freshening of North Atlantic waters in response to greenhouse gas (GHG) forcing leads to a cessation of deep convection or a large reduction in the Atlantic meridional density gradient (Manabe and Stouffer 1988, 1994). Currently, most climate models suggest that the circulation is likely to weaken, but they do not suggest that such a collapse will occur in the near future (Murphy and Mitchell 1995; Gordon et al. 1997; Wood et al. 1999; Boer et al. 2000), though these same models tend to disagree both about the likely strength of any transient weakening (Kattenberg et al. 1996; Latif et al. 2000), and the mechanisms which might drive it (Mikolajewicz and Voss 2000; Dixon and Lanzante 1999), prompting Houghton et al. (2001) to state that a collapse of the circulation could not be ruled out, particularly beyond 2100.

In the light of this uncertainty and the significant climatic impacts associated with it (Vellinga and Wood 2002), it is important that we understand why the models disagree, what mechanisms are responsible for this disagreement, and how credibly they are represented in the various models so that we can develop the process of quantifying and reducing the uncertainties involved. This paper takes a step down this road by using the methodology of Thorpe et al. (2001) to compare and evaluate the responses of the Hadley Centre coupled climate models HadCM2 (Johns et al. 1997) and HadCM3 (Gordon et al. 2000).

In this study, both HadCM2 and HadCM3 are subjected to an idealised forcing scenario in which CO2 is increased fourfold from its preindustrial level at 2% compound per annum over a period of 70 years, and is then stabilised at this level for a further four centuries (experiment 4×CO2, an extension of 2PC in Wood et al. 1999). The different THC responses are analysed in terms of changes to the meridional steric gradient as in Thorpe et al. (2001), and differences in important mechanisms isolated and discussed.

2 Model description

HadCM2 (Johns et al. 1997) uses a horizontal resolution of 2.5°×3.75° for atmosphere and ocean components, with 19 vertical levels in the atmosphere, and 20 in the ocean. The atmosphere component (HadAM2) includes parameterisations of radiation, cloud, convection, large-scale precipitation, the boundary layer (Smith 1993), and the land surface. The radiative transfer scheme includes the effects of clouds, water vapour, ozone, and carbon dioxide. The ocean component (HadOM2) has been developed from Cox (1984), and includes an explicit mixed layer parameterisation, isopycnal diffusion, a simple parameterisation of Mediterranean outflow (Thorpe and Bigg 2000) and a sea ice model that allows for thermodynamic processes and simple free-ice drift. A constant ‘flux adjustment’ of heat and freshwater is applied at the ocean surface to ensure that sea surface temperatures and salinities remain close to specified climatologies and do not drift away from a realistic state (Sausen et al. 1988; Manabe et al. 1991; Johns et al. 1997). The equilibrium sensitivity of the model to doubling atmospheric CO2, estimated from a long coupled simulation of HadCM2 is 3.8°C (Senior and Mitchell 2000).

HadCM3 (Gordon et al. 2000) uses the same atmospheric resolution as HadCM2, but includes a new radiative transfer scheme that represents the radiative effects of aerosols, oxygen, and minor trace gases (Edwards and Slingo 1996). A new land surface scheme (the Met office surface exchange scheme or MOSES) includes the freezing and thawing of soil moisture, and the effect of CO2 on stomatal resistance to evapotranspiration (Cox et al. 1999). The convective parameterisation also includes convection of momentum (Gregory et al. 1997), with the cloud prediction scheme modified by reducing the critical relative humidity above which cloud is assumed to form from 85% to 70% above model level 3 (around 925 hPa). The boundary layer scheme is modified in HadCM3, with the nonlocal mixing effects (Smith 1993) being removed. Pope et al. (2000) give a more detailed account of the changes in the atmospheric component of HadCM3 (HadAM3) and the impact these have on ‘atmosphere only’ simulations. The ocean component of HadCM3 (HadOM3) has an increased horizontal resolution of 1.25°1.25°, allowing the use of much lower ocean viscosities, and improving the advection of heat and salinity. Other changes include, the tracer mixing scheme, Gent and McWilliams (1990) as modified by Visbeck et al. (1997) to enhance mixing in regions of strong mesoscale eddy activity, and an explicit parameterisation of the flow over the Greenland–Iceland sill (Thorpe et al. 2004). The improvement in heat transports, especially in the ocean, means that HadCM3 is able to maintain a stable surface climate without flux adjustments, although regional systematic errors remain. An assessment of its performance is given by Gordon et al. (2000).

3 The response of the THC to greenhouse gas forcing in HadCM2 and HadCM3

HadCM3 (Gordon et al. 2000; Pope et al. 2000) simulates a stable THC of realistic strength without the need for flux adjustment, and with separate sites of deep water formation in the Nordic and Labrador Seas. This circulation weakens by around 25% in experiment 4xCO2 and convection in the Labrador Sea is interrupted (Wood et al. 1999), but the circulation does not collapse. The weakening is driven by surface fluxes of heat and freshwater, and stabilised by increased export of freshwater out of the Atlantic basin and northwards advection of the resulting increasingly saline oceanic water which remains (Thorpe et al. 2001).

The model response is qualitatively different from that of HadCM2 (Johns et al. 1997), where the circulation recovered to near or even above its original strength in various stabilisation scenarios (Mitchell et al. 2000). Figure 1 shows a comparison of the Atlantic THC responses in experiment 4×CO2. The transient behaviour in HadCM2 and HadCM3 is qualitatively similar, with a reduction in overturning strength on a timescale of around 70 years, though this is smaller in HadCM2 at around 25% than it is in HadCM3, where the reduction is 30–40%. The long-term response in the two models is very different. In HadCM3 there is a partial recovery and the circulation stabilises at around 75% of its original strength, whereas in HadCM2 the recovery is much stronger, and the circulation stabilises at around 110% of its original strength.

Fig. 1
figure 1

Atlantic thermohaline overturning in experiment 4xCO2 for a HadCM2 and b HadCM3

4 Dependence of THC strength on the meridional gradient of steric height in HadCM2 and HadCM3

Similar relationships between the Atlantic meridional steric gradient and the strength of the THC exist in both HadCM2 and HadCM3 (Fig. 2), enabling us to extend the methodology of Thorpe et al. (2001) to the older model, and use it to understand why the meridional overturning circulation behaves in a different way in response to the same external forcing in the two models and hence, to evaluate their relative credibility.

Fig. 2
figure 2

Relationship between the meridional steric height gradient relative to 3,000 m depth for the western Atlantic basin and the maximum strength of the meridional overturning in the Atlantic (THC) for HadCM2 and HadCM3

The time evolution of the meridional steric gradient (Fig. 3; green line) is different in the two cases, and consistent with the transient weakening but equilibrium strengthening of the THC in HadCM2, and with the transient weakening and stabilisation in HadCM3. Several factors can be seen to contribute to the different behaviour:

  1. (a)

    As in HadCM3, the initial reduction in steric gradient, and hence transient weakening of the THC in HadCM2 (first 50 years or so) is due to the warming and freshening of the North Atlantic region driven by surface flux forcing. The transient response is less in HadCM2 because the surface flux forcing is less, though in both the cases, heat is more important than freshwater.

  2. (b)

    In the longer term, the THC is stronger in HadCM2 than HadCM3 because the surface flux forcing (heating and freshening of the North Atlantic) is less effective at reducing the steric gradient in HadCM2 than HadCM3. This is particularly the case with heat flux forcing, which reverses its impact from around year 100, and by year 200 is the single most important strengthening factor in HadCM2.

  3. (c)

    The diffusive terms are larger in HadCM2 than in HadCM3, consistent with the six times lower areal resolution of the former.

  4. (d)

    There is a similar leading order balance between surface fluxes and the advective terms, if we assume that diffusion is acting as a surrogate for advection in HadCM2 (a reasonable assumption given the lower resolution and increased viscosity of the ocean in HadCM2).

  5. (e)

    Though heat flux forcing is eventually the dominant driver of the stronger THC in HadCM2, heat advection and salinity diffusion are the initial triggers of the circulation recovery through their tendency to increase the meridional density gradient.

Fig. 3
figure 3

Time evolution of the relative contributions made by surface forcing, advection, and other terms to the rate of change of the steric height gradient in the Atlantic basin between 30°S and 60°N during the first 250 years for; a HadCM2, b HadCM3, and c HadCM2–HadCM3. Positive terms are associated with an acceleration of the meridional overturning circulation, negative terms with a deceleration of the meridional overturning circulation. Note that the scale in a is half that of b and c. The response beyond year 250 (not shown) becomes increasingly dominated by the near balance between large heat flux forcing and advection terms (along with heat diffusion in HadCM2)

5 Changes in heat flux forcing and the atmospheric circulation response

As is clear from Fig. 3, the more robust meridional circulation in HadCM2 after the first few decades is primarily due to the different impacts of surface heating on the Atlantic meridional density gradient in the two models, although differences in freshwater flux forcing (more of a brake on the circulation in HadCM3 than in HadCM2) also play a role. Figure 4 shows the surface heat fluxes in the Atlantic for years 150–200. This is around the time when heat flux forcing takes over from salinity diffusion as the major driver of THC strengthing in HadCM2. There are significant differences between the two models, with a large area of substantial heat loss in the western North Atlantic in HadCM2. This in turn appears to be driven by differences in the Atlantic mean sea level pressure (MSLP) response in the two models (Fig. 5). This is primarily because the changes in MSLP equate to an increase in the component of northerly winds over the western North Atlantic in HadCM2, whereas in HadCM3 it is the westerly component which increases. The increased importance of winds from a cold direction in the North Atlantic in HadCM2 leads to significant additional heat loss relative to the control case, whereas in HadCM3 there is no such response. The reduced warming of the atmosphere overlying the North Atlantic in HadCM2 is also associated with the formation of a low cloud which acts to reflect solar radiation. This offsets the increased longwave forcing due to enhanced CO2 (Fig. 6), and is associated with the minimum in atmospheric temperature response seen south of Greenland. It is likely that this change is a response to the large-scale atmospheric circulation change rather than being the trigger of such change, but the resulting radiative cooling acts to reinforce the dynamically-driven cooling in the region.

Fig. 4
figure 4

Average change of surface heat fluxes relative to the control in the Atlantic region for years 150–200 of the 4×CO2 experiment for a HadCM2, and b HadCM3. Positive values denote an increase in ocean heat uptake

Fig. 5
figure 5

Average change in mean sea level pressure relative to the control for years 150–200 of the 4×CO2 experiment for a HadCM2, and b HadCM3

Fig. 6
figure 6

Change in net top of the atmosphere (TOA) outgoing a total, b shortwave, and c longwave cloud radiative fluxes in the Labrador Sea region, showing that there is increased reflection of shortwave radiation in the region due to the formation of the extra low cloud in HadCM2, and that this effect is absent in HadCM3

The brake on high latitude warming applied by the northerly advection and associated changes in cloud forcing cause the meridional density gradient to weaken less than it otherwise would, and eventually even to strengthen as the waters of the south Atlantic begin increasingly to warm. As a consequence, the THC initially weakens less than in HadCM3, and then in the long term recovers until it is stronger than in the control. The enhanced poleward heat transport offsets this atmospheric forcing to an extent, but not enough to prevent the THC from strengthening over time (Fig. 1). This is consistent with the notion that it is the atmospheric forcing that is the primary driver of the THC response, a suggestion that was made earlier in connection with the steric gradient analysis (Fig. 3).

6 Determining the changes in physics responsible for the different atmospheric responses in the North Atlantic sector

The role of the atmospheric forcing in determining the nature of the response was investigated further, by comparing the atmospheric response patterns of the coupled models HadCM2 and HadCM3 with those of the corresponding mixed layer (slab) ocean models (HadSM2 and HadSM3), thereby isolating the impact of the atmospheric-forced response from that driven dynamically by the ocean.

The strong similarity in Atlantic MSLP response in the coupled (Fig. 5) and mixed layer or slab (Fig. 7) experiments clearly demonstrates that ocean dynamics are not important in setting the large-scale MSLP response patterns in this case, and implies that the atmospheric model differences are primarily responsible for the different THC responses seen in HadCM2 and HadCM3 (at least in the long term), rather than any changes in the oceanic components of the coupled models, a finding consistent with Williams et al. (2001).

Fig. 7
figure 7

Average change in mean sea level pressure relative to the control for the last 20 years of the 4×CO2 experiment for a HadSM2, and b HadSM3. The strong similarity between HadSM2 and HadCM2, and between HadSM3 and HadCM3 demonstrates that ocean dynamics is not important in setting the large scale pattern of atmospheric response

By removing individual physics changes from HadAM3, and considering the impact of doubling CO2 in each case, we were able to evaluate the effect of these changes on North Atlantic MSLP, and ascertain which changes were responsible for altering the northerly component of the atmospheric circulation response in HadAM2/HadCM2. Williams et al. (2001) demonstrated that changes in the boundary layer and cloud parameterisations, though having important effects elsewhere, were not responsible for the atmospheric circulation changes in the North Atlantic (their Fig. 11), so we focussed on the treatment of convection and the MOSES land surface scheme. Figure 8 shows the impact of removing the parameterisation of convection of momentum (8a) and the MOSES land surface scheme (8b). Although neither of the changes alone accounts for the different MSLP responses of HadAM2/HadCM2 and HadAM3/HadCM3, the development of an increased north-easterly component to the circulation in the Northwest Atlantic upon the removal of the parameterisation of convection of momentum is the key. When combined with the changes resulting from MOSES (Fig. 8b), a northerly response pattern consistent with the heat flux forcing change in HadCM2 develops in the region to the south of Greenland. So, whilst we do not fully understand in detail the physical mechanisms driving the changed MSLP response in the Atlantic sector, we suggest that the addition of these two schemes to the atmospheric model can account for the qualitative differences in North Atlantic MSLP atmospheric response in HadCM2 and HadCM3, and hence differences in heat flux forcing and long-term THC response in the two coupled models.

Fig. 8
figure 8

Average change in mean sea level pressure relative to the control for the last 20 years of the 4×CO2 experiment for a HadSM3 minus parameterised convective momentum, b HadSM3 minus the MOSES land surface scheme, and c the combined differences in a and b. Removal of parameterised convective momentum (Fig. 8a) is the key, resulting in a north-easterly anomaly, which when combined with the westerly signal of removing the MOSES scheme (Fig. 8b) suggests an approximately northerly pattern consistent with the heat flux response in HadCM2 (Fig. 8c)

7 Summary and discussion

In this paper, we have undertaken a comparative study of the mechanisms which drive the response to a fourfold increase in CO2 over 70 years with stabilisation thereafter in HadCM2 and HadCM3. In both the models, the THC changes are driven by surface flux forcing, with advection (and diffusion in HadCM2) acting in the opposite sense to limit the circulation change. In both the cases, heat fluxes are more important than those of freshwater. We find that different patterns of heat flux forcing in HadCM2 and HadCM3 are the prime determinant of the differing response in the two models. The increased northerly component to the near surface winds (associated with an increase in reflective low level cloud), leads to enhanced heat loss in the west-central North Atlantic, which in turn tends to steepen the steric gradient as the warming in the North Atlantic begins to lag behind that of the South. By contrast, in HadCM3 the winds become more westerly rather than northerly, there is no dynamically-forced enhancement of surface heat loss and the heat flux in the North Atlantic continues to be strongly positive, relative to the control, leading to a reduction in the meridional steric gradient, and a weaker overturning circulation.

The strong similarity of the Atlantic MSLP response in coupled and slab experiments demonstrates that ocean dynamics are not important in setting the large-scale MSLP response patterns, and that the atmospheric model differences are primarily responsible for the different THC responses seen in HadCM2 and HadCM3 (at least in the long term), rather than any changes in the oceanic components of the coupled models.

Our results suggest that the response in HadCM2 is less credible than that in HadCM3. This is primarily because the long-term THC changes appear to be driven by changes in circulation brought about by the addition of convection of momentum and land surface schemes that represent genuine physical improvements in HadCM3 relative to HadCM2. This suggests that modification of the atmosphere or land surface components of coupled models can influence the long-term evolution of the THC and may have implications for the use of reduced complexity models in the study of THC behaviour and stability. Loss of credibility due to the lower oceanic resolution in HadCM2 (and associated increase in the magnitude of unphysical diffusive terms) appears to be of secondary importance since the differences in response are essentially driven by large-scale atmospheric forcing.

If it transpires that a similar relationship between the oceanic density structure and meridional overturning strength applies to other models as it appears to in HadCM2 and HadCM3, then generalising this mechanistic approach to a wider spread of models will allow us to make progress in reducing future uncertainties by pinning down which mechanisms are important in producing such a spread of predictions, how credible the individual members are in a relative sense, and how we should go about improving the predictions generated by the next generation of models.