1 Introduction

The Atlantic Meridional Overturning Circulation (AMOC) is an important component of the climate system, transporting heat northwards in the Atlantic. Since changes in the AMOC have significant impacts on climate (Zhang et al. 2019; Bellomo et al. 2021), it is of considerable interest to understand how the AMOC might evolve in the future. Climate and ocean models can provide valuable information about AMOC behaviour and future evolution, however they can also suffer from biases and inadequate representation of some processes. Biases in the mean climate have been shown to affect AMOC variability (Menary et al. 2015) and anthropogenic weakening (Jackson et al. 2020; Sgubin et al. 2017; Weijer et al. 2020), and many processes that are believed to be related to the AMOC are not well represented in models, particularly in climate models in which resolution is limited (Fox-Kemper et al. 2019). In particular, the representation of the AMOC might be affected by inadequate representation of: overflows (Yeager and Danabasoglu 2012; Zhang et al. 2011); eddies and their mixing (Bruggemann and Katsman 2019; Tagklis et al. 2020); narrow boundary currents and their transports of heat and freshwater (Talandier et al. 2014); convection (Danabasoglu et al. 2014; Heuzé 2017; Koenigk et al. 2021), sinking (Katsman et al. 2018), the pathway of the Gulf Stream and North Atlantic current (Jackson et al. 2020). Given the potential issues with representing these processes, detailed assessments of AMOC representation in climate and ocean models are necessary.

Recent observational results have shown that our understanding of processes in the subpolar North Atlantic is incomplete (Lozier et al. 2019). The previous paradigm of ocean variability found buoyancy fluxes associated with the North Atlantic Oscillation (NAO) over the Labrador Sea (LS) driving AMOC variability (Robson et al. 2012; Yeager and Danabasoglu 2014; Kim et al. 2020), with strong statistical relationships found between the AMOC and LS properties such as mixed layer depth (a proxy for deep convection), deep densities, and the formation of Labrador Sea water (Ortega et al. 2021; Danabasoglu et al. 2016; Roberts et al. 2013). However, observations from the OSNAP campaign (Lozier et al. 2019), which measures the overturning from Newfoundland to Greenland, and Greenland to Scotland (blue and cyan lines, Fig 1), have shown a much stronger overturning across the east section of OSNAP (OSE) than the west section (OSW) in both depth and density space. This implies that northwest of OSW (which is most of the LS) there is little densification or sinking, casting doubts on climate and ocean models which are largely responsible for the previous paradigm that buoyancy fluxes over the LS are driving AMOC variability.

The observations from OSNAP have been supported by other estimates with a variety of observational methods. These studies support the findings by Lozier et al. (2019) that the overturning across OSW is small with values of 1.5–3.4Sv (Pickart and Spall 2007; Chafik and Rossby 2019). Further studies have shown that the stronger overturning across OSE has at least half originating in the Iceland and Irminger Seas (IIS) (between OSE and the sills along the Greenland–Iceland–Scotland ridge, green line in Fig. 1), rather than further north in the GIN seas (Petit et al. 2020; Desbruyères et al. 2019; Chafik and Rossby 2019).

These various observational results have driven more analysis of the subpolar overturning in models. As well as comparisons of the overturning in density space across OSNAP sections (Li et al. 2019; Menary et al. 2020; Jackson et al. 2020), analysis in density space has made analysis of water mass transformation valuable (Langehaug et al. 2012; Sidorenko et al. 2020, 2021; Oldenburg et al. 2021; Menary et al. 2020; Megann et al. 2021). Water mass transformation (WMT) is the transformation of water from one density class to another. For the AMOC in density coordinates, the circulation of lighter waters transported northwards and denser water southwards must be closed by the transformation from lighter to denser density classes. Hence, the AMOC in density coordinates can be reconstructed from density transformations, assuming that circulation and transformations are in balance (Groeskamp et al. 2019; Marsh 2000). At short timescales, in particular seasonally, they are not in balance because of the transit time between the transformation at the surface and the propagation of the newly dense water southward (Kostov et al. 2019; Petit et al. 2020; Le Bras et al. 2020), however studies have shown good agreements on decadal timescales and longer (Grist et al. 2009, 2012). Most of the WMT occurs at the surface from surface buoyancy fluxes. Hence, a reconstruction of the AMOC from the WMT from surface fluxes alone has been found to well represent the mean and decadal changes of the AMOC (Jackson et al. 2020; Megann et al. 2021; Langehaug et al. 2012). There may be a lag between surface flux changes and overturning changes (Josey et al. 2009). This paradigm allows a simple way of relating the AMOC to surface fluxes, and aids analysis. Several studies have shown coupled models agreeing with observations that most overturning and WMT from surface fluxes (SFWMT) occurs to the east of Greenland (Sidorenko et al. 2020; Oldenburg et al. 2021; Menary et al. 2020; Yeager et al. 2021), though one coupled model and several forced ocean models have been found to have large overturning in the LS (Oldenburg et al. 2021; Xu et al. 2018; Li et al. 2019). However, even though the east subpolar Atlantic might dominate the mean overturning, the west could still be important for decadal variability. Modelling studies have found a variety of results for the relationships between the overturning and LS properties. These include: the decadal variability is still driven by surface fluxes in the LS, despite it having a weaker mean strength (Yeager et al. 2021; Oldenburg et al. 2021; Sidorenko et al. 2021); the variability is driven by fluxes in the Iceland and Irminger sea (IIS), with density anomalies propagating into the LS and affecting densities and mixed layer depths there (Menary et al. 2020); surface fluxes are covarying over the LS and IIS (Megann et al. 2021; Yeager et al. 2021).

In this study, we use a subset of CMIP6 climate models to address the questions of how the time mean and multidecadal variability of the SFWMT relate to the overturning in different regions, and whether the SFWMT can be used as a proxy. We also investigate how well the models compare to observations and what controls differences in the overturning in the LS. Section 2 describes the models and methods used. Section 3 examines the mean state of the overturning and SFWMT, firstly in more detail in two resolutions of the CMIP6 model HadGEM3-GC3.1, and then in a selection of CMIP6 models. Section 4 analyses the same models, but for multidecadal variability, and then conclusions are presented in the final section.

2 Models and methods

2.1 HadGEM3-GC3-1LL/MM

Much of the analysis focuses on the coupled climate models HadGEM3-GC3-1LL and HadGEM3-GC3-1MM (LL and MM), both of which contributed to CMIP6. These are two different resolutions of a global, coupled climate model with atmosphere (UM), ocean (NEMO), sea ice (CICE) and land (JULES) components, with details described in Kuhlbrodt et al. (2018) and Williams et al. (2018). HadGEM3-GC3-1LL has an atmospheric resolution of approximately 135 km and an ocean resolution of 1\(^{\circ }\); HadGEM3-GC3-1MM has an atmospheric resolution of approximately 60 km and an ocean resolution of 0.25\(^{\circ }\). Both models have the same vertical resolution. Differences in parameters and parameterizations are described in Kuhlbrodt et al. (2018), and include a parameterization for eddy-induced transports in LL, but not in MM.

The experiments analysed are 500 year long preindustial controls.

2.2 CMIP6 models

We use preindustrial controls for a set of CMIP6 models in addition to HadGEM3-GC3-1LL and MM, selected from those models which had the required data available (temperature, salinity, surface heat and freshwater fluxes and AMOC), and also for diversities in institution and ocean model. Consideration was also given to AMOC mean strength to include several models with strengths at 26.5\(^{\circ }\) N which agreed with observational estimates, but to also ensure that models characterised by overly strong and weak AMOC strengths are also included (Weijer et al. 2020). The models used are: ACCESS-CM2 (Dix et al. 2019), CanESM5 (Swart et al. 2019), CNRM-CM6-1 (Voldoire 2018), EC-Earth3-Veg (EC-Earth Consortium (EC-Earth) 2019), IPSL-CM6A-LR (Boucher et al. 2018), MPI-ESM1-2-LR (Wieners et al. 2019), MRI-ESM2-0 (Yukimoto et al. 2019) and NorESM2-MM (Bentsen et al. 2019).

2.3 SFWMT from an atmospheric reanalysis

The water mass transformation is estimated from observational datasets for comparison with the models. We estimate the heat and freshwater fluxes from the atmospheric reanalysis National Centers for Environmental Prediction (NCEP)/National Center for Atmospheric Research (NCAR) (Kalnay et al. 1996). To estimate density at the surface, we use a combination of sea surface temperature from NCEP/NCAR and subsurface salinity at 5m depth from EN4.2.1 (Good et al. 2013). These fields are sub-sampled onto a common grid of 30 km. The reanalysis provides monthly estimates of the variables from 1980 to 2018, which allow us to average the water mass transformation over 39 years.

2.4 Observations

Our estimations of SFWMT and overturning are compared with numerous observational estimates. Previous studies estimated SFWMT over areas close to our definition in Fig. 1 with various atmospheric reanalysis. Desbruyères et al. (2019) estimated transformation of 5.4 ± 0.4 Sv over GIN and 15.4 ± 1.8 Sv over the entire subpolar gyre from three atmospheric reanalyses (NCEP2, ERA-I, and CERES). Marsh (2000) also estimated a transformation of 15.5 Sv north of 45 N by using COADS1a fluxes. More recently, Petit et al. (2020) estimated SFWMT of 7 ± 2.5 over the IIS, 1.5 ± 0.7 over LS and 4.7 ± 1.5 Sv over GIN from the atmospheric reanalyses NCEP and ERA5.

The overturning across OSW and OSE have also been estimated using different approaches. These include direct observations at the AR7W hydrographic line near OSW (2 Sv by Pickart and Spall 2007), direct observations from the mooring array OSNAP (2.6 ± 0.3 at OSW and 16.8 ± 0.6 at OSE by Li et al. (2021)), and estimations derived from a composite of direct measurement of currents and moored current meters at the Greenland–Scotland Ridge (5.7 ± 0.7 Sv by Osterhus et al. (2019)). Other observations from a regional thermohaline inverse method (Mackay et al. 2020) that suggest large values for the LS overturning (6–9 Sv) are not comparable because they identify Labrador Sea waters by temperature and salinity characteristics, rather than geographical location. We also consider estimates of the overturning convergence in different regions from volume budgets that combine direct measurement of currents, hydrography from profiling Argo floats and satellite altimetry data (9.6 ± 3.4 Sv over IIS and 8.8 ± 0.8 over GIN by Chafik and Rossby (2019); 10.2 ± 1.7 Sv over IIS and 6.3 ± 1 Sv over GIN by Sarafanov et al. (2012)). Finally, an overturning of 14.3 ± 1.4 Sv was derived at 45N by combining geostrophic thermal-wind currents with altimetry-derived sea-surface geostrophic velocities (Desbruyères et al. 2019).

For comparison with CMIP6 models we also calculate observational values of certain metrics. For LS surface salinity we use salinity from EN4.2.1 (Good et al. 2013) and use an estimate from recent years (2000–2014) where there are more observations, and from an earlier period (1900–1950) which is more comparable to the preindustrial period used in the models. We also calculate LS surface salinity for 2000–2014 from the CORA dataset (Cabanes et al. 2013). For March ice extent we use sea ice concentrations from HadISST (Rayner et al. 2003), and again use both an earlier estimate (1900–1950) and a present-day estimate (2000–2022). For March MLD we use the March climatology of (de Boyer Montégut et al. 2004) with a density criteria of 0.03 kg/m\(^3\). Given that this uses a relatively coarse ocean grid (2\(^{\circ }\)) compared to the models, we might expect that the maximum over the area to be a bit lower than in the models.

2.5 Methods

2.5.1 Overturning

We calculate the overturning for LL and MM in density space across various sections. The overturning profiles show the cumulative (in density space) volume transport across the sections in the same way as an overturning streamfunction, but defined across sections. The difference in the overturning profiles between two density classes then gives the total volume transport between those profiles. The sections are shown in Fig. 1a: these are OSNAP west (OSW), OSNAP east (OSE), the Greenland to Scotland sills (Sills), the Fram strait (Fram) and across the Atlantic at 45\(^{\circ }\) N (45 N). For each of these sections a line is defined along vorticity points of the Arakawa C grid (Madec 2008) that are as close as possible to the observed sections (Lozier et al. 2019). We use this line to extract volume fluxes on their natural grid points and preserve the model transports. These transports are regridded into density space and the overturning is calculated by summing the transports along the line, and then cumulatively summing in density space (see Menary et al. (2020); Jackson et al. (2020)). Since there can be net transport through the section, we set the overturning to be zero at the ocean floor so the overturning profile is equal to the net transport at the surface. This means that we can focus on comparing overturning in the denser levels between models and with observations, with little impact from the net transport (Zou et al. 2020). The overturning across each section is denoted as \(M_{OSW}\), \(M_{OSE}\), \(M_{Sills}\), \(M_{Fram}\) and \(M_{45N}\). We use density referenced to the surface (sigma0) so that the overturning is directly comparable to the implied overturning from SFWMT (see next section), and for comparison with OSNAP observations which also use sigma0. However, it should be noted that sigma0 can be non-monotonic in the deeper ocean, and comparisons with density referenced to 2000 m (not shown) show a slightly stronger overturning across OSE and OSW. Calculations of overturning with HadGEM3-GC3-1LL do not include parameterised eddy transports, however these are found to be small across these sections.

We also define the convergence of the overturning in regions bordered by these sections (Fig. 1b). Hence, the convergence in the Labrador Sea is \(M_{LS}=M_{OSW}\) (excluding the small transport through the Davis Strait); the convergence in the Greenland–Iceland–Norway (GIN) Seas is \(M_{GIN}=M_{Sills}-M_{Fram}\) (excluding the small transport through the North Sea between Britain and mainland Europe); the convergence in the Iceland-Irminger Seas is \(M_{IIS}= M_{OSE}-M_{Sills}\); the convergence in the subpolar gyre is \(M_{SPG}=M_{45N}-M_{OSE}-M_{OSW}\). Then we can note that

$$\begin{aligned} M_{45N}-M_{Fram}=M_{SPG}+M_{IIS}+ M_{GIN}+M_{LS}. \end{aligned}$$
(1)

Since \(M_{Fram}\) is small, we can regard the transport across 45 \(^{\circ }\)N as being the sum of the convergences in the SPG, IIS, GIN and LS regions.

2.5.2 Water mass transformation

It has previously been shown (Marsh 2000; Josey et al. 2009) that if you have lighter waters flowing into a region and denser waters being exported, then you can relate the overturning to the rate of transformation of water from lighter to denser density classes. This assumes that the region is in a steady state so that water masses created are exported, rather than stored. The main component of the transformation is from surface fluxes (both heat and freshwater fluxes) although mixing (Sidorenko et al. 2021; Xu et al. 2018), cabbeling and thermobaricity (McDougall 1987) can also play roles. Hence, we can estimate the water mass transformation (WMT) from surface fluxes alone (Josey et al. 2009; Desbruyères et al. 2019; Langehaug et al. 2012; Jackson et al. 2020; Megann et al. 2021).

To calculate the surface flux water mass transformation (SFWMT), we first calculate the surface buoyancy flux (see also Marsh (2000); Groeskamp et al. (2019)) using

$$\begin{aligned} B= -\alpha \frac{Q}{C_p} - \beta \frac{\rho s W}{1-s} \end{aligned}$$

where Q is the surface heat flux, Cp the specific heat capacity of water, \(\rho \) the surface density, s the non-dimensional surface salinity and W the surface fresh water flux (from precipitation, evaporation, runoff and ice processes). We also use the thermal (\(\alpha \)) and haline (\(\beta \)) expansion coefficients which are calculated at each grid point from the gradient of surface density with respect to temperature and salinity.

We then calculate the area integrated surface buoyancy flux \(B_A(\rho )\) over the area north of where the isopycnal \(\rho \) outcrops and within each region A. The SFWMT is then

$$\begin{aligned} F_A(\rho )=\frac{\partial B_A(\rho )}{\partial \rho } \end{aligned}$$

which gives the overturning implied from transformation by surface fluxes alone.

While water mass transformation can be related to the overturning, water mass formation (WMF) instead shows where transports of water of given density classes are created and destroyed. Water mass formation is given by \(\Delta F_A(\rho )\), where we use a bin size of 0.1 kg/m\(^3\) for the differences.

Although there is an assumption that the overturning is in balance with surface fluxes, this may not hold on shorter time scales (Petit et al. 2021; Kostov et al. 2019). Previous studies (Grist et al. 2009, 2012) showed that there was reasonable agreement between the variability of the overturning and SFWMT on decadal timescales and longer, though there may be lags of a few years between two (Josey et al. 2009; Desbruyères et al. 2019). Hence, we limit our analysis to using decadal means. However, all calculations of WMT and overturning are done using monthly mean fields to account for the impact of the seasonal cycle of density and surface fluxes on the SFWMT, with results shown as decadal means.

3 Mean state

3.1 HadGEM3-GC3-1 overturning

The Atlantic overturning streamfunction in density space in both LL and MM shows a typical AMOC overturning cell, with surface waters becoming denser as they move northwards in the North Atlantic, and then dense water flowing southwards (Fig. 2). Much of the densification occurs south of 67\(^{\circ }\) N, but there is some water which flows into the GIN seas (north of 67\(^{\circ }\) N), becoming very dense there. However, this very dense signal is lost as the water returns south, because as the dense water passes over the sills between Greenland and Scotland it mixes with lighter waters in overflows (Legg et al. 2008).

The overturning across the sections (Fig. 1) is shown in Fig. 3a, b. Observations show overturning transports across OSE and OSW are 16.8 ± 0.6 and 2.6 ± 0.3 Sv, respectively (Li et al. 2021), and Menary et al. (2020) and Jackson et al. (2020) have previously shown that the OSNAP sections in these models compare well with observations, both in the mean state and monthly variability. In both models, the magnitude and density of the maximum overturning across 45\(^{\circ }\) N is similar to that across OSE, suggesting little modification of deep transports between the OSNAP line and 45 \(^{\circ }\) N, though transports in the upper limb become denser in the SPG in MM. Transports across the Sills section account for some of the transport across OSE (44\(\%\) in LL and 27\(\%\) in MM). The transports across the Sills at the densest levels do not reach OSE (resulting in a negative contribution from IIS, Fig. 3c, d), likely because diapycnal mixing in the overflows shifts transports to lighter density classes. There is some very dense water that passes through the Fram Strait from the Arctic. These sections suggest that this might continue to the Sills section.

Since the sum of the overturning convergences is approximately equal to \(M_{45}\) (since \(M_{Fram}\) is relatively small; see Eq. 1) we can investigate which region has the largest contribution to the overturning across 45\(^{\circ }\) N (Fig. 3c,d). Results show contributions from SPG at around 1026.5–1027.5 kg/m\(^3\) (though this is small in LL), contributions from IIS at around 1027–1027.8 kg/m\(^3\), small contributions from LS at around 1027.5–1027.8 kg/m\(^3\) and contributions from GIN at around 1027.3–1028.2 kg/m\(^3\). In particular, we note that the region with the largest contribution to the peak overturning at around \(\rho = 1027.6\) kg/m\(^3\) is IIS in both models, though in LL there is a similar contribution from the GIN seas.

There are some differences between the two models. MM has a stronger overturning at 45\(^{\circ }\) N (12.8 and 17.4 Sv for LL and MM respectively), which can be attributed to a stronger contribution from IIS. MM also has a slightly greater overturning from the LS and weaker overturning from the GIN seas. Jackson et al. (2020) attribute this difference to a stronger subpolar gyre and a more westerly position of the North Atlantic current in MM, resulting in greater transport of warm, saline subtropical waters into the western subpolar North Atlantic, rather than the GIN seas, and hence more heat loss and WMT in the LS. Another difference is that the upper branch of the overturning across 45\(^{\circ }\) N is lighter in MM than LL, with greater transformation to denser levels in the SPG. This can be related to temperatures biases in the models, with LL having a large cold bias across the subpolar North Atlantic, so has less heat loss and SFWMT there (Jackson et al. 2020).

3.2 HadGEM3-GC3-1 surface flux water mass transformation

To understand how much of the overturning in density space can be attributed to surface fluxes, we calculate the implied overturning convergence from SFWMT. The SFWMT (Fig. 3e, f) have a lot of similarities with the overturning convergences (Fig. 3c, d). In particular, the SFWMT is of similar magnitude to the overturning in all regions. Differences between the overturning and SFWMT are likely to be caused by diapycnal mixing, with the time-dependent storage and release unlikely to have a role in the long-term average.

A greater physical understanding can be gained by examining water mass formation as well as transformation from surface fluxes. Since formation is calculated as the difference of SFWMT across a density bin, we compare this to the actual transport in that density bin (rather than the overturning which is the depth-integrated transport). The horizontal convergences of transports and WMF in each region are shown in Fig. 4.

In the SPG, upper panels of Fig. 4 show import of waters < 1027.1 kg/m\(^3\) and export of waters of 1027.2–1027.5 kg/m\(^3\), with the bottom panels showing the destruction and formation of those respective water masses by surface fluxes. The density class exported from the SPG (1027.2–1027.5 kg/m\(^3\)) enters the IIS and GIN seas, where it is transformed by surface buoyancy fluxes to denser classes of water. In the IIS waters of density 1027.3–1027.7 kg/m\(^3\) (slightly denser in MM) are formed by surface fluxes, however the water exported is denser suggesting that mixing with denser waters within the IIS is important in setting the waters exported from the IIS (and across 45\(^{\circ }\) N). In the GIN seas dense waters (1027.85–1028.05 kg/m\(^3\)) are formed, with some mixing modifying the dense waters exported from the GIN seas. Most of these dense waters are imported into the IIS (although there is some exchange across the Fram strait), however these dense waters are not exported across OSE (Fig. 3a, b). They are likely destroyed in the IIS by mixing to lighter density classes, contributing to the large export of waters at around 1027.8 kg/m\(^3\), and the densification of the waters formed within the IIS. However, we note that the total export of dense waters in IIS (Fig. 3c, d) has a similar magnitude to that implied by the WMT, so the mixing shifts the transports to different density classes, but does not change the total transport in the lower limb of the overturning. In the LS there is formation of dense waters at 1027.7–1027.85 kg/m\(^3\) (slightly denser in MM). This peak, taken together with the peak in the SPG at similar densities (likely because the OSW line dividing LS and SPG does not capture all the WMF in the LS region), explains the peak in total SFWMT in both models. The water exported is modified by mixing. In particular, in MM the resulting export and overturning have a double peak, which is similar to that found in the observations (Lozier et al. 2019). We hypothesise that this is a result of mixing of water formed in the LS with different water masses.

Although the LS (and dense contribution from the SPG) dominates the peak in water mass formation, this only occurs over a small density bin. Since the overturning is related to the transformation (the cumulative sum of the formation), the transformation in the IIS, which occurs over a larger density range, is larger than that in the LS.

We find there is a clear role for mixing in modifying water masses after formation, however we note that the SFWMT is a reasonable predictor of the overturning from each region, even in the IIS and LS where mixing is found to be important. This is likely to be because, in many cases the mixing modifies the densities of transports within the region, resulting in the overturning profile shifting to different density classes, rather than changing the maxima.

3.3 CMIP6

We have shown that in LL and MM the overturning profiles implied by SFWMT are a reasonable approximation for the actual overturning profiles. Previous studies have found that SFWMT is also a reasonable approximation for the overturning in other models (Megann et al. 2021; Langehaug et al. 2012; Grist et al. 2012), though mixing might have a more important role in some models (Oldenburg et al. 2021; Yeager et al. 2021). We make use of an ensemble of CMIP6 models with a range of AMOC strengths (Fig. 5). We find that there is a good agreement between the strengths of the SFWMT north of 45\(^{\circ }\) N and the AMOC overturning in density space across 45\(^{\circ }\) N, where that diagnostic is available, and also a significant correlation between the strength of the SFWMT north of 45\(^{\circ }\) N and the overturning in depth space at 26.5\(^{\circ }\) N.

The SFWMT are shown in Fig. 6. These show qualitatively the same behaviour as in the HadGEM3-GC3-1 models, with the overturning peak in SPG being at a lighter level than that in IIS, and with the peak in GIN being at the densest level. At the density of largest total SFWMT (where the total strength is measured), the IIS SWMT has an important contribution to the total for all models, however SPG and GIN also have large contributions. The SFWMT contribution to the overturning across OSE is stronger than that across OSW in all models. The overturning in the LS has a large range of magnitudes: in most models this is small (1–5 Sv), however in three models (ACCESS-CM2, EC-Earth3-Veg, CanESM5) there is no dense SFWMT in the LS, and in one model (NorESM2-MM) there is overly strong SFWMT in the LS.

Figure 7 compares the SFWMT in the CMIP6 models with various observational estimates. Black lines show SFWMT estimated from observational products from 39 years of data, while symbols show reported estimates from observations of the overturning itself and of the SFWMT from previous studies. In general there is a good agreement between the models and observations, particularly in the GIN and IIS regions. In the SPG there is good agreement of most models, though there is only the one observational estimate (black line). The SPG SFWMT is very weak in two models, CanESM5 and HadGEM3-GC3-1LL, with the latter having a known cold bias in the SPG which reduces heat loss and SFWMT (Jackson et al. 2020). In the LS observations have a range of 1.2–3.4 Sv. Most models agree with a small LS overturning, though NorESM2-MM has a strong SFWMT and three models have very little SFWMT. For overturning across sections rather than in regions, overturning across OSW is the same as in the LS by definition. For OSE there is a large range of observational values, though this is not seen in the SFWMT of individual regions feeding into OSE (IIS and GIN). The total transports across 45\(^{\circ }\) N are often stronger in models than the observations, however this is not clearly the case in any individual region. We note that observations can differ because of different methodologies and different time periods. This leads to uncertainties about the values of long term mean strengths. Although some differences could be caused by neglecting mixing when calculating SFWMT in the models and in some observations, there is no clear difference in observed values of SFWMT compared to velocity-based estimates.

There are many processes in the LS and wider western subpolar North Atlantic (SPNA), that can affect the water mass transformation there, and hence the overturning. Heat loss causes WMT, so the greater the transport into the region of warm, saline subtropical waters, the greater the potential for heat loss and WMT (Jackson et al. 2020). Transport of cold, fresh polar waters via the east and west Greenland boundary currents, and the mixing of boundary and interior waters (Tagklis et al. 2020) can also affect the surface densities and hence the stratification and heat loss. Sea ice could also have an important role in restricting heat exchange between the ocean and atmosphere in winter, and through freshwater fluxes from freezing and melting that have a local effect on the stratification and thus on SFWMT (Langehaug et al. 2012; Kostov et al. 2019; Wu et al. 2021). Also subsurface properties could affect SFWMT through changing stratification, and hence deep convection. This paper does not aim to fully understand the controls on the SFWMT, however can provide some information on the relationships.

Jackson et al. (2020) suggested that the amount of subtropical waters reaching the western SPNA affects the SFWMT occurring there. Salinity is a better indicator of this water mass, since heat loss to the atmosphere modifies the surface temperature. We see correlations across the models of maximum SFWMT in the LS and IIS with LS (50–60\(^{\circ }\) N, 45–55\(^{\circ }\) W) salinity (Fig. 8a,b) and temperature (not shown). Since the LS SFWMT is also correlated with the salinity in the IIS (upstream of the LS, not shown), this suggests that the relationship is not caused by local effects on salinity (such as convection) in the LS. Those models with warm, salty waters in the IIS and LS have stronger SFWMT there and those with cold, fresh waters have weak SFWMT (with the freshest models having no SFWMT in the LS).

Figure 8c–f show relationships between the SFWMT in the LS and IIS and both March sea ice extent and March mixed layer depth (MLD; a proxy for deep convection). The correlations are only significant for the SFWMT in the IIS, since NorESM2-MM is an outlier in both for the LS. This suggests that ice extent and MLD are not directly influencing the SFWMT in the LS. They may influence the SFWMT in combination with other processes, or may simply be responding to other factors, for example the differences in temperature and salinity. These results, in combination with the strong correlation between LS SFWMT and IIS salinity (upstream of the LS), suggest that the drivers of differences in LS SFWMT are not processes local to the LS.

Using observational constraints on SFWMT, salinity and sea-ice suggest that those models with moderate-stronger LS SFWMT and IIS SFWMT have the best agreement with observations. However, March MLD is overestimated in nearly all the models. This shows that models can have good agreements of the SFWMT, salinity and ice extent with observations, but have much too deep a mixed layer.

4 Decadal AMOC variability

4.1 HadGEM3-GC3-1 overturning

Although the overturning strength is often measured as the maximum overturning in density (or depth) space, it should be noted that the density class where the overturning is strongest differs substantially between sections and regions (Fig. 3). One method for measuring contributions to the variability of the AMOC at 45\(^{\circ }\) N is to use a fixed density level for all regions (chosen to be the density where the AMOC at 45\(^{\circ }\) N is maximum). In LL the maximum of the mean overturning is at 1027.58 kg/m\(^3\) and in MM at 1027.63 kg/m\(^3\), with the density of maximum overturning varying little between decades (up to 0.04 kg/m\(^3\)). The AMOC strength at 45\(^{\circ }\) N and at this density is defined as m45, with both models showing multidecadal variability (Fig. 2c). One advantage of using a fixed density level is that we can make use of Eq. (1) to quantify the contributions of different regions to the AMOC timeseries at 45\(^{\circ }\) N. Figure 9 shows regressions (bar lengths) and correlations (numbers) of m45 with the timeseries at the various sections and regions in MM and LL. Note that since \(M_{Fram}\) is not included, the regressions do not quite sum to one. In both models the strongest correlations and regressions are with the transport across OSE and the convergence in IIS. For LL there are also significant contributions to decadal variability from LS, and in MM there are significant contributions from the SPG.

Although using a fixed density level helps us to quantify contributions from different components, mixing could shift the density class of a signal between different regions. Hence, a greater understanding is achieved through looking at correlations and regressions of overturning profiles with m45. These are shown in the upper two rows of Fig. 10. At the density of maximum overturning (dashed grey lines), the regressions are the same values as shown in Fig. 9, showing strongest regressions with IIS. At denser levels we see significant relationships of the m45 with the overturning in other regions: in LL there is a significant relationship (though regression coefficients are relatively small) with the convergence in the GIN seas; in MM there are strong correlations and regression with the overturning in the LS.

4.2 HadGEM3-GC3-1 SFWMT

Understanding the roles of surface flux driven transformation in overturning variability is useful for understanding mechanisms. We may also be able to understand better whether the SFWMT is a reliable indicator of actual overturning variability. Table 1 shows regressions of decadal timeseries of overturning convergences within each region against timeseries of the implied overturning from SFWMT. Timeseries are calculated using the maximum in density space, to allow for potential shifts of the profile in density space from mixing, and correlations are strongest at zero lag. For most regions the implied overturning from SFWMT is a good indicator of actual overturning variability on decadal timescales, with significant correlations and regression coefficients near 1. In most of these regions the regression coefficient is slightly smaller than 1 implying that the magnitude of overturning variability is smaller than that of SFWMT. The exception to this result is the GIN seas, where the overturning variability is half that of the SFWMT in LL, while in MM they are not significantly correlated. This could be because the formation and export of water masses in the GIN seas are not in balance on decadal timescales (leading to the storage of density anomalies in the GIN seas), because some water masses formed in the GIN seas are exported northwards into the Arctic, or because mixing has a large role in modifying variability in the GIN seas. The weaker relationship between overturning and SFWMT in GIN affects that in the sum of the regions (TOT), with higher regression coefficients found when excluding the GIN region (TOT-GIN).

As well as examining the relationships between SFWMT and overturning convergences in each region, we can also examine how the SFWMT in each region is related to the total overturning across 45\(^{\circ }\) N. Figure 10e and f shows regressions of the SFWMT with m45. There are many similarities with the regressions with the overturning convergences (Fig. 10c, d), but also some differences. There is good qualitative agreement around and above the density of the maximum AMOC (around 1027.6 kg/m\(^3\)). At denser levels (around 1027.75 kg/m\(^3\)), the total SFWMT is much stronger in LL than the actual overturning, suggesting that variability from the WMT by surface fluxes is damped, possibly by mixing. This peak in total SFWMT has contributions from the SPG (particularly in LL), which likely occurs near the Labrador Sea, but south-east of OSW, since there is a similar signal in the SFWMT in the LS. The strong relationship with the SFWMT in the SPG at this density is not seen in the actual overturning, suggesting that it is obscured by mixing, or possibly by longer residency times than a decade. In the LS there are also differences in MM, with the actual convergence showing a double peak in the regression coefficient, whereas the SFWMT only has one peak. Again we hypothesise that the upper peak is driven by mixing.

In the GIN seas there is a strong relationship between SFWMT and m45 at densities higher than 1027.8 kg/m\(^3\), resulting in regression coefficients of 0.5–0.6 (Fig 10e and f). This implies that for every 1 Sv of variability in m45 there is 0.5–0.6 Sv of variability of the SFWMT in the GIN seas. However, this only translates into 0.1–0.2 Sv of overturning across the Sills (Fig. 10a, b). In MM there also is little actual convergence (Fig. 10d), so the SFWMT variability is dissipated by mixing or the residency time in the GIN seas. The small regression values for transports over the Sills suggest that variability of GIN seas overturning cannot have a substantial impact on the overturning at 45 N. It is possible that the correlations are caused by co-varying surface fluxes, or that overturning variability south of the Sills affects the transport of lighter, warmer waters into the GIN seas, and that this affects the transformation there.

4.3 CMIP6

The CMIP6 models exhibit variability of various timescales and magnitudes (Fig. 5b). Since previous studies (Grist et al. 2009, 2012; Megann et al. 2021), and the previous analysis of LL and MM, have shown good agreement between total SFWMT and AMOC timeseries on decadal timescales and longer, we limit our analysis to the variability of decadal mean SFWMT which will inform us about multidecadal variability. For those CMIP6 models where the AMOC in density space is available (Fig. 5), we find significant correlations in all models between decadal means of the AMOC in density space at 45\(^{\circ }\) N and the total SFWMT north of 45\(^{\circ }\) N, either including or not including the GIN seas region (since SFWMT formed here may not be exported across the sills).

The standard deviations of decadal mean SFWMT are shown in Fig. 11, and show large variability (> 1 Sv) in all models in SPG, IIS and GIN. However, there are large intermodel differences in the magnitude of variability in the LS, with some models showing large variability and others showing very little variability. The standard deviation is correlated to the mean LS SFWMT (not shown), with models with weak mean SFWMT having very little variability and models with strong mean SFWMT having larger variability. If variability in each region was independent and uncorrelated then the sum of variability (black dashed line; calculated as the square root of the sum of individual variances) would be the same as the total (black line). For some models and density classes the sum is larger than the total, implying positive correlations between the components, and in some it is smaller, implying negative correlations.

Since we only have the actual overturning in density space from a few models, we cannot calculate regressions of SFWMT with m45, as done for HadGEM3-GC3-1LL and MM in Fig. 10. Instead we calculate regressions of SFWMT with the AMOC at 26\(^{\circ }\) N in depth space (m26z; Fig. 12). We note that comparison of regressions with m26z, with the AMOC at 45\(^{\circ }\) N in depth space (m45z) and m45 (where available), mostly show the same relationships, apart from MRI-ESM2-0 and ACCESS-CM2, where differences in responses are within the range of the ensemble (not shown).

All models show significant regressions with SFWMT in the GIN seas (purple lines for GIN are overlain by black lines for TOT in many cases), however we note that in LL and MM the resulting transport across the Sills associated with m45 (measured by the regression coefficent) is small. Although we do not have the overturning across the Sills section for all the models, we do have the overturning in density space across 67\(^{\circ }\) N (which is close to the Denmark Strait) for three other CMIP6 models. In ACCESS-CM2 there is a significant correlation with m45, with a regression coefficient of 0.4 (40\(\%\) of the regression coefficient for SFWMT); in NorESM2-MM the correlation is significant, but with a small regression coefficient of 0.1; and in MRI-ESM2-0 the correlation is not significant (not shown). Hence, the GIN seas might have a larger role in some models, for instance in ACCESS-CM2 a 1 Sv change in m45 is associated with 1 Sv change in GIN SFWMT and 0.4 Sv change in the overturning across the Denmark Strait. However, in all models the variability of transports across the sills associated with m45 is less than half of, and in some cases much smaller than, the variability of GIN SFWMT.

If this is true for the remaining models, then the variability of SFWMT in the GIN seas would not contribute to the AMOC variability further south. All models show significant correlations of m26z with SFWMT in lighter waters of the SPG, and most models show significant correlations with SFWMT in IIS and/or LS in denser water classes. Although most of the relationships are the same or less significant if considering m26z lagging by 10 years, in two models (MPI-ESM1-2-LR and MRI-ESM2-0), there is a significant correlation of m26z with the SFWMT in the LS in the previous decade, rather than instantaneously (Fig. 13).

The regressions of LS SFWMT with m26z vary a lot between models. In the three models with weak mean SFWMT in the LS (ACCESS-CM2, EC-Earth3-Veg, CanESM5), there is no correlation with denser LS SFWMT because there is little variability. If we order the models from the model with weakest LS SFWMT to strongest (Fig. 13) we can see this is part of a pattern: models with a stronger mean LS SFWMT have stronger regressions of LS SFWMT against m26z and the largest regressions generally occur at denser levels. Those models with the best agreements with observations of mean LS overturning (IPSL-CM6A-LR, HadGEM3-GC3-1LL, MPI-ESM1-2-LR, CNRM-CM6-1) suggest overturning changes of \(\sim \) 0.5 Sv in the LS overturning for 1 Sv of overturning at m26z. However, these relationships are mostly at denser levels than the maximum of the overturning and it is unclear how much they are driving variability of the AMOC at 45 or 26\(^{\circ }\) N.

Although there are relationships between the mean state and variability of overturning in the LS, there are no clear relationships in other regions. Details of regression patterns vary a lot between models (Fig. 12), possibly because variability in these models differs in terms of the location of the drivers and/or the importance of mixing.

5 Conclusions

This study has examined which regions contribute to the time mean and multi-decadal variability of the AMOC, and how much of the overturning is related to water mass changes driven by surface fluxes. In analysis of two models (HadGEM3-GC3-1LL and HadGEM3-GC3-1MM) it is found that the overturning reconstructed from surface flux driven water mass transformation (SFWMT) is a good indicator of the mean strength of the actual overturning. Mixing modifies densities and can shift the overturning profiles, but does not have significant impact on the maximum overturning strength.

For multidecadal variability, SFWMT is a good indicator of overturning variability (significantly correlated with regression coefficients similar to 1) in all regions except GIN. However, some details, such as the double peak in LS profiles, are not captured by SFWMT, suggesting mixing may play a role. In the GIN seas, although there is strong variability of SFWMT associated with the AMOC, the associated variability in the waters exported across the Sills is found to be much smaller than suggested by the SFWMT. This suggests that the water masses formed are not in balance with those exported south on decadal timescales, so anomalies are either modified by mixing within the GIN seas, or remain in the GIN seas.

In all the models examined here the mean overturning across OSE is stronger than that across OSW, in agreement with observations. These results also agree with observational findings that the IIS is a major contributor to the mean overturning, although SPG and GIN also have large contributions in some models. The overturning in the mean state in the LS is mostly found to be small. Despite many similarities between the mean states of models, relationships of multidecadal variability in SFWMT in different regions and the AMOC at 26\(^{\circ }\) N are very diverse.

Although the mean overturning in the LS is mostly found to be small, strong relationships are found across models, with those models with the freshest LS having the weakest LS overturning and the smallest variability. Those models with a more saline LS have stronger LS SFWMT and larger regression coefficients between the LS SFWMT and the AMOC further south at 26.5\(^{\circ }\) N, possibly indicating stronger causal relationships between variability of the LS SFWMT and the AMOC at 26.5\(^{\circ }\) N.

These results suggest that many of the models examined compare well to observations of overturning, despite previous arguments that many ocean and climate models have too strong an emphasis on the Labrador Sea. In fact, we find here that only one model has an overly strong LS overturning while three have too weak an overturning. However, although this may provide some reassurance as to the validity of these models, there are still issues with the representation of processes such as mixing in overflows, eddy mixing and restratification that could have a detrimental impact on the representation of the AMOC (Fox-Kemper et al. 2019). In particular, it should be noted that none of these climate models have sufficient horizontal resolution to resolve eddies at subpolar latitudes or to resolve narrow boundary currents, which could impact their abilities to represent water mass transformation. Also it is possible that different models (for example with different mixing parameterisations) might have stronger contributions to the overturning from mixing, and hence might have less strong relationships between overturning and SFWMT.

The relationships found here between the overturning in the LS and the salinity there have implications for model development, providing motivations for the reduction of biases. These results also suggest that locations driving variability, and potentially the mechanisms involved, could also be affected by the model mean state. Hence, to understand mechanisms of variability, biases in the mean state should be considered.

Table 1 Regression coefficients (correlations in brackets) for regressions of decadal timeseries of maximum overturning convergences with implied timeseries from SFWMT in models LL and MM
Fig. 1
figure 1

Locations of sections (top) and regions (bottom). Colours indicate the different sections and regions (see legends)

Fig. 2
figure 2

Time mean overturning in density space in LL (top left) and MM (top right). Bottom panel shows timeseries of decadal mean m45 (maximum in density space of the AMOC at 45\(^{\circ }\) N.) Overturning is calculated using monthly mean fields

Fig. 3
figure 3

Overturning across sections (top panels), overturning convergences in regions between sections (middle panels) and SFWMT in regions (bottom panels). Shown are results for LL (left) and MM (right)

Fig. 4
figure 4

Volume transport convergences (top panels) and water mass formation (bottom panels) in regions for LL (left) and MM (right). All are totals in density bins of size 0.04 kg/m\(^3\). Positive (negative) values show southwards (northwards) transports in the upper panels, and formation (destruction) of water masses in the bottom panels

Fig. 5
figure 5

AMOC in CMIP6 models. a Time-mean profiles of AMOC at 26\(^{\circ }\) N in depth space. b Maximum of decadal mean AMOC at 26\(^{\circ }\) N in depth space. c Scatter plot of AMOC at 26\(^{\circ }\) N in depth space against the SFWMT north of 45\(^{\circ }\) N (F45). d As c but for the AMOC at 45\(^{\circ }\) N in density space. The black line is y = x

Fig. 6
figure 6

SFWMT for CMIP6 models. Regions are indicated by the colours (see legend) and panels show different models

Fig. 7
figure 7

Comparison of CMIP6 profiles (colored lines, top legend) with SFWMT calculated from observed surface fluxes and densities (black lines, see section 2.3). Coloured circles show the maxima of the profiles. Symbols show magnitudes of overturning from previous literature with estimates of overturning from velocities in grey and estimates from SFWMT in black (bottom legend). Uncertainty (where given) is shown with horizontal lines, and the vertical positioning of the symbols is arbitrary

Fig. 8
figure 8

Scatter plots comparing the mean SFWMT in a, c, e the LS and b, d, f the IIS with a, b sea surface salinity in the Labrador sea region (50–60\(^{\circ }\) N, 45–55\(^{\circ }\) W), c, d March sea ice extent (area over 50–65\(^{\circ }\) N, 10–60\(^{\circ }\) W), and e, f March mixed layer depth (maximum over 50–65\(^{\circ }\) N, 10–60\(^{\circ }\) W). Symbols show values from CMIP6 models (see legends). Grey horizontal bars show observational estimates of SFWMT, based on observations shown in Fig. 7, not including uncertainties in individual estimates. Vertical dotted lines show observational estimate of LS surface salinity, March sea ice extent and March MLD (see section 2.4)

Fig. 9
figure 9

Correlations (numbers) and regressions (bar lengths) of the m45 timeseries (AMOC at 45\(^{\circ }\) N and 1027.6 kg/m\(^3\) density) with the overturning across sections, or convergence of overturning in regions, measured at 1027.6 kg/m\(^3\) density. Left bars are from LL and right for MM

Fig. 10
figure 10

Regressions of the m45 timeseries (AMOC at 45\(^{\circ }\) N and 1027.6 kg/m\(^3\) density) with the overturning across sections at different densities (upper panels), the convergence of overturning in regions (middle panels), and the SFWMT in regions (lower panels). LL is shown in the left panels and MM in the right panels. Dotted lines indicate where the regressions are deemed not significant (P < 0.05), and the horizontal grey dashed lines show the density of the AMOC maximum at 45\(^{\circ }\) N

Fig. 11
figure 11

Standard deviations of decadal mean SFWMT in different regions and different models. Black dashed line shows the square root of the sum of the variances of the SFWMT in the GIN, LS, IIS and SPG regions. If the variability in each region was independent of each other then this would be the same as the standard deviation of the whole (black line). In all panels the TOT line (black) overlays the GIN line (purple) at the densest levels

Fig. 12
figure 12

Regressions of m26z timeseries with the SFWMT in different regions for different models. Dotted lines indicate where the regressions are deemed not significant (P < 0.05). In all panels the TOT line (black) overlays the GIN line (purple) at the densest levels

Fig. 13
figure 13

Regressions of m26z timeseries with the SFWMT in LS. Black lines show instantaneous regressions and blue lines show regressions where m26z lags SFWMT by 10 years. Dotted lines indicate where the regressions are deemed not significant (P < 0.05). Panels are ordered going from models with the weakest mean LS SFWMT (top left) to models with the strongest (bottom right)