1 Introduction

Anthropogenic emissions of carbon dioxide continue to rise as does global mean surface air temperature (Hartmann et al. 2013). Climate models project a continuation of this upwards trend in surface air temperature (Kirtman et al. 2013; Collins et al. 2013). Despite the global mean temperature trend, in the twenty first century these climate projections predict a region of relatively little warming in the North Atlantic, often termed the warming hole (WH, Collins et al. 2013), which is projected to show the smallest temperature anomaly anywhere on the planet (See Figure 12.10 of Collins et al. 2013). This occurs in the North Atlantic subpolar gyre (NA SPG), a region that has been shown to be important in mechanisms of decadal/multi-decadal variability simulated by climate models (Frankcombe et al. 2010; Liu 2012; Menary et al. 2015a) and crucial for the initialisation of skilful decadal forecasts (Dunstone et al. 2011). The NA SPG is also involved in a large-scale pattern of sea surface temperature (SST) variability in the Atlantic termed the Atlantic Multidecadal Oscillation (AMO, Delworth and Mann 2000) with a wide range of potential climate impacts (Folland et al. 1986; Goldenberg et al. 2001; Sutton and Hodson 2005)

The existence of a warming hole has been linked to projected changes in the large scale circulation of the North Atlantic, such as the Atlantic meridional overturning circulation (AMOC, Drijfhout et al. 2012). However, the disentanglement of cause and effect is made difficult by the dominance of common (i.e. degenerate) linear trends in both projected global mean surface air temperature and local changes in the North Atlantic (such as the AMOC), though the associated patterns and precise magnitudes can be model dependent (Kim and An 2013). In addition, it is not clear that multi-decadal variability in observations or historical/pre-industrial simulations, which often link the AMOC (circulation) with the AMO (SSTs), necessarily arises via the same ocean (or atmospheric) processes as those responsible for projected future trends (e.g. Knight et al. 2005; McCarthy et al. 2015). For example, recent fingerprinting studies using either historical or control simulations show that the expected response of the NA SPG to increasing external forcing is a cooling, after removing the AMO signal (Ting et al. 2009; DelSole et al. 2011), which, if both the cooling and the AMO-related warming are mediated via the AMOC, highlights the degeneracy of the various processes/phenomena involved.

A significant role for the AMOC in a projected WH is suggested in idealised studies (Winton et al. 2013; Marshall et al. 2015) and in a detailed study of two fully coupled models (Rugenstein et al. 2013). In a recent analysis of the MPI-ESM-LR model, Rahmstorf et al. (2015) find a link between the AMOC and sea surface temperatures in the NA SPG, although this relationship is not necessarily consistent across models or forcing scenarios (Roberts et al. 2013). In a multi-model framework, Drijfhout et al. (2012) investigated the role of the AMOC by separating local SST changes into those due to the global mean and those due to the AMOC via a multiple linear regression. They highlighted the slightly different fingerprints in the historical and future scenario periods, suggesting different processes may dominate in these different times. However, they noted that in the historical period the separation between the radiatively forced component and the AMOC component may still have been incomplete. In addition to this, in the scenario period, although the loading patterns appear more well separated, the variates themselves are likely to have been dominated by similar temporal variability, i.e. a linear trend. Given these difficulties, it remains unclear to what extent the projected WH is a signature of AMOC decline or, for example, changes in local ocean mixing, and whether this is consistent across all available coupled climate models. In this study, we build upon the work of Drijfhout et al. (2012) to further investigate these questions with an increased focus on the mechanisms at work.

Finally, we note it has previously been shown that climate models can show a diverse range of oceanic mean states (Wang et al. 2014; Menary et al. 2015b) and differing strengths of responses to external forcing (Booth et al. 2012; Forster et al. 2013). In addition, key North Atlantic processes, such as the location and strength of deep water formation (DWF), can vary significantly between models (Ba et al. 2014). All of these differences can plausibly lead to fundamentally different mechanisms and timescales of WH onset, magnitude, and spatial structure. As such, it does not necessarily follow that metrics/regressions based on the multi-model mean (MMM) of indices averaged over broad geographic locations reveal the most important locations—and, by extension, timescales—involved. This is especially true if the analysis is limited to only those models that have all the required diagnostics readily available.

In this study we investigate the multi-model archive in greater detail with two primary goals in mind: (1) To test the potential competing mechanisms responsible for the WH, such as changes in DWF, the AMOC, or surface heat fluxes, and determine whether the WHs are consistent with a significant role for the AMOC, despite considerable differences in model simulations of AMOC-related variability (Huang et al. 2014); (2) To test how representative the MMM WH is of the processes occurring in individual models, which may be different in both spatial structure and temporal evolution. We attempt to maximise the power of our analysis by: (1) using all models for which at least the most basic diagnostics are available, (2) applying the same analysis methods to both individual models and the MMM, and (3) using inter-model differences as an additional tool to analyse the ocean processes at work.

2 Data and methods

We use data from 40 models with data archived to the Fifth Coupled Model Intercomparison Project (CMIP5) archive. Three dimensional ocean temperature and salinity (standard name ‘thetao’ and ‘so’) exist on the archive for all the models we analyse, but ocean mass streamfunctions (‘msftmyz’ or ‘msftyyz’), ocean mixing depth estimates (‘mlotst’ or ‘omlmax’), ocean heat transport (‘hfy’), and surface heat flux (‘hfds’), are only uploaded for subsets of the models. We use the historical and future RCP8.5 (Representative Concentration Pathway; intended to yield a net radiative forcing of 8.5 W/m2 at the year 2100, Moss et al. 2010) experiments. In addition to the CMIP5 models, we use the historical and RCP8.5 simulations from the recently developed HadGEM3-GC2 model (Williams et al. 2015), which runs with a significantly higher ocean and atmosphere resolution than the CMIP5 models. For consistency and simplicity, all models are regridded to a 1 × 1° grid for the subsequent analysis and visualisation except where using ocean heat transport (OHT) diagnostics, which must be calculated on the native model grid. For analysis involving OHT, only models for which the relevant ‘hfy’ diagnostics are available are included. We use annual mean data unless otherwise stated. The models and available diagnostics are summarised in Table 1.

Table 1 A summary of the 41 models used in this analysis and the data that was obtained from the CMIP5 archive. Variables used are: ‘thetao’ (ocean potential temperature), ‘so’ (ocean salinity), ‘mlotst’ (ocean mixed layer depth defined by Sigma T), ‘omlmax’ (ocean mixed layer depth defined by model’s mixed layer scheme), ‘msftmyz’ (ocean mass streamfunction in meridional direction), ‘msftyyz’ (ocean mass streamfunction in y-direction), ‘hfy’ (ocean heat transport in y-direction), ‘hfds’ (net heat flux into ocean at the surface). For further details of the CMIP5 models and institutions the reader is referred to Table 9.A.1 of Flato et al. (2013) and references therein. 1For the GISS-E2-R model ocean mixed layer depths were estimated offline using monthly temperature and salinity data following the method of Kara et al. (2000). 2For CESM1-WACCM only data from 1955 onwards were available

3 Results

3.1 Multi-model structure of warming hole

We begin by defining a ‘warming hole’ (WH) at any location as the difference both from the global mean ocean temperature and between a particular period of time and a previous reference period (Eq. 1):

$$\begin{aligned} WH_{i,j,0:500 \, \mathrm{m}} = (T_{i,j,0:500 \, \mathrm{m}}^{2070:2100}- T_{i,j,0:500 \, \mathrm{m}}^{1850:2000}) \nonumber \\ - (T_{global,0:500 \, \mathrm{m}}^{2070:2100} - T_{global,0:500 \, \mathrm{m}}^{1850:2000}) \end{aligned}$$
(1)

Here, we use the difference between the end of the twenty first century (2070–2100) under RCP8.5 and the historical period (1850–2000). If we do not remove the global mean (i.e. the second term on the right hand side of Eq. 1) then the pattern of the WH is unchanged but the magnitude is reduced, although some models (that have the largest relative WHs) still show an absolute cooling (not shown). In addition, we use depth averaged temperatures over the top 500 m of the ocean to both highlight the depth extent of the WH and to reduce noise. In Fig. 1 this is shown for the 41 models as well as the multi-model mean (MMM).

Fig. 1
figure 1

a Maps of the top 500 m depth-average temperature (T500, Kelvin) difference between the local T500 and the global mean T500 and between the future period (years 2070–2100) and historical mean (1850–2000) in CMIP5 models (see Table 1) and the multi-model mean. See Eq. 1. Also shown are 20\(^{\circ }\) longitude by 15\(^{\circ }\) latitude boxes that encompass the region of largest relative cooling, defined as the ‘warming hole’

Similar to the spatial patterns derived by Drijfhout et al. (2012, their Figure 1), the MMM WH exists in the central NA SPG, with a region of relatively strong warming to the south. Averaged across all models, this region is cooler by around 1.5 K (averaged over the top 500 m) than would otherwise be expected if the ocean was to warm at the same rate in all locations. However, there is clearly a large range of responses with individual models showing differences in both the location and magnitude of any WH, if one can even be said to exist (e.g. IPSL-CM5B-LR).

The spread in WH expressions is quantified in Fig. 2 in which all models are compared against the MMM. Although one would expect the MMM to show reduced spatial variability compared to any given model it nonetheless provides a baseline from which to compare individual models against one another. Note however that we are not quantifying which models are more or less correct, merely their spread around the MMM. There are 5 models that correlate with the MMM with r > 0.75 (EC-EARTH, MPI-ESM-MR, CCSM4, CESM1-BGC, CESM1-CAM5) and two that correlate with r < 0.2 (bcc-csm1-1, bcc-csm1-1-m), where the spatial correlation is computed for the entire top 500m depth averaged NA SPG domain as pictured in Fig. 1. In addition, there is one model with a spatial standard deviation twice that of the MMM (FIO-ESM), which also has a large root mean square error (RMSE) compared to the MMM, which may be related to the collapsed AMOC in this model (Sgubin et al. 2017). If we define ‘closeness’ to the MMM as models in the range 0.8 < sd < 1.25, r > 0.5, RMSE < 1.2 we are left with 5 models: CMCC-CESM, inmcm4, IPSL-CM5A-LR, NorESM1-M, NorESM1-ME. Despite similar statistics, visual inspection of the WH patterns in these models (Fig. 1) still highlights important differences in the location, spatial extent, and magnitude of the WH between these models.

Fig. 2
figure 2

A Taylor Diagram of the top 500 m depth-average temperature (T500) warming hole (WH) patterns compared against the multi-model mean (MMM) WH for the domains shown in Fig. 1. Dotted lines represent curves of constant root mean square difference from the MMM

Due to particularities in the specification and treatment of external forcings and model biases/differences in the representation of important ocean processes (such as the location of deep water formation and the strength and route of the AMOC/North Atlantic Current, both of which have been suggested as contributing to the WH) it is perhaps not surprising that there is such a large range of locations/magnitudes of WHs. To attempt to create model-specific definitions of the WH, we allow the WH location to vary from model to model but within two sets of rules: (1) The resulting location must be within the NA SPG, and (2) the size of the region used to define the WH must be 20\(^{\circ }\) longitude by 15\(^{\circ }\) latitude. The first rule is because our focus is on the WH in the NA SPG. The second rule is designed to balance the use of bespoke regions for each model with a simple and consistent treatment across models. The limits are chosen after inspection of the size of the MMM WH in Fig. 1. The model-specific WH locations are thus highlighted on Fig. 1. (Note that boxes of a fixed size in degrees may have different spatial areas but that where area averages are computed these are computed using the actual grid cell areas). The imposition of a fixed size box acts to downweight the magnitude of WHs in models for which the WH is either spatially extensive but weak (e.g. inmcm4) or localised but strong (e.g. ACCESS1-0).

Given the wide range of WH patterns in the horizontal, we now look at the vertical structure of the WHs, using our model-specific definitions of the WH (horizontal) location (Fig. 3). The temperature profiles are presented as horizontal area averages but using each model’s native vertical grid (converted from sigma coordinates where appropriate). The MMM profile highlights relative cooling over the top 1000 m of the water column with the value over the top 500 m now closer to 2.5 K as a result of allowing the WH horizontal region to vary from model to model. Though all models exhibit a WH (by construction), there is still plenty of inter-model diversity. For example, some models show a large relative cooling around 500m (e.g. GISS-E2-H-CC), whereas others show a very shallow and weakly cool layer (e.g. bcc-csm1-1-m). In terms of the surface spatial patterns, the expression of the WH is very muted in MRI-CGCM3 and MRI-ESM1 but somewhat amplified in CSIRO-Mk3-6-0 (not shown). The MMM suggests relative warming below 1000m but for individual models there can be either continued cooling (e.g. GFDL-CM3) or, for models with stronger near-surface cooling, even stronger deep warming (e.g. GISS-E2-R-CC).

Fig. 3
figure 3

Profiles of the temperature difference [Kelvin, K] between warming hole (WH) region and the global mean (at each depth level) and between the future (2070–2100) and historical periods (1850–2000) for CMIP5 models and the multi-model mean (MMM), i.e. the black boxed regions in Fig. 1. See Eq. 1. Also shown are the historical time mean depths of the mixed layer as defined by ‘omlmax’ (dashed) and ‘mlotst’ (dotted) and the depth of Atlantic meridional overturning circulation (AMOC) upper limb (blue)

3.2 Wintertime mixing and the warming hole

One possible cause of a future NA SPG warming hole is a change in deep, wintertime, vertical mixing in the ocean, whereby a reduction in mixing would act to cool the surface ocean as wintertime mixing acts to bring heat up from greater depths. To investigate whether this could explain the WHs seen in the CMIP5 models, we also plot the historical (1850–2000) mean of the March mixed layer depth (MLD) in the model-specific WH regions (Fig. 3, dotted and dashed lines), as defined by the diagnostics ‘mlotst’ and ‘omlmax’ on the CMIP5 archive (see Table 1). We use both diagnostics to maximise the subset of models we can compare as for some models only one or other of the diagnostics is archived. Nevertheless, it can be seen that both ‘mlotst’ and ‘omlmax’ give generally very similar results for the models where both diagnostics could be analysed. In addition, we manually compute the MLD in the GISS-E2-R model in order to sample a model in which the WH is primarily in the Labrador Sea region.

For the simple hypothesis that local mixing can explain the local WH we find that the mixing depths suggest we should reject this hypothesis based on the relatively shallow depth to which mixing extends, at least in the MMM. Nonetheless, for some models, such as CanESM2, ACCESS1-0, HadGEM2-CC, and particularly GISS-E2-R, mixing seems able to explain the WH, raising the possibility that the WH could arise for different reasons in different models. We note that, with the exception of GISS-E2-R, these models exhibit WHs where relative cooling fills less than half of the WH box.

It is possible that vertical mixing that is not co-located with the WH could interact with the local circulation and result in an offset between the location of mixing and the location of the WH, although this would also require the local cooling induced by mixing changes to be subsequently masked by some additional process. Nevertheless, to investigate this hypothesis we also show spatial maps of the historical time mean March-time MLDs as diagnosed by all of the available ‘mlotst’ and ‘omlmax’ diagnostics (contours on Fig. 4). In addition, the location of the maximum WH is shown, as also shown in Fig. 1. The magnitudes and spatial patterns of the mixed layers as measured by ‘omlmax’ or ‘mlotst’ agree well in the MMM, with deep mixed layers in the Labrador Sea, Irminger Current and eastern SPG. As with the WHs, there is considerable variation between the models over both the location and depth of vertical mixing, with some models diagnosing almost no mixing (e.g. IPSL-CM5B-LR), relatively little mixing (e.g. HadGEM2-CC) or plenty of mixing (e.g. MPI-ESM-MR). Note though that a shallow mixed layer in the (March) time mean can be due to either a systematically shallow mixed layer or interannually episodic deep convection, as is the case in HadGEM2-ES and HadGEM2-CC (not shown).

Fig. 4
figure 4

Maps of the March-time mean mixed layer depth as defined by the ‘omlmax’ and ‘mlotst’ diagnostics (contours, at 500 m intervals) for those models for which the data was available on the CMIP5 archive, averaged over the historical period (1850–2000). The multi-model mean (MMM) is shown separately for ‘omlmax’ and ‘mlotst’. Panel title colours alternate between models for clarity. Shading represents the difference between the future (2070–2100) and historical periods (shallowing in blue, deepening in red). Red boxes highlight model-specific warming hole regions, as defined in Fig. 1

The predominant mixing location can vary between the east of the NA SPG (e.g. MRI-ESM1), the centre of the NA SPG (e.g. CanESM2), and the west/Labrador Sea (e.g. CESM1-BGC). However, a common feature of almost all of the models is that, despite allowing the WH locations and deep mixing locations to be model specific, the region of deepest mixing is generally downstream of the WH (if one considers the mean surface currents or barotropic circulation). In addition, the location of largest future changes in the vertical mixing are also co-located with the regions of largest mixing (shading on Fig. 4). This would make it difficult for changes in vertical mixing to directly drive the WH, at least on short (annual) timescales, without invoking further processes, such as changes to the large scale circulation in the region (specific questions involving changes in vertical mixing and the AMOC are addressed in Sects. 3.5 and 3.6). A notable exception to this paradigm is the GISS-E2-R model, in which the WH and both the mean MLD and largest changes in MLD are co-located. As such, for this model, both the horizontal pattern and vertical structure of the WH (Fig. 3) are not inconsistent with an important role for changes in vertical mixing in producing the WH. The GISS-E2-R model is discussed further in Sect. 3.5.

In summary, despite significant differences in WH location from model to model, for most of the individual models and MMM it seems unlikely that the WH is due to reductions in vertical mixing, either in-situ or upstream. Having investigated the structure of the WH across the CMIP5 models, we next investigate the formation of the WH by constructing a simple heat budget using the available diagnostics.

3.3 Heat budget of warming hole region

The full depth anomalous heat budgets for the NA SPG (45–65\(^{\circ }\)N) in the 15 models for which there were surface heat flux (‘hfds’) and northward ocean heat transport (‘hfy’) diagnostics are shown in Fig. 5, along with the ocean heat content anomaly for the same region (time integrated net heat flux). The heat budget is shown as the anomaly from the arbitrary historical period 1850–2000 to highlight the changes in the twenty first century:

$$\begin{aligned} NHF' = hfds' + hfy_{S}' + hfy_{N}' + \epsilon ' \end{aligned}$$
(2)
Fig. 5
figure 5

Time series of anomalous (relative to the period 1850–2000) heat budget components of the North Atlantic subpolar gyre (basinwide, full depth, 45–65\(^{\circ }\)N) for those models for which the data was available on the CMIP5 archive, as well as the multi-model mean (MMM). Also plotted (right hand axis) are the top 500m ocean heat content anomalies (integrated net heat flux) for the Atlantic (black)

Here, NHF is the net heat flux into the ocean volume, hfds is net surface heat flux into the ocean, \(hfy_{S}\) and \(hfy_{N}\) are the northward ocean heat transport at the southern and northern boundaries respectively, and \(\epsilon\) is the contribution from missing diagnostics or the effects of regridding. Primes indicate anomalies with respect to the 1850–2000 time mean. The ocean heat transport and net surface heat flux diagnostics should add to give the rate of change of ocean heat content (e.g. dOHC / dt) but, presumably due to missing diagnostics, this is not the case for some of the models, as can be seen in Fig. 5 (e.g. HadGEM2-ES). Nonetheless, even when non-zero the residual terms (\(\epsilon\)) are generally very small compared to the surface or ocean heat transport terms.

The MMM anomalous heat budget highlights that there is little coherent variability across the models in heat fluxes into the NA SPG prior to around the year 2000 (i.e. anomalies generally cancel) despite individual models suggesting large variability in both surface and heat transport fluxes. In the twenty first century, the MMM highlights the warming in the NA SPG, which is due to an increase in surface heat fluxes (actually a reduction in surface heat loss) that overwhelms a reduction in ocean heat transport. However, for some models these changes begin far earlier in the mid twentieth century (e.g. GFDL-ESM2G). The ocean heat transport can be broken down into fluxes from the south (\(hfy_{S}\)) and from the north (\(hfy_{N}\)) and highlights that it is the south that generally dominates the projected future changes in individual models and the MMM. In some models fluxes from the north act to slightly oppose the changes at the southern boundary, whereas in others they act to reinforce these changes (e.g. GISS-E2-R), and in two models they dominate the southern changes (MRI-CGCM3 and MRI-ESM1).

Although ocean heat transport is acting to anomalously cool the NA SPG—and surface heat fluxes to anomalously warm it—it doesn’t immediately follow that transport fluxes are responsible for the subsequent WH compared to the global mean. To estimate their relative contribution to the WH, we compute the heat budget for the Pacific ocean at the same latitudes and investigate the difference between Atlantic and Pacific heat budget changes (where the Pacific is normalised to an equivalent volume as the Atlantic). However, over the latitude range 45–65\(^{\circ }\)N, either over the full depth or top 500m, the Atlantic warms faster than the Pacific, due in part to subsurface warming as well as the inclusion of near surface warming in the North Atlantic Current region (cf. Fig. 1). There is far more variability in the Atlantic than Pacific heat budget (whether normalised by volume or not) on both annual and decadal timescales for this latitude range, resulting in very little change to the fluxes described by Fig. 5 (and hence not shown). This is consistent with the far larger role for ocean heat transport variability (related to the AMOC) and the associated counter variability in surface fluxes in the Atlantic subpolar gyre. In the Pacific, much of the heat loss has already occurred south of 45\(^{\circ }\)N.

Finally, across all available models, we note that there is a weak inverse correlation between the magnitude of the WH and the ocean-to-atmosphere heat flux, such that the larger the WH, the weaker the surface heat loss from ocean to atmosphere (or, equivalently, the larger the heat flux anomaly into the ocean). To first order, and ignoring potentially important lagged relationships (discussed further in Sect. 3.6), this is consistent with the magnitude of the WH controlling the surface heat fluxes but not the surface heat fluxes controlling the magnitude of the WH. This is consistent with recent work investigating the role of the WH in driving an atmospheric response via surface heat fluxes, in which models with a stronger WH also exhibit a stronger sea level pressure response (Haarsma et al. 2015).

The heat budget analysis has so far made use of full depth, basinwide diagnostics but, as can be seen in Fig. 1, the WH is a more localised feature and is heavily model dependent. Investigating non-basinwide features is made substantially more tractable by redefining the ocean heat transport to be the residual of the full depth net heat flux and surface heat flux terms. To test this approximation, Fig. 6a shows the MMM anomalous heat budget (difference between Atlantic and volume-normalised Pacific) using instead ocean heat transport fluxes estimated as the residual. This gives qualitatively similar results to the non-residual case (not shown), suggesting the use of heat transport fluxes calculated this way is justifiable, at least in the MMM sense. As also noted above, the full depth Atlantic warms more than the full depth Pacific, after accounting for their different volumes in the latitude range 45–65\(^{\circ }\)N, although the majority of this relative warming occurs at depth (comparison of black and green lines in Fig. 6a).

Fig. 6
figure 6

Time series of anomalous (relative to the period 1850–2000) heat budget components for the difference between Atlantic and Pacific (normalised by Atlantic volume) subpolar gyres (basinwide, full depth, 45–65\(^{\circ }\)N) for the multi-model mean (MMM) where: a ocean heat transport is estimated as the residual flux, b as a but the Atlantic region is instead the warming hole region as shown in Fig. 1 with normalisation altered accordingly. Also plotted (right hand axis) are the full depth ocean heat content anomaly (integrated net heat flux, black) and the top 500 m ocean heat content anomaly (green) for the the Atlantic minus Pacific in each of the two cases

Given that estimating ocean heat transport as the residual heat flux is successful for the full NA SPG, we now apply the same method to the model-specific WH locations (Fig. 6b). Using the smaller WH regions results in far smaller area total surface and heat transport fluxes but a much larger estimate of the magnitude of the WH. Indeed, the WH exists for some models in the Atlantic-only heat budget without subtracting the Pacific estimates (not shown), again highlighting that for some models the future WH isn’t just a ‘relative’ hole but an absolute one. This is not the case for the MMM, as can also be seen in Drijfhout et al. (2012).

Although the CMIP5 archive provides only full depth-integrated heat budget diagnostics we have conducted additional sensitivity tests using the model HadGEM3-GC2, for which more granular heat budget diagnostics were available (Supp. Mat.). Investigating the top 500 m heat budget of the WH region in HadGEM3-GC2 yields a similarly large role for ocean heat transport by the horizontal circulation, with vertical advection and ocean mixing acting to oppose the horizontal circulation. There is again a large role for heat transport by the horizontal circulation when investigating the top 300 m, which represents the mixed layer in the WH region in HadGEM3-GC2. However, here vertical advection and ocean mixing act to reinforce these changes during the middle of the twenty first century. As such, at least with HadGEM3-GC2, a more detailed heat budget supports the conclusions of the broader multi-model heat budget.

Returning to the MMM, the WH begins to form around the turn of the twenty first century (green line in Fig. 6b) where an initial decline in ocean heat transport is not balanced by the surface fluxes. After some multi-decadal adjustment time, it appears that the surface fluxes are able to counteract the deceleration (accelerating decline) of the heat transport and so the full depth ocean heat content begins to recover while the top 500 m heat content continues to decrease. The disparity between deep and near surface heat content changes is consistent with a weakening AMOC that brings less heat northwards in the near surface and additionally removes less heat southwards at depth. Nonetheless, it is also consistent with a reduction in deep mixing for models such as GISS-E2-R in which the MLD is initially very deep in the WH region.

In summary, in individual models and the MMM, the WH is consistent with reductions in ocean heat transport from the south that are not fully balanced by changes in surface fluxes and that do not occur to the same extent in the Pacific Ocean at comparable latitudes. As a further independent method of analysis, in the next section we investigate the WH in the context of simultaneous freshwater changes in the North Atlantic.

3.4 The co-existence of a salinity hole

Although our focus has been on the projected WH in the NA SPG because of its potential impacts on nearby climate, we here briefly investigate the existence of a parallel salinity hole in order to further understand the origins of the WH. The analysis we present in this section is somewhat more qualitative but has the benefit of using all the models (as we use ‘thetao’ and ‘so’) to indirectly investigate the potentially advective processes driving the WH.

Fig. 7
figure 7

a Time series of multi-model mean (MMM) warming hole (WH, black) and salinity hole (red), using regions as defined in Fig. 1. WH has been normalised by its standard deviation over the slightly earlier historical period (1850–1900, see text) and the salinity hole by the same (temperature) scaling factor as well as the thermal expansion and haline contraction coefficients to give a comparable density change. Whiskers show the inter-model standard deviation at each year. b The regression slope between the salinity and temperature anomalies across all models at each year (blue) and the slope expected for a perfectly density compensated change at these temperatures and salinities (red)

Figure 7a shows time series of the WH anomaly and similarly defined salinity hole anomaly, where the WH anomaly is relative to the period 1850–1900 and is normalised by the standard deviation of the same period 1850–1900. The salinity hole anomaly is also relative to the period 1850–1900 but is normalised by the standard deviation of the WH scaled by the thermal expansion/haline contraction coefficients to give a similar density change. As such, the relative values of the normalised WH and normalised salinity hole are comparable (in terms of the magnitudes of their effect on density) and it can be seen that the WH and salinity hole have similar magnitudes at the end of the twenty first century. The inter-model spread is larger (whiskers on Fig. 7) for the salinity hole, perhaps unsurprisingly as the regional freshwater budget is less well constrained by global radiative forcing than the regional heat budget. Note in this panel that the WH and salinity hole are close to being density compensating when averaged over all models, but that any given model may or may not show density compensating anomalies.

The AMOC brings warm/salty water to the North Atlantic from the subtropics, and as such AMOC-related anomalies (specifically due to circulation) in annual or decadal mean NA SPG temperature generally co-vary with anomalies in salinity such that they are either warm/salty or cool/fresh, making them to some extent density compensating. Thus, as a further, alternative strategy to investigate whether a slowdown of the AMOC may be driving the projected WH, we compare the magnitude of WHs and salinity holes across the ensemble for each year and whether this relationship changes through time (Fig. 7b). It is worth noting that this is an example of using the inter-model spread (specifically the covariability in the spread of temperature and salinity anomalies) to investigate the processes behind the multi-model mean (MMM) response. As can be seen, beginning in the mid twentieth century, the relationship between salinity and temperature anomalies strengthens, with the positive regression slope indicating that warm and salty anomalies co-vary, and vice versa. That is, anomalies in temperature in the models are becoming increasingly related to compensating anomalies in salinity.

This is an earlier initiation than estimated by merely investigating the MMM WH response, as is seen in Fig. 7a where the WH doesn’t begin to grow until the beginning of the twenty first century. Here, using the full CMIP5 archive, it can be seen that temperature and salinity anomalies were beginning to become more coherently arranged towards the end of the twentieth century—across the suite of CMIP5 models. This is consistent with an increasing role for the AMOC or any other processes (such as potentially changes in vertical mixing) that result in co-varying temperature and salinity anomalies. However, crucially, this is not consistent with the driver being surface fluxes for which the relationship between temperature and salinity anomalies need not be of the same sign. For example, surface driven freshening does not necessarily correlate with simultaneous density-compensating surface driven cooling. In addition, in Sect. 3.2 we argued against a direct role for changes in vertical mixing based on the shallow mixing depth over the WH regions and that the model-dependent mixing location generally exists downstream of the WH.

Around the turn of the century the salinity anomaly and temperature anomaly become almost density compensating. The regression slope remains high until the middle of the twenty first century where it begins to decline although still remains higher than in the early part of the twentieth century. The increasingly important role for the AMOC/large scale circulation is investigated in the next section.

3.5 Projected changes in the AMOC streamfunction

The historical mean overturning streamfunctions for the models for which the ‘msftyyz’ or ‘msftmyz’ diagnostics were available are shown in Fig. 8. There is considerable inter-model variability in the maximum strength of the overturning, the latitude at which this occurs, and the depth of the upper limb. Nonetheless, our goal is not to assess the quality of the historical mean AMOC streamfunction but to ascertain whether the AMOC drives the projected WH in the NA SPG. To this end, overplotted are contours showing the projected changes in streamfunction strength. For all the models for which there are AMOC data stored for the RCP8.5 scenario the AMOC upper limb both weakens and shallows. Notably, a historical mean AMOC somewhat different to the MMM, such as the very shallow upper limb in HadGEM3-GC2, does not preclude a weakening remarkably similar to the MMM. Similarly, a historical mean similar to the MMM (e.g. MRI-ESM1) does not necessarily imply a weakening that is also similar in shape to the MMM.

Fig. 8
figure 8

Time mean Atlantic overturning streamfunctions for those models for which the data was available on the CMIP5 archive, averaged over the historical period (1850–2000). Also plotted are contours of the difference between the future (2070–2100) and historical periods for those models for which the scenario data was available (negative values in black, positive values in white, contoured every 2Sv, zero line dashed)

The depth of the AMOC maximum in the NA SPG (45–65\(^{\circ }\)N) during the historical period is also shown on Fig. 3 for all models with available diagnostics, along with the MMM value. Note that the depth of the AMOC maximum actually shallows throughout the projections, which will be discussed further on. As can be seen, the AMOC depth is more consistent with the MMM depth of the WH although once again there is considerable model diversity. For example, the ACCESS1-0 model is provided with both mixing and AMOC diagnostics, for which the mixing depth provides a more likely explanation of that model’s WH based on analysis of the depth structure alone. However, this is not the case for the similar ACCESS1-3 model and may partially reflect our arbitrary choice of a fixed WH box size as well as potential internal variability. A similar contrast arises for HadGEM2-CC and HadGEM2-ES. All four of these models have relatively small WHs located in the north east of the NA SPG, compared to other models (e.g. IPSL-CM5A-LR) or the MMM. Despite using somewhat different ocean submodels, these four models all use a similar atmosphere model (the Unified Model, Martin et al. 2011), raising the possibility that, despite the WHs appearing to be ocean-driven, their spatial location may be sensitive to details of the atmospheric formulation.

In Sect. 3.2 we noted that the GISS-E2-R model had a large MLD (time mean as well as change into the future) in the WH region, which provided one potential driver of the WH. From Fig. 8 it can be seen that the AMOC change into the future is also large, and indeed the rate of change of the AMOC measured at 45\(^{\circ }\)N in the twenty first century projections is larger than in any of the other models (not shown). As such, both the AMOC and changes in MLD cannot be ruled out as driving the WH in this model without conducting a more rigorous heat budget analysis of the WH, unfortunately precluded by the insufficient granularity of the diagnostics available on the CMIP5 archive. It is plausible that the WH simulated by GISS-E2-R may be different in nature to those simulated by the other CMIP5 models. It is also worth noting that the GISS-E2-R model contains a bug in the isopycnal mixing scheme resulting in too much isopycnal mixing (Schmidt et al. 2014), which could result in too much restratification in the mixing regions. This could lead to an over-sensitivity of the model to events that cap vertical mixing and lead to “convection collapse” as noted by (Sgubin et al. 2017, their Figure 3b) even under moderate forcing, which may have a large impact on the local heat budget.

The consistent shallowing of the AMOC upper limb allows us a final way to examine whether the AMOC is responsible for the projected WH in the NA SPG. Figure 9 shows annual mean snapshots of the MMM temperature anomaly profiles (relative to the period 1850–2000) for just the subset of models that also provided AMOC data. The same plot for the entire ensemble is very similar (not shown). The global mean temperature profile (Fig. 9, top row) shows clear surface driven warming that is not evident in the WH location (Fig. 9, middle row). Here, there is a subsurface layer of almost no warming in the top 100 metres, which may be related to reductions in local vertical mixing that inhibits the mixing down of the surface warm anomaly, but that cannot explain the full heat budget of the WH (Sect. 3.3). Below this there is considerable warming but less than is experienced by the global mean. This results in the relative WH in the near surface ocean, as shown in the lower panels of Fig. 9.

Fig. 9
figure 9

Eleven-year mean multi-model mean (MMM, for the subset of models that provided AMOC data) temperature anomaly (relative to the period 1850–2000) profile snapshots of (top row) the global mean, (middle row) the subpolar gyre warming hole (WH) region, and (bottom row) the difference (defined as the WH). Snapshots are centred at the years 2000 (left column), 2040 (middle column), and 2080 (right column). Dashed lines indicate the one standard deviation range across the subset of models. Also shown is the depth of the upper limb of the Atlantic meridional overturning circulation over the same period for the same subset of models. The global profiles (top row) are repeated over the WH region profiles (middle row) for comparison (green)

Figure 9 also includes the time mean estimate of the AMOC upper limb depth across all available models for the eleven year window centred at the highlighted year. If the AMOC is responsible for the WH then the relative cooling in the near surface ocean should be contrasted by relative warming at greater depths, with the turnover/zero anomaly centred at the depth of the AMOC upper limb. As already shown in Fig. 3, this is indeed the case for the MMM, but in addition, as shown here, the time-evolution of the depth structure of the WH is also explained by changes in the AMOC. That is, as the near surface WH grows, its depth extent actually decreases, which is preceded by a shallowing of the AMOC (Fig. 9, bottom row, left to right). An animated ‘gif’ version of this figure, which further highlights the timing of AMOC and WH depth profile changes on annual to decadal timescales, can be seen in the Supp. Mat. At least in the MMM this provides further evidence of the link between changes in the AMOC and the WH. We note this depth structure and link to the AMOC is consistent with a recent model/observation comparison (Robson et al. 2016). To conclude our analysis, in the next section we investigate the timing of the onset of the WH in more detail and across all the models.

3.6 Timing of the onset of the warming hole

Finally, we investigate the timing of the onset of the AMOC weakening along with the emergence of the WH and changes in ocean mixing. In Sect. 3.3 we noted that the time series of the MMM AMOC maximum can be characterised as an initially stable phase followed by an approximately linear decline, which can be seen in Fig. 10a. To aid our analysis we define a new metric as the mean mixed layer depth in the deep water formation region (DWFMLD), which we estimate for each model independently in a similar manner to the WH (cf. Fig. 4). As noted in Sect. 3.2, these DWFMLD sites are generally not co-located with the WH.

Fig. 10
figure 10

Time series of a the warming hole index averaged over model-specific regions as defined in Fig. 1, b Atlantic meridional overturning circulation (AMOC) index (streamfunction value at 45\(^{\circ }\)N and 1000 m depth), c mixed layer depths averaged over the model-specific region of deep water formation (DWFMLD) using just the ‘mlotst’ diagnostic. Colours in panel (a) indicate those models that have neither MLD or AMOC data (grey), just MLD data (red), just AMOC data (blue), or both MLD and AMOC data (purple). The multi-model mean (MMM) is displayed in black in each case. The vertical lines denote the time of WH onset (1990) and a window of full-width of 50 years around this time. d Lagged regressions between DWFMLD, AMOC, and WH indices over the 50 year window for each model (thin) and the MMM (thick). The lag with maximum correlation in the MMM is shown in brackets

Broadly similar temporal evolutions can be seen in both the DWFMLDs (Fig. 10b) and in the size of the WH (Fig. 10c), but with apparently different times of initial onset. There is more inter-model diversity in WH and DWFMLD behaviour than AMOC behaviour, but as noted in Table 1, different subsets of models provided mixing and AMOC data. The relatively small subset of models that provided both MLD and AMOC data (all models provided WH data by definition) are highlighted and show that they span the full range of WH time series characteristics—at least as determined by these specific indices.

To analyse the lagged relationships around the WH onset, we estimate the approximate time of the initial development of the MMM WH (1990) and then define a window, centred on this time, with a full width of 50 years. For each model, all combinations of WH, DWFMLD, and AMOC indices are lag regressed against one another, subject to the provision of the required data. This is repeated for the MMM indices (Fig. 10d). Note that, although the correlations shown are high, it can be seen that there is a large amount of autocorrelation in the data.

In general, reductions in DWFMLD during the onset phase lead weakenings of the AMOC that lead growth in the WH. The MMM response yields an average lag between the DWFMLD and AMOC of 5 years, and a subsequent lag between the AMOC and WH of 5 years, broadly consistent with the multi-model AMOC fingerprinting study of Roberts et al. (2013). The total timescale of 10 years compares favourably with the direct lag between DWFMLD and WH of 10 years, suggesting there is little short-circuiting of the route from DWFMLD to AMOC to WH, while the similar correlations involving the intermediate AMOC step further suggest that the AMOC does play a role. These relationships use the previously defined model-specific WH regions as well as model-specific MLD regions estimated in a similar way (i.e. the regions of largest MLD in Fig. 4). Using broader definitions based on the MMM yields a similar total timescale of 11 years between changes in MLD and WH but a rearrangement of the partitioning; the lag between MLD and AMOC is reduced to 3 years and the lag between the AMOC and WH is increased to 8 years while the correlations with the WH are reduced (not shown). The similarity between the two methods of estimating the MMM total timescale, but differences in the lags related to the AMOC, suggests that, even when using the MMM, allowance does need to be made for the differing spatial patterns of changes from model to model if one is interested in the timings of the processes (e.g. the AMOC) involved.

In summary, although changes in vertical mixing cannot explain the heat budget of the WH region (Sect. 3.3), they are still of crucial importance in determining the reduced ocean heat transport driven by a long term AMOC decline initiated towards the beginning of the twenty first century. This is separate to the results of Sgubin et al. (2017) who found an important role for vertical mixing in driving a WH in some CMIP5 models that was much more rapid and occurred during the middle of the twenty first century.

4 Discussion

We have investigated the structure of the WH projected by CMIP5 models and found it to vary considerably from one model to another. Similarly, the location of DWF (as diagnosed from ocean MLDs) and the shape of the overturning streamfunction are also model dependent. Nonetheless, there exists a broadly consistent chain of events of a DWFMLD reduction that leads to an AMOC weakening that leads to an increase in the magnitude of the WH over the twenty first century. We have not investigated the origins of the reduction in DWFMLD but can speculate that this arises through some combination of surface ocean warming and freshening (due to increased runoff and ice melt). Note that, although the WH shows a relative cooling compared to the global mean during the scenario period, the absolute change is a warming compared to the historical period in most models, particularly in the regions of deep mixing that are generally not co-located with the WH. This is an important distinction from recent work analysing rapid cooling events (Sgubin et al. 2017).

We also noted that the changes in DWFMLD, AMOC, and WH exhibited a somewhat secular shift from little or no trend to a roughly constant trend. Given that it is DWFMLDs that change first, and that DWF is a somewhat nonlinear process in which small changes at the surface can cap an otherwise nearly unstable water column, it seems plausible that this secular shift originates with the DWFMLDs. Climate models suggest that different DWF regions may be more or less susceptible to climate change (Wood et al. 1999).

In recent work, Sgubin et al. (2017) show that some CMIP5 models can show a rapid regional cooling in the North Atlantic in the mid twenty first century related to rapid changes in MLDs. For these rapid changes, the cooling occurs over the region of largest MLDs, unlike for the slower changes we have investigated here (which would be denoted “non-abrupt” in their analysis). They also find that this phenomenon is more likely to occur in climate models with an improved representation of mixed layers in the present day. As such, further work investigating the sensitivity of MLDs to future climate change (Heuzé and Wåhlin 2017), and the existence of possible tipping points (e.g. Lenton 2011), seems warranted.

Previously, multi-model analysis of preindustrial control simulations within the CMIP5 archive found a link between mean state biases and the drivers of density variability in the NA SPG (Menary et al. 2015b), using broadly the same models as used in this analysis. This might be expected to feed in to the manifestation of decadal or secular (i.e. the onset of a linear trend) variability within the region. However, we find no link between the magnitude of a model’s innate temperature/salinity biases in the near surface NA SPG and the subsequent onset or magnitude of the projected WH, suggesting that the externally forced signal may overwhelm any more subtle cross-model links between mean states and variability. Nonetheless, there are still clearly significant differences in the manifestation of a future WH across the CMIP5 models, and it remains possible that these are systematically related to specific features of the models.

5 Conclusions

We have investigated the nature of a projected future warming hole (WH) in CMIP5 models. We have identified the AMOC as a primary driver of the projected WH in most models as opposed to changes in vertical mixing or surface heat fluxes. Specifically, we find that:

  • The horizontal and vertical structure of the WH varies considerably from model to model.

  • The location of both historical mean and future changes in DWF are not co-located with the WH location in most models.

  • A heat budget analysis is consistent with a significant role for ocean heat transport in the projected WH.

  • The depth structure of the WH suggests local MLDs are generally not deep enough to directly drive the heat budget of the WH.

  • An analysis of the simultaneous salinity hole is also consistent with a significant role for the AMOC in forming the WH.

  • The AMOC upper limb is projected to weaken and shallow into the future, despite considerable model diversity in the mean state structure.

  • The temporal evolution of the AMOC upper limb depth and depth structure of the WH are further consistent with a significant role for the AMOC.

  • Lagged regression analysis suggests changes in DWF lead AMOC changes by 5 years. The AMOC then leads changes in the heat budget of the WH by a further 5 years.

Here, we have attempted to assess in greater detail, and across all available models, the existence, structure, and origins of a projected North Atlantic warming hole. The WH is manifest somewhat differently in different models (as are the locations of greatest vertical mixing and the character of the overturning streamfunction) but allowing ‘model-specific’ definitions of the WH yields many consistent relationships across the models. Although a role for the AMOC is one general commonality, differences in the pattern and evolution of the projected warming hole are likely to lead to somewhat different associated climate impacts (cf. Haarsma et al. 2015), which should be the focus of future study.