1 Introduction

Adjoint methods provide a mathematically rigorous approach to calculating the impact of changes in all model inputs on a single model output. Adjoint methods express the response of a model as a single, scalar output or cost function that is calculated from the model state. A system of adjoint equations, developed from a differentiated form of the original model equations, then allows the calculation of the linear sensitivity of the cost function to all model “inputs” (initial conditions, surface boundary fields, equation coefficients and parameters) in one calculation. In contrast, traditional perturbation approaches give the sensitivity of all the model outputs to a single model input in one calculation. Note that the linearization takes place about the fully nonlinear model trajectory.

In this paper, we use a simple configuration of a state of the art ocean general circulation model (Marshall et al. 1997a,b) as the basis for an adjoint sensitivity analysis of the strength of the meridional overturning circulation (MOC). This paper has two objectives:

  1. The work of van Oldenborgh et al. (1999) and Galanti and Tziperman (2003) on El Nino–Southern Oscillation (ENSO) sensitivities and Marotzke et al. (1999) on the Atlantics heat transport has established that adjoint solutions are robust for 1–2 years. This article shows that adjoint solutions can be derived and are reliable over time scales of several hundred years in coarse-resolution models.

  2. This article focuses on the physical processes that communicate information to the MOC and the time scales over which this information is communicated. The adjoint model solutions in this simplified idealized geometry allow us to observe clearly how perturbations in boundary forcing impact the overturnings strength. While previous studies have used adjoint waves to explain their results, they did not focus on the basic mechanics of these waves.

The prognostic model and its adjoint are described in Sections 2 and 3. The accuracy of the results presented in Section4 is tested and validated using a Greens function approach at the beginning of that section. The description of the time-dependent mechanisms in the adjoint space make up the rest of that section. Section5 includes a brief summary, discussion, and relevance of the results to climate research.

2 Description of the forward model

The ocean model used in this project is the Massachusetts Institute of Technology (MIT) general circulation model; it is described in detail in Marshall et al. (1997a,b). The model is designed to study ocean processes ranging from the nonhydrostatic scale of convection to the hydrostatic global scale. For this study, the model was configured to operate in hydrostatic form on a highly idealized single-basin representation of the Atlantic. The basin is 64° wide in the zonal direction, and extends from 66°S to 70°N. The layout of the basin can be viewed in Fig.4. A continuous channel was added in southern latitudes to represent the Antarctic circumpolar current (ACC). At 80° wide, this ACC is four grid points wider than the rest of the basin; it extends from 66°S to 46°S. The model resolution is 4°×4° with 15 layers in the vertical, having thicknesses ranging from 50m near the surface to 690m at the bottom. The depth of the basin is uniformly 4,200m, except over the 16° of added longitude of the ACC where a 2,500-m-deep sill is added. The time-stepping procedure is asynchronous (Bryan 1984); the momentum equations have a time step of 40min, while tracers (T, S) are advected with a time step of 1 day. The Redi isopycnal diffusion scheme is used in all model runs to project the diffusion of tracers along three-dimensional isopycnal surfaces rather than along grid coordinate directions (Redi 1982). The Gent–Mc Williams scheme parameterizes the role played by geostrophic eddies in transporting tracers with a “bolus velocity” (Gent and McWilliams 1990; Danabasoglu and McWilliams 1995), which is added to the Eulerian mean velocity. Table1 summarizes the value of key model parameters. Idealizing the domain geometry and restricting its size simplifies the pathways used to communicate perturbations in the basin. A nonlinear formulation for the equations of state is used, so that the temperature term T represents potential temperature (Marshall et al. 1997a,b).

Table 1 Model parameters

Forcing fields are derived from the zonal, annual average of wind stress, heat flux, and net precipitation fields observed over the Atlantic basin (Schmitt et al. 1989; Trenberth and Solomon 1994; Trenberth et al. 1989; Jiang et al. 1999). The heat flux and freshwater forcing fields are adjusted by adding everywhere a constant term to ensure zero net input of energy or mass into the system. This model has no seasonal cycle; this was beyond the scope of this initial study.

2.1 Circulation patterns

The steady-state meridional streamfunction ψ

$$\psi {\left( {\phi ,z} \right)} = r \cdot \cos {\left( \phi \right)} \cdot {\int_{\lambda _{1} }^{\lambda _{2} } {{\int_z^{z_{B} } {v\,dz\,d\lambda } }} }$$
(1)

is shown in Fig.1; ϕ is the latitude, λ is the longitude, z is depth, and z B is the depth of the basin. The meridional overturning peaks in high latitudes at a value of 11.5Sv, and the heat transported northward in the basin amounts to 0.7PW, nearly all transported by the MOC. As is often the case in similar models (Colin De Verdière 1988; Marotzke and Scott 1999), the actual downward mass transport signature of the MOC is taking place along the eastern wall from 40° to 68°N, with peak downwelling velocity in the northeast corner. “Convection” takes place in the northwest corner below the surfaces coldest spot. More details about the density adjustment process and its differentiation are provided in Section4. The separation of the western boundary current from the coast near 50°N plays an important role in isolating the northwest corner and allowing it to cool while warming the northeast corner.

Fig.1
figure 1

Meridional overturning streamfunction (in sverdrup) in the single-basin model when forced with mixed boundary conditions. The rectangular box encloses the area used as cost function for the calculations

3 Adjoint model

The output of an adjoint model calculation gives the sensitivity of the cost function to the initial conditions as well as to the models physical parameters. The cost function can be any scalar function of the model output, as long as it remains differentiable.

The reader is referred to Morse and Feshbach (1953), Lanczos (1961) and Hall et al. (1982) for a derivation of the general principles of adjoint construction, both in the continuous and discrete time domains.

In summary, the evolution of the state vector y n between time steps n and n+1 can be represented from the mapping C of the state at one time step onto the next,

$${\text{y}}_{{n + 1}} = {\text{C}}_{{n + 1}} {\left( {{\text{y}}_{n} } \right)} = {\text{C}}_{{n + 1}} \circ {\text{C}}_{{n - 1}} \circ \cdots {\text{C}}_{1} {\left( {{\text{y}}_{0} } \right)}.$$
(2)

In the ocean model used for this analysis, y n is composed of the prognostic variables: the tracers temperature (T) and salinity (S), the three-dimensional velocity vector (\({\mathop u\limits^ \to }\)), the diagnostic variables pressure (p) and density (ρ), as well as the models physical parameters (α). The variables are represented by three-dimensional arrays of the dimensions of the model grid.

The evolution of a perturbation of the state vector δy n at time n+1 can be represented as:

$$\delta {\text{y}}_{{{\text{n}} + 1}} = {\text{C}}_{{n + 1}} \delta {\text{y}}_{{\text{n}}} .$$
(3)

C n+1′ is often called the linear propagator. The gradient of the cost function, J, which is a response functional of the output state vector J(y N ) at final time t=N, with respect to the model inputs at time t=0 obeys the following equation:

$$\nabla _{{y_{0} }} J = {\text{C}}^{ * }_{1} {\text{C}}^{ * }_{{\text{2}}} \ldots {\text{C}}^{ * }_{N} \nabla _{{y_{N} }} J = C^{ * } \nabla _{{y_{N} }} J$$
(4)

where the superscript * indicates the use of the adjoint of matrix. This follows from the properties of the inner product in the L 2 norm. In discrete models and the L 2 norm, the adjoint is simply the transpose of the Jacobian matrix. The adjoint model is integrated backwards in time from the last time step to the first: y N J is calculated first before being multiplied with C N *, C N1*, and back to C 1*.

The adjoint concept can be generalized to any time step; y n J gives the sensitivity of the cost function on time step N to a perturbation of the state variables or parameters applied on time step n.

The derivation of the adjoint of the MIT general circulation model is performed automatically by the tangent linear and adjoint model compiler (TAMC). The approach used by TAMC is to derive the adjoint of the numerical code itself [as opposed to the finite difference of adjoint approach (Sirkes and Tziperman 1997), which derives the adjoint equations to the linearized continuous model and subsequently finds a suitable finite difference formulation]. A detailed description of the construction of the adjoint code and an example of its implementation can be found in Marotzke et al. (1999). The procedure used by TAMC is to perform first an integration of the forward model while storing the variables required by the adjoint model to calculate the derivative terms. For 400- or 800-year-long integrations, the storage capacity required during the forward run rapidly exceed the capacity of any storage device. A procedure called checkpointing (Griewank and Corliss 1991; Heimbach et al. 2002), which consists of splitting the model’s time-stepping into multiple embedded loops, is used to reduce the storage requirements. The outer integrations store the model state at intermediate checkpoints. The inner integration then restarts from these intermediate checkpoints and stores the model states, which are used by the adjoint run. Three-level checkpointing is required to achieve integrations on climatological time scales.

4 Cost function

In this study, we define a cost function, J, that is a measure of the MOC’s strength. The cost function we use is the average value of the meridional overturning streamfunction defined in Eq. 1, calculated between ϕ=60–64N and 1,055–1,785m where it peaks:

$$J = \psi _{{MAX}} = \overline{\psi } {\left( {60 - 64^{ \circ } N;1055 - 1785\;m} \right)}.$$
(5)

The region from which ψ MAX is evaluated is indicated by the rectangular box in Fig.1. The adjoint model can then be used to calculate the sensitivities of ψ MAX with respect to perturbations in wind-forcing (\(\frac{{\partial \psi _{{MAX}} }}{{\partial \tau _{x} }}\) and \(\frac{{\partial \psi _{{MAX}} }}{{\partial \tau _{y} }}\)), heat, and freshwater fluxes (\(\frac{{\partial \psi _{{MAX}} }}{{\partial Q}}\) and \(\frac{{\partial \psi _{{MAX}} }}{{\partial E - P_{{obs}} }}\)) or any other model parameter. Examining the patterns of spatial and temporal variability will allow us to develop a qualitative picture of the dynamical mechanisms that affect the MOC.

5 Adjustment mechanisms

The sensitivity terms are a function of both space and time (for example, \(\frac{{\partial \psi _{{MAX}} }}{{\partial \tau _{x} }} = \frac{{\partial \psi _{{MAX}} }}{{\partial \tau _{x} }}{\left( {\lambda ,\phi ,t} \right)}\) since τ x is a two-dimensional forcing field). The temporal evolution of the sensitivities allow us to see how ψ MAX responds to perturbations to reach a new equilibrium state. Equilibration can follow two basic paths. For parameters that are time-invariant such as the zonal wind stress, τ x , sensitivities asymptote to a constant value over long time scales. A change in the value of these parameters will permanently alter ψ MAX. For time-dependent variables such as the models sea surface temperature, T, the sensitivity to a perturbation in T at an initial time asymptotes to zero with time. Over long-enough time scales, ψ MAX will “forget” about that perturbation and return to its original value.

Figure2 shows the sensitivity to a time-invariant parameter, τ x (right), and the initial value sensitivity to a perturbation in T (left) for sample grid points. These figures show time decreasing to the left and are to be interpreted as the influence of a perturbation applied n years ago on the final model state at t=0. For example, the sensitivity to a zonal wind stress perturbation imposed at 18°N and maintained during 800 years (top right panel) is approximately 0.8Sv N−1 m−2. The sensitivity to an initial perturbation in temperature, T, imposed at year t=−800 is very close to zero (left two panels). From these figures, it is clear that in a model of this dimension, most steady-state information will be obtained after approximately 400 years.

Fig.2
figure 2

Time series of the sensitivity of the meridional overturnings maximum streamfunction ψ MAX to the initial surface temperature distribution \(\frac{{\partial \psi _{{MAX}} }}{{\partial T}}\;{\text{in}}\;\frac{{S_{v} }}{{^{ \circ } C}}\) (left) and to the zonal wind stress field \(\frac{{\partial \psi _{{MAX}} }}{{\partial \tau _{x} }}\;{\text{in}}\;\frac{{S_{v} }}{{Nm^{{ - 2}} }}\) (right) for sample points at different latitudes in the model; the sample points were at equal distance from the eastern and western boundaries of the model

5.1 Accuracy of the adjoint sensitivities

Diagnostic analysis was performed by choosing the points that exhibited the largest or most interesting sensitivities, perturbing the relevant parameter at those points, and running the forward model with these perturbations. The use of this traditional Greens function approach has two advantages:

  1. It allows the observation of the response of all model variables to the effect of a perturbation.

  2. Provided that the direct perturbation is small, it allows a verification of the accuracy of the sensitivities derived with the adjoint method.

The accuracy of the adjoint model was estimated by comparing adjoint and finite difference sensitivities. For a 400-year integration of the single-basin model and 22 point perturbations of relevant model variables (the perturbation strength was either 1 or 10% of the parameter value), the difference between the two methods was on average 2.6% and never worse than 8.2%.

Because of its coarse resolution, this model does not contain any of the obviously nonlinear processes such as eddy formation, shedding, and reabsorption, which would limit the time scale over which the adjoint sensitivity remains accurate. All remaining nonlinear processes, notably advection and convective overturning, appear easily linearized in this particular framework. The ocean model uses the Adams–Bashford time-stepping scheme, which has no computational mode. This avoids the stability problems mentioned by Sirkes and Tziperman (1997) with the Leapfrog scheme, which has a computational mode that requires damping in both the forward and adjoint models.

5.2 Adjoint Kelvin and Rossby waves

Both Kelvin wave and Rossby wave dynamics are present in the adjoint model. This has already been observed by van Oldenborgh et al. (1999) and Galanti and Tziperman (2003) in their studies of ENSO with adjoint models. These studies focused on time scales of 1–2 years, much shorter than the time scales of interest to the MOC.

Kelvin waves are one of the fundamental processes which carry perturbations around an ocean basin (Kawase 1987). This has been noted in relation to the adjustment of the thermocline in response to changes in deepwater formation (Huang et al. 2000), as well as to explain how changes in wind stress forcing over the Southern Oceans relate to changes in overturning strength (McDermott 1996). In both studies, coastal and equatorial Kelvin waves transport the perturbation signal rapidly along the boundaries of the basin and along the equator. The interior flow adjusts through the radiation of Rossby waves from the boundaries. The adjoint model gives a new visual perspective on the propagation of these waves.

Equatorial Kelvin waves in a shallow-water framework obey the following equations:

$$\begin{array}{*{20}l} {{\frac{{\partial u}}{{\partial t}} - g\frac{{\partial \eta }}{{\partial x}} = 0} \hfill} \\ {{\frac{{\partial \eta }}{{\partial t}} + H\frac{{\partial u}}{{\partial x}} = 0} \hfill} \\ \end{array} ,$$
(6)

where u is the zonal velocity, η is the perturbation to the sea surface height, and H is the mean depth of the layer. This linear Kelvin wave operator can be written succinctly as:

$${\user1{\mathcal{A}}}\xi = 0,$$
(7)

with \(\xi = {\left[ {\begin{array}{*{20}c} {\eta } \\ {u} \\ \end{array} } \right]}\) and

$${\user1{\mathcal{A}}} = {\left[ {\begin{array}{*{20}l} {{\frac{\partial }{{\partial t}}} \hfill} & {{H\frac{\partial }{{\partial x}}} \hfill} \\ {{ - g\frac{\partial }{{\partial x}}} \hfill} & {{\frac{\partial }{{\partial t}}} \hfill} \\ \end{array} } \right]}.$$
(8)

Greens identity states that:

$${\int {{\int {\xi _{1} } }} }{\user1{\mathcal{A}}}\xi _{2} - \xi _{2} \widetilde{{\user1{\mathcal{A}}}}\xi _{1} = boundary\;terms,$$
(9)

where ξ 1 and ξ 2 are arbitrary vectors, and \(\widetilde{{\user1{\mathcal{A}}}}\) is the adjoint operator. If Eq. 9 is satisfied, then the adjoint operator satisfies the adjoint equation:

$$\widetilde{{\user1{\mathcal{A}}}}\widetilde{\xi } = 0.$$
(10)

The adjoint operator is derived by integrating the first part of Eq. 9 by parts. It is:

$$\begin{array}{*{20}c} {{{\int {{\int {\xi _{1} {\user1{\mathcal{A}}}\xi _{2} = } }} }{\int {{\int {\eta _{1} } }} }{\left( {\frac{{\partial _{{\eta _{2} }} }}{{\partial t}} + H\frac{{\partial u_{2} }}{{\partial _{x} }}} \right)} + u_{1} {\left( { - g\frac{{\partial _{{\eta _{2} }} }}{{\partial x}} + \frac{{\partial u_{2} }}{{\partial t}}} \right)}dx\;dt = }} \\ {{{\int {{\int {\eta _{2} } }{\left( { - \frac{{\partial \eta _{1} }}{{\partial t}} + g\frac{{\partial u_{1} }}{{\partial x}}} \right)} - Hu_{2} {\left( {\frac{{\partial \eta _{1} }}{{\partial x}} + \frac{{\partial u_{1} }}{{\partial t}}} \right)}dx\;dt + } }}} \\ {{{\int {{\left[ {\eta _{1} \eta _{2} + u_{1} u_{2} } \right]}{\left| {{}^{{t_{b} }}_{{t_{a} }} dx + {\int {{\left[ {Hu_{2} \eta _{1} - gu_{1} \eta _{2} } \right]}} }} \right|}{}^{{x_{b} }}_{{x_{a} }} dt} }}} \\ \end{array} .$$
(11)

The operator which satisfies Greens identity is:

$$\widetilde{{\user1{\mathcal{A}}}} = {\left[ {\begin{array}{*{20}l} {{ - \frac{\partial }{{\partial t}}} \hfill} & {{g\frac{\partial }{{\partial x}}} \hfill} \\ {{ - H\frac{\partial }{{\partial x}}} \hfill} & {{ - \frac{\partial }{{\partial t}}} \hfill} \\ \end{array} } \right]}.$$
(12)

\(\widetilde{\xi }\) therefore satisfies the Kelvin wave equation with time and space reversed:

$$\widetilde{\xi }{\left( {x,t} \right)} = \xi {\left( { - x, - t} \right)}.$$
(13)

The zonal velocity of the Kelvin wave is furthermore in balance with the meridional pressure gradient:

$$\beta yu = - g\frac{{\partial \eta }}{{\partial y}}.$$
(14)

The adjoint operator implies that a reversal of space in the meridional direction is sufficient to satisfy the equation \(\widetilde{\xi }{\left( y \right)} = \xi {\left( { - y} \right)}.\)

The general solution to the classic shallow-water equatorial Kelvin wave problem is a wave of the form:

$$\eta {\left( {x,y,t} \right)} = e^{{y^{2} /4a}} F{\left( {x + ct} \right)} + e^{{ - y^{2} /4a}} G{\left( {x - ct} \right)},$$
(15)

where \(a = \frac{c}{{{\sqrt {2\beta } }}}\) is the equatorial radius of deformation, and \(c = {\sqrt {gH} }\) is the phase speed of the wave. The solution, which decays exponentially away from the equator in the meridional direction, is an eastward-propagating wave:

$$\eta {\left( {x,y,t} \right)} = e^{{ - y^{2} /4a}} G{\left( {x - ct} \right)}.$$
(16)

The adjoint Kelvin wave thus has the general form:

$$\widetilde{\eta }{\left( {x,y,t} \right)} = \eta {\left( { - x, - y, - t} \right)} = e^{{ - y^{2} /4a}} F{\left( { - x - ct} \right)} + e^{{y^{2} /4a}} G{\left( { - x + ct} \right)}.$$
(17)

However, it makes little sense to analyze these adjoint waves with time going in the normal “forward” direction since this would be tantamount to observing a wave that is moving toward the impulse that causes it, and not away from it. It would furthermore represent a wave moving towards an impulse that will take place later in time, as shown on the left-hand side in Fig.3. The model results are therefore more intuitive when analyzed with time reversed [η (−x, −y, t)]. The wave then propagates away from the impulse that causes it, as shown on the right-hand side of Fig.3. In that case, the adjoint equatorial Kelvin wave can be expected to propagate from east to west:

$$\widetilde{\eta }{\left( {x,y, - t} \right)} = \eta {\left( { - x, - y,t} \right)} = e^{{ - y^{2} /4a}} F{\left( { - x + ct} \right)}.$$
(18)
Fig.3
figure 3

Illustration of the adjoint models response to a perturbation source. Time runs in the normal “forward” direction and the wave propagates towards the source that causes it (left). With time reversed, the wave propagates away from the source (right)

A similar analysis of the mathematics of adjoint coastal Kelvin waves indicates that, when viewed with time going backwards, they can be expected to propagate with the boundary on their left in the Northern hemisphere and on their right in the Southern hemisphere.

The mathematics of the simplest adjoint Rossby waves can be analyzed with the same method. In its simplest form, the baroclinic Rossby wave can be written as:

$$\partial _{{txx}} p + \partial _{{tyy}} p + \frac{{f^{2} }}{{N^{2} }}\partial _{{tzz}} p + \beta \partial _{x} p = 0.$$
(19)

For constant f, β, and N 2, the adjoint operator is:

$$\widetilde{{\user1{\mathcal{A}}}} = - \partial _{{txx}} - \partial _{{tyy}} - \frac{{f^{2} }}{{N^{2} }}\partial _{{tzz}} p - \beta \partial _{x} .$$
(20)

If p(x, y, z, t) is a solution of the baroclinic Rossby wave equation, \(\widetilde{p}{\left( {x,y,z,t} \right)} = p{\left( { - x,y,z, - t} \right)}\) will satisfy the adjoint baroclinic Rossby wave. It is interesting to point out that Rossby waves are invariant to a change in meridional direction but not to changes in the zonal direction and time. For reasons already mentioned, the evolution of these waves is more sensible when observed with time reversed.

Figure4 shows the time evolution of \(\frac{{\partial \psi _{{MAX}} }}{{\partial \tau _{x} }}\). This term captures the equilibration of ψ MAX to a perturbation in the zonal wind stress forcing, τ x . Each grid point on each plot shows the accumulated effect on ψ MAX of a perturbation in τ x applied at that location, and maintained for 6 months, 1.5, 2, 3, 5, and 100 years, respectively. The sequential passage of wave crests and troughs erases any wavelength signature in the adjoint Kelvin and Rossby wave signals. Such a wavelength signature would, for example, be present in the sensitivity pattern to an initial perturbation in the velocity field which is maintained during a single time step. Unfortunately, these sensitivity patterns to initial value perturbations decay rapidly.

Fig.4
figure 4

Sensitivity of the meridional overturning streamfunction maximum ψ MAX to the zonal wind stress forcing: \(\frac{{\partial \psi _{{MAX}} }}{{\partial \tau _{x} }}\) in sverdrup per newton per square meter for 6 months, 1.5, 2, 3, 5, and 100 years. All plots are on the same scale, −0.6–0.6Sv N−1 m−2, except for the last one, which is on a scale four times larger

In the first 6 months, a band of high sensitivity is located directly over the latitudes where ψ MAX is calculated. A hypothetical positive (westerly) Δτ x >0 wind perturbation would induce a southward Ekman transport near the surface, which would be compensated at depth by a northward geostrophic return flow. On a 6-month time scale, this circulation would weaken the overturning, Δψ MAX<0, leading to a negative sensitivity, \(\frac{{\psi _{{MAX}} {\left( {\tau _{x} + \Delta \tau _{x} } \right)} - \psi _{{MAX}} {\left( {\tau _{s} } \right)}}}{{\Delta \tau _{s} }} = \frac{{\Delta \psi _{{MAX}} }}{{\Delta \tau _{x} }} < 0\). Similarly, a hypothetical easterly wind perturbation (Δτ x <0) would strengthen the overturning, again leading to a negative sensitivity. Elsewhere, sensitivities are close to zero, indicating that zonal wind stress perturbations further away from the cost function location take more than 6 months to have any impact on ψ MAX.

At around 1 year, an adjoint boundary Kelvin wave, propagating north to south along the eastern boundary, reaches the equator. The adjoint wave then crosses the basin propagating from east to west in less than 1 year. Given that the basin is 60° wide, the wave phase speed can be approximated at 0.2ms−1, which is slower than the first baroclinic mode in the worlds ocean, estimated to be 2.4–2.8ms−1 (Johnson and McPhaden 1993; Gill 1982). The propagation speed of equatorial Kelvin waves is distorted by a factor \(\frac{1}{{{\sqrt \alpha }}}\) because of the asynchronous time-stepping used to accelerate the models convergence (Bryan 1984). α is the models “time-stretching” factor, or the ratio between the tracer and momentum time steps: \(\alpha = \frac{{1\;{\text{day}}}}{{{\text{40}}\;{\text{min}}}} = 36\). The effective velocity of the Kelvin wave in an equivalent model with synchronous time-stepping should therefore be six times larger, or approximately 1.2ms−1, still slower than observed. Huang and Pedlosky (2002) have noted distortions in Rossby wave phase speed and structure in ocean models using asynchronous time-stepping. Repeating this analysis in a model with synchronous time-stepping would provide an accurate estimate of the phase speed of the models Kelvin wave. Computational limitations would, however, restrict the total integration period. Given the absence of a seasonal cycle in the forcing fields, the asynchronous time-stepping should not, however, affect the equilibrium solutions (Danabasoglu et al. 1996; Danabasoglu 2004).

The adjoint equatorial Kelvin wave reaches the western equatorial boundary after about 2 years. After impinging on the western boundary, it is reflected and refracted in three directions: generating a pair of adjoint boundary Kelvin waves that propagate toward the poles in both hemispheres and an adjoint equatorial Rossby wave. The speed of propagation of the first baroclinic mode of the equatorially trapped Rossby wave should be a third of the speed of the equivalent mode of the Kelvin wave (Gill 1982). It takes the adjoint Rossby wave 3–4 years to cross the basin, as seen at year 5.

As suggested by McDermott (1996), the Kelvin waves, which advance by depressing isopycnals ahead of their route, most likely get halted in their journey around the basin at the neutrally buoyant column in the northwest corner where the isopycnals are near vertical. At year 5, a perturbation can be observed in the northwest corner as the wave locks on to the site of “convection” and perturbs it. Meanwhile, the boundary waves have communicated information as far south as the ACC.

Beyond 5 years, the sensitivity pattern begins to show added complications as the adjoint equatorial Rossby wave is again reflected and refracted, and the circulation in the interior is established by the radiation of adjoint Rossby waves eastward from the boundaries (Huang et al. 2000; McDermott 1996). It is also over those time scales that adjoint advection begins to play an important role.

The patterns, which are observed in steady state, begin to emerge after 50–100 years: a band of high equatorial sensitivity surrounded by two bands of opposite polarity in the tropics. This time scale suggests that, beyond wave motions, advection also plays an important role in determining the final sensitivity pattern. It is also after approximately 100 years that peak sensitivities are reached.

Beyond 100 years, diffusion and advection both play important roles. The final pattern (not shown) is set after about 400 years.

5.3 Adjoint advection

The advection operator can only be treated as linear for passive tracers in an incompressible medium. In that case only, advection and adjoint advection operators are:

$${\user1{\mathcal{A}}} = \frac{\partial }{{\partial t}} + u\frac{\partial }{{\partial x}} + v\frac{\partial }{{\partial y}} + w\frac{\partial }{{\partial z}}$$
(21)
$$\widetilde{{\user1{\mathcal{A}}}} = - \frac{\partial }{{\partial t}} - u\frac{\partial }{{\partial x}} - v\frac{\partial }{{\partial y}} - w\frac{\partial }{{\partial z}}.$$
(22)

The adjoint advection of a tracer, for example temperature, with no temperature sources or sinks (\(\widetilde{{\user1{\mathcal{A}}}}T = 0\)), is \(\widetilde{{\user1{\mathcal{A}}}}\widetilde{T} = 0\). \(\widetilde{T}\) satisfies the tracer advection equation with time and space reversed: \(\widetilde{T}{\left( {x,y,z,t} \right)} = T{\left( { - x, - y, - z, - t} \right)}\). In a Lagrangian sense, information is advected backwards in time. Note that this interpretation does not extend to nonlinear advective effects such as those present in the advection of the velocity field.

The patterns of sensitivity of the maximum value of the meridional streamfunction to the wind stress forcing provide an interesting picture of the ways in which mechanically induced perturbations travel around the basin. The advective pathways are best observed when a perturbation in the buoyancy field is applied. The example shown in Fig.5 is a time series of the sensitivity of the streamfunction maximum to the heat flux forcing field for a perturbation maintained during 1, 5, 10, 15, 20, and 100 years. The model is forced at the surface with constant heat and freshwater fluxes. Other buoyancy perturbations such as changes in the net precipitation field have a similar impact.

Fig.5
figure 5

Sensitivity of the streamfunction maximum ψ MAX to the imposed heat flux under flux boundary conditions: \(\frac{{\partial \psi _{{MAX}} }}{{\partial Q}}\) in sverdrup per watt per square meter. The figures show the sensitivity after 1, 5, 10, 15, 20, and 100 years. All figures are plotted on the same scale: −0.012–0.012Sv W−1 m−2

On short time scales, the mechanisms that can influence the overturning are “convection,” rapid vertical advection, and the response at depth to a surface perturbation. Convection was kept in quotes because the model ceases to be statically unstable at any grid point during the initial model spin-up. The projection of the Gent–McWilliams isopycnal diffusion onto the vertical in regions of steeply sloping isopycnals is sufficient to homogenize a kilometer-thick layer over a few weeks. This efficient adiabatic diffusion substitutes itself for convective mixing associated with static instability during the model spin-up, but it has the advantage of being differentiable for the purposes of adjoint code construction. “Convection” conveys information downwards on the western boundary. Along the eastern boundary, the vertical advection is sufficiently strong to communicate a perturbation to 1,000m in just about 1 year. Elsewhere, the sensitivity is most likely related to the effect of a surface thermal, or equivalently, vorticity perturbation. For a perturbation of 400km radius and an average stratification N approximately 10−3s−1, the Rossby depth or scale over which the perturbation will decay is close to 4km. This means that a surface perturbation can influence the circulation at the depth at which the cost function is diagnosed. This explains the generally positive, yet weak, sensitivities observed over time scales of less than a year between 60 and 68°N everywhere but at the eastern wall.

During the first decade, the source of the positive sensitivity spreads eastward along the northern wall and southward along the western boundary. This is, in both cases, due to adjoint advection by the sum of Eulerian and bolus velocities. By year 10, the signal reaches the northern extent of the subtropical gyre and is rapidly advected southward by the adjoint western boundary current. The pattern also shows the role played by the recirculation of water along the northern boundary of the subtropical gyre: along the eastern boundary, the negative sensitivities extend to 35°N and westward from there, again in agreement with advection by the flow field. By year 20, the northern boundary of the subtropical gyre in the northern hemisphere and the equatorial boundary are clearly outlined. One can also notice that signals from the southern hemisphere are first communicated northward through an eastern equatorial passage. After catching on to the western boundary current, the signal travels around the subtropical gyre down to the equator in approximately 10 years, which translates into what is the observed average surface flow velocity of 3–4cm s−1. The zero contour separates the water that is advected towards the northwestern part of the basin from that which reaches the eastern boundary directly.

The signal begins to spread in the southern hemisphere after approximately 20 years; by year 100, it has filled the basin. Most of the features of the steady-state pattern of sensitivity are present after 100 years.

6 Summary and discussion

The adjoint sensitivity fields provide a remarkably clear picture of the pathways followed by wave and advective motions to influence the MOC.

Adjoint Kelvin waves take approximately 3 years to go around the basin, clockwise in the northern hemisphere, or to reach the ACC from the western boundary in the southern hemisphere. This phase velocity is at least six times slower than what would be obtained in a model with synchronous time-stepping. Adjoint Rossby waves are slower by a factor of 3 or 4; it therefore takes from decades to centuries to equilibrate the interior portion of the circulation. The sequence of adjustments is very similar to the response to a new source of deepwater described by Huang et al. (2000) in a simplified model, albeit viewed in adjoint space. Adjoint advection is slower still; it takes 20 years for a perturbation in the subtropical gyre to be advected to the northern part of the basin and over half a century for the southern hemisphere to have any influence. Adjoint diffusion is slowest and difficult to isolate.

The steady-state patterns take approximately 400 years to equilibrate and are the product of all the physical mechanisms described above.

One objective of this analysis was to show that the adjoint model behaves in a reasonable way over a number of time scales, notably on climatological time scales. The obviously nonlinear processes, notably advection, do not degrade the accuracy of the adjoint calculations, even after 400 years. Note that this conclusion does not extend to highly nonlinear eddy-resolving ocean models (Lea et al. 2000) or to models coupled to three-dimensional atmospheres. The issue of the differentiability of convection is avoided by a highly efficient diffusion process, which substitutes itself for convection during the model spin-up.

The second objective was to use a simple diagnostic to show how information is communicated and interpreted in an adjoint framework. The adjoint maps show quite clearly what paths perturbations use to have an impact on the diagnostic function, something which is often difficult to observe with traditional forward perturbation methods. By displaying sensitivities as two-dimensional maps, this approach takes much of the guesswork out of which geographical areas are the most important in maintaining or changing the cost function. Over time scales of a few years, dynamical perturbations applied along the boundaries or the equator can have an impact on the high-latitude diagnostic, but the overturning is insensitive to perturbations to wind stress in the interior of the basin. The sensitivity of the MOC to buoyancy perturbations provides a clear picture of basin separation. Buoyancy perturbations applied in the subtropical gyre or the southern hemisphere that eventually flow through the western boundary current influence the diagnostic through “convection”. Perturbations applied north of the subtropical gyre and east of the western boundary current affect the MOC through downwelling at the eastern boundary.

The ability to run adjoint simulations on centennial time scales provides a new and efficient way of approaching issues related to climate sensitivity, with a number of advantages:

  1. Climate models typically have dozens of parameters and forcing fields, with often poorly known values. The adjoint approach gives a complete picture of how the climate sensitivity depends on the geographical distribution of these parameters.

  2. Within the small perturbation constraint of the linearization, the adjoint solutions also allow a quantification of the impact of perturbations in the model parameters and forcing fields on the model diagnostic.

  3. The time dependence of the adjoint solutions creates a set of time scales over which various mechanisms can influence a diagnostic. The sensitivity of the meridional overturning strength to perturbations in buoyancy forcing peaks on time scales of a century before declining (see Fig.2), while its sensitivity to perturbations in mixing parameters takes longer to reach full strength.