1 Introduction

Many natural and experimental turbulent flows display bistable behavior, in which one observes rare and abrupt dynamical transitions between two attractors that correspond to very different subregions of the phase space. The most prominent natural examples are the Earth magnetic field reversals (over geological timescales), or the Dansgaard-Oeschger events that have affected the Earth’s climate during the last glacial period, and are probably due to several attractors of the turbulent ocean dynamics [60]. Experimental studies include examples in two-dimensional turbulence [15, 28, 47, 67], rotating tank experiments [66, 74] related to the quasi-geostrophic dynamics of oceans (Kuroshio current bistability [59, 66]) and atmospheres (weather regime blockings), three dimensional turbulent flows in a Von Kármán geometry [61], the magnetic field reversal in MHD experiments [4, 28] and Rayleigh-Bénard convection cells [17, 19, 55, 68].

The theoretical understanding of these transitions is an extremely difficult problem due to the large number of degrees of freedom, the broad spectrum of timescales and the non-equilibrium nature of these flows. Up to now there have been an extremely limited number of theoretical results, the analysis being mostly limited to analogies with models with few degrees of freedom. One example with an interesting phenomenological approach results in the clever use of symmetry arguments in order to describe effectively the largest scales of MHD experiments [57]. This strategy has been fruitful in several examples in regimes close to deterministic bifurcations, where the hypothesis of describing the turbulent flow by a few dominant modes, even if based only up to now on empirical arguments, is likely to be relevant. In fact it has led to the prediction of non-trivial qualitative features of rare transitions.

The main problem is in how to develop a general theory for these phenomena? When a complex turbulent flow switches at random from one subregion of the phase space to another, the first theoretical aim is to characterize and predict the observed attractors. This is already a non-trivial task as no picture, based on a potential landscape, is available. Indeed, this is especially tricky when the transition is not related to any symmetry breaking. An additional theoretical challenge is in being able to compute the transition rates between attractors. It is also often the case that most transition paths from one attractor to another concentrate close to a single unique path, therefore a natural objective is to compute this most probable transition path. In order to achieve these goals, it is convenient to think about the framework of large deviation theory, in order to describe either, the stationary distribution of the system, or in computing the transition probabilities of the stochastic process. In principle, we could argue that from a path integral representation of the transition probabilities [76], and the study of its semi-classical limit in an asymptotic expansion, with a well chosen small parameter, we could derive a large deviation rate function that would characterize the attractors and various other properties of the system. When this semi-classical approach is relevant, one expects a large deviation result, similar to the one obtained through the Freidlin—Wentzell theory [26]. If this notion is correct, then this would explain why these rare transitions share many analogies with phase transitions in statistical mechanics and stochastic dynamics with few degrees of freedom. The theoretical issues in order to assess the validity of such a broad approach are however numerous: what is the natural asymptotic large deviation parameter? Why and when should the finite dimensional picture be valid? How does one actually compute the large deviation rate function and characterize its minima? Should one expect that the dynamics of the rare transition to be well described by few degrees of freedom? And so on. Up to now, these questions have no clear or precise answers for any meaningful turbulence problems. The aim of this paper is to make small steps in this direction.

We will study the class of models that describe two-dimensional and quasi-geostrophic dynamics. These are arguably the simplest category of turbulence models for which phase transitions and bistability phenomena exist. For simplicity, we will consider forces which are stochastic, white in time, Gaussian noises. In previous papers, we have given partial answers to the theoretical challenges discussed above. For instance, for the two-dimensional stochastic Navier-Stokes equations, we have argued [15] that in the inertial limit (weak noise and dissipation), one should expect the invariant measure to be concentrated close to the attractors of the inertial dynamics (the two-dimensional Euler equations). This partially answers the issue of characterizing the attractors, and helps us to empirically find the bistable regimes, based on bifurcation diagrams for the inertial dynamics. Indeed, numerical simulations showed that the Navier-Stokes dynamics actually concentrates close to the set of attractors of the two-dimensional Euler equations [15], and display bistable behavior in some parameter range. In order to develop further the theoretical understanding, we have used stochastic averaging techniques to describe the long time dynamics of the barotropic quasi-geostrophic model in a regime where the main attractors are simple parallel flows (zonal jets) [14]. Moreover, this model also displays multiple attractors [14], which can be studied using large deviation theory. Furthermore, we have also developed a similar theoretical approach for the stochastic Vlasov equations where bistability was also discussed [51, 52]. However these works only partially address the theoretical questions: mainly in predicting the set of attractors and in determining the phase transitions and bistable regimes. However, up to now it has not been possible to explicitly compute the transition rates and transition probabilities for these systems.

For turbulent dynamics, in the inertial limit, the attractors are expected to be subclasses of the attractors of the inertial dynamics, as we discussed above. The natural attractors of the inertial dynamics are those derived from the microcanonical measures, namely the macroscopic equilibria of the Miller-Robert-Sommeria theory [50, 6365]. There have been many recent contributions to the application of this theory [11, 23, 3436, 53, 54, 58, 70, 71, 73, 75]. In essence, these microcanonical measures are characterized by an entropy functional that is actually a large deviation rate functional (see for instance [8, 49]). As explained in [9], the related entropy maximization is closely related to energy-Casimir variational problems. This link highlights the possibility that energy-Casimir functionals are natural potentials for the effective description of the largest scales in these turbulent flows. We address this point further in the conclusion.

The goal of this paper is to define and to study a class of Langevin dynamics associated to energy-Casimir potentials and in the investigation of the related stochastic process. We show that this stochastic process is an equilibrium one, in the sense that either it verifies detailed balance, or a generalization of the detailed balance property. In the latter, the time reversed stochastic process is not simply the initial process but belongs to the same class of physical model (for instance in Langevin dynamics of particles in magnetic fields). From this time reversal symmetry, identified at the level of the action, we can show that the quasi-potential related to the action minimization can be explicitly computed, and is actually the energy-Casimir functional. Moreover, we can also explain why fluctuation trajectories (the most probable paths to get a rare fluctuation) are time reversed relaxation trajectories of the dual dynamics, as in classical Langevin dynamics. In situations with bistability (when the quasi-potential has two or more local minima), we recover the classical picture: an Arrhenius law for the transition rate and a typical transition trajectory that follows an instanton trajectory (the time reversed trajectory of the relaxation path of the dual dynamics from the lowest saddle point). All these properties are derived from the orthogonality of the Hamiltonian part of the dynamics to the potential part, which is a consequence of the fact that the potential is conserved under the Hamiltonian dynamics.

We discuss a specific example where the energy-Casimir functional leads to bistable regimes, and describe a bifurcation diagram that includes a tricritical point (a bifurcation from a first order phase transition to a second order phase transition). Close to the critical point, the turbulent dynamics can be reduced to the effective dynamics involving only a few degrees of freedom related to the null space of the potential at the transition point, by analogy with the phenomenology of bifurcations in deterministic systems. However, far away from the tricritical point such a reduction does not seem to be relevant.

These Langevin dynamics are very interesting examples of turbulent dynamics, that fit within the classical framework of equilibrium stochastic thermodynamics. All the recent results related to stochastic thermodynamics: Gallavotti-Cohen fluctuation relations, relations between the entropy production and the probability of paths, and so on, could be easily generalized for these Langevin dynamics. Together with genuine turbulence dynamics, they also display fascinating dynamical behavior including phase transitions. The relevance of these dynamics for real physical phenomena should however be questioned. As discussed in the paper and in the conclusion, several examples of these Langevin dynamics actually relate to physical microscopic dissipation mechanisms (linear friction and/or viscosity), but this is not true in general. When this analogy is incorrect, these dynamics should be understood, at best, as effective models for the largest scales of the flows. All these aspects and the resulting limitations and benefits of these model to real flows are further discussed in the conclusion.

This Langevin dynamics approach also opens up a new set of very interesting theoretical and mathematical issues. For instance, for dynamics that involve white in space noises, or colored noises but with vanishing related frictions, under which conditions are the stochastic dynamics well-posed? Will dynamics with regularized noises lead to qualitative similar behavior? What are the necessary and sufficient conditions for the formal computations performed in this work to be mathematically rigourous? Some of these questions are related to recent advances in the mathematics of stochastic partial differential equations [6, 7, 29, 39, 40, 44, 45]. Again, these aspects are further discussed in the conclusion.

In Sect. 2 we discuss a general framework for Langevin dynamics. Starting from a few hypotheses (Liouville theorem, transversality condition, and the relation between friction and noise amplitude), we derive the time reversal symmetry properties of the stochastic process. Section 3 applies this framework to two-dimensional and quasi-geostrophic turbulence models. Section 4 discusses a specific case where a tricritical point is a situation for bistability, and finally Sect. 5 concludes by emphasizing the interest and limitations of these Langevin models and outlining the perspectives.

2 Langevin Dynamics and Equilibrium Instantons

The aim of this section is to describe the general framework for Langevin dynamics. We first define Langevin dynamics in Sect. 2.1, as stochastic, ordinary or partial differential equations, for which the deterministic part is composed of a vector field with a Liouville property (conservation of phase space volume, Eq. 5) plus a potential force with potential \(\mathcal {G}\). The conservative part of the dynamics are assumed to be transverse to the gradient of the potential (6). The stochastic force is defined as the derivative of a Brownian process, with a correlation function identical to that of the kernel of the potential force.

We derive the main properties of Langevin dynamics: its invariant measure is a Gibbs measure with potential \(\mathcal {G}\). As Langevin dynamics is a Markov process, we can define the time reversed Markov process. A classical proof that the time reversal of a finite dimensional diffusion is also a diffusion can be found in [33]. It is also a classical result, in the sense that for a Langevin dynamics, the time reversed process is another Langevin dynamics which is usually simply related to the original dynamics. We call this process the reversed, or dual Langevin dynamics. We study this time-reversal symmetry through the symmetry of the action, describing transition path probabilities. Based on this symmetry, we describe the relation between relaxation paths (most probable paths for a relaxation from any initial state to an attractor of the system) and fluctuation paths (most probable paths to observe a fluctuation starting from an attractor and ending at any point of the system). As we explain, for Langevin dynamics, fluctuation paths are time reversed trajectories of relaxation paths of the dual dynamics.

These properties, for instance the relation between fluctuation and relaxation paths, can be considered as a generalization of Onsager reciprocal relations. However, they are valid for fluctuations arbitrarily far from the main attractor, and for relaxation dynamics that do not necessarily need to be linear. Such a symmetry between the fluctuation and relaxation paths is somehow a classical remark in statistical mechanics. For instance, the relation between the action symmetry and detailed balance can be found in [41], discussion of these properties can also be found in [46], and the general idea may be traced back to Onsager and Machlup [56]. We also note an interesting discussion of this symmetry in [69]. Moreover, this symmetry is also clearly related to the Gallavotti-Cohen fluctuation relations [24, 27].

The fact that large deviation functionals can be computed explicitly when the dynamics can be decomposed into the sum of a gradient and a transverse part is explained in the book of Freidlin and Wentzell [26]. In our problem, this transversality comes from the Hamiltonian structure and the fact that the potential is a conserved quantity of the Hamiltonian dynamics. As explained very clearly in [5], for non-equilibrium systems, the deterministic vector field can also be decomposed into the sum of the gradient of the quasi-potential plus a transverse part, the transversality condition being equivalent to the Hamilton-Jacobi equation. Similar ideas can also be found in the works of Graham in the 1980s and 1990s (see for instance [30]).

While many ideas discussed in the following section are classical: time reversal of Langevin dynamics, the relation between a transverse decomposition and the time reversed process, the Lyapunov properties of the quasi-potential, they are discussed independently by authors in different communities. We do not know of any references where the general structure of Langevin dynamics is discussed bringing together these sets of ideas. It is thus useful to have a self contained discussion. Moreover, the general relationship between the transverse decomposition and the conserved quantites of Hamiltonian systems, and the applicability of this framework to the two-dimensional Euler and quasi-geostrophic equation are new.

2.1 Langevin Dynamics with Potential \(\mathcal {G}\)

We call the Langevin dynamics for the potential \(\mathcal {G}\) the stochastic dynamics given by

$$\begin{aligned} \frac{\partial q}{\partial t}=\mathcal {F}\left[ q\right] (\mathbf{r})-\alpha \int _{\mathcal {D}}\, C(\mathbf{r},\mathbf{r}')\frac{\delta \mathcal {G}}{\delta q(\mathbf{r}')}\left[ q\right] \,\mathrm{d}\mathbf{r}'+\sqrt{2\alpha \gamma }\, \eta , \end{aligned}$$
(1)

where \(\mathcal {F}\) satisfies a Liouville property (defined below, Eq. 5), \(\mathcal {G}\) is a conserved quantity of the dynamics defined by \(\mathcal {F}\) (see Eq. (6), and the stochastic force \(\eta \) is a Gaussian process, white in time, with correlation function \(\mathbb {E}\left[ \eta (\mathbf{r},t)\eta (\mathbf{r}',t')\right] =C(\mathbf{r},\mathbf{r}')\delta (t-t')\). As it is a correlation function, \(C\) is a symmetric positive function, i.e. for any function \(\phi \) over \(\mathcal {D}\)

$$\begin{aligned} \int _{\mathcal {D}}\int _{\mathcal {D}}\,\phi \left( \mathbf{r}\right) \, C(\mathbf{r},\mathbf{r}')\, \phi \left( \mathbf{r}'\right) \, \mathrm{d}\mathbf{r}\, \mathrm{d}\mathbf{r}' \ge 0 , \end{aligned}$$
(2)

and \(C(\mathbf{r},\mathbf{r}')=C(\mathbf{r}',\mathbf{r})\). For simplicity, we assume in the following that \(C\) is positive definite and has an inverse \(C^{-1}\) such that

$$\begin{aligned} \int _{\mathcal {D}}\, C(\mathbf{r},\mathbf{r}_{1})\, C^{-1}(\mathbf{r}_{1},\mathbf{r}')\, \mathrm{d}\mathbf{r}_{1}=\delta \left( \mathbf{r}-\mathbf{r}'\right) . \end{aligned}$$

The variable \(q\) is either finite dimensional (for instance \(q\in \mathbb {R}^{N}\)), or a field (e.g. a two-dimensional field for solution of the two-dimensional Euler equations). If \(q\in \mathbb {R}^{N}\), we assume that the deterministic dynamical system

$$\begin{aligned} \frac{\partial q}{\partial t}=\mathcal {F}\left[ q\right] , \end{aligned}$$
(3)

conserves the Lebesgue measure \(\prod _{i=1}^{N}\, \mathrm{d}q_{i}\), or equivalently that the divergence of the vector field \(\mathcal {F}\) is zero:

$$\begin{aligned} \nabla \cdot \mathcal {F}\equiv \sum _{i=1}^{N}\frac{\partial \mathcal {F}}{\partial q_{i}}=0. \end{aligned}$$
(4)

We call this property a Liouville property. If \(q\) is a field (for instance a two-dimensional vorticity or potential vorticity field, for the two-dimensional Euler or quasi-geostrophic equations) defined over a domain \(\mathcal {D}\), we assume that a Liouville property holds, in the sense that the formal generalization of the finite dimensional Liouville property,

$$\begin{aligned} \nabla \cdot \mathcal {F}\equiv \int _{\mathcal {D}}\,\frac{\delta \mathcal {F}}{\delta q(\mathbf{r})}\,\mathrm{d}\mathbf{r}=0, \end{aligned}$$
(5)

is verified. We further assume that the deterministic dynamical system (3) has \(\mathcal {G}\) as a conserved quantity. Then for any \(q\):

$$\begin{aligned} \int _{\mathcal {D}}\,\mathcal {F}\left[ q\right] (\mathbf {r})\frac{\delta \mathcal {G}}{\delta q(\mathbf{r})}\left[ q\right] \,\mathrm{d}\mathbf{r}=0. \end{aligned}$$
(6)

This equation is a transversality property between the the vector field \(\mathcal {F}\) and the gradient of the potential \(\mathcal {G}\).

These two hypotheses, Liouville (5) and the conservation of the potential (6), are verified if the dynamical system is Hamiltonian:

$$\begin{aligned} \mathcal {F}[q]=\left\{ q,\mathcal {H}\right\} , \end{aligned}$$
(7)

with \(\mathcal {G}\) being one of its conserved quantity, for instance \(\mathcal {G}=\mathcal {H}\). We stress however that \(\mathcal {G}\) does not need to be \(\mathcal {H}\) in general.

The major property of Langevin dynamics is that the stationary probability density functional is known a priori and is given by

$$\begin{aligned} P_{s}[q]=\frac{1}{Z}\exp \left( -\frac{\mathcal {G}[q]}{\gamma }\right) , \end{aligned}$$
(8)

where \(Z\) is a normalization constant. At a formal level, this can be easily checked by writing the Fokker-Planck equation for the evolution of the probability functional. Then the property that \(P_{s}\) is stationary readily follows from the Liouville property and the fact that \(\mathcal {G}\) is a conserved quantity for the deterministic dynamics.

2.2 Reversed Langevin Dynamics

We consider the linear operator \(I\) to be a linear involution on the space of fields \(q\) (\(I^{2}=\mathrm{Id}\)). Therefore, we define the reversed Langevin dynamics, with respect to \(I\), as

$$\begin{aligned} \frac{\partial q}{\partial t}=\mathcal {F}_{r}\left[ q\right] (\mathbf {r})-\alpha \int _{\mathcal {D}}\, C_{r}(\mathbf{r},\mathbf{r}')\frac{\delta \mathcal {G}_{r}}{\delta q(\mathbf{r}')}\left[ q\right] \,\mathrm{d}\mathbf{r}'+\sqrt{2\alpha \gamma }\eta , \end{aligned}$$
(9)

where

$$\begin{aligned} \mathcal {F}_{r}&= -I\circ \mathcal {F}\circ I,\end{aligned}$$
(10)
$$\begin{aligned} C_{r}&= I^{+}CI, \end{aligned}$$
(11)

here \(I^{+}\) is the adjoint of \(I\) for the \(L^{2}\) scalar product, and

$$\begin{aligned} \mathcal {G}_{r}\left[ q\right] =\mathcal {G}\left[ I\left[ q\right] \right] . \end{aligned}$$
(12)

From the properties of \(\mathcal {F}\), \(C\) and \(\mathcal {G}\), one can demonstrate that a Liouville property holds for \(\mathcal {F}_{r}\), that \(C_{r}\) is positive definite, and that \(\mathcal {G}_{r}\) is a conserved quantity for the dynamics \(\frac{\partial q}{\partial t}=\mathcal {F}_{r}\left[ q\right] \) for any \(q\):

$$\begin{aligned} \int _{\mathcal {D}}\,\mathcal {F}_{r}\left[ q\right] (\mathbf {r})\frac{\delta \mathcal {G}_{r}}{\delta q(\mathbf{r})}\left[ q\right] \,\mathrm{d}\mathbf{r}=0. \end{aligned}$$
(13)

As a consequence, the reversed Langevin dynamics (9) is also Langevin.

A very interesting case is when the deterministic dynamics is symmetric with respect to time reversal. Then there exists a linear involution \(I\) such that

$$\begin{aligned} \mathcal {F}=\mathcal {F}_{r}=-I\circ \mathcal {F}\circ I. \end{aligned}$$
(14)

Moreover, if \(C\) and \(\mathcal {G}\) are symmetric with respect to the involution: \(C_{r}=C\) and

$$\begin{aligned} \mathcal {G}_{r}=\mathcal {G}, \end{aligned}$$
(15)

then the reversed Langevin dynamics are nothing else than the original Langevin dynamics. In this case, we say that the Langevin dynamics are time-reversible. Simple examples of time-reversible Langevin dynamics are the overdamped processes:

$$\begin{aligned} \dot{q}=-\int _{\mathcal {D}}\, C(\mathbf{r},\mathbf{r}')\frac{\delta \mathcal {G}}{\delta q(\mathbf{r}')}\left[ q\right] \, \mathrm{d}\mathbf{r}'+\sqrt{2\gamma }\eta , \end{aligned}$$
(16)

which can be proved to be time-reversible with the involution \(I=\mathrm{Id}\), the canonical Langevin dynamics

$$\begin{aligned} \left\{ {\begin{array}{lc} \dot{x} = p,\\ \dot{p} = -\frac{\mathrm{d}V}{\mathrm{d}x}-\alpha p+\sqrt{2\alpha k_{B}T}\eta ,\\ \end{array} } \right. \end{aligned}$$

with \(I\left( x,p\right) ^T=\left( x,-p\right) ^T\), or the two-dimensional stochastic Euler equations:

$$\begin{aligned} \frac{\partial \omega }{\partial t}+\mathbf{v}\cdot \nabla \omega =-\alpha \int _{\mathcal {D}}\, C(\mathbf{r},\mathbf{r}')\frac{\delta \mathcal {G}}{\delta \omega (\mathbf{r}')}\, \mathrm{d}\mathbf{r}'+\sqrt{2\alpha \gamma }\eta ,\quad \mathrm{with}\quad \mathbf{v}=\mathbf {e}_{z}\times \nabla \psi , \end{aligned}$$

under the assumption that \(\mathcal {G}\) is conserved by the Euler dynamics, and is an even functional (\(\mathcal {G}\left[ -\omega \right] =\mathcal {G}\left[ \omega \right] \)). For the two-dimensional Euler equations, the natural involution corresponding to time-reversal symmetry is \(I\left[ \omega \right] =-\omega \). In the following, we will also consider cases when the Langevin dynamics are not time-reversible, for instance the two-dimensional stochastic Euler equations when \(\mathcal {G}\) is not even, or the quasi-geostrophic equations with topography \(h(y)\ne 0\).

2.3 Path Integrals, Action, and Time-Reversal Symmetry

The Lagrangian \(\mathcal {L}\) associated to the Langevin dynamics (1) is defined as

$$\begin{aligned} \mathcal {L}\left[ q,\frac{\partial q}{\partial t}\right]&= \frac{1}{4\alpha }\int _{\mathcal {D}}\int _{\mathcal {D}}\,\left( \frac{\partial q}{\partial t}-\mathcal {F}\left[ q\right] (\mathbf {r})+\alpha \int _{\mathcal {D}}C(\mathbf{r},\mathbf{r}_{1})\frac{\delta \mathcal {G}}{\delta q(\mathbf{r}_{1})}\left[ q\right] \text{ d }\mathbf{r}_{1}\right) \nonumber \\&\times \,C^{-1}(\mathbf{r},\mathbf{r}')\left( \frac{\partial q}{\partial t}-\mathcal {F}\left[ q\right] (\mathbf {r}')+\alpha \int _{\mathcal {D}}\, C(\mathbf{r}',\mathbf{r}_{2})\frac{\delta \mathcal {G}}{\delta q(\mathbf{r}_{2})}\left[ q\right] \text{ d }\mathbf{r}_{2}\right) \,\mathrm{d}\mathbf{r}\,\mathrm{d}\mathbf{r}',\nonumber \\ \end{aligned}$$
(17)

and the action functional as

$$\begin{aligned} \mathcal {A}_{(0,T)}\left[ q\right] =\int _{0}^{T}\mathcal {L}\left[ q(t),\frac{\partial q}{\partial t}(t)\right] \,\mathrm{d}t. \end{aligned}$$
(18)

Consequently, the Lagrangian of the reverse process is defined as

$$\begin{aligned} \mathcal {L}_{r}\left[ q,\frac{\partial q}{\partial t}\right]&= \frac{1}{4\alpha }\int _{\mathcal {D}}\int _{\mathcal {D}}\,\left( \frac{\partial q}{\partial t}-\mathcal {F}_{r}\left[ q\right] (\mathbf {r})+\alpha \int _{\mathcal {D}}\, C_{r}(\mathbf{r},\mathbf{r}_{1})\frac{\delta \mathcal {G}_{r}}{\delta q(\mathbf{r}_{1})}\left[ q\right] \text{ d }\mathbf{r}_{1}\right) \nonumber \\&\times \,C_{r}^{-1}(\mathbf{r},\mathbf{r}')\left( \frac{\partial q}{\partial t}-\mathcal {F}_{r}\left[ q\right] (\mathbf {r}')+\alpha \int _{\mathcal {D}}\, C_{r}(\mathbf{r}',\mathbf{r}_{2})\frac{\delta \mathcal {G}_{r}}{\delta q(\mathbf{r}_{2})}\left[ q\right] \text{ d }\mathbf{r}_{2}\right) \,\mathrm{d}\mathbf{r}\, \mathrm{d}\mathbf{r}',\nonumber \\ \end{aligned}$$
(19)

with the time-reversed action functional \(\mathcal {A}_{r}\) defined accordingly.

Using the Onsager-Machlup formalism, we know that \(P\left[ q_{T},T;q_{0},0\right] \), the transition probability to go from the state \(q_{0}\) at time \(0\) to the state \(q_{T}\) at time \(T\), can be expressed as

$$\begin{aligned} P\left[ q_{T},T;q_{0},0\right] =\int ^{q(T)=q_T}_{q(0)=q_0}\,\mathcal {D}\left[ q\right] \,{\exp }\left( -\frac{\mathcal {\mathcal {A}}}{\gamma }\right) , \end{aligned}$$
(20)

where we have used the fact that the Jacobian

$$\begin{aligned} J\left[ q\right] =\left| \det \left[ \frac{\delta }{\delta q(\mathbf{r}')}\left( \dot{q}-\mathcal {F}[q]+\alpha \int _{\mathcal {D}}\, C(\mathbf{r},\mathbf{r}_{1})\frac{\delta \mathcal {G}}{\delta q(\mathbf{r}_{1})}\left[ q\right] \, \mathrm{d}\mathbf{r}_{1}\right) \right] \right| , \end{aligned}$$

is formally equal to a \(q\)-independent constant when we interpret our stochastic partial differential equation using Ito’s convention [76], and can be included in the definition of the functional integration measure.

For a given path \(\left\{ q(t)\right\} _{0\le t\le T}\), we define the reversed path by \(q_{r}(t)=I\left[ q(T-t)\right] \). The main interest of the reversed process stems from the study of temporal symmetries of the stochastic process and the remark that

$$\begin{aligned} \mathcal {A}\left[ q_{r},T\right] =\mathcal {A}_{r}\left[ q,T\right] -\left( \mathcal {G}\left[ q(T)\right] -\mathcal {G}\left[ q(0)\right] \right) , \end{aligned}$$
(21)

or equivalently, using (12),

$$\begin{aligned} \mathcal {A}\left[ q,T\right] =\mathcal {A}_{r}\left[ q_{r},T\right] +\left( \mathcal {G}\left[ q(T)\right] -\mathcal {G}\left[ q(0)\right] \right) . \end{aligned}$$
(22)

Let us derive this equality. Using the definition of \(\mathcal {F}_{r}\), \(\mathcal {G}_{r}\) and \(C_{r}\), (Eqs. 1012), and using that

$$\begin{aligned} \frac{\delta \mathcal {G}_{r}}{\delta q(\mathbf {r})}\left[ q\right] =I\frac{\delta \mathcal {G}}{\delta q(\mathbf {r})}\left[ I\left[ q\right] \right] ,\nonumber \end{aligned}$$

with \(I^{2}=\mathrm{Id}\), we have

$$\begin{aligned} \mathcal {L}\left[ I\left[ q\right] ,-\frac{\partial }{\partial t}I\left[ q\right] \right]&= \frac{1}{4\alpha }\int _{\mathcal {D}}\int _{\mathcal {D}}\,\left( \frac{\partial q}{\partial t}-\mathcal {F}_{r}\left[ q\right] (\mathbf {r})-\alpha \int _{\mathcal {D}}\, C_{r}(\mathbf{r},\mathbf{r}_{1})\frac{\delta \mathcal {G}_{r}}{\delta q(\mathbf{r}_{1})}\left[ q\right] \text{ d }\mathbf{r}_{1}\right) \nonumber \\&\times \,C_{r}^{-1}(\mathbf{r},\mathbf{r}')\left( \frac{\partial q}{\partial t}-\mathcal {F}_{r}\left[ q\right] (\mathbf {r}')-\alpha \int _{\mathcal {D}}\, C_{r}(\mathbf{r}',\mathbf{r}_{2})\frac{\delta \mathcal {G}_{r}}{\delta q(\mathbf{r}_{2})}\left[ q\right] \mathrm{d}\mathbf{r}_{2}\right) \,\mathrm{d}\mathbf{r}\, \mathrm{d}\mathbf{r}'.\nonumber \end{aligned}$$

Then, by expanding and using the conservation of \(\mathcal {G}_{r}\) we arrive to

$$\begin{aligned} \mathcal {L}\left[ I\left[ q\right] ,-\frac{\partial I\left[ q\right] }{\partial t}\right] =\mathcal {L}_{r}\left[ q,\frac{\partial q}{\partial t}\right] -\int _{\mathcal {D}}\,\frac{\partial q}{\partial t}\frac{\delta \mathcal {G}}{\delta q(\mathbf{r})}\,\mathrm{d}\mathbf{r}, \end{aligned}$$

or equivalently,

$$\begin{aligned} \mathcal {L}\left[ I\left[ q\right] ,-\frac{\partial I\left[ q\right] }{\partial t}\right] =\mathcal {L}_{r}\left[ q,\frac{\partial q}{\partial t}\right] -\frac{\mathrm{d}}{\mathrm{d}t}\mathcal {G}\left[ q\right] . \end{aligned}$$

Using the above formula and (18) in order to compute \(\mathcal {A}\left[ q_{r},T\right] \), we obtain (21).

Performing the change of variable \(q_{r}(t)=I\left[ q(T-t)\right] \) in the path integral representation (20), and using the action duality formula (21), we obtain

$$\begin{aligned} P\left[ q_{T},T;q_{0},0\right] \exp \left( -\frac{\mathcal {G}\left[ q_{0}\right] }{\gamma }\right) =P_{r}\left[ I\left[ q_{0}\right] ,T;I\left[ q_{T}\right] ,0\right] \exp \left( -\frac{\mathcal {G}_{r}\left[ I\left[ q_{T}\right] \right] }{\gamma }\right) , \end{aligned}$$
(23)

where \(P_{r}\) is a transition probability for the reversed process. We have thus obtain a relation between the transition probability of the direct, forward, process and that of the reversed one.

2.4 Detailed Balance for Reversible Processes

If we assume that the Langevin equation is time-reversible, then the direct and the reverse processes are the same, and the duality relation for the transition probability implies

$$\begin{aligned} P\left[ q_{T},T;q_{0},0\right] \exp \left( -\frac{\mathcal {G}\left[ q_{0}\right] }{\gamma }\right) =P\left[ I\left[ q_{0}\right] ,T;I\left[ q_{T}\right] ,0\right] \exp \left( -\frac{\mathcal {G}\left[ I\left[ q_{T}\right] \right] }{\gamma }\right) , \end{aligned}$$

where it is also true that \(\exp \left( -\mathcal {G}\left[ I\left[ q_{T}\right] \right] /\gamma \right) =\exp \left( -\mathcal {G}\left[ q_{T}\right] /\gamma \right) \). This result is the detailed balance property for the stochastic process. When the reverse process is different from the direct process, then in general, detailed balance should not be verified.

2.5 Steady States of the Deterministic Dynamics, Critical Points of \(\mathcal {G}\), and Relaxation Paths

2.5.1 Steady States and Critical Points of the Potential \(\mathcal {G}\)

Let us prove that any non-degenerate critical point of the potential is also a steady state of the deterministic dynamics. This is a classical result in mechanics, i.e. any non-degenerate critical point of the energy is a steady state.

Extrema of the stationary PDF are critical points of the potential \(\mathcal {G}\). Such a critical point \(q_{c}\) verifies

$$\begin{aligned} \frac{\delta \mathcal {G}}{\delta q(\mathbf{r})}\left[ q_{c}\right] =0. \end{aligned}$$

We assume that the critical point is non-degenerate, that the second variations of \(\mathcal {G}\) has no null eigenvalues. More explicitly, the relation

$$\begin{aligned} \int _{\mathcal {D}}\,\frac{\delta ^{2}\mathcal {G}}{\delta q(\mathbf{r})\delta q(\mathbf{r}')}\left[ q_{c}\right] \,\phi (\mathbf {r}')=0, \end{aligned}$$

implies that \(\phi =0\). Now, we can show that \(q_{c}\) is also a steady state of the Hamiltonian dynamics.

We use the property that \(\mathcal {G}\) is conserved. By taking the variational derivative \(\delta /\delta q(\mathbf{r})\) of Eq. (6) we obtain that for any \(q\)

$$\begin{aligned} \int _{\mathcal {D}}\,\frac{\delta ^{2}\mathcal {G}}{\delta q(\mathbf{r}_{2})\delta q(\mathbf{r})}\left[ q\right] \mathcal {F}\left[ q\right] (\mathbf {r}_{2})\,\mathrm{d}\mathbf{r_{2}}+\int _{\mathcal {D}}\,\frac{\delta \mathcal {G}}{\delta q(\mathbf{r}_{2})}\left[ q\right] \frac{\delta \mathcal {F}}{\delta q(\mathbf{r})}\left[ q\right] (\mathbf {r}_{2})\,\mathrm{d}\mathbf{r_{2}}=0. \end{aligned}$$
(24)

If we apply this formula at the critical point \(q_{c}\), we can conclude that

$$\begin{aligned} \int _{\mathcal {D}}\,\frac{\delta ^{2}\mathcal {G}}{\delta q(\mathbf{r}_{2})\delta q(\mathbf{r})}\left[ q_{c}\right] \,\mathcal {F}\left[ q_{c}\right] (\mathbf {r}_{2})\, \mathrm{d}\mathbf{r_{2}}=0. \end{aligned}$$

Moreover, using that \(\mathcal {G}\) is non-degenerate we observe that for all \(\mathbf {r}\)

$$\begin{aligned} \mathcal {F}\left[ q_{c}\right] (\mathbf {r})=0, \end{aligned}$$

and thus \(q_{c}\) is a steady state of the deterministic dynamics.

The remark that non-degenerate critical points of conserved quantity are steady states also extend to their stability properties. Any stable and non-degenerate minima or maxima of a conserved quantity is a stable fixed point of the deterministic dynamics (again, think of the energy or angular momentum in classical mechanics). These points are probably about as old as classical mechanics. For infinite-dimensional problems, like the two-dimensional Euler equations or any other fluid mechanical problems, the issue may be more subtle. Indeed, one should be careful of possible norm inequivalence (an infinite number of small scales can do a lot). But proofs about stability of critical points of conserved quantities can still be obtained on a case by case basis. For instance, we refer to the two Arnold stability theorems for the two-dimensional Euler equations [1, 1], or their generalization to many other fluid mechanical problems [38].

Another important point is that from relations (10) and (12), it is clear that if \(q_{s}\) is a steady state of the deterministic dynamics, then \(I\left[ q_{s}\right] \) is a steady state of the reversed dynamics, and vice-versa. Moreover, if \(q_{c}\) is a critical point of the potential \(\mathcal {G}\), then \(I\left[ q_{c}\right] \) will be a critical point of \(\mathcal {G}_{r}\). The stability properties (minima, global minima, local minima, number of unstable directions, and so on) of \(q_{c}\), with respect to the minimization of \(\mathcal {G}\), will agree with the stability properties of \(I\left[ q_{c}\right] \) with respect to the minimization of \(\mathcal {G}_{r}\).

2.5.2 Relaxation Dynamics and Lyapunov Functionals

We define a relaxation path to be a solution of the relaxation dynamics:

$$\begin{aligned} \frac{\partial q}{\partial t}=\mathcal {F}\left[ q\right] (\mathbf {r})-\alpha \int _{\mathcal {D}}\, C(\mathbf{r},\mathbf{r}')\frac{\delta \mathcal {G}}{\delta q(\mathbf{r}')}\left[ q\right] \,\mathrm{d}\mathbf{r}'. \end{aligned}$$
(25)

For any relaxation path \(q(t)\), using the property that \(\mathcal {G}\) is conserved by the inertial dynamics we can derive that

$$\begin{aligned} \frac{{\mathrm {d}}}{{\mathrm {d}}t}{\mathcal {G}}\left[ q(t)\right] =-\alpha \int _{\mathcal {D}} \, C(\mathbf{{r}},\mathbf{{r}}')\frac{\delta {\mathcal {G}}}{\delta q(\mathbf{r}')}\frac{\delta {\mathcal {G}}}{\delta q(\mathbf{{r}})}\,{\mathrm {d}}\mathbf{{r}}\,{\mathrm {d}}\mathbf{{r}}'\le 0, \end{aligned}$$

where we have used the positive definiteness of \(C\) for establishing the inequality. Thus, we can conclude that \(\mathcal {G}\) is a Lyapunov functional for the relaxation dynamics.

From this, we state that any minima of the potential is stable for the relaxation dynamics.

2.6 Action Minima, Relaxation Paths of the Dual Dynamics and Instantons

We consider action minima, subjected to fixed boundary conditions

$$\begin{aligned} A_{(0,T)}\left[ q_{0},q_{T}\right] =\min _{\left\{ q\,\left| \,q(0)=q_{0},\ q(T)=q_{T}\right\} \right. }\mathcal {A}_{(0,T)}\left[ q\right] . \end{aligned}$$
(26)

This variational problem is important for many questions. For instance, it describes the most probable path to go from state \(q_{0}\) to state \(q_{T}\). Moreover, as will be discussed in the next section, it will be useful in order to describe large deviation results.

From the definition of the action (1718), and as \(C\) is positive definite, it is clear that

$$\begin{aligned} A_{(0,T)}\left[ q_{0},q_{T}\right] \ge 0. \end{aligned}$$

Furthermore, using the action duality relation given by Eq. (22), we also conclude that

$$\begin{aligned} A_{(0,T)}\left[ q_{0},q_{T}\right] \ge \mathcal {G}\left[ q_{T}\right] -\mathcal {G}\left[ q_{0}\right] . \end{aligned}$$
(27)

It is self-evident from the definition of the relaxation paths (25), and from the structure of the action functional (1718) that a relaxation path has zero action. This should be physically intuitive, as no noise is needed for the system to follow such a path. Then, if there exists a relaxation path between \(q_{0}\) and \(q_{T}\) taking time \(T\), (\(\left\{ q(t)\right\} _{0\le t\le T}\) such that \(q(0)=q_{0}\) and \(q(T)=q_{T}\)), we deduce that

$$\begin{aligned} A_{(0,T)}\left[ q_{0},q_{T}\right] =0. \end{aligned}$$

Similarly, using the duality relation (22), if there exists a relaxation path for the reversed dynamics between \(I\left[ q_{T}\right] \) and \(I\left[ q_{0}\right] \), we surmise that

$$\begin{aligned} A_{(0,T)}\left[ q_{0},q_{T}\right] =\mathcal {G}\left[ q_{T}\right] -\mathcal {G}\left[ q_{0}\right] . \end{aligned}$$

This is an important statement. Indeed, the reversed dynamics has properties very similar to that of the original dynamics (it has the same fixed points, the same attractors, and the same saddles up to the application of the involution \(I\)), but in the argument above, we see that the final and end-points of the relaxation paths have been exchanged from \(q_{0}\) and \(q_{T}\) to \(I\left[ q_{T}\right] \) and \(I\left[ q_{0}\right] \) respectively. This will be especially useful when the starting point is one of the local minima of the potential \(\mathcal {G}\), and thus one of the attractors of the reversed dynamics.

Consider now the case when \(q_{0}\) is a local minima of \(\mathcal {G}\). Then as it is also an attractor of the relaxation dynamics, no non-trivial relaxation path will start at \(q_{0}\). But, for all \(q_{T}\) inside the basin of attraction of \(q_{0}\), there exists a relaxation path from \(q_{T}\) to \(q_{0}\). Generically, this path will take an infinite amount of time \(T=\infty \), e.g. if there is an exponential relaxation. Consequently, there will also be a relaxation path for the dual dynamics from \(I\left[ q_{T}\right] \) to \(I\left[ q_{0}\right] \) taking infinite time.

Therefore, for the relaxation dynamics, we have that for all \(q_{T}\) in the basin of attraction of an local minima of \(q_{0}\)

$$\begin{aligned} A_{(-\infty ,0)}\left[ q_{0},q_{T}\right] =\mathcal {G}\left[ q_{T}\right] -\mathcal {G}\left[ q_{0}\right] . \end{aligned}$$

For many problems, e.g. when one considers the stationary distribution, the action minima \(A_{(-\infty ,0)}\left[ q_{0},q_{T}\right] \) becomes an essential quantity.

If \(q_{T}\) is in the basin of attraction of \(q_{1}\ne q_{0}\), then as there exists a relaxation path from \(q_{1}\) to \(q_{T}\), we can infer that

$$\begin{aligned} A_{(-\infty ,0)}\left[ q_{0},q_{T}\right] =A_{(-\infty ,0)}\left[ q_{0},q_{1}\right] . \end{aligned}$$

Moreover, it is easily understood that the action minima will correspond to the relaxation trajectory, in the reversed dynamics, from the saddle \(q_{s}(q_{0},q_{1})\) that belongs to the closure of the basin of attractions of both \(q_{0}\) and \(q_{1}\), with the smallest possible value of the potential \(\mathcal {G}\left[ q_{s}(q_{0},q_{1})\right] \). Hence, if \(q_{T}\) is within the basin of attraction of \(q_{1}\) we have

$$\begin{aligned} A_{(-\infty ,0)}\left[ q_{0},q_{T}\right] =A_{(-\infty ,0)}\left[ q_{0},q_{1}\right] =A_{(-\infty ,\infty )}\left[ q_{0},q_{s}(q_{_{0},}q_{1})\right] =\mathcal {G}\left[ q_{s}(q_{0},q_{1})\right] -\mathcal {G}\left[ q_{0}\right] . \end{aligned}$$

Ultimately, the minimizers of the action, between local minima of the potential and saddles, of infinite time, are immensely important. These trajectories are called instantons. As it should be obvious from the previous discussion, instantons for Langevin dynamics are the reversed time trajectories of relaxation paths of the reversed dynamics. Instantons are thus fluctuation paths for the Langevin dynamics. More explicitly, if \(\left\{ q_{r}(t)\right\} _{-\infty \le t\le \infty }\) is a relaxation path for the reversed dynamics between a saddle \(I\left[ q_{s}\right] \) and the attractor \(I\left[ q_{0}\right] \), then the instanton between \(q_{0}\) and \(q_{s}\) is given by \(\left\{ I\left[ q_{r}(-t)\right] \right\} _{-\infty \le t\le \infty }\). As instantons are the most probable fluctuation paths between attractors and saddles, they require an infinite amount of time to leave the attractor and an infinite amount of time to converge to the saddle. Moreover, they are degenerate, in the sense that if \(\left\{ q_{r}(t)\right\} _{-\infty \le t\le \infty }\) is an instanton, then for any \(\tau \), \(\left\{ q_{r}(t+\tau )\right\} _{-\infty \le t\le \infty }\) is also an instanton.

2.7 Large Deviations, Freidlin–Wentzell Theory and Entropic Effects

Up to now, we have discussed only the symmetry properties of the action functional (18) and of the action minima (26). In the limit of small noise, \(\gamma \rightarrow 0\) (see Sect. 2.1), one directly observes, from the path integral representation of the transition probability (20) that the minima of the action will play a crucial role. Indeed, the path integral will then be seen as a Laplace integral, and a Laplace principle will be used in order to derive a large deviation result

$$\begin{aligned} P\left[ q_{T},T;q_{0},0\right] \underset{\gamma \rightarrow 0}{=}\exp \left( -\frac{A_{(0,T)}\left[ q_{0},q_{T}\right] }{\gamma }+\mathrm{o}\left( \frac{1}{\gamma }\right) \right) , \end{aligned}$$
(28)

where \(A_{(0,T)}\left[ q_{0},q_{T}\right] =\min _{\left\{ q\,\left| \,q(0)=q_{0},\ q(T)=q_{T}\right. \right\} }\mathcal {A}_{(0,T)}\left[ q\right] \), and where \(\mathrm{o}\left( 1/\gamma \right) \) are subdominant contributions. Physicist, through explicit computations, have discussed many examples where this Laplace principle may or may not be correct for small \(\gamma \). In quantum mechanics, evaluations of path integrals in the limit of small \(\hbar \), or in the WKB approximation, which also involves the evaluation of path integrals through a saddle point approximation. On the mathematical side, the study of sufficient hypotheses in order to rigorously prove such large deviation results (28) is one of the main aspects of Freidlin–Wentzell theory [26]. Roughly speaking, Freidlin and Wentzell proved that for finite dimensional stochastic dynamics, under generic hypotheses, a large deviation result actually holds.

However, we draw the attention of the reader to the fact that for infinite dimensional field equations, e.g. turbulence models, a large deviation result (28) is far from obvious in the limit of small \(\gamma \). It may be expected to be true if, for instance, the degrees of freedom at the smallest scales can be proven to have a negligible effect upon the dynamics, such that it is qualitatively similar to an effective finite dimensional system. For the turbulence model we present here, such a property is not obvious at all. Studying this issue in general is an extremely difficult task. The path integral taken over Gaussian fluctuations around the critical point is given by the determinant of the second variation of the action functional and this determinant is typically infinite for infinitely many degrees of freedom. Therefore it requires a regularization which can either lead to a renormalization of constants in (28) or to a completely different answer. This problem goes beyond the scope of this paper, however, we will return to this discussion for a specific case in the conclusion.

3 The Two-Dimensional Euler and Quasi-Deostrophic Equilibrium Dynamics

In this section, we apply the formalism outlined previously to turbulence models. We explain why the two main hypotheses of Langevin dynamics (Liouville property and conservation of the potential related to the transversality condition) are verified. We assume that the kernel in front of the gradient part and the noise autocorrelation are identical. Then all of the time-reversal properties and the Lyapunov properties discussed in the previous section apply to these turbulence models.

An interesting aspect, explained below, is that depending on the properties of the potential \(\mathcal {G}\) (even or not), and of the model (with or without topography), the Langevin dynamics can be either symmetric under time reversal or not.

We consider the Langevin dynamics associated to the quasi-geostrophic equations in a periodic domain \(\mathcal {D}=[0,2\pi l_{x})\times [0,2\pi )\) with aspect ratio \(l_{x}\) to be given as

$$\begin{aligned} \frac{\partial q}{\partial t}+\mathbf {v}\left[ q-h\right] \cdot \mathbf {\nabla }q&= -\alpha \int _{\mathcal {D}}\, C(\mathbf{r},\mathbf{r}')\frac{\delta \mathcal {G}}{\delta q(\mathbf{r}')}\,\mathrm{d}\mathbf{r}'+\sqrt{2\alpha \gamma }\eta ,\end{aligned}$$
(29)
$$\begin{aligned} \mathbf {v}=\mathbf {e}_{z}\times \mathbf {\nabla }\psi ,\quad \omega&= \Delta \psi , \quad q=\omega +h(\mathbf {r}), \end{aligned}$$
(30)

with potential \(\mathcal {G}.\) The stochastic force \(\eta \) is a Gaussian process, white in time, with correlation function \(\mathbb {E}\left[ \eta (\mathbf {r},t)\eta (\mathbf {r}',t')\right] =C(\mathbf {r},\mathbf {r}')\delta (t-t')\). The potential \(\mathcal {G}\) and the assumption of Langevin dynamics are discussed in Sect. 3.1. Moreover, the topography \(h(\mathbf {r})\) is such that \(\int _{\mathcal {D}}\, h\left( \mathbf {r}\right) \, \mathrm{d}\mathbf{r}=0\). We consider \(G\) to be the Green’s function of the Laplacian operator (\(G=\Delta ^{-1}\)) for doubly periodic functions with zero averages. Then, the equations relating the potential vorticity \(q\), the stream function \(\psi \), and the velocity are inverted as

$$\begin{aligned} \psi (\mathbf {r})=\int _{\mathcal {D}}\, G\left( \mathbf {r},\mathbf{r}'\right) \left[ q(\mathbf{r}')-h(\mathbf{r}')\right] \, \mathrm{d}\mathbf{r}', \end{aligned}$$

and

$$\begin{aligned} \mathbf {v}\left[ \omega \right] (\mathbf {r})=\int _{\mathcal {D}}\,\mathbf {e}_{z}\times \nabla _{\mathbf {r}'}G\left( \mathbf {r},\mathbf{r}'\right) \omega (\mathbf {r}')\, \mathrm{d}\mathbf{r}', \end{aligned}$$
(31)

respectively. Here, \(\mathbf {v}\left[ \omega \right] \) is the operator that allows us to compute the velocity from the vorticity. When \(h=0\), these dynamics correspond to the two-dimensional Euler equilibrium dynamics.

3.1 Conserved Quantity and Liouville Property

From the velocity-vorticity relationship, it is easily checked that the kinetic energy can be expressed as

$$\begin{aligned} \mathcal {E}=-\frac{1}{2}\int _{\mathcal {D}}\,\left[ q-h\left( \mathbf {r}\right) \right] \psi \, \mathrm{d}\mathbf {r}=\frac{1}{2}\int _{\mathcal {D}}\,\left( \nabla \psi \right) ^{2}\, \mathrm{d}\mathbf {r}, \end{aligned}$$
(32)

and, for any sufficiently smooth real function \(s\), the Casimir functionals are defined as

$$\begin{aligned} \mathcal {C}_{s}=\int _{\mathcal {D}}\, s(q)\, \mathrm{d}\mathbf {r}, \end{aligned}$$

which are all conserved quantities of the deterministic quasi-geostrophic dynamics (Eqs. (29) for \(\alpha =0\)). For any \(s\), and any \(\beta \) the functional

$$\begin{aligned} \mathcal {G}=\mathcal {C}_{s}+\beta \mathcal {E}, \end{aligned}$$

will be the correct potential for Langevin dynamics.

Moreover, as the deterministic equations (Eq. (29) for \(\alpha \) =0) essentially correspond to a transport equation by a divergenceless velocity field, the Liouville property (5) is formally verified

$$\begin{aligned} \nabla \cdot \mathcal {F}\equiv -\int _{\mathcal {D}}\,\mathbf {v}\left[ q-h\right] \cdot \mathbf {\nabla }q\,\mathrm{d}\mathbf {r}=-\int _{\mathcal {D}}\,\nabla \cdot \left( \mathbf {v}\left[ q-h\right] q\right) \,\mathrm{d}\mathbf {r}=0. \end{aligned}$$

Then the formalism of Sect. 2 applies with \(\mathcal {F}\left[ q\right] =-\mathbf {v}\left[ q-h\right] \cdot \nabla q\).

3.2 Reversed Dynamics and Detailed Balance

For the two-dimensional Euler or quasi-geostrophic equations, the relevant involution corresponding to a time reversal is

$$\begin{aligned} I\left[ q\right] =-q. \end{aligned}$$

Using (1012) we conclude that

$$\begin{aligned} \mathcal {F}_{r}\left[ q\right] =\mathbf {v}\left[ q+h\right] \cdot \nabla q, \end{aligned}$$

\(C_{r}=C\) and

$$\begin{aligned} \mathcal {G}_{r}\left[ q\right] =\mathcal {G}\left[ -q\right] . \end{aligned}$$

From these equations, we observe that for the two-dimensional Euler equations (\(h=0\)), \(\mathcal {F}_{r}=\mathcal {F}\), and thus we conclude that the dynamics are time-reversible (see Eq. (10). The time reversibility condition on \(\mathcal {G}\) (see Eq. 15) imposes that the potential \(\mathcal {G}\) must be even. There we have two cases:

  1. 1.

    For the two-dimensional Euler equations with an even potential \(\mathcal {G}\), the Langevin dynamics are time-reversible and detailed balance is verified.

  2. 2.

    When either \(h\ne 0\) (quasi-geostrophic) or when \(\mathcal {G}\) is not even, then the Langevin dynamics are not time-reversible. The original dynamics are conjugated to another Langevin dynamics where \(h\) has to be replaced by \(-h\) and \(\mathcal {G}\) by \(\mathcal {G}_{r}\left[ q\right] =\mathcal {G}\left[ -q\right] \). In this case, detailed balance is not verified.

3.3 Instanton Equation

As discussed in Sect. 2, the instantons from one attractor to a saddle are given by the reverse of the relaxation paths of the corresponding reversed dynamics. From (25) applied to the case where \(\mathcal {F}_{r}\left[ q\right] =\mathbf {v}\left[ q+h\right] \cdot \nabla q\), and \(\mathcal {G}_{r}\left[ q\right] =\mathcal {G}\left[ -q\right] \), we determine that the equation of these relaxation paths is

$$\begin{aligned} \frac{\partial q}{\partial t}+\mathbf {v}\left[ q+h\right] \cdot \nabla q=-\alpha \int _{\mathcal {D}}\, C(\mathbf{r},\mathbf{r}')\frac{\delta \mathcal {G}}{\delta q(\mathbf{r}')}\left[ -q\right] \mathrm{d}\mathbf{r}'. \end{aligned}$$
(33)

3.4 Energy, Enstrophy, and Energy-Enstrophy Ensembles and Physical Dissipation

In this subsection, we consider the special case when the potential is given in the following form

$$\begin{aligned} \mathcal {G}=\int _{\mathcal {D}}\,\frac{q^{2}}{2}\,\mathrm{d}\mathbf {r}+\beta \mathcal {E}. \end{aligned}$$
(34)

This structure is referred to as the potential enstrophy ensemble (when \(\beta =0\)), the enstrophy ensemble (when \(\beta =0\) and \(h=0\)), or generally as the energy-enstrophy ensemble. The properties of the corresponding invariant measures have been discussed on a number of occasions, starting with the works of Kraichnan [43] in the case of Galerkin truncations of the dynamics, and for some cases without discretization, see for instance [11] and references therein.

For specific choices of the potential \(\mathcal {G}\) and of the kernel \(C\), the friction term can also be identified with a classical physical dissipation mechanism. For instance, if \(C(\mathbf{r},\mathbf{r}')=\Delta \delta (\mathbf {r}-\mathbf {r}')\), and the potential takes the form of (34), then the dissipative term on the right hand side of (29) is

$$\begin{aligned} -\alpha \int _{\mathcal {D}}\, C(\mathbf{r},\mathbf{r}')\frac{\delta \mathcal {G}}{\delta q(\mathbf{r}')}\left[ q\right] \,\mathrm{d}\mathbf{r}'=\alpha \Delta q-\alpha \beta q, \end{aligned}$$

which leads to a diffusion type dissipation with viscosity \(\alpha \) and a linear friction with friction parameter \(\alpha \beta \). Such a linear friction can model the effects of three-dimensional boundary layers on the quasi two-dimensional bulk vorticity, that appear in experiments with a very large aspect ratio, rotating tank experiments, or soap film experiments.

The fact that for the enstrophy ensemble, the quasi-potential is simply the enstrophy, the relaxation and fluctuation paths can be easily computed explicitly in many scenarios, as is discussed in [12].

For the majority of the other cases, the dissipative term on the right hand side of (29) cannot be identified as a microscopic dissipation mechanism nor as a physical mechanism. There is however another possible interpretation of this kind of friction term. As explained in [9], entropy maxima subjected to constraints related to the conservation of energy and the distribution of vorticity, are also extrema of energy-Casimir functionals. By analogy with the Allen-Cahn equation in statistical mechanics, that uses the free energy as a potential, it seems reasonable to describe the largest scales of turbulent flows as evolving through a gradient term of the energy-Casimir functional. Such models have been considered in the past (see, for example [20, 21] and references therein). At this stage, this should be considered as a phenomenological approach, as no clear theoretical results exist to support this view.

4 Phase Transition and Instantons Between Zonal Flows in the Barotropic Quasi-Geostrophic Equations

In order to fully determine the quasi-geostrophic Langevin dynamics (29), we need to specify the topography function and the potential \(\mathcal {G}\). Given the infinite number of conserved quantities for the quasi-geostrophic dynamics, there are many possible choices. We are interested in the description of the phenomenology of phase transitions and instanton theory in situations of first order transitions. Therefore, we will illustrate such a phenomenology through two examples.

For the first example, we choose a topography given by \(h\left( \mathbf {r}\right) =H\cos \left( 2y\right) \), such that

$$\begin{aligned} q=\Delta \psi +H\cos \left( 2y\right) , \end{aligned}$$

and consider the potential

$$\begin{aligned} \mathcal {G}=\mathcal {C}+\beta \mathcal {E}, \end{aligned}$$
(35)

with energy (32), \(\beta \) the inverse temperature, and where \(\mathcal {C}\) is the Casimir functional

$$\begin{aligned} \mathcal {C}=\int _{\mathcal {D}}\,\frac{q^{2}}{2}-a_{4}\frac{q^{4}}{4}+a_{6}\frac{q^{6}}{6}\, \mathrm{d}\mathbf {r}, \end{aligned}$$
(36)

where we assume that \(a_{6}>0\).

4.1 Zonal Phase Transitions

We first consider the structure of the minima of the potential \(\mathcal {G}\) (35), and then their bifurcations when the parameters \(\epsilon \) and \(a_{4}\) are changed, where \(\epsilon \) is defined by

$$\begin{aligned} \beta =-1+\epsilon . \end{aligned}$$

At low positive temperature (\(\beta \rightarrow \infty \)), we expect to observe energy minima, which correspond to \(\psi =0\) and \(q=H\cos \left( 2y\right) \). As the energy is convex, for positive \(\beta \) and small enough \(a_{4}\), both \(\mathcal {C}\) and \(\beta \mathcal {E}\) will also be convex. Henceforth, we expect that \(\mathcal {G}\) will contain an unique global minimum and no local minima. For large enough \(\beta \), this equilibrium state will be dominated by the topographic effect. For small negative \(\beta \), the change of convexity of \(\beta \mathcal {E}\) from convex to concave will not change this picture. However, for smaller \(\beta \) (more negative and higher absolute value), we expect a phase transition to occur as the potential \(\mathcal {G}\) will become locally concave. If \(a_{4}>0\), with sufficiently large values, this will be a first order phase transition. If \(a_{4}<0\) with sufficiently large values, this will be a second order phase transition.

When \(H=0\), a bifurcation occurs for \(\beta =-1\) (\(\epsilon =0\)) and \(a_{4}=0\), as can be easily checked (see [23]). This bifurcation is due to the vanishing of the Hessian at \(\beta =-1\) (\(\epsilon =0\)) and \(a_{4}=0\). As discussed in many papers [16, 22, 23, 72], for the quadratic Casimir functional \(\mathcal {C}_{2}=\int _{\mathcal {D}}\,q^{2}/2\, \mathrm{d}\mathbf{r}\), the first bifurcation involves the eigenfunction of \(-\Delta \) with the lowest eigenvalue. If we assume that the aspect ratio \(l_{x}\) (defined just before Eq. 29) satisfies \(l_{x}<1\), then the smallest eigenvalue is the one corresponding to the zonal mode proportional to \(\cos \left( y\right) \). Because we are interested by transitions between two zonal states, we assume from now on that \(l_{x}<1\).

For non-zero, but sufficiently small, \(H\) there will still be a bifurcation for \(\epsilon \) and \(a_{4}\) close to zero. This is the regime that we wish to consider. The null space of the Hessian is spanned by eigenfunctions \(\cos \left( y\right) \) and \(\sin \left( y\right) \), therefore as a consequence, for small enough \(\epsilon \), \(a_{4}\) and \(H\), we expect that the bifurcation can be described by a normal form involving only the projection of the field \(q\) onto the null space. Hence, we decompose the fields into a contribution arising through its projection onto this null space and its orthogonal complement:

$$\begin{aligned} \psi =A\cos \left( y\right) +B\sin \left( y\right) +\psi ' \end{aligned}$$
(37)

where \(\int _{\mathcal {D}}\exp \left( iy\right) \, \psi '(\mathbf {r})\, \mathrm{d}\mathbf{r} =0.\) Then

$$\begin{aligned} q=-A\cos \left( y\right) -B\sin \left( y\right) +q', \end{aligned}$$
(38)

with \(\int _{\mathcal {D}}\,\exp \left( iy\right) \, q'(\mathbf {r}) \,\mathrm{d}\mathbf{r}=0\). The fact that the bifurcation can be described by a normal form over the null space of the Hessian can be expected on a general basis. It can actually be justified by using Lyapunov-Schmidt reduction, as performed and explained in [23] for a number of examples for the two-dimensional Euler and quasi-geostrophic equations. Then all other degrees of freedoms describing the minima \(q_{c}\) of \(\mathcal {G}\) are slaved to \(A\) and \(B\), in the sense that they can be simply expressed as functions of \(A\) and \(B\) themselves. Even though the following example is not treated in the paper [23], it would not be difficult. Therefore, we omit the details of the Lyapunov-Schmidt reduction here for simplicity. Instead, we rather propose a more heuristic discussion.

Our strategy, will be in treating the problem perturbatively by assuming that \(\epsilon \ll 1\), \(a_6 H^2 \ll a_4\), and \(a_{4}H^{2}\ll \epsilon \) (note that it implies that \(a_{6}H^{4}\ll \epsilon \)). We make these assumptions in order to get an explicit description of the phase transition. However, it is important to understand that the theory that predicts the transition rates and the instantons does not depend on these assumptions, and that the same phenomenology will remain valid beyond the perturbative regime. We will assume that \(\psi '\) and \(q'\) are first order corrections in all of the three perturbation parameters. By rewriting the potential \(\mathcal {G}\), taking into account only the leading order contributions, and using Eqs. (32), and (3638), we get after some straightforward computations that

$$\begin{aligned} \mathcal {E}=\pi ^{2}l_{x}\left( A^{2}+B^{2}\right) +\frac{1}{2}\int _{\mathcal {D}}\,\left[ H\cos (2y)-q'\right] \psi ' \, \mathrm{d}\mathbf {r}, \end{aligned}$$

and

$$\begin{aligned} \mathcal {G}=\pi ^{2}l_{x}\mathcal {G}_{0}(A,B)+\mathcal {G}_{1}(A,B)\left[ q'\right] +\mathrm{lower~order~terms}, \end{aligned}$$

with

$$\begin{aligned} \mathcal {G}_{0}(A,B)=\epsilon \left( A^{2}+B^{2}\right) -\frac{3a_{4}}{8}\left( A^{2}+B^{2}\right) ^{2}+\frac{5a_{6}}{24}\left( A^{2}+B^{2}\right) ^{3}, \end{aligned}$$

and

$$\begin{aligned} \mathcal {G}_{1}(A,B)\left[ q'\right]&= \frac{\epsilon -1}{2}\int _{\mathcal {D}}\,\left[ H\cos \left( 2y\right) -q'\right] \psi '\,\mathrm{d}\mathbf {r}\nonumber \\&+\,\frac{1}{2}\int _{\mathcal {D}}\, q'^2\left\{ 1-3a_{4}\left[ A\cos (y)+B\sin (y)\right] ^{2}\right. \nonumber \\&\left. +\,5a_{6}\left[ A\cos (y)+B\sin (y)\right] ^{4}\right\} \,\mathrm{d}\mathbf {r}. \end{aligned}$$
(39)

We further assume that \(a_{4}A^{2} \ll \epsilon \), \(a_6 A^2\ll a_4\) and \(\epsilon \ll 1\). Then. the leading order terms are obtained from the minimization of the first integral and

$$\begin{aligned} \psi '=\left[ \frac{H}{3}\cos (2y)\right] \left[ 1+\mathcal {O}\left( \epsilon \right) +\mathcal {O}\left( a_{4}A^{2}\right) +\mathcal {O}\left( a_{6}A^{4}\right) \right] , \end{aligned}$$

or equivalently

$$\begin{aligned} q'=-\frac{H}{3}\cos (2y)\left[ 1+\mathcal {O}\left( \epsilon \right) +\mathcal {O}\left( a_{4}A^{2}\right) +\mathcal {O}\left( a_{6}A^{4}\right) \right] . \end{aligned}$$

We use this expression in order to compute the leading order contributions to \(G_{1}(A,B)=\min _{q'}\mathcal {\, G}_{1}(A,B)\left[ q'\right] \). After lengthy but straightforward computations, we get the leading order contribution to be

$$\begin{aligned} G_{1}&= \min _{q'}\mathcal {G}_{1}=-\frac{{H^{2}}}{3}-\frac{\pi ^{2}l_{x}a_{4}H^{2}}{6}\left( A^{2}+B^{2}\right) \\&+\,\frac{5\pi ^{2}l_{x}a_{6}H^{2}}{144}\left[ 5\left( A^{2}+B^{2}\right) ^{2}+2\left( A^{2}-B^{2}\right) ^{2}\right] , \end{aligned}$$

and subsequently we obtain

$$\begin{aligned} \min _{q}\mathcal {G}=\min _{(A,B)}\,\pi ^{2}l_{x}G(A,B) \end{aligned}$$
(40)

with \(G\) given at leading order by

$$\begin{aligned} G(A,B)&= -\frac{{H^{2}}}{3}+\left( \epsilon -\frac{a_{4}H^{2}}{6}+\frac{5a_{6}H^{4}}{216}\right) \left( A^{2}+B^{2}\right) \nonumber \\&+\left( -\frac{3a_{4}}{8}+\frac{25a_{6}H^{2}}{144}\right) \left( A^{2}+B^{2}\right) ^{2}+\frac{5a_{6}}{24}\left( A^{2}+B^{2}\right) ^{3}\nonumber \\&+\,\,\frac{5a_{6}H^{2}}{72}\left( A^{2}-B^{2}\right) ^{2}. \end{aligned}$$
(41)

\(G(A,B)\) is the normal form that describes the phase transition in the limit \(a_4 A^2 \ll \epsilon \), and \(a_6 A^2 \ll a_4\) and \(\epsilon \ll 1\).

The fact that \(G\) is a normal form for small enough \(a_{4}\), \(a_{6}\), and \(H\), implies that the gradient of \(\mathcal {G}\) in the directions transverse to \(q=A\cos \left( y\right) +B\sin \left( y\right) \) are much steeper than the gradient of \(G\). A more complete derivation could easily be performed along the lines discussed in [23].

We observe that the term proportional to \(\left( A^{2}-B^{2}\right) ^{2}\) breaks the symmetry between \(A\) and \(B\). Its minimization imposes that \(A^{2}=B^{2}\). Then either \(A=B\), or \(A=-B\). If we take into account that minimizing with respect to \(A^{2}+B^{2}\) will give only the absolute value of \(A\), we can surmise that we will have four equivalent non-trivial solutions:

$$\begin{aligned} q_{i}=-\frac{H}{3}\cos \left( 2y\right) +\sqrt{2}\left| A\right| (\epsilon ,a_{4},a_{6})\cos (y+\phi _{i}), \end{aligned}$$

with \(\phi _{i}\) taking one of the four value \(\left\{ -\frac{3\pi }{4},-\frac{\pi }{4},\frac{\pi }{4},\frac{3\pi }{4}\right\} \), with \(\left| A\right| \) minimizing

$$\begin{aligned} \tilde{G}(\left| A\right| )=-\frac{H^{2}}{3}+2\left( \epsilon -\frac{{a_{4}H^{2}}}{6}+\frac{5a_{6}H^{4}}{216}\right) \left| A\right| ^{2}+4\left( \frac{3a_{4}}{8}+\frac{25a_{6}H^{2}}{144}\right) \left| A\right| ^{4}+\frac{5a_{6}}{3}\left| A\right| ^{6}. \end{aligned}$$
(42)

The reduced potential \(G\) is plotted in Fig. 1 for the case \(\epsilon >0\) and \(a_{4}>0\). The structure has four non-trivial attractors due to a breaking of the symmetry imposed by the topography \(h(y)=H\cos \left( 2y\right) \). For \(\epsilon <0\), the minima of \(G\) have the symmetries of \(h\) (potential vorticity profile have a reflexion symmetry with respect to both \(y=0\) or \(y=\pi \) and an anti-reflection symmetry with respect to both \(y=\pi /2\) and \(y=3\pi /2\)). For \(\epsilon >0\) this symmetry is broken leading to four different attractors. In Fig. 2, we show the potential vorticity of two of the attractors, the corresponding saddle and the topography.

Fig. 1
figure 1

Contour plot (left) and surface plot (right) of the reduced potential surface \(G(A,B)\) (see Eq. 41) for parameters: \(\epsilon =1.6\times 10^{-2}\), \(H=4\), \(a_{4}=6\times 10^{-4}\), \(a_{6}=3.6\times 10^{-6}\). For these parameter, \(G\) has four global minima with \(\left| A\right| =\left| B\right| \) and one local minima at \(A=B=0\). This structure with four non-trivial attractors is due to symmetry breaking imposed by the topography \(h(y)=H\cos \left( 2y\right) \)

Fig. 2
figure 2

The plot depicts the topography (\(h(y)=H\cos \left( 2y\right) \), symmetric red curve) and two non-trivial attractors of the potential vorticity \(q\) (black solid lines) corresponding to two minima of the effective potential \(G\) (see Eq. 41; Fig. 1) for parameter values \(\epsilon >0\) and \(a_{4}>0\). Additionally, we show the saddle between the two attractors of the effect potential \(G\) (dashed black curve) (Color figure online)

Considering the reduced potential \(\tilde{G}\) (Eq. 42), we recognize that the structure contains a tricritical point: a point at which a first order transition line switches to a second order transition line. Figure 3 shows a normal form for a tricritical point. The reduced potential \(\tilde{G}\) (Eq. 42) has the same normal form structure with \(a=\frac{2}{5a_{6}}\left( \epsilon -\frac{a_{4}H^{2}}{6}+\frac{5a_{6}H^{4}}{216}\right) \) and \(b=\frac{8}{5a_{6}}\left( \frac{3a_{4}}{8}+\frac{25a_{6}H^{2}}{144}\right) \).

Fig. 3
figure 3

We show the phase diagram for a tricritical point corresponding to the maximization of the normal form \(s(m)=-m^{6}-\frac{3b}{2}m^{4}-3am^{2}\) (taken from [10]). The inset show the qualitative shape of the potential \(s\) when the parameters \(a\) and \(b\) are changed. The black solid line corresponds to a line of first order (discontinuous) phase transition. The black dashed line is a second order phase transition line. At the tricritical point (\(a=b=0\)), the first order phase transition change to a second order phase transition

From this last equation, we can conclude that for \(a_{4}<25a_{6}H^{2}/54\) (\(a_{4}<0\) at leading order), we have a continuous phase transition for \(\epsilon ={35a_{6}H^{4}}/{648}\) (zero at leading order). For \(a_{4}={25a_{6}H^{2}}/{54}\) (\(a_{4}=0\) at leading order), we have a tricritical point. Therefore, the transition is between a state given at leading order by

$$\begin{aligned} q=-\frac{H}{3}\cos \left( 2y\right) \end{aligned}$$

to one of the four states given by

$$\begin{aligned} q_{i}=-\frac{H}{3}\cos \left( 2y\right) +\sqrt{2}\left| A\right| (\epsilon ,a_{4},a_{6})\cos \left( y+\phi _{i}\right) , \end{aligned}$$
(43)

where \(\phi _{i}\in \left\{ -\frac{3\pi }{4},-\frac{\pi }{4},\frac{\pi }{4},\frac{3\pi }{4}\right\} \), and \(\left| A\right| (\epsilon ,a_{4},a_{6})\) being the non-zero minimizer of (42). For \(a_{4}>0\) and \(\epsilon \) close to zero, we have the coexistence of both of these states, and thus the transition when \(\epsilon \) is increased is of first order. For \(a_{4}<0\) and \(\epsilon \) close to zero, the transition when \(\epsilon \) is increased is a second order (continuous) transition.

4.2 Instantons for the Topography Phase Transition

To summarize, we know how to describe and compute the instantons corresponding to the phase transitions between zonal flows. In Sect. 2 we have derived the general theory for Langevin dynamics for field problems with potential \(\mathcal {G}\), and have concluded in Sect. 2.6 that instantons are the time reversed trajectories of relaxation paths for the reversed dynamics. The corresponding equation of motion for the relaxation paths for the reversed dynamics for the quasi-geostrophic dynamics has then been derived in Sect. 3.3.

The general theory and Eq. (33) show that for the quasi-geostrophic dynamics, the reversed dynamics is simply the quasi-geostrophic dynamics where \(h\) has been replaced by \(-h\) and \(\mathcal {G}\) by \(\mathcal {G}_{r}\), with \(\mathcal {G}_{r}\left[ q\right] =\mathcal {G}\left[ -q\right] \). In the example we discussed now, \(\mathcal {G}\) is even (see Eq. 36) such that \(\mathcal {G}_{r}=\mathcal {G}\). We remark, that over the set of zonal flows \(\mathbf {v}=U(y)\mathbf {e}_{x}\), the nonlinear term of the quasi-geostrophic equation vanishes: \(\mathbf {v}\left[ q+h\right] \cdot \nabla q=0\). As a consequence, when the instanton remains a zonal flow, the fact that \(h\) has to be replaced by \(-h\) has no consequence. Let us now argue that the instanton is actually generically a zonal flow.

We assume for simplicity that the stochastic forces are homogeneous (invariant by translation in both directions). Then \(C\left( \mathbf {r},\mathbf {r'}\right) =C\left( \mathbf {r}-\mathbf {r'}\right) =C_{z}(y-y')+C_{m}(y-y',x-x')\) where

$$\begin{aligned} C_{z}(y)=\frac{1}{2\pi l_{x}}\int _{_{0}}^{2\pi l_{x}}\, C(x,y)\,\mathrm{d}x \end{aligned}$$

is the zonal part of the correlation function, and \(C_{m}=C-C_{z}\) the non-zonal or meridional part.

As the nonlinear term of the two-dimensional Euler equations identically vanishes, the relaxation dynamics has a solution among the set of zonal flows. If \(C_{z}\) is non-degenerate (positive definite as a correlation function), then relaxation paths will exist through the gradient dynamics

$$\begin{aligned} \frac{\partial q}{\partial t}=-2\pi \alpha l_{x}\int _{_{0}}^{2\pi }\, C_{z}(y-y')\frac{\delta \mathcal {G}}{\delta q(y')}\, \mathrm{d}y', \end{aligned}$$
(44)

where \(q=q(y)\) is the zonal potential vorticity field.

Moreover, as argued in Sect. 4.1, the fact that \(G\) (41) is a normal form for small enough \(a_{4}\), \(a_{6}\), and \(H\), implies that the gradient of \(\mathcal {G}\) in directions transverse to \(q=A\cos \left( y\right) +B\cos \left( y\right) \) are much steeper than the gradient of \(G\). As a consequence, at leading order the relaxation paths will be given by the relaxation paths for the effective two-degrees of freedom \(G\). Then, from (40), (41), and (44) we obtain that, at leading order, for the relaxation path given by (3738), the dynamics of \(A\) and \(B\) are given by

$$\begin{aligned} \frac{\mathrm{d}A}{\mathrm{d}t}=-c\frac{\partial G}{\partial A}\quad \mathrm{and}\quad \frac{\mathrm{d}B}{\mathrm{d}t}=-c\frac{\partial G}{\partial B}, \end{aligned}$$

with \(c=-\alpha l_{x}\int _{0}^{2\pi }\, C_{z}(y)\cos \left( y\right) \, \mathrm{d} y\), where we recall that \(G\) is given by Eq. (41).

From this result the relaxation paths are easily computed. Using the fact that fluctuation paths are time reversed trajectories of relaxation paths, instanton are also easily obtained. One of the resulting relaxation paths (blue curve) and one of the instantons (red curve) are depicted in Fig. 4 overlapped on the contours of the potential \(G\) in the \((A,B)\)-plane. The corresponding two attractor involved, together with the saddle point and examples of two intermediate states are shown in Fig. 5.

Fig. 4
figure 4

Contour plot of the reduced potential surface \(G(A,B)\) (same as Fig. 1) with the superimposed transition path between two attractors denoted by filled circle via a saddle filled square. The instanton (most probable fluctuation path from one attractor to a saddle) is show by the solid red line, while the corresponding relaxation path from the saddle to the second attractor is given by the solid blue line. In this case, the instanton and the relaxation paths are actually the reverse of one another (Color figure online)

Fig. 5
figure 5

The potential vorticity \(q(y)\) for two of the non-trivial attractors (solid black curves), the corresponding saddle between the attractors (dashed black curve), and two intermediate profiles along the instanton path (solid red curve) and the relaxation path (solid blue curve) (Color figure online)

4.3 Dimensional Analysis

In this section, we briefly discuss dimensional analysis for the dynamics (29), with topography \(h=H\cos \left( 2y\right) \), and potential \(\mathcal {G}\) given by Eqs.  (3536). We recall these equations for clarity:

$$\begin{aligned} \frac{\partial q}{\partial t}+\mathbf {v}\left[ q-H\cos \left( 2y\right) \right] \cdot \nabla q&= -\alpha \int _{\mathcal {D}}\, C(\mathbf{r}-\mathbf{r}')\frac{\delta \mathcal {G}}{\delta q(\mathbf{r}')}\,\mathrm{d}\mathbf{r}'+\sqrt{2\alpha \gamma }\eta ,\end{aligned}$$
(45)
$$\begin{aligned} \mathbf {v}=\mathbf {e}_{z}\times \nabla \psi ,\quad \omega&= \Delta \psi , \quad q=\omega +H\cos \left( 2y\right) , \end{aligned}$$
(46)

with

$$\begin{aligned} \mathcal {G}=\int _{\mathcal {D}}\frac{q^{2}}{2}-a_{4}\frac{q^{4}}{4}+a_{6}\frac{q^{6}}{4}\, \mathrm{d}\mathbf{r}-\left( 1-\epsilon \right) \mathcal {E}. \end{aligned}$$
(47)

First, let us discuss a set of convenient non-dimensional units for our problem. We express length in units of the domain size. The dynamics involve the following parameters \(\alpha \) (\(s^{-1}\)), \(\gamma \) (\(s^{2}\)), \(H\) (\(s^{-1}\)), \(a_{4}\) (\(s^{2}\)), \(a_{6}\) (\(s^{4}\)), \(\beta \) or \(\epsilon \) (no dimension), the aspect ratio \(l_{x}\) (no dimension), and the force spectrum \(C\) (no dimension), energy \(\mathcal {E}\) (\(s^{-2}\)), and Casimirs \(\mathcal {C}\) (\(s^{-2}\)). We are interested mainly in the range of parameter for which the dynamics is bistable. Moreover, it will be especially useful to consider the perturbative regime close to the bifurcation described in Sect. 4.1. As a consequence, we choose \(\epsilon \ll 1\), \(a_{4}>0\) and \(a_{4}\) sufficiently small (as discussed below), and \(H\) sufficiently small (\(a_{4}H^{2}\ll \epsilon \) and \(a_{6}H^{2}\ll a_{4}\)) such that the phase transition is close to the one occurring for \(H=0\). We recall that these assumptions are made in order to get an explicit description of the phase transitions, however it is important to understand that the theory that predicts the transition rates and the instantons does not depend on these assumptions and that the same phenomenology will remain valid beyond this perturbative regime.

As discussed in Sect. 4.1, with these hypotheses, the lower values of \(\mathcal {G}\) are approximated by the normal form \(G\) (41). From (41), we conclude that if we assume \(a_6 H^2 \ll a_4\), then the order of magnitude of \(A\), the amplitude of the large scale mode, is \(\left( \epsilon /a_{4}\right) ^{1/2}\). As we have chosen \(a_{4}H^{2}\ll \epsilon \), the correction due to the topography is of sub-leading order (see Eq. 43). The kinetic energy of the largest scale mode is then of the order \(\epsilon /a_{4}\). Subsequently, we choose \(\left( a_{4}/\epsilon \right) ^{1/2}\) as a time unit. We denote \(H'=\left( a_{4}/\epsilon \right) ^{1/2}H\), \(\gamma '=\left( a_{4}/\epsilon \right) ^{3/2}\gamma \), \(\alpha '=\left( a_{4}/\epsilon \right) ^{1/2}\alpha \), \(a'_{6}=(\epsilon /a_{4})^{2}a_{6}\), and \(q'=\left( {a_{4}/\epsilon }\right) ^{1/2}q\) to be the dimensionless variables in this time unit. Therefore, we can write the non-dimensional equations, dropping the prime variables as

$$\begin{aligned} \frac{\partial q}{\partial t}+\mathbf {v}\left[ q-H\cos \left( 2y\right) \right] \cdot \nabla q&= -\alpha \int _{\mathcal {D}}\, C(\mathbf{r}-\mathbf{r}')\frac{\delta \mathcal {G}}{\delta q(\mathbf{r}')}\,\mathrm{d}\mathbf{r}'+\sqrt{2\alpha \gamma }\eta ,\end{aligned}$$
(48)
$$\begin{aligned} \mathbf {v}=\mathbf {e}_{z}\times \nabla \psi ,\quad \omega&= \Delta \psi , \quad q=\omega +H\cos \left( 2y\right) , \end{aligned}$$
(49)

with

$$\begin{aligned} \mathcal {G}=\int _{\mathcal {D}}\,\frac{q^{2}}{2}-\epsilon \frac{q^{4}}{4}+a_{6}\frac{q^{6}}{6}\, \mathrm{d}\mathbf{r}-\left( 1-\epsilon \right) \mathcal {E}. \end{aligned}$$
(50)

Within these non-dimensional variables, \(\epsilon \) controls the distance to the bifurcation. The approximation of the large scale dynamics by a few number of modes will then be valid for \(\epsilon \ll 1\), and the approximation that the topography is a second order effect is controlled by \(H^{2}\ll \epsilon \) and \(H^2 \ll 1\) and \(a_6 H^2 \ll \epsilon \) (this also implies \(a_{6}H^{4}\ll \epsilon \)).

We now give a qualitative picture of the dynamics. Recall that the stationary distribution of the stochastic process is given by \(P_{s}=Z^{-1}\exp \left( -\mathcal {G}/\gamma \right) \). The gradient of \(\mathcal {G}\) in the directions which are transverse with respect to the modes \(A\cos \left( y\right) +B\sin \left( y\right) \) is of order one, whereas the stochastic force is multiplied by \(\gamma ^{1/2}\). As a consequence, typical values of fluctuations for the stationary measure in these transverse directions are of order \(\gamma ^{1/2}\). Finally, the non-dimensional parameter \(\alpha \) controls the relative order of magnitude of the inertial (or Hamiltonian) part of the dynamics, compared to the dissipative gradient terms in (48).

5 Conclusions and Perspectives

We have defined Langevin dynamics for two-dimensional and quasi-geostrophic turbulent flows. These dynamics have an energy-Casimir invariant measure. The dissipative part of the dynamics derives from a potential that is transverse to the Hamiltonian part of the dynamics. Moreover, the noise autocorrelation function is the same as the kernel defining the dissipative part. Under these hypotheses, the action is modified in a simple manner under time reversal. It is either symmetric leading to detailed balance, or leads to a dual action which describes dynamics that belong to the same family of physical model. These symmetries put these Langevin dynamics in the framework of classical Langevin dynamics. For instance, fluctuation paths are time reversed trajectories of relaxation paths of the dual dynamics. This gives a very simple characterization of fluctuation paths, of large deviations, and of large deviation paths, when they exist.

We have proposed and analyzed cases with phase transitions, both continuous and discontinuous, and of a tricritical point. This opens the study to a rich phenomenology of processes, including bistable situations. These Langevin dynamics with exact theoretical prediction will be very useful benchmarks for future tests of numerical algorithms aimed at computing large deviations in turbulence problems [13, 37, 62].

Several interesting concepts could be developed in the future. These Langevin dynamics give examples of turbulence problems for which the recent results of stochastic thermodynamics could be extended, e.g. it would be very interesting to study Gallavotti-Cohen fluctuation relations [27], or entropy production [3, 40] in this setup. The temporal response of the system to external driving or change of parameters could also be studied in relation to recently studied non-equilibrium linear response for Markovian dynamics [2, 48].

Let us come back to two important and related issues not discussed in this paper. Firstly, is it possible to give a clear mathematical meaning to the Langevin dynamics (29), given that it may involve very rough forces through the noise term? Or of smooth noise combined with very weak friction? Secondly, for the dynamics (29), will large deviation results (28) be valid? Similar questions have been addressed in the past in the context of the Allen-Cahn or stochastic Ginzburg-Landau equations, related to stochastic quantization [25, 42] with very appealing new results in larger dimensions [31, 32]. In order to discuss these two questions in a fluid mechanical context, let us consider a special case of Langevin dynamics (29), with \(C(\mathbf{r},\mathbf{r}')=\Delta \delta (\mathbf {r}-\mathbf {r}')\), corresponding to the enstrophy ensemble (see Sect. 3.4). From a physical point of view, it has been identified for a long time that the dynamics can not be given a simple physical interpretation. Indeed, for the enstrophy measure, the expectations of both the energy and enstrophy are infinite. Even the expectation for the velocity field is not defined, and most of the realizations do not lead to a physical velocity field. This is related to some of the mathematical results in [6]. These remarks give a negative answer to the first question. Still, it has been observed [12] that, at a formal level, the minimization of the action can be computed explicitly and leads to a quasi-potential which is indeed the enstrophy as may have been expected. A natural physical question is then to understand what happens if the noise is regularized at a scale \(\delta \), much smaller than the domain size. Recently, we have been aware of the work by [18], that actually considers this problem. Their mathematical result, is that for any finite \(\delta \) the dynamics are well defined. Moreover, that for any finite \(\delta \), a large deviation principle for exit times from a bounded domain holds when the noise amplitude goes to zero (when \(\gamma \) goes to zero in our notation, see Eq. 29). These large deviations are actually described by the minimization of the action functional (1718), with a kernel \(C_{\delta }\) taking into account the noise regularization. When \(\delta \) goes to zero, the large deviation functional and the minimizers of the actions actually converge to the one corresponding to the enstrophy ensemble [18]. These results justify the formal computation in [12], and equivalent results would justify the formal computations presented in this current work. However, we stress that for these results to hold, the order of the limits (\(\gamma \rightarrow 0\) and \(\delta \rightarrow 0\) afterwards) is crucial.

As discussed above, for the enstrophy ensemble, it is necessary to regularize the noise first in order to obtain meaningful dynamics. However, it is not yet clear which are the relevant cases, depending on the kernel \(C\) or the potential \(\mathcal {G}\), when such a regularization is necessary or not? For instance, when \(a_{4}<0\) or \(a_{6}>0\), see Eq. (36), such a regularization may be unnecessary, or with a potential controlling the extremal values of the vorticity field, such a regularization would also be unnecessary. This question could be the subject of further studies. The dynamics could also be regularized at the level of the dissipation, for instance by adding small scale dissipation in the form of hyperviscosity with a small coefficient.

In order to conclude, we stress once more, that for applications it would be desirable to go beyond the Langevin dynamics considered in this paper. A first step could be for the derivation of the slow dynamics of zonal jets in quasi-geostrophic models [14], followed by large deviation computations. We consider progresses in this direction and in others in future works.