Abstract
Sharp large deviation estimates for stochastic differential equations with small noise, based on minimizing the Freidlin–Wentzell action functional under appropriate boundary conditions, can be obtained by integrating certain matrix Riccati differential equations along the large deviation minimizers or instantons, either forward or backward in time. Previous works in this direction often rely on the existence of isolated minimizers with positive definite second variation. By adopting techniques from field theory and explicitly evaluating the large deviation prefactors as functional determinant ratios using Forman’s theorem, we extend the approach to general systems where degenerate submanifolds of minimizers exist. The key technique for this is a boundary-type regularization of the second variation operator. This extension is particularly relevant if the system possesses continuous symmetries that are broken by the instantons. We find that removing the vanishing eigenvalues associated with the zero modes is possible within the Riccati formulation and amounts to modifying the initial or final conditions and evaluation of the Riccati matrices. We apply our results in multiple examples including a dynamical phase transition for the average surface height in short-time large deviations of the one-dimensional Kardar–Parisi–Zhang equation with flat initial profile.
Similar content being viewed by others
Avoid common mistakes on your manuscript.
1 Introduction
In its classical formulation, large deviation theory (LDT) is often used to gain access to the limiting behavior of probabilities or expectations at an approximate, i.e. exponential scale, which is the content of notions such as large deviation principles in general or Varadhan’s lemma (see e.g. [1]). However, in any practical application where quantitative estimates are required, it is desirable to refine such an analysis to get absolute and asymptotically correctly normalized results instead of mere scaling for the probabilities of rare events, effectively supplementing the exponential LDT estimate by a sub-exponential prefactor. Such precise Laplace asymptotics, which are the subject of this paper for the specific scenario of stochastic differential equations (SDEs) subject to small Gaussian noise, have a long history [2].
In the past decades, sample path LDT or Freidlin–Wentzell theory [3] and the related notion of instanton calculus in theoretical physics [4, 5] have been widely applied as a tool to study rare event probabilities in stochastic dynamical systems, either numerically, e.g. in [6,7,8,9,10], or through analytical analysis of the corresponding minimization problems, e.g. in [11,12,13,14,15]. Reviews of the theory, highlighting connections of large deviation theory to field-theoretic methods and optimal fluctuations or instantons in theoretical physics are given by [16,17,18]. For the metastable setup for reversible systems prefactor corrections are classical [19, 20], and recent generalizations and rigorous progress has been made [21,22,23,24,25]. With some notable exceptions such as [26,27,28], however, most of the work for general irreversible systems and extreme events has focused only on exponential asymptotics using the large deviation minimizers themselves, solution to a deterministic optimization problem. As an additional, concrete motivation to go beyond such rough estimates in practical applications, it has been pointed out very recently that for assessing the relative importance of different instantonic transition paths, knowledge of the LDT prefactor at leading order may be vital even at comparably small noise strengths [29].
In the last year, there has been a lot of activity to provide generic numerical tools that also allow for the computation of the leading order term of the large deviation prefactor for the statistics of final time observables of small noise ordinary SDEs using symmetric Riccati matrix differential equations, either forward or backward in time [30,31,32,33]. In an abstract setting, expressions for prefactors in this context, even at arbitrarily high order, have already been known rigorously since the 1980’s [2, 34,35,36] and are, not surprisingly, related to a certain operator determinant at the leading order. The Riccati formalism then allows one to compute such determinants in a closed form through the solution of an initial value problem instead of eigenvalue computations (see [37] for a recent work in the latter direction, as well as [38]), much in the spirit of the classical Gel’fand-Yaglom technique in quantum mechanics [39] or its later generalization via Forman’s theorem [40]. This is advantageous if either, from a numerical point of view, the spatial dimension of the system is not too large, with the Riccati matrix being of size \(n \times n\) for a n-dimensional SDE, or if an analytical analysis of the resulting equations is desired. Our first contribution in this paper is to make the connection to functional determinants more precise and to add to the existing derivations of the Riccati equations using (i) a WKB analysis of the Kolmogorov backward equation [31] (ii) a discretization approach of the path integral [30] or (iii) the use of the Feynman-Kac formula for Gaussian fluctuations [30, 33] a fourth derivation that makes explicit use of Forman’s theorem. Furthermore, in contrast to previous derivations, we also include the case of Itô SDEs with multiplicative noise here. In general, we stress the technical advantage of working with the moment-generating function (MGF) as the principal quantity of interest here, only later transforming onto probabilities or probability density functions (PDFs).
This groundwork then opens the way to treat a new class of problems using Riccati equations compared to the previous works. Notably, all of the cited previous works on this approach have been limited to unique or at least isolated large deviation minimizers with positive definite second variation of the associated functional at the minimizers. In contrast to this, we extend the Riccati approach to cases where compact submanifolds of minimizers exist. There, the application of the infinite-dimensional Laplace method requires the removal of the zero eigenvalues of the corresponding second variation operator, as discussed in a general setting in [34] already. The eigenfunctions corresponding to these zero eigenvalues are usually called zero modes. In the context of mean transition times in the small noise limit, a paper that deals with related problems is [41]. Carrying out the procedure described above through a boundary-value type regularization that builds on the work of [42] among others, we obtain Riccati equations with suitably regularized initial or final conditions in this paper that implicitly remove the divergences that would otherwise be encountered in the solution of the Riccati equations.
Situations where degenerate families of instantons exist are in fact far from pathological. Importantly, many stochastic dynamical systems, in particular stochastic partial differential equations (SPDEs) motivated from physics, possess certain symmetries, such that the equations of motion are invariant e.g. under translations, rotations, Galilei transformations and so forth. If, in addition to the SDE itself, the observable whose statistics are computed has the same symmetries, then it is possible to search for unique minimizers or instantons of the large deviation minimization problem obeying the same symmetry. Generically, however, the global minimum will not be attained this way, but instead the true minimizer will break the symmetry and hence be comprised of a family of equivalent possible solutions related by the symmetry group of the system. Of particular interest is the case of a dynamical phase transition, where this symmetry breaking happens spontaneously with the extremeness of the rare event under consideration as the control parameter. Relevant examples of this phenomenon in the context of sample path LDT include the one-dimensional Kardar–Parisi–Zhang (KPZ) equation [43,44,45,46] for the surface height at one point in space and with two-sided Brownian motion initial condition (leading to discrete mirror symmetry breaking), the two-dimensional [47] and three-dimensional [48] incompressible Navier-Stokes equations and a Lagrangian turbulence model [49] (all with rotational symmetry breaking). In all of these cases, due to the underlying symmetries, it turns out that it suffices to integrate a single Riccati equation, corresponding to a single reduced functional determinant evaluation, which thereby allows for a generalization of earlier results [30,31,32,33] without increasing the computational costs. In addition to the examples listed above, further systems where the methods and results of this paper could be applied are those within the scope of the macroscopic fluctuation theory [50], e.g. the Kipnis-Marchioro-Presutti model on a ring where a dynamical phase transition for the current due to translational symmetry breaking is known to occur [51, 52].
Regarding limitations of this paper, we consider only systems where the drift term of the SDE has a unique, stable fixed point. Further, we do not explicitly discuss the extension to infinite time intervals which could be done through an appropriate geometric parameterization [53] that could be incorporated similar to [31]. We formulate our general results only for ordinary stochastic differential equations in \(\mathbb {R}^n\), and leave the (at least on a purely formal level) simple extension towards stochastic partial differential equations to the reader, treating this extension only by means of an example in this paper. The presentation throughout, which is based on stochastic path integrals, is not rigorous in favor of intuition and brevity, while still using a structure in terms of propositions, lemmas and derivations for clarity.
This paper is organized as follows: In Sect. 2, we start with the rederivation of known Riccati matrix results for unique large deviation minimizers with positive definite second variation. We introduce the general setup in Subsect. 2.1 and give the main results for prefactors of MGFs in Subsect. 2.2. The transformation onto PDF prefactors is carried out in Subsect. 2.3. Afterwards, Sect. 3 follows the same structure for the zero mode case. In Subsect. 3.1, we briefly motivate degenerate Laplace asymptotics in finitely many dimensions and then derive analogous results to Subsects. 2.2 and 2.3 in Subsect. 3.2 and 3.3. Afterwards, we consider four specific examples with degenerate instantons in Sect. 4 and compare the result of our leading order degenerate Laplace expansion to known theoretical results or direct sampling of the SDEs at hand. In addition to three finite-dimensional systems, we also deal with a dynamical phase transition in an irreversible one-dimensional stochastic partial differential equation (SPDE) in this section, namely the KPZ equation where we investigate the probability distribution of the average surface height at short times with flat initial condition. We conclude the paper with a discussion of the results and comments on future extensions in Sect. 5. Appendix A contains the general statement of Forman’s theorem for second order ordinary differential operators as well as general Lagrangian and Hamiltonian formulations of the theorem for second variation operators. Appendix B states a general expression for the MGF prefactor in the non-degenerate case for an arbitrary continuous time Markov process satisfying a large deviation principle as a reference. Finally, appendix C deals with an analytical computation for the LDT prefactor in the KPZ equation when expanding around the spatially homogeneous instantons of Subsect. 4.4.
2 Prefactor in the Nondegenerate Case
2.1 Freidlin–Wentzell Theory Setup
For \(n \in \mathbb {N}\) and \(\varepsilon > 0\), we consider the Itô SDE
on the finite time interval [0, T], \(T>0\), with multiplicative Gaussian noise. We assume that the process starts deterministically at \(x \in \mathbb {R}^n\). The drift \(b : \mathbb {R}^n \mapsto \mathbb {R}^n\) is not necessarily gradient. We assume it to be sufficiently smooth and to possess only a single fixed point \(x_* \in \mathbb {R}^n\) which is stable. The process \(B = \left( B_t \right) _{t \in [0,T]}\) is a standard n-dimensional Brownian motion, and the diffusion matrix \(a := \sigma \sigma ^\top :\mathbb {R}^n \rightarrow \mathbb {R}^{n \times n}\), also assumed to be sufficiently smooth, as well as nonvanishing at x, is not necessarily diagonal or invertibleFootnote 1.
We are interested in obtaining precise estimates, as the noise strength \(\varepsilon \) tends to zero, for the PDF \(\rho _f^\varepsilon :\mathbb {R}\mapsto [0, \infty )\) of a random variable \(f(X^\varepsilon _T)\) where \(f:\mathbb {R}^{n} \rightarrow \mathbb {R}\) is a possibly nonlinear observable of the process \(X^\varepsilon \) at final time T. Typically, we are interested in situations where n is large, as in the (semi-)discretization of an SPDE, and f corresponds to the observation of a real-valued physical quantity that is characteristic for a process described by an SPDE, either at a single point in space or averaged over the spatial volume. In the limit \(\varepsilon \downarrow 0\), it is intuitive that trajectories \(\left( X^\varepsilon _t\right) _{t \in [0,T]}\) concentrate around the deterministic trajectory \(\phi _0\) solving
LDT tells us that this concentration happens exponentially fast in \(\varepsilon \), and deviations from this deterministic behavior correspond to rare events.
The Freidlin–Wentzell rate (or action) functional that governs the concentration of the path measure on \(\phi _0\) is given by [3]
where \(a^{-1}\) is the Moore-Penrose inverse of a, \(\left\langle \cdot , \cdot \right\rangle _{n}\) is the standard Euclidean inner product on \(\mathbb {R}^n\) and \(AC\left( [0,T], \mathbb {R}^n \right) \) is the space of absolutely continuous paths \(\phi :[0,T] \rightarrow \mathbb {R}^n\). Note that we will treat a as invertible below, but no final result will contain any inverse of a, and all results remain valid if the limit to singular diffusion matrices is considered carefully. The asymptotic LDT estimate for the PDF \(\rho _f^\varepsilon \) as \(\varepsilon \downarrow 0\) reads
We call \(I_f\) the rate function of the observable. The minimizer \(\phi _z\), also termed the instanton, is a solution to the constrained minimization problem (4), and thus satisfies the first order necessary conditions in Hamiltonian form (cf. the derivation of Proposition 2.2.1)
where \(\theta _z = \partial L(\phi _z, {{\dot{\phi }}}_z) / \partial {\dot{\phi }}\) is the conjugate momentum of the instanton \(\phi _z\), and \(\lambda _z \in \mathbb {R}\) is a Lagrange multiplier, suitably chosen to enforce the final time constraint \(f\left( \phi _z(T)\right) = z\). Comparing (5) to the SDE (1) indicates that \(\eta _z = \sigma ^\top (\phi _z) \theta _z\) can be interpreted as the optimal (in the sense of most likely) forcing realization that drives the system towards the outcome \(f(X^\varepsilon _T) = z\).
The mere exponential scaling estimate from Freidlin–Wentzell theory, as given in (4), can be refined to next order to obtain a prefactor estimate in the small noise limit. These refinements rely on the fact that a sample path large deviation estimate formally corresponds to an infinite dimensional application of Laplace’s method, and higher order estimates can then be obtained by integrating the Gaussian integral of the second variation around the minimizer to obtain a ratio of determinants as prefactor. In this section, we will rederive the results of [30, 31] following this strategy, including the explicit evaluation of the appearing functional determinants using Forman’s theorem. Importantly, we only consider the case of unique instantons and positive definite second variations in this section.
In Sect. 3, we will then demonstrate that the approach can be generalized to SDEs and observables with degenerate instantons which are rendered non-unique due to an underlying symmetry of the system. While an extension towards multiple isolated global minimizers of the action functional is trivially achieved by simply summing over the contributions of each individual minimizer, we here consider the case of a degenerate family of instantons that define an r-dimensional submanifold \({{\mathcal {M}}}^r_z\) with \(r \in \left\{ 1, \dots , n \right\} \) in the space of all permitted paths \(\phi :[0,T] \mapsto \mathbb {R}^n\) that fulfill the boundary conditions \(\phi (0) = x\), \(f(\phi (T)) = z\), such that the action functional S is globally minimized and constant on \({{\mathcal {M}}}^r_z\). In order to formally derive an analogue procedure in this case, we will rely on well-known tools from field theory, where the spontaneous symmetry breaking of instantons is known to generate zero- or Nambu-Goldstone modes that need to be explicitly integrated out. The small noise expansion for sample path large deviations then necessitates removing zero eigenvalues from the second variation of the action at the instanton.
2.2 Moment-Generating Function Prefactor Estimates for Freidlin–Wentzell Theory with Unique Instantons
We define the moment-generating function (MGF) of the real-valued random variable \(f(X^\varepsilon _T)\) as
and assume in the remainder of this paper that the scaled cumulant-generating function
exists in \(\mathbb {R}\) for all \(\lambda \in \mathbb {R}\). For systems and observables where this assumption is not fulfilled, a convexification of the rate function \(I_f\) through a reparameterization of the observable as in [54] makes our results applicable.
We will proceed to derive precise large deviation results for \(A_f^\varepsilon \), which is simpler on a technical level than directly computing the PDF, and only afterwards perform an inverse Laplace transform onto the PDF, which can again be evaluated by a saddlepoint approximation as \(\varepsilon \downarrow 0\).
Remark 2.2.2
We set \((\nabla b)_{ij} = \partial _j b_i\) and use the short-hand notations \(\left[ \left\langle \nabla ^2 b(\phi ), \theta \right\rangle _{n} \right] _{ij} := \sum _{k=1}^n\partial _i \partial _j b_k(\phi ) \theta _{k}\) as well as \(\left[ \nabla a(\phi ) \theta \right] _{ij} = \sum _{k=1}^n \partial _j a_{ik}(\phi ) \theta _k\) and \(\left[ \left\langle \theta , \nabla ^2 a(\phi ) \theta \right\rangle _n \right] _{ij} =\sum _{k=1}^n\sum _{l=1}^n \partial _i \partial _j a_{kl}(\phi ) \theta _k \theta _l\). The precise meaning of the ratio of functional determinants in (14) will be explained below, where we will also rederive efficient computational methods in order to evaluate it. Throughout this paper, we denote functional determinants by \({{\,\textrm{Det}\,}}\) with the boundary conditions under which the determinant is computed as a subscript, whereas ordinary matrix determinants are written as \(\det \) with the dimension of the respective matrix as a subscript. The operator a in the functional determinants in (14) is to be understood as pointwise multiplication with \(a(\phi (t))\) for all \(t \in [0,T]\).
Remark 2.2.3
The exponent
in (13) is (minus) the Legendre-Fenchel transform of the rate function \(I_f\) evaluated at \(\lambda \), which yields the scaled cumulant-generating function and is finite by assumption.
Derivation of Proposition 2.2.1
We express the MGF \(A_f^\varepsilon \) at \(\lambda \in \mathbb {R}\) as a Wiener path integral over all realizations of the increments \(\eta = \textrm{d}B / \textrm{d}t\) of the Brownian motion B on [0, T]
where \(X^\varepsilon _T[\eta ]\) indicates that \(X^{\varepsilon }_T\) is a functional of the realization \(\eta \) of the noise, and we divide by the “free” path integral \(\int {{\mathcal {D}}} \eta \; \exp \left\{ -\frac{1}{2} \int _0^T \left\langle \eta , \eta \right\rangle _n \textrm{d}t \right\} \) to ensure correct normalization
of the path measure. We now perform a change of variables \(\eta \rightarrow X^\varepsilon \) in the path integrals, which necessitates including the correction terms
for a midpoint discretization of the path integral (see [55, 56] and in particular [57] for a detailed discussion), so that the rules of standard calculus apply in the subsequent expansion around the instanton. We obtain
where S is the Freidlin–Wentzell action functional (3). Both path integrals have a free right boundary and hence consider all paths that start at x, regardless of their final position at \(t = T\). The only difference is the final time boundary term in the numerator, which imposes different boundary conditions for the first and second variation of the action functional. We apply an infinite-dimensional version of Laplace’s method to both path integrals in the small noise limit \(\varepsilon \downarrow 0\), which leads to the computation of a ratio of functional determinants for the pre-exponential factor. Note that the additional terms in the exponent originating from C are irrelevant for the determination and expansion around the minimum as \(\varepsilon \downarrow 0\), and will just be evaluated at the expansion point.
For the denominator of (19), the first variation of the action around a fixed path \(\phi \) becomes
where \(\theta \) is the conjugate momentum of \(\phi \). Since \(\phi (0) = x\) due to the only boundary condition of the path integral, we have \(\gamma (0) = 0\) for all variations. Demanding that the first variation around \(\phi \) should vanish hence imposes the natural boundary condition \(\theta (T) = 0\) for a stationary path. We conclude that the deterministic trajectory \(\phi _0\) with vanishing momentum \(\theta _0(t) \equiv 0\) is the unique stationary point of the action functional in the denominator of (19) with \(S[\phi _0] = 0\). Expanding S around \(\phi _0\) to second order as in appendix A, we see that in addition to \(\gamma (0) = 0\), the variations need to satisfy \(\zeta (T)=0\) for the boundary term \(\tfrac{1}{2} \left\langle \gamma , \zeta \right\rangle _n |_0^T\) to vanish in the path integral [58], i.e. we obtain the boundary conditions (11) for \(\lambda = 0\). Hence
where we used the expansion
Note that, for any discretization \(0 = t_0< t_1< \dots < t_K = T\) of the time interval [0, T] with spacing \(\Delta t = T / K\), the Jacobian of this transformation cancels the divergent normalization constants of the discrete path measure
and also leads to a second order coefficient of the second variation operator of \(-1\) in the determinant
For the expansion of the numerator of (19), we first need to determine the instanton \(\phi _\lambda \) (with conjugate momentum \(\theta _\lambda \)) which minimizes S under the given boundary conditions. Additionally expanding the term \(-\lambda f(\phi (T)) \) around \(\phi _\lambda \) results in the first order necessary conditions (5) for a stationary path \(\phi _\lambda \). The boundary conditions of the fluctuations \(\gamma \) are given by \(\gamma (0) = 0\), and, taking into account the additional boundary term \(-\tfrac{\lambda }{2} \left\langle \gamma (T), \nabla ^2 f(\phi _\lambda (T)) \gamma (T) \right\rangle _n\) as well as the boundary term \(\tfrac{1}{2} \left\langle \gamma , \zeta \right\rangle _n |_0^T\) from the general expansion in appendix A,
i.e. the boundary conditions (11) (cf. [59, 60] for examples of path integrals with similar boundary conditions). Proceeding with the application of Laplace’s method to the numerator in (19) with these boundary conditions for the fluctuations, we conclude that
The functional determinants in Proposition 2.2.1 can either be defined as the (divergent) product of all eigenvalues of the differential operator under the boundary conditions in question when suitable ratios of operator determinants are considered, or individually via zeta function regularization [61]; see e.g. [62] for a short introduction. Since the top order coefficient of both operators in Proposition 2.2.1 is identical (and equal to -1), the spectra of the two operators should agree for asymptotically large eigenvalues and we can expect their determinant ratio to be finite. This idea is made precise for example by using Forman’s theorem [40], which is a generalization of the initial work of Montroll [63], Gel’fand and Yaglom [39] and others on ratios of functional determinants of Schrödinger operators in quantum mechanics. While the results of [40] are valid for the general case of elliptic differential operators on Riemannian manifolds, we only need the special case of second order ordinary differential operators on finite time intervals as stated in appendix A. In a Hamiltonian formulation in terms of fluctuations and momentum fluctuations, applying the general proposition A.2 to the Freidlin–Wentzell action (3) directly yields the following proposition in order to evaluate the ratio of functional determinants in (14):
Remark 2.2.5
We call (27) the (first order) Jacobi equation for the Freidlin–Wentzell action functional (3). Expressing it in terms of \(\gamma \) and \({\dot{\gamma }}\), i.e. from a Lagrangian instead of a Hamiltonian perspective, the Jacobi equation can equivalently be stated as a second order ordinary differential equation
with the Freidlin–Wentzell Jacobi operator \(\Omega \) defined in (10). This transformation is carried out explicitly for a general action functional in appendix A.
Remark 2.2.6
A particularly convenient aspect of proposition 2.2.4 is the fact that it makes the dependence of the functional determinants on the boundary conditions very transparent and easy to calculate. We just need any fundamental system of solutions \(\Upsilon \) for each of the operators \(\Omega \), which is entirely independent of the imposed boundary conditions, and then, for given boundary condition matrices M, N, we can immediately evaluate the right-hand side of (29) from our knowledge of the \(\Upsilon \)’s. The separation of the fundamental system of solutions and boundary condition dependence is the crucial feature that allows for the treatment of zero eigenvalues via boundary perturbations later.
Remark 2.2.7
Since \(\Gamma [\phi ]\) is traceless, \(\det \Upsilon _\lambda (t)\) and \(\det \Upsilon _0(t)\) are constant for all \(t \in [0,T]\).
Remark 2.2.8
Some examples, treated in [42], for typical boundary conditions encountered in physics and their representations in terms of matrices \(M,N \in \mathbb {R}^{2n \times 2n}\) (which are unique up to \(\text {GL}(2n)\) transformations) are
-
(i)
Dirichlet boundary conditions \(\gamma (0) = \gamma (T) = 0\):
$$\begin{aligned} M_{\text {Dirichlet}} = \left( \begin{array}{c|c} 1_{n \times n} &{} 0_{n \times n}\\ \hline 0_{n \times n} &{} 0_{n \times n} \end{array} \right) \,, \quad N_{\text {Dirichlet}} = \left( \begin{array}{c|c} 0_{n \times n} &{} 0_{n \times n}\\ \hline 1_{n \times n} &{} 0_{n \times n} \end{array} \right) \,. \end{aligned}$$(31)In quantum mechanics, functional determinants of operators with Dirichlet boundary conditions typically appear in the computation of semi-classical propagators.
-
(ii)
Periodic (Antiperiodic) boundary conditions \(\gamma (0) = p \cdot \gamma (T)\), \(\zeta (0) = p \cdot \zeta (T)\) with \(p = 1\) (\(p = -1\)):
$$\begin{aligned} M_p = \left( \begin{array}{c|c} 1_{n \times n} &{} 0_{n \times n}\\ \hline 0_{n \times n} &{} 1_{n \times n} \end{array} \right) \,, \quad N_p = \left( \begin{array}{c|c} -p \cdot 1_{n \times n} &{} 0_{n \times n}\\ \hline 0_{n \times n} &{} -p \cdot 1_{n \times n} \end{array} \right) \,. \end{aligned}$$(32)Functional determinants with periodic (antiperiodic) boundary conditions need to be evaluated for the calculation of partition functions and other thermal averages of bosons (fermions) in quantum statistical physics and field theory.
For the boundary conditions (11), possible choices for M, N are
Using proposition 2.2.4 and choosing \(\Upsilon _\lambda (0) = \Upsilon _0(0) = 1_{2n \times 2n}\) the prefactor \(R_\lambda \) in (14) simplifies to
with \((\gamma , \zeta ): [0,T] \rightarrow \mathbb {R}^{2n \times n}\) solving the Jacobi equation with boundary conditions
As remarked in [30, 31], considering the example of an Ornstein-Uhlenbeck process with \(b(x) = - \beta x\) for \(\beta > 0\) and \(\sigma (x) \equiv \sqrt{2}\) shows that the equation for \(\zeta \) in (35) should naturally be integrated backwards in time due to the appearance of \(-\nabla b(\phi _z)^\top \) on the right-hand side, in contrast to the formulation above in terms of an initial value problem. For large T, we consequently expect that the determinant in (34) will diverge to \(+ \infty \), whereas the exponential term will tend to 0. The following transformation onto a symmetric matrix Riccati differential equation mitigates this problem and is hence in particular well suited for numerical calculations of the prefactor \(R_\lambda \):
This result quantifies the impact of the Gaussian fluctuations around the instanton in a numerically convenient way. These fluctuations satisfy the linear SDE
and from a probabilistic point of view, proposition 2.2.9 effectively computes the expectation
Computationally, the inefficient approach to estimate \(A_f^\varepsilon (\lambda )\) for small \(\varepsilon \) using Monte Carlo simulations is thus replaced by the (\(\varepsilon \)-independent) problem to minimize the action functional S, subject to final time boundary conditions \(\theta _\lambda (T) = \lambda \nabla f(\phi _\lambda (T))\), plus the numerical integration of an initial value problem for \(Q_\lambda \). For moderate dimensions n (e.g. if the SDE at hand stems from the semi-discretization of a one-dimensional SPDE), the direct numerical integration of Q poses no problems.
Derivation of Proposition 2.2.9
The transformation of the Jacobi equation (35) to the solution \(Q = \gamma \zeta ^{-1}\) of the forward Riccati equation (37) is explained for a general action functional in appendix A. Hence, the proposition is obtained by factoring out \(\zeta (T)\) in (34) and using \(\det = \exp \textrm{tr}\log \) for
It is also straightforward to derive a representation of the prefactor \(R_\lambda \) in terms of a backward Riccati differential equation from Proposition 2.2.4:
Derivation of Proposition 2.2.10
The general transformation of the Jacobi equation (35) to the solution \(W = \zeta \gamma ^{-1}\) of the backward Riccati equation (41) can also be found in appendix A. Instead of the initial condition \(\Upsilon _\lambda (0) = 1_{2n \times 2n}\), we now pick (assuming for simplicity that \( \nabla ^2 f(\phi _\lambda (T))\) has full rank)
as final condition of the fundamental system of solutions. Hence \({\det }_{2n} \Upsilon _\lambda (T) = {\det }_{n} \left( -\lambda \nabla ^2 f(\phi _\lambda (T)) \right) \) and
where \(\gamma \) is composed of the upper left block of the fundamental system of solutions. Again computing
completes the derivation.
2.3 Probability Density Function Prefactor Estimates for Freidlin–Wentzell Theory with Unique Instantons
Assuming, as usual, strict convexity of the rate function \(z \mapsto I_f(z)\):
Remark 2.3.2
By Legendre duality, we have \(\lambda _z = I_f'(z)\) for the observable rate function \(I_f(z) = S[\phi _{\lambda _z}]\), so the additional term in the PDF prefactor in Proposition 2.3.1 compared to the MGF case of the previous section can be written as
where the second derivative of \(I_f\) is positive by our assumption of strict convexity.
Derivation of Proposition 2.3.1
Since the scaled MGF is a two-sided Laplace transform \({{\mathcal {L}}}\) of the PDF
it can be inverted by contour integration (with a suitable shift \(\alpha \in \mathbb {R}\) for the contour):
where we applied a saddlepoint approximation in the last line. At stationary points of the Lagrange function \({\tilde{S}}_z\), we demand that the first derivative
vanishes, and hence \(f(\phi _{\lambda _z}(T)) = z\) at the unique minimum. Furthermore, we see that \({\tilde{S}}_z''(\lambda ) = -\frac{\textrm{d}}{\textrm{d}\lambda } f(\phi _\lambda (T))\), thereby concluding the derivation.
Remark 2.3.3
Via partial integration, as detailed in [64], it is also straightforward to derive an asymptotic expression for tail probabilities \(\mathbb {P}\left[ f(X_T^\varepsilon ) > z \right] \) from Proposition 2.3.1: For any \(z \in \mathbb {R}\) such that \(S \left[ \phi _\cdot \right] \) increases monotonically on \([z,\infty )\) with \(\textrm{d}S[\phi _z] / \textrm{d}z > 0\) (where \(\phi _z := \phi _{\lambda _z}\)), we have
with \(\lambda _z\) uniquely determined by \(f(\phi _{\lambda _z}(T)) = z\).
Expressing the derivative of \(f(\phi _\lambda (T))\) with respect to \(\lambda \) at \(\lambda _z\) in terms of the forward Riccati matrix \(Q_z = Q_{\lambda _z}\) (similarly \(\phi _z = \phi _{\lambda _z}\), etc) finally recovers the full result of [30] for the PDF of one-dimensional observables:
Remark 2.3.5
Note that, alternatively, we could have directly evaluated a path integral expression for the PDF at z, which necessitates integrating over all paths that start at \(\phi (0)=x\) and end with \(f(\phi (T))=z\). This results in the boundary conditions
for the quadratic fluctuations and functional determinant, thereby making the application of Forman’s theorem and the introduction of the Riccati matrices more involved. Nevertheless, it would also be possible to derive the PDF prefactor results in this section using this direct approach.
Derivation of Proposition 2.3.4
The fluctuation mode \((\textrm{d}\phi _\lambda / \textrm{d}\lambda , \textrm{d}\theta _\lambda / \textrm{d}\lambda )\) satisfies the boundary conditions
as well as the Jacobi equation (27) along \((\phi _\lambda , \theta _\lambda )\). Hence, choosing \((\textrm{d}\phi _\lambda / \textrm{d}\lambda , \textrm{d}\theta _\lambda / \textrm{d}\lambda )\) as the first column of n linearly independent solutions \((\gamma , \zeta ):[0,T] \rightarrow \mathbb {R}^{2n \times n}\) with \(\gamma (0) =0\) and \(Q = \gamma \zeta ^{-1}\) results in
where \((*)_{n\times (n-1)}\) is a placeholder for the further \(n-1\) irrelevant columns. Then
and consequently
3 Prefactor in the Presence of Zero Modes
3.1 Motivation and Finite-Dimensional Examples
In this section, we derive in detail analogous statements to the previous section for situations where an r-dimensional continuous family \({{\mathcal {M}}}^r_z\) of instanton solutions exist for a given observable value z. We are in particular interested in the case of dynamical phase transitions due to spontaneous symmetry breaking of the instanton, where the action functional and boundary conditions as a whole possess a certain symmetry, the possible violation of which beyond a critical observable value \(z_{\text {c}}\) gives rise to a continuous family of degenerate instantons and associated flat directions or zero modes in the function space of all variations. An alternative to a phase transition at a critical observable value for zero modes to occur would be the “trivial” case where all instantons at any observable strength must necessarily break the symmetry of the problem, an example of which is sketched in Fig. 1. On the level of rate functions, these two different scenarios roughly look as sketched in Fig. 2. These examples will be discussed in Sects. 4.1 and 4.3.
Both of these situations are not only relevant in many examples, but furthermore convenient from a numerical perspective, since, due to the underlying symmetry of the entire problem, it will turn out that it suffices to consider a single, arbitrarily chosen instanton in \({{\mathcal {M}}}^r_z\) and compute a modified prefactor for this particular instanton by solving the same Riccati equations as before. We will again proceed first on the level of MGFs and afterwards transform onto the PDF. Despite the fact that in the case of spontaneous symmetry breaking, the rate function can become non-convex as in Fig. 2, the final results for the PDF prefactor remain valid in this case as well. The idea is that even though some instantons might be unobtainable through minimization at fixed \(\lambda \) [54], as in Fig. 2 with \(z \in (z_1, z_2)\), they can still be computed directly using different minimization strategies such as penalty methods [48], and of course correspond to some value of \(\lambda \) depending on their final time position and momentum, which can then be used to compute the prefactor. If the rate function branches are then locally convex individually (or convexified appropriately), then the corresponding prefactor derivations go through without changes.
In order to derive appropriately modified prefactor formulas, we will use the following, conceptually simple strategy: First, we split the integration in path space into components along the submanifold of degenerate minimizers and the subspace which is \(L^2\)-orthogonal to it. For each point on the submanifold, we can then use Laplace’s method on the normal space, where all flat directions of the second variation of the action are removed by construction. Then, a boundary-type regularization procedure [42, 65, 66] is used to compute functional determinants with removed zero eigenvalues by integrating a Riccati equation similar to the non-degenerate case.
We start with a brief motivation in finitely many dimensions, as well as two simple examples: Consider the Laplace-type integral
in the case where there is a family of global minimizers \(\mathcal{M}^r\) of \(S:\mathbb {R}^n \rightarrow \mathbb {R}\), and \(h:\mathbb {R}^n \rightarrow \mathbb {R}\) is any continuous function. We assume that \({{\mathcal {M}}}^r = {{\,\mathrm{arg\,min}\,}}S\) is an r-dimensional submanifold of \(\mathbb {R}^n\) with \(0< r < n\). Then, we know that for small \(\varepsilon > 0\), the integral \(J_\varepsilon \) is dominated by the behavior of S in an open neighborhood \(U_{{{\mathcal {M}}}^r}\) of \({{\mathcal {M}}}^r\), such that
where the integration was split into the integration along \(\mathcal{M}^r\) (with surface measure \(\textrm{d}^r \mu \)) and the (entire, for \(\varepsilon \downarrow 0\)) normal space \(N_y {{\mathcal {M}}}^r\) perpendicular to the hypersurface \({{\mathcal {M}}}^r\). This split of integration directions is usually done formally using the Faddeev-Popov method [67] in the physics literature, which consists of inserting a suitable Dirac \(\delta \) function into the initial integral. For each \(y \in {{\mathcal {M}}}^r\), applying Laplace’s method in z yields
where \({\det }'_{n-r}\) denotes the removal of the r zero eigenvalues of the matrix \(\nabla ^2S(y) \in \mathbb {R}^{n \times n}\) from the determinant that correspond to eigenvectors in the tangent space \(T_y{{\mathcal {M}}}^r\). In the second line, we used that S is constant in \({{\mathcal {M}}}^r\) in order to pull the exponential factor out of the integral, evaluated at any \(y_0 \in {{\mathcal {M}}}^r\). Now, there are two cases: If \({\det }'_{n-r}(\nabla ^2 S)\) and h are constant along \({{\mathcal {M}}}^r\), the volume of \({{\mathcal {M}}}^r\) factors out and we obtain (if this volume is finite; otherwise, the integral is infinite and needs to be regularized in some way in order to make sense of it, e.g. by normalizing it with respect to the volume)
Otherwise, the integral along \({{\mathcal {M}}}^r\) in (61) needs to be evaluated explicitly. It is easy to find two-dimensional examples (\(n = 2\), \(r = 1\)) for either case (with \(h \equiv 1\)):
-
(i)
Consider \(S:\mathbb {R}^2 \rightarrow \mathbb {R}\), \(S(x,y) = (1+x^4)y^2\). Then the set of minimizers of S is given by the \((r=1)\)-dimensional manifold \({{\mathcal {M}}}^1 = \left\{ (x,0) \in \mathbb {R}^2 \right\} \) with \(S|_{{{\mathcal {M}}}^1} = 0\), and Hessian \(\nabla ^2 S(x,0) = \text {diag}(0, 2(1+x^4))\). Since the integration along y for each x is already Gaussian, (61) yields the exact result
$$\begin{aligned} J_\varepsilon = (2 \pi \varepsilon )^{1/2} \int _{-\infty }^\infty \frac{\textrm{d}x}{\sqrt{{\det }'_1(\nabla S(x,0))}} = \left( \pi \varepsilon \right) ^{1/2} \int _{-\infty }^\infty \frac{\textrm{d}x}{\sqrt{1+x^4}} = \frac{\Gamma \left( \frac{1}{4} \right) ^2}{2} \sqrt{\varepsilon }\,. \end{aligned}$$(63)Notably, in this case, \({\det }'_{n-r}(\nabla ^2 S)\) is not constant along the family of minimizers, and the dependency of \(\nabla ^2 S\) on x was needed in order to obtain the correct, finite result despite the infinite volume of the family of minimizers. Also, in this example, while the action on \({{\mathcal {M}}}^1\) is constant (and equal to 0) under translations \(x \rightarrow x + \delta x\), this is not true for the action S on all of \(\mathbb {R}^2\).
-
(ii)
Next, consider \(S:\mathbb {R}^2 \rightarrow \mathbb {R}\), \(S(x,y) = \left( x^2 + y^2 - a^2\right) ^2\) with \(a>0\), such that the set of minimizers is the \(r=1\)-dimensional manifold \({{\mathcal {M}}}^1 = \left\{ (x,y) \in \mathbb {R}^2 \mid x^2 + y^2 = a^2 \right\} \). Here, the eigenvalues of the Hessian at the minimizers are given by \(\lambda _0 = 0\) and \(\lambda _1 = 8 a^2\). In this case, the eigenvalues are independent of the position on \({{\mathcal {M}}}^1\), since the entire action is rotationally invariant. From (62), we obtain \(J_\varepsilon \overset{\varepsilon \downarrow 0}{\sim }\ (2 \pi \varepsilon )^{1/2} 2 \pi a (8 a^2)^{-1/2} = \pi ^{3/2} \varepsilon ^{1/2}\,,\) in accordance with the \(\varepsilon \downarrow 0\) asymptotics of the exact result \(J_\varepsilon = \pi ^{3/2} \varepsilon ^{1/2} \left[ 1 + \hbox {erf}\left( a^2 / \varepsilon \right) \right] /2\).
3.2 Moment-Generating Function Prefactor Estimates for Freidlin–Wentzell Theory with Zero Modes
In our setup of sample path large deviation theory, we will only consider the second scenario where the volume of the manifold factors out and is finite. Note that in this sense, the volume part in the prefactor can always be trivially found, such as a sphere or box volume of the “equi-observable” hypersurfaces, and the nontrivial part of our analysis is to find the exact way in which the Riccati approach can be adjusted when the second variation functional possesses vanishing eigenvalues.
Usually, when solving the instanton equations (5) for \((\phi _\lambda ,\theta _\lambda )\) in the situation that there is an r-dimensional submanifold, \(r \ge 1\), of global minimizers \(\mathcal{M}^r_\lambda \), we will find a specific parameterization of \({{\mathcal {M}}}^r_\lambda \), \(u \mapsto \phi _\lambda ^u\) for \(u\in D \subseteq \mathbb {R}^r\). Then, a basis of the tangent space \(T_{\phi _\lambda ^u}{{\mathcal {M}}}^r_\lambda \) is given by the zero modes
with \(i = 1, \dots , r\). We denote the corresponding momentum fluctuations as
We make the following two observations:
-
The zero modes \(\psi ^u_{\lambda ,i}\), \(i = 1, \dots , r\) satisfy the Jacobi equation (27) (or, equivalently, (30)), since
$$\begin{aligned} \left. \frac{\delta S}{\delta \phi } \right| _{\phi _\lambda ^u} = 0 \; \forall u \in D \quad \overset{\partial /\partial u_i}{\Rightarrow } \quad \left. \frac{\delta ^2 S}{\delta \phi ^2} \right| _{\phi _\lambda ^u} \psi ^u_{\lambda ,i} = \Omega [\phi _\lambda ^u] \psi ^u_{\lambda ,i} = 0 \; \forall u \in D\,, \end{aligned}$$(66)as well as the boundary conditions \({{\mathcal {A}}}_\lambda ^u\) of the second variation, because
$$\begin{aligned}&\phi ^u_\lambda (0) = x \; \forall u \in D \quad \overset{\partial /\partial u_i}{\Rightarrow }\ \quad \psi ^u_{\lambda ,i}(0) = 0 \nonumber \\&\theta _\lambda ^u(T) = \lambda \nabla f(\phi _\lambda ^u(T)) \quad \overset{\partial /\partial u_i}{\Rightarrow }\ \quad \xi ^u_{\lambda ,i}(T) = \lambda \nabla ^2 f(\phi _\lambda ^u(T)) \psi ^u_{\lambda ,i}(T)\,. \end{aligned}$$(67)Hence, each of the zero modes is an admissible eigenfunction of the Jacobi operator \(\Omega [\phi _\lambda ^u]\) under \({{\mathcal {A}}}_\lambda ^u\) with eigenvalue \(\lambda ^{(0)}_i = 0\) and it follows that \({{\,\textrm{Det}\,}}_{{{\mathcal {A}}}_\lambda } \left( \Omega [\phi _\lambda ^u] \right) = 0\).
-
We can immediately conclude that \(r \le n\) since there are at most n linearly independent solutions of the first order Jacobi equation (27), i.e.
$$\begin{aligned} \frac{\textrm{d}}{\textrm{d}t} \left( \begin{array}{c} \gamma \\ \zeta \end{array} \right) = \Gamma \left[ \phi _\lambda ^u \right] \left( \begin{array}{c} \gamma \\ \zeta \end{array} \right) \end{aligned}$$(68)that satisfy the initial condition \(\gamma (0) = 0 \in \mathbb {R}^n\).
It is now straightforward to formulate the analogue of Proposition 2.2.1 in the presence of zero modes:
For the given parameterization \(u \mapsto \phi _\lambda ^u\), the volume of \({{\mathcal {M}}}^r_\lambda \) can be computed as
where is the Gram matrix defined via
In order to be able to compute the ratio
in \({\tilde{R}}_\lambda \) efficiently using Forman’s theorem, without having to compute and multiply all non-zero eigenvalues of both operators, we use a technique based on boundary perturbations. The concept of the following treatment is described in [42], who discuss the case of an arbitrary number of zero modes with Dirichlet and (anti)-periodic boundary conditions. A related paper in this regard is also [68]. Note, however, that these references do not derive manifestly parameterization-invariant results, and further discuss neither the boundary conditions specific for low dimensional observables in sample path large deviations, nor the relation to efficient numerical prefactor computations using Riccati equations.
The idea of the boundary regularization procedure to compute \({{\,\textrm{Det}\,}}_{{{\mathcal {A}}}_\lambda ^{u_0}}^\prime \left( a(\phi _\lambda ^{u_0}) \Omega [\phi _\lambda ^{u_0}] \right) \) is as follows: We modify the boundary conditions \({{\mathcal {A}}}_\lambda ^{u_0}\), realized through \(M_\lambda ^{u_0}, N_\lambda ^{u_0} \in \mathbb {R}^{2n \times 2n}\), using a small perturbation, that is, we replace them by \(M_\lambda ^{u_0}(\delta ), N_\lambda ^{u_0}(\delta ) \in \mathbb {R}^{2n \times 2n}\) with \(\delta = (\delta _1, \dots , \delta _r) \in \mathbb {R}^r\), such that \(M_\lambda ^{u_0}(0)= M_\lambda ^{u_0}\) and \(N_\lambda ^{u_0}(0)=N_\lambda ^{u_0}\). The boundary perturbation has to be chosen in such a way as to remove all zero eigenvalues of \(\Omega [\phi _\lambda ^{u_0}]\). Then we carry out the following three steps:
-
1.
Explicitly compute the leading order asymptotics of the r nonzero eigenvalues \(\lambda _1^{(0)}(\delta ), \dots , \lambda _r^{(0)} (\delta )\) of \(\Omega [\phi _\lambda ^{u_0}]\) under \(M_\lambda ^{u_0}(\delta ), N_\lambda ^{u_0}(\delta )\) that tend to 0 as \(\delta \rightarrow 0\).
-
2.
Apply Forman’s theorem to evaluate the full, nonzero determinant \({{\,\textrm{Det}\,}}_{{{\mathcal {A}}}_\lambda ^{u_0}(\delta )} \left( a(\phi _\lambda ^{u_0}) \Omega [\phi _\lambda ^{u_0}] \right) \).
-
3.
Evaluate
$$\begin{aligned} {{\,\textrm{Det}\,}}_{{{\mathcal {A}}}_\lambda ^{u_0}}' \left( a(\phi _\lambda ^{u_0}) \Omega [\phi _\lambda ^{u_0}] \right) \overset{\cdot }{=} \lim _{\delta \rightarrow 0} \left[ \frac{{{\,\textrm{Det}\,}}_{{{\mathcal {A}}}_\lambda ^{u_0}(\delta )} \left( a(\phi _\lambda ^{u_0}) \Omega [\phi _\lambda ^{u_0}] \right) }{\prod _{i=1}^r \lambda _i^{(0)}(\delta )} \right] \,. \end{aligned}$$(77)
Of course, step 2 and 3 only make sense when considering ratios of functional determinants; however, since it is irrelevant to the following discussion, we omit the division by the free determinant for the time being and denote equalities up to division by the free determinant via “\(\overset{\cdot }{=}\)” as in [42].
In our setup, there are different types of regularization that can be chosen depending on the assumptions. We start with the case of a nonlinear observable with positive definite matrix
where
Importantly, the zero modes \((\psi ^u, \xi ^u)\) are, due to their initial conditions \(\psi ^u(0) = 0\) and \(\xi ^u(0) \ne 0\), part of the n solutions \((\gamma , \zeta )\) that make up the forward Riccati matrix solution with \(Q = \gamma \zeta ^{-1}\) and \(Q(0) = 0\). Now, since \(\nabla ^2 f(\phi ^u_\lambda (T))\) is non-degenerate on the space of final time zero mode states \(\psi ^u(T)\), we conclude that \(\xi ^u(T)\) will also be nondegenerate due to the boundary conditions of the zero modes. Hence, the forward Riccati differential equation for Q remains well-posed and Q(t) does not explode as \(t \rightarrow T\), the only problem being the removal of zero eigenvalues of \({\det }_n \left( 1_{n \times n} - \lambda \nabla ^2 f \left( \phi _\lambda (T) \right) Q_\lambda (T) \right) \) in Proposition 2.2.9.
In this case, the problem can be regularized using the perturbation
where \(\left\{ {\tilde{\xi }}^{u_0}_{\lambda ,1}, \dots ,{\tilde{\xi }}^{u_0}_{\lambda ,r} \right\} \) is any (oriented) orthonormal basis of the vector space \(\text {span} \left\{ \xi ^{u_0}_{\lambda ,1}(T), \dots , \xi ^{u_0}_{\lambda ,1}(T) \right\} \subset \mathbb {R}^n\) spanned by the zero mode momenta at \(t=T\). Let us denote by \(\psi ^{{u_{0}}}_{{\lambda ,i}}(\delta )\) the eigenfunctions of \(\Omega [\phi _\lambda ^{u_0}]\) under these boundary conditions \(\mathcal{A}_\lambda ^{u_0}(\delta )\) that tend to the zero modes \(\psi ^u_{\lambda ,i}\) as \(\delta \rightarrow 0\). Then we have the following leading order asymptotics of \(\prod _{i=1}^r \lambda _i^{(0)}(\delta )\) for step 1 with this particular regularization:
Derivation of Lemma 3.2.2
The modified boundary conditions at \(t = T\) read
For any \(i,j \in \left\{ 1, \dots , r \right\} \), we compute
Computing the determinant of these expressions yields
In the last step, note that it will not be true in general that \(\psi ^{{u_{0}}}_{{\lambda ,i}}(\delta ) \rightarrow \psi ^{{u_{0}}}_{{\lambda ,i}}\) as \(\delta \rightarrow 0\) for each \(i = 1, \dots , r\) individually (cf. [42]), but due to linearity, the transformation matrices from \(\lim _{\delta \rightarrow 0} \psi ^{u_0}_{\lambda }(\delta )\) to \(\psi ^{u_0}_{\lambda }\) and from \(\lim _{\delta \rightarrow 0} \xi ^{u_0}_{\lambda }(\delta )\) to \(\xi ^{u_0}_{\lambda }\) will coincide and their determinants therefore cancel in the last step.
Derivation of Lemma 3.2.3
We pick an orthonormal basis of \(\mathbb {R}^n\) by extending \(\left\{ {\tilde{\xi }}^{u_0}_{\lambda ,1}, \dots ,{\tilde{\xi }}^{u_0}_{\lambda ,r}\right\} \) by \(n - r\) additional unit vectors \(v_1, \dots , v_{n-r}\). In this basis, the right boundary matrix \(N_\lambda ^{u_0}(\delta )\) from (80) becomes
For the fundamental system of solutions \(\Upsilon \), we choose the initial condition
such that
and
Combining the previous two lemmas with Proposition 3.2.1 and observing that for the solutions \((\gamma , \zeta )\) of the Jacobi equation in Lemma 3.2.3, we have
which yields the following concrete formula to evaluate the MGF prefactor in the presence of zero modes for nondegenerate, nonlinear observables:
The second case that we consider is when the matrix
is not positive definite, which is in particular relevant for the important case of linear observables. Here, the regularization procedure of the previous proposition will not work and the solution of the Riccati matrices with unmodified initial or final conditions can diverge since the zero modes can provide solutions of the Jacobi equation (27) with \(\gamma (0) = 0\) and \(\zeta (T) = 0\). We will instead suppose in the following that the matrix
is positive definite and regularize the final time boundary condition as
where \(\left\{ {\tilde{\psi }}^{u_0}_{\lambda ,1}, \dots ,{\tilde{\psi }}^{u_0}_{\lambda ,r} \right\} \) is any orthonormal basis of the vector space \(\text {span} \left\{ \psi ^{u_0}_{\lambda ,1}(T), \dots , \psi ^{u_0}_{\lambda ,r}(T) \right\} \subset \mathbb {R}^n\) spanned by the zero modes at \(t=T\). Going through a similar calculation as above results in the following proposition 3.2.5, now with
for the quasi-zero eigenvalue behavior as \(\delta \rightarrow 0\), and final condition
for the fundamental system of solutions \(\Upsilon \) in an orthonormal basis \(\left\{ {\tilde{\psi }}^{u_0}_{\lambda ,1}, \dots ,{\tilde{\psi }}^{u_0}_{\lambda ,r}, v_1, \dots , v_{n-r}\right\} \):
Remark 3.2.6
The final condition of the backward Riccati matrix in Proposition 3.2.5 is to be understood as
in index notation, with
as usual. For linear observables f, it reduces to
3.3 Probability Density Function Prefactor Estimates for Freidlin–Wentzell Theory with Zero Modes
Again performing an inverse Laplace transform leads to a proposition for PDF prefactors in the presence of zero modes. This is the main result of the paper. It constitutes a complete recipe for the computation of the PDF when zero modes are present, since every quantity can be evaluated numerically, after numerically integrating a Riccati equation along the symmetry broken instanton.
Alternatively, the regularization on the left boundary
leads to the following expression for the PDF prefactor using the same techniques as outlined above:
Remark 3.3.3
Note that, again, the initial conditions were modified in a suitable way as to remove divergences from the Riccati equation and render the determinants in the denominator non-zero. While this result is convenient in that it can be used regardless of whether the Hessian \(\nabla ^2 f(\phi _z^{u_0}(T))\) is non-singular, it may be inconvenient for taking the stationary limit \(T \rightarrow \infty \). As an example, consider an SDE with additive noise and initial position \(x = x_*\) at the fixed point. Then \({{\,\textrm{vol}\,}}\left( \theta _z(0) \right) \) will tend to 0 in this case for \(T \rightarrow \infty \). Similarly, the Riccati matrix Q will “forget” its regularizing initial condition and instead tend to its stationary solution \(Q_*\) determined by the Lyapunov equation
Remark 3.3.4
We observe that the determinant of the \(L^2\)-scalar products of the zero modes in (73) cancels in each of the expressions which we have derived via boundary regularization, and we are always left only with integrations over the zero modes at the initial or final time T. This is a generic feature of the regularization procedure as remarked already in [42].
4 Examples
In this section we illustrate the application of the propositions to compute PDF prefactors in the presence of zero modes in four instructive examples. We start with the arguably simplest case in Subsect. 4.1: A multidimensional Ornstein-Uhlenbeck process with a purely radial, linear vector field as drift and the norm of the process as the observable as sketched in Fig. 1 (left). Here, all results on both finite and infinite time horizons T can be found analytically. In Subsect. 4.2, we consider again a diffusion process in a rotationally symmetric vector field with the radius as our observable. Here the vector field is constructed to be non-linear and to possess an angular component to break the detailed balance property of the process. In the limit \(T \rightarrow \infty \), the problem can again be solved exactly, and, in addition to this limiting case, we compare the numerical solution of the instanton and Riccati equations to direct sampling of the SDE for finite times. Third, in Subsect. 4.3, we analyze a three-dimensional diffusion process in a potential landscape of the type sketched in Fig. 1 (right). This is the first concrete example with a dynamical phase transition that is considered in this paper, and, restricting ourselves to the infinite time limit \(T \rightarrow \infty \) for clarity, we show that the Riccati formalism correctly predicts the PDF prefactor in the quadratic approximation and compare it to the full prefactor at different finite noise strengths \(\varepsilon > 0\). Finally, in Subsect. 4.4, we show by means of the one-dimensional KPZ equation with a dynamical phase transition for the average surface height that the formalism developed in this paper remains formally applicable and numerically feasible for out-of-equilibrium systems with infinitely many spatial degrees of freedom. Numerical applications to spatially extended systems in fluid dynamics and turbulence theory are left as a subject of future, separate publications.
4.1 n-Dimensional Ornstein–Uhlenbeck Process with Radius as Observable
We consider the case of an n-dimensional Ornstein-Uhlenbeck process with \(n \ge 2\), as sketched in Fig. 1 (left) for \(n = 2\),
We take \(b(x) = - \beta x\) for the drift with \(\beta > 0\), \(a = 2 \cdot 1_{n \times n}\) for the diffusion matrix and \(f(x) = \Vert x\Vert _n\) for the observable. In this case, the radial symmetry will always necessarily be broken by the instanton at any \(z > 0\) and generate \(n - 1\) zero modes. As a reference, the PDF \(\rho ^\varepsilon \) of \(X^\varepsilon _T\) is always Gaussian for any \(T > 0\) with
Note that the prefactor of the full PDF, given by \(\left( 2 \pi \varepsilon \right) ^{-n/2} \left[ \beta / (1 - \exp \left\{ -2 \beta T\right\} )\right] ^{n/2}\), is just a constant in x, such that the reference radial PDF
with \({{\,\textrm{vol}\,}}_{n-1}\left( S^{n-1}\right) = 2 \pi ^{n/2} / \Gamma (n/2)\) merely acquires a z-dependent prefactor through the multiplication with a hypersphere volume. Here, \(\Gamma \) denotes the gamma function. Furthermore we can evaluate the MGF \(A_f^\varepsilon \) for \(\lambda \ge 0\) using the probability density and applying Laplace’s method:
Starting with the computation of the MGF using instantons, for any unit vector \(e_u \in \mathbb {R}^n\) and with \(\nabla f(x) = x / \Vert x\Vert _n\), a valid solution of the instanton equations is
with corresponding action
so that
as expected.
For the prefactor, we note that with \(n - 1\) zero modes corresponding to angles on the hypersphere, the \(\varepsilon \)-scaling of the prefactor of the MGF in (72) is correct. We first evaluate the prefactor \({\tilde{R}}_\lambda \) according to (106), i.e. using the forward Riccati equation with unmodified initial condition: The solution of the forward Riccati equation
is
and with \(\nabla ^2f(x) = \text {pr}_{x^\perp } / \Vert x\Vert _n\), where \(\text {pr}_{x^\perp }\) denotes the orthogonal projection onto the subspace \(x^\perp \subset \mathbb {R}^n\), we obtain
Hence, \(n-1\) eigenvalues are 0 and
Since \(\nabla ^2 b = 0\), we are left with evaluating
thereby correctly reproducing the MGF (120) including the prefactor. In order to get the PDF (119) using Proposition 3.3.1, all we have to do is note that
which immediately leads to (119) via (104).
Alternatively, we can use the backward Riccati approach (106), i.e. using the backward Riccati equation with modified final condition. Then, the volume term becomes
and solving the Riccati equation
to get
leads to
thereby correctly reproducing the full prefactor.
Finally, we compute the prefactor using Proposition 3.3.2 with a forward Riccati equation with modified initial condition. This is instructive in that it demonstrates the singular limits of the individual terms as \(T\rightarrow \infty \). We note that \({\tilde{R}}_{\lambda }\) and its constituents in the previous paragraphs have a well-behaved limit as \(T \rightarrow \infty \), which is in contrast to the PDF prefactor computation via Proposition 3.3.2 presented here. First
tends to 0 as \(T \rightarrow \infty \), whereas, since with
and
we get
and
such that the regularized denominator from Proposition (3.3.2)
also tends to zero as \(T \rightarrow \infty \) and only their quotient \({\tilde{R}}_\lambda \) remains finite.
4.2 Rotationally Symmetric Two-Dimensional Vector Field with Swirl
As a second example, we slightly modify the situation of the previous subsection to a nonlinear radial vector field, to which we then also add a rotationally symmetric nonlinear swirl. Restricting ourselves to a spatial dimension \(n = 2\), we consider the following drift vector field in polar coordinates \((r, \varphi )\):
with unit coordinate vectors \(e_r = x / \Vert x\Vert = (\cos \varphi , \sin \varphi )\) and \(e_\varphi = (-\sin \varphi , \cos \varphi )\). We again consider a diffusion process \((X_t^\varepsilon )_{[0,T]}\) in this vector field starting at \(x_0 = 0\) with final-time observable \(f(X_T^\varepsilon ) = \Vert X_T^\varepsilon \Vert \), and the radial symmetry of this problem will generate one zero mode in this case. Even though the drift is not gradient, the leading order behavior of the PDF \(\rho _\varepsilon ^f\) in \(\varepsilon \) as \(T \rightarrow \infty \), i.e. in the stationary case, can be found analytically here. The reason for this is that the drift given in (140) is already specified in terms of its transverse decomposition [3, 69]
where V is the quasi-potential. In our example, we have \(V(x) = V_r(\Vert x\Vert )\) and \(\ell (r,\varphi ) = l(r) e_\varphi \). The stationary PDF of the process itself is given by [31, 33]
Since the transverse vector field \(\ell \) in our example is divergence-free, we conclude that the PDF \(\rho _f^\varepsilon \) of \(f(X^\varepsilon _T)\) as \(T \rightarrow \infty \) and \(\varepsilon \downarrow 0\) will be given by
For finite times, no easy analytical solution is available, so we have to solve the instanton and (forward) Riccati equations numerically in order to obtain the precise small noise asymptotics of the PDF \(\rho _f^\varepsilon \). For the specific example
we compare the results of this numerical procedure to Monte Carlo sampling at a fixed, small noise level \(\varepsilon \) for different times T in Fig. 3. For \(T \in \{0.01, 0.1, 1., 5.\}\), instanton solutions \((\phi _z^{u_0}, \theta _z^{u_0}, \lambda _z)\) were computed directly for different, equidistantly spaced \(z \in [0,3]\) using the augmented Lagrangian method for the final time constraint and the L-BFGS algorithm using adjoints as detailed in [48], with \(n_t = 4000\) time discretization points in all cases and Heun time steps. Here, \(u_0 \in [0, 2 \pi )\) is the arbitrary angle characterizing the numerically found instantons. Afterwards, for each instanton, the forward Riccati equation from Proposition 3.3.1 (i) was solved numerically with the same time discretization and time stepping. In order to evaluate the prefactor (106), the \(\det '\) expression was computed by only taking into account the single positive eigenvalue of \(1_{2 \times 2} - \lambda _z \nabla ^2 f(\phi ^{u}_z(T)) Q_z^{u}(T)\) (the other eigenvalue being close to zero). The zero mode volume prefactor is
where, in the last line, we used that due to rotational symmetry, the scalar product of the tangent vectors is the same as for the original instanton, as well as \(\theta _z^u(T) = \lambda _z \nabla f(\phi _z^u(T)) = \lambda _z \phi _z^u(T) / \Vert \phi _z^u(T)\Vert \). The last ingredient for the prefactor (104), the derivative \(\textrm{d}\lambda _z \ / \textrm{d}z\), was simply computed by numerical differentiation of the obtained map \(z \mapsto \lambda _z\) from the instanton computations. As Fig. 3 shows, both the limiting case \(T \rightarrow \infty \), as well as the Monte Carlo data at smaller T and \(\varepsilon = 0.05\) are well reproduced.
4.3 Dynamical Phase Transition in a Three-Dimensional Gradient System
For a system dimension of \(n = 3\), we consider a first instructive example exhibiting spontaneous symmetry breaking beyond a critical observable value \(z_{\text {c}} > 0\) as sketched in the right subplot of Fig. 1. Choosing a gradient system
on the time interval [0, T] and focusing on the stationary limit \(T \rightarrow \infty \) allows us to treat this case in an exact manner. We assume that the potential has a unique global minimum at \(x_0 = (0,0,0)\) with \(\nabla ^2 V(x_0)\) positive definite. Furthermore, V should be symmetric in the first component \(x_1\), i.e. \(V(-x_1,x_2,x_3) = V(x_1,x_2,x_3)\), and rotationally symmetric in \((x_2,x_3)\) for any \(x_1\), i.e. for all \(x_1 \in \mathbb {R}\) and \(b \ge 0\), \(V(x_1, b \cos u, b \sin u)\) is constant in \(u \in [0, 2 \pi )\). We assume that there exists \(z_{\text {c}} > 0\), such that for all \(x_1 = z \in \mathbb {R}\) with \(|z| < z_{\text {c}}\), the function \(V(z,\cdot ,\cdot ):\mathbb {R}^2 \rightarrow \mathbb {R}\) has a unique, nondegenerate global minimum at \((x_2,x_3) = (0,0)\), and for all z with \(|z|> z_{\text {c}}\), \(V(z,\cdot ,\cdot )\) has a continuous family of global minima at \(({\bar{x}}(z) \cos u, {\bar{x}}(z) \cos u)\) with \({\bar{x}}(z) > 0\) and \(u \in [0,2 \pi )\), as sketched in Fig. 4 (left). A specific example of such a potential is
with constants \(V_0, a, z_{\text {c}} > 0\), which indeed exhibits a Mexican hat-like structure in the \(x_2\)-\(x_3\) plane for \(x_1 = z > z_{\text {c}}\) with minima at radius
As our (linear) observable, we take
which allows us to test the backward Riccati equation for the prefactor from Proposition 3.2.5 in the limit \(T \rightarrow \infty \). Since the system is gradient, we known that the stationary PDF \(\rho ^\varepsilon _\infty :\mathbb {R}^3 \rightarrow [0, \infty )\) of \(X^\varepsilon \) is given by
with normalization constant
Applying Laplace’s method on the PDF of the marginal distribution
of the first component \(X^\varepsilon _1\) (approximating both \(Z_\varepsilon \) and the \((x_2,x_3)\)-integral) yields
for any \(u_0 \in D = [0, 2 \pi )\). Here, \({\det }_2\) denotes the restriction onto the \((x_2,x_3)\)-plane, and \({\det }_1'\) reduces to the single nonzero eigenvalue of the matrix in the \((x_2,x_3)\)-plane corresponding to the radial eigenvector. For the specific example (147), the result is
as a reference result, with discontinuous second derivative of the rate function at \(z = z_{\text {c}}\) and divergent prefactors as \(z \uparrow z_{\text {c}}\).
In order to reproduce this result using sample path large deviations, we first note that the unique (for the chosen potential) solution to the instanton equations for any endpoint \(x=(x_1,x_2,x_3) \in \mathbb {R}^3\)
is given by
as \(T \rightarrow \infty \), i.e. by time-reversed deterministic dynamics, such that
By the contraction principle, i.e. by minimizing this result over all \((x_2,x_3) \in \mathbb {R}^2\) for a given \(x_1 = z \in \mathbb {R}\), we obtain the correct rate function
with any \(u_0 \in D = [0,2\pi )\). For the prefactor in the nondegenerate case \(|z| < z_{\text {c}}\), we first evaluate \(\exp \left\{ \int _0^\infty \textrm{tr}\left[ W_z\right] \right\} \) following [31]: The backward Riccati matrix \(W_z\) solves
Defining \(W_z = C_z^{-1} {\dot{C}}_z\) with \(C_z(\infty ) = 1_{3 \times 3}\), \({\dot{C}}_z(\infty ) = 0_{3 \times 3}\), we have, on the one hand,
and on the other hand, from (159),
so
Using the boundary conditions and equation (160) as well as noting that necessarily \({\dot{C}}_z(0) = 0_{3 \times 3}\) in the stationary limit, we obtain
The second ingredient for the PDF prefactor is
thereby correctly reproducing the reference result below the critical observable value \(z_{\text {c}}\) from (153) via Propositions 2.2.10 and 2.3.1.
Above the critical observable value \(z_{\text {c}}\), the final condition for the backward Riccati equation becomes
where \({\tilde{\psi }}^{u_0}_z\) is, in particular, a unit eigenvector corresponding to the single vanishing eigenvalue of the Hessian \(\nabla ^2 V(z,{\bar{x}}(z) \cos u_0, {\bar{x}}(z) \sin u_0)\). Setting \({\dot{C}}_z(\infty ) = - {\tilde{\psi }}^{u_0}_z \otimes {\tilde{\psi }}^{u_0}_z\) in the computation above yields
Hence, as desired, the modified initial condition renders the fraction well defined by replacing the single zero eigenvalue of the matrix in the denominator by 1. Furthermore, we have
by restricting to the invariant subspace \(\left( {\tilde{\psi }}^{u_0}_z\right) ^\perp \) of the Hessian \(\nabla ^2 V(z,{\bar{x}}(z) \cos u_0, {\bar{x}}(z) \sin u_0)\) on which it is invertible for the computations, and afterwards reintroducing the full matrix including the modified eigenvalue 1. All in all, we have thus correctly reproduced the PDF prefactor above the critical value in (153). For the specific example potential (147), the situation considered here is sketched and compared to the results of Monte Carlo simulations of the SDE (146) in Fig. 4.
4.4 Average Surface Height for the One-Dimensional KPZ Equation with Flat Initial Condition
The KPZ equation [70], an SPDE describing nonlinear surface growth, and in particular its large deviation statistics have been the subject of various studies. Here, particularly noteworthy works are [43,44,45,46] for an investigation of a short time dynamical phase transition for the distribution of the surface height at one point in space, starting from a stationary surface. Furthermore, recently, in [15], an exact computation of the rate function for the same observable with general deterministic initial condition has been carried out; and for the flat initial condition, the exact distribution of the height at one point in space for all times has already been found in [71]. A systematic short-time expansion for the height distribution at one point and droplet and Brownian initial conditions, which goes beyond the rate function and includes subleading prefactor terms, can be found in [72]. All of the works listed above deal with the KPZ equation on an unbounded spatial domain. Here, we proceed in the spirit of [43,44,45,46], but modify the setup to study continuous symmetry breaking instead of only a discrete mirror symmetry. Accordingly choosing the spatially averaged surface height as an observable necessitates considering a bounded spatial domain. For such a domain, the large deviation statistics of the surface height at one point have been computed in detail in [73], with the analysis of the spatially averaged surface height left as a future task there and predicted to display a second order dynamical phase transition. Here, we will confirm this prediction and compute the leading order PDF prefactors for both phases numerically. Furthermore, we analytically compute the PDF prefactor when the spatially homogeneous instanton dominates, which, in particular, allows us to determine the critical observable value \(z_{\text {c}}\). We will focus on a single choice of the only parameter of the system, the non-dimensionalized domain size l, and use \(l = \pi \) throughout this paper. We remark that it would be an interesting future work to systematically study the large deviation properties of the system for different domain sizes l using the methods developed here, and to derive a complete phase diagram in the (l, z) plane for the system, similar to [73].
To be more precise, we consider the KPZ equation in one spatial dimension on a bounded interval in space [0, L] with periodic boundary conditions for the surface height \(H :[0, L] \times [0,T] \rightarrow \mathbb {R}\),
starting from a flat initial profile \(H(\cdot , 0) = H_0 \equiv 0\), and are interested in precise asymptotic estimates for the probability distribution (and in particular its tails) of the spatially averaged surface height at time T,
for small T. In (168), we denote by \(\nu > 0\) the diffusivity, by \(\lambda > 0\) (the choice of sign is without loss of generality) the strength of the nonlinearity, and by \(D > 0\) the noise strength. The noise term \(\eta \) is assumed to be space-time white Gaussian noise with
The non-dimensionalization \(t \rightarrow t T\), \(x \rightarrow \sqrt{\nu T} x\) , \(H \rightarrow 2 \nu H / \lambda \) and \(\eta \rightarrow \left( \nu T^3 \right) ^{-1/4} \eta \) leads to the following model that we will consider for all computations in the following: For a dimensionless noise strength \(\varepsilon = D \lambda ^2 T^{1/2} / (4 \nu ^{5/2}) > 0\), we consider \(H^{\varepsilon } :[0,l] \times [0,1] \rightarrow \mathbb {R}\) with \(l = L / \sqrt{\nu T}\) the solution of
and are interested in estimating the PDF of the mean surface height
at the final time as \(\varepsilon \downarrow 0\). The small noise limit in these dimensionless variables can be seen to directly correspond to either of the limits \(D \downarrow 0\) or \(\lambda \downarrow 0\) in the physical variables. Additionally, as mentioned above, we choose a fixed and finite non-dimensionalized domain size \(l = \pi \) in all of our numerical computations , so the usual short-time limit \(T \downarrow 0\) considered in KPZ large deviations actually corresponds to simultaneously taking \(T \downarrow 0\) and \(\nu \propto T^{-1} \uparrow \infty \) in this setup if the physical domain size remains constant.
For spatially white noise, the KPZ equation (171) is only well-posed after renormalization, the noise being too rough for the nonlinearity \(- \tfrac{1}{2} \left( \partial _x H^\varepsilon \right) ^2\) to make sense otherwise [74, 75]. While this is not an issue on the level of instanton computations, the solutions of which are expected to be classically differentiable, renormalization is necessary when dealing with the random fluctuations around the instanton. We interpret (171) as the result of applying a Cole-Hopf transformation to the field \({{\mathcal {Q}}}^\varepsilon :[0, l] \times [0,1] \rightarrow (0,\infty )\), solving the well-posed stochastic heat equation (SHE) with multiplicative noise in the Itô sense
Then, the height field of the KPZ equation (171) is given by
and a formal application of Itô’s lemma shows that the Cole-Hopf transformation generates a counter-term \(-\delta (0)\), where \(\delta \) is Dirac’s delta function, on the right-hand side of (171) that intuitively cancels the divergences in the original KPZ equation. We will compute the contribution of the Gaussian fluctuations to the distribution of the observable (172) within this interpretation of the KPZ equation, i.e. actually consider the observable
for the SHE.
The instanton equations (5) for the example (171) and (172) that determine the instanton \((h_z, {\tilde{h}}_z, \lambda _z)\) written in terms of the original field and its conjugate momentum read (see [76] for an early reference that derives these equations)
In terms of the SHE, the instanton equations for the fields \((q_z, p_z, \lambda _z)\) with
become
The idea is now that a trivial spatially homogeneous critical point \((h_z^{\text {hom}}, {\tilde{h}}_z^{\text {hom}}, \lambda _z^{\text {hom}})\) of the action functional for the average height observable, i.e. a solution of (176), is always given by
with corresponding SHE instantons
leading to the Gaussian rate function
for all such \(z \in \mathbb {R}\) for which this critical point realizes the global minimum of the action under the boundary condition \(f \left( h_z(\cdot , 1) \right) = z\). However, one might expect (with reference to the typical growth patterns of the KPZ equation due to the nonlinearity and diffusion, as sketched in [70], as well as the results and scaling estimates of [73]) that for sufficiently large \(z > z_{\text {c}}\) in the right tail of the distribution of \(f \left( H^\varepsilon (\cdot , 1) \right) \), the KPZ nonlinearity will favor a nonuniform surface growth in order to achieve a large average height, such that the rate function displays a non-equilibrium phase transition to a continuous family of spatially localized global minimizers \(\left\{ \left( h_z^{\text {loc}, u_0}, {\tilde{h}}_z^{\text {loc}, u_0}, \lambda _z^{\text {loc}}\right) \big | u_0 \in [0, l) \right\} \) of the instanton equations. This intuitive picture is indeed confirmed by our numerical computations of instantons for this example, performed directly for (176). The corresponding results for the rate function as well as the space-time evolution of typical instantons are shown in Fig. 5. For these instanton computations, we used a pseudo-spectral discretization in terms of \(n_x = 128\) Fourier modes in space [0, l] with \(l = \pi \) and a second-order explicit Runge-Kutta integrator in time [0, 1] with an integrating factor for the diffusion terms with \(n_t = 2 \cdot 10^4\) equidistant time steps of size \(\Delta t = 5 \cdot 10^{-5}\). The comparably high resolution in time turned out to be necessary for the subsequent Riccati equation integrations, for which the instantons serve as an input, as detailed below. In order to directly compute instantons for different and given observable values z, equidistantly spaced in \([-10, 20]\), we use a penalty-type method, and minimized the action using L-BFGS steps with exact discrete adjoint gradient evaluations in order to reduce the \(L^2\)-norm of the action gradient by a factor of \(10^6\) in each subproblem. For details on the optimization procedure, we refer the reader to [48].
From the results of the instanton computations, we see that this constitutes an example of a dynamical phase transition in an irreversible SPDE where the associated symmetry that is broken is continuous, thereby allowing us to apply the methods developed in the previous section in order to compute not only the large deviation rate function, given by the pointwise minimum of the two branches in Fig. 5, but also a more refined, asymptotically sharp prefactor estimate. The phase transition is second order, as can be seen from the derivative of the rate function in the left subplot of Fig. 6, and we also show the \(L^2\) norm of \(\partial _x h_z\) for the instantons as an order parameter for the different phases in the center subplot of Fig. 6. Since the KPZ equation is a non-equilibrium system, in contrast to the previous example 4.3, the complete Riccati formalism and the corresponding numerical integration of a Riccati partial differential equation with regularized boundary data is now required to get the leading order prefactor.
When the spatially homogeneous instanton dominates, the rate function of the average surface height in the small noise limit is Gaussian with
but the prefactor component \(R_z\) can still depend nontrivially on z. The only restriction on the function \(R_\cdot \) is that at \(z = 0\), we have \(R_0 = 1\) for correct normalization of the PDF as \(\varepsilon \downarrow 0\). In the case of the spatially homogeneous instanton, the prefactor component \(R_z\) can be found analytically using probabilistic methods without explicit reference to the functional integration methods developed here, which is carried out in detail in Appendix C. The analysis of \(R_z\) for the homogeneous instantons in particular yields the prediction that the critical observable value \(z_{\text {c}}\) for the second order phase transition, where the \((k = 1)\)-contribution to the prefactor is found to diverge, is the smallest nontrivial real solution of the equation
for \(l = \pi \) and hence \(z_{\text {c}}(l = \pi ) \approx 2.8259\) as sketched in Fig. 5, which matches the numerical results of the instanton computations quite well.
Now, we turn to the numerical prefactor computation in the SHE formulation using Riccati fields. We use the backward Riccati formalismFootnote 2 from Proposition 3.3.1. The result for the PDF of \(F({{\mathcal {Q}}}^\varepsilon (\cdot , 1))\) as \(\varepsilon \downarrow 0\) is given by
In (184), the prefactor components
and
with volume factor
depend on the backward Riccati field \(W_z :[0, l]^2 \times [0,1] \rightarrow \mathbb {R}\) solving
for both cases along the respective instantons, and with final condition
for the homogeneous instanton and
for the spatially localized instanton. In all of these expressions, the zero mode is given by
with \(u_0 \in [0, l)\) denoting the reference position of the localized instanton, and the normalized zero mode is defined by
Numerically evaluating the prefactor by solving the Riccati equation and differentiating \(\lambda _z^{\text {loc}}\) with respect to z using finite differences, we obtain the results shown in the right panel of Fig. 6 for the leading order prefactor
where \(r(z) = 0\) for \(z < z_{\text {c}}\) and \(r(z) = 1\) for \(z > z_{\text {c}}\). For the solution of the Riccati equation (188), we also used a pseudo-spectral, anti-aliased code at spatial resolution \(n_x = 128\) with the Cole-Hopf transformed KPZ instantons as an input. For the time stepping, the same Heun integrator with an appropriate integrating factor in Fourier space was used, but we had to choose a different time resolution for numerical stability reasons. It turned out that the final condition (190) requires extremely small time steps in the vicinity of \(t = 1\), and accordingly, we divided the time interval [0, 1] into two subintervals \(I_1 = [0, t_0]\) and \(I_2 = [t_0, 1]\) with time steps of a different, smaller size \(\Delta t_2\) within \(I_2\) compared to \(\Delta t_1\) within \(I_1\). All results shown in Fig. 6 were generated using \(t_0 = 0.99995\), \(\Delta t_1 \approx 1.1 \cdot 10^{-5}\) and \(\Delta t_2 = 5 \cdot 10^{-9}\) with \(n_t = 10^5\) time steps in total. Further increasing the resolution would allow to extend the dashed curve in Fig. 6 to higher values of z, the relevant influence being the size of \(\Delta t_2\) here. We made sure that the results shown are invariant under modifications of \(\Delta t_1\), \(\Delta t_2\) and \(t_0\) as long as these yield finite results.
From the left subplot of Fig. 6, we see that for the spatially homogeneous instanton, the numerical results from solving the backward Riccati equation (188) closely match the analytical calculations from Appendix C. Further, the prefactor beyond the critical observable value \(z_{\text {c}}\) only has a weak dependence on z, and the behavior at \(z > 9\) is only due to the fact that a higher time resolution would be needed there. Furthermore, we show the instanton and the corresponding solution of the Riccati equation at different times for observable values \(z = 2 < z_{\text {c}}\) and \(z = 8 > z_{\text {c}}\) in Fig. 7. All in all, we have demonstrated with this example that the formalism developed in this paper can indeed be employed to analyze nontrivial, spatially extended non-equilibrium systems in the presence of phase transitions.
5 Discussion and Outlook
Going beyond large deviation estimates and obtaining sharp limits for rare events in stochastic systems is important for many applications, including nonequilibrium phase transitions. Importantly, one obtains the full limiting rare event probability or probability density instead of merely its exponential scaling, in regimes where direct sampling methods are completely intractable. In this paper, we have first set out to rederive such prefactor formulas at leading order for unique instantons [30,31,32,33], expressed in terms of Riccati matrices, explicitly using tools from field theory, i.e. by evaluating the appearing functional determinants using Forman’s theorem [40]. The resulting derivations are short and conceptually simple. We stressed the role of the MGF for a vast simplification of the computations, which in particular simplifies the boundary conditions of the second variation operator in path space. Secondly, writing the prefactor in terms of operator determinants allowed us to extend the Riccati formalism to situations where the second variation around the instanton path that is used for the expansion is only positive semi-definite due to the presence of zero modes, i.e. degenerate submanifolds of instantons. We have demonstrated, using boundary-type regularizations [42], that the Riccati approach remains feasible in this case, i.e. that the reduced functional determinant with removed zero eigenvalues can still be expressed through the solution of the same matrix Riccati differential equation, only with modified initial/final conditions or evaluations involving knowledge of the zero modes. Afterwards, we have verified our results in four different examples involving linear and nonlinear, reversible and irreversible SDEs as well as a nonlinear irreversible SPDE, the KPZ equation, exhibiting spontaneous symmetry breaking of the instantons for the average surface height.
With the general treatment of zero modes completed, it is now theoretically possible to compute leading order large deviation prefactors even for multi-dimensional SPDEs such as the two-dimensional or three-dimensional Navier-Stokes equations where spontaneous symmetry breaking of the rotational symmetry of instantons has indeed been observed [47, 48]. The remaining complication for numerical computations is the high dimensionality of the involved Riccati matrices, and it would be interesting future work to consider low-rank approximations of the Riccati differential equations [77] in this regard, that could e.g. make use of the sparsity of the large-scale forcing typically used in turbulence simulations. Alternatively, an approach based on computing only the dominant eigenvalues of a Carleman-Fredholm determinant expression for the prefactor [36] could be used for numerical computations, which will be the subject of a future publication. Another interesting project would be the development of efficient importance sampling algorithms for rare events as e.g. in [78] for systems with non-unique instantons due to symmetry breaking.
Data Availability
The datasets generated during and/or analyzed during the current study are available from the corresponding author on reasonable request.
Notes
We do not attempt to give mathematically strict conditions on the drift field b, diffusion matrix a and observable f in this paper, which, beyond the existence and uniqueness of solutions of (1), would also guarantee the rigorous applicability of the results of the following sections. For the case of component projections as observables and unique instantons, we refer the reader e.g. to [13, 14] for works in this direction.
The system at hand is an example where, regardless of the spontaneous symmetry breaking and indeed already for the spatially homogeneous instanton, the forward Riccati equation can be ill-posed for certain observable values, whereas the backward equation remains well-posed for the same observable values. Conceptually, we conjecture that this is due to the fact that divergences of the backward Riccati matrix \(W = \zeta \gamma ^{-1}\) are related to conjugate points and violations of the positive definiteness of the second variation at the instanton, whereas divergences of the forward Riccati matrix \(Q = \gamma \zeta ^{-1}\) can appear when the momentum passes through zero without “physical” consequences. In the example of this subsection, one can find parameters for which the solution of the forward Riccati equation in (C20) passes through a singularity in (0, T), prohibiting forward numerical integration, while the analytical result (C21) remains finite. This is the reason why we use the backward Riccati approach for all numerical computations in this subsection.
References
Dembo, A., Zeitouni, O.: Large Deviations Techniques and Applications. Large Deviations Techniques and Applications. Springer, Berlin, Heidelberg (2010)
Piterbarg, V.I., Fatalov, V.R.: The Laplace method for probability measures in Banach spaces. Russ. Math. Surv. 50, 1151 (1995). https://doi.org/10.1070/RM1995v050n06ABEH002635
Freidlin, M.I., Wentzell, A.D.: Random Perturbations of Dynamical Systems. Random Perturbations of Dynamical Systems, vol. 260. Springer, Berlin (2012)
Coleman, S.: In: Zichichi, A. (ed.) The Whys of Subnuclear Physics. The Subnuclear Series, vol. 15, pp. 805–941. Springer US, Berlin (1979)
Vainshtein, A., Zakharov, V.I., Novikov, V., Shifman, M.A.: ABC of Instantons. Sov. Phys. Usp. 25, 195 (1982). https://doi.org/10.1070/PU1982v025n04ABEH004533
Chernykh, A.I., Stepanov, M.G.: Large negative velocity gradients in Burgers turbulence. Phys. Rev. E 64, 026306 (2001). https://doi.org/10.1103/PhysRevE.64.026306
Ren, W.E.W., Vanden-Eijnden, E.: Minimum action method for the study of rare events. Commun. Pure Appl. Math. 57, 637 (2004). https://doi.org/10.1002/cpa.20005
Bouchet, F., Laurie, J., Zaboronski, O.: Control and instanton trajectories for random transitions in turbulent flows. J. Phys. 318, 022041 (2011). https://doi.org/10.1088/1742-6596/318/2/022041
Grafke, T., Grauer, R., Schäfer, T.E.: Vanden–Eijnden, Relevance of instantons in Burgers turbulence. EPL 109, 34003 (2015). https://doi.org/10.1209/0295-5075/109/34003
Dematteis, G., Grafke, T., Onorato, M., Vanden-Eijnden, E.: Experimental evidence of hydrodynamic instantons: The universal route to rogue waves. Phys. Rev. X 9, 041057 (2019). https://doi.org/10.1103/PhysRevX.9.041057
Gurarie, V., Migdal, A.: Instantons in the Burgers equation. Phys. Rev. E 54, 4908 (1996). https://doi.org/10.1103/PhysRevE.54.4908
Balkovsky, E., Falkovich, G., Kolokolov, I., Lebedev, V.: Intermittency of Burgers’ turbulence. Phys. Rev. Lett. 78, 1452 (1997). https://doi.org/10.1103/PhysRevLett.78.1452
Deuschel, J.-D., Friz, P.K., Jacquier, A., Violante, S.: Marginal density expansions for diffusions and stochastic volatility I: Theoretical foundations. Commun. Pure Appl. Math. 67, 40 (2014). https://doi.org/10.1002/cpa.21478
Deuschel, J.-D., Friz, P.K., Jacquier, A., Violante, S.: Marginal density expansions for diffusions and stochastic volatility II: Applications. Commun. Pure Appl. Math. 67, 321 (2014). https://doi.org/10.1002/cpa.21483
Krajenbrink, A., Le Doussal, P.: Inverse scattering of the Zakharov–Shabat system solves the weak noise theory of the Kardar–Parisi–Zhang equation. Phys. Rev. Lett. 127, 064101 (2021). https://doi.org/10.1103/PhysRevLett.127.064101
Touchette, H.: The large deviation approach to statistical mechanics. Phys. Rep. 478, 1 (2009). https://doi.org/10.1016/j.physrep.2009.05.002
Grafke, T., Grauer, R., Schäfer, T.: The instanton method and its numerical implementation in fluid mechanics. J. Phys. A 48, 333001 (2015). https://doi.org/10.1088/1751-8113/48/33/333001
Grafke, T., Vanden-Eijnden, E.: Numerical computation of rare events via large deviation theory. Chaos 29, 063118 (2019). https://doi.org/10.1063/1.5084025
Eyring, H.: The activated complex in chemical reactions. J. Chem. Phys. 3, 107 (1935). https://doi.org/10.1063/1.1749604
Kramers, H.A.: Brownian motion in a field of force and the diffusion model of chemical reactions. Physica 7, 284 (1940). https://doi.org/10.1016/S0031-8914(40)90098-2
Bovier, A., Eckhoff, M., Gayrard, V., Klein, M.: Metastability in reversible diffusion processes I: Sharp asymptotics for capacities and exit times. J. Eur. Math. Soc. 6, 399 (2004). https://doi.org/10.4171/JEMS/14
Berglund, N.: Kramers’ law: Validity, derivations and generalisations. Markov Process. Relat. Fields 19, 459 (2013)
Berglund, N., Gesù, G.D., Weber, H.: An Eyring–Kramers law for the stochastic Allen–Cahn equation in dimension two. Electron. J. Probab. 22, 1 (2017). https://doi.org/10.1214/17-EJP60
Bouchet, F., Reygner, J.: Generalisation of the Eyring–Kramers transition rate formula to irreversible diffusion processes. Ann. Henri Poincaré 17, 3499 (2016). https://doi.org/10.1007/s00023-016-0507-4
Landim, C., Seo, I.: Metastability of nonreversible random walks in a potential field and the Eyring–Kramers transition rate formula. Commun. Pure Appl. Math. 71, 203 (2018). https://doi.org/10.1002/cpa.21723
Lehmann, J., Reimann, P., Hänggi, P.: Activated escape over oscillating barriers: the case of many dimensions. Physica Status Solidi (b) 237, 53 (2003). https://doi.org/10.1002/pssb.200301774
Nickelsen, D., Engel, A.: Asymptotics of work distributions: the pre-exponential factor. Eur. Phys. J. B 82, 207 (2011). https://doi.org/10.1140/epjb/e2011-20133-y
Nickelsen, D., Touchette, H.: Noise correction of large deviations with anomalous scaling. Phys. Rev. E 105, 064102 (2022). https://doi.org/10.1103/PhysRevE.105.064102
Kikuchi, L., Adhikari, R., Kappler, J.: Diffusivity dependence of the transition path ensemble. (2022) arXiv:2203.12947
Schorlepp, T., Grafke, T., Grauer, R.: Gel’fand–Yaglom type equations for calculating fluctuations around instantons in stochastic systems. J. Phys. A 54, 235003 (2021). https://doi.org/10.1088/1751-8121/abfb26
Grafke, T., Schäfer, T., Vanden-Eijnden, E.: Sharp Asymptotic Estimates for Expectations, Probabilities, and Mean First Passage Times in Stochastic Systems with Small Noise. (2021) arXiv:2103.04837
Ferré, G., Grafke, T.: Approximate optimal controls via instanton expansion for low temperature free energy computation. Multiscale Model. Simul. 19, 1310 (2021). https://doi.org/10.1137/20M1385809
Bouchet, F., Reygner, J.: Path integral derivation and numerical computation of large deviation prefactors for non-equilibrium dynamics through matrix Riccati equations. J. Stat. Phys. 189, 1 (2022). https://doi.org/10.1007/s10955-022-02983-7
Ellis, R.S., Rosen, J.S.: Asymptotic analysis of Gaussian integrals, II: Manifold of minimum points. Commun. Math. Phys. 82, 153 (1981). https://doi.org/10.1007/BF02099914
Ellis, R.S., Rosen, J.S.: Asymptotic analysis of Gaussian integrals I. Isolated minimum points. Trans. Am. Math. Soc. 273, 447 (1982). https://doi.org/10.2307/1999924
Arous, G.B.: Methods de Laplace et de la phase stationnaire sur l’espace de Wiener. Stochastics 25, 125 (1988). https://doi.org/10.1080/17442508808833536
Tong, S., Vanden-Eijnden, E., Stadler, G.: Extreme event probability estimation using PDE-constrained optimization and large deviation theory, with application to tsunamis. Commun. Appl. Mathe. Comput. Sci. 16, 181 (2021). https://doi.org/10.2140/camcos.2021.16.181
Psaros, A.F., Kougioumtzoglou, I.A.: Functional series expansions and quadratic approximations for enhancing the accuracy of the Wiener path integral technique. J. Eng. Mech. 146, 04020065 (2020). https://doi.org/10.1061/(ASCE)EM.1943-7889.0001793
Gel’fand, I.M., Yaglom, A.M.: Integration in functional spaces and its applications in quantum physics. J. Math. Phys. 1, 48 (1960). https://doi.org/10.1063/1.1703636
Forman, R.: Functional determinants and geometry. Invent. Math. 88, 447 (1987). https://doi.org/10.1007/BF01391828
Berglund, N., Gentz, B.: The Eyring–Kramers law for potentials with nonquadratic saddles. Markov Process. Relat. Fields 16, 549 (2010)
Falco, G., Fedorenko, A.A., Gruzberg, I.A.: On functional determinants of matrix differential operators with multiple zero modes. J. Phys. A 50, 485201 (2017). https://doi.org/10.1088/1751-8121/aa9205
Janas, M., Kamenev, A., Meerson, B.: Dynamical phase transition in large-deviation statistics of the Kardar–Parisi–Zhang equation. Phys. Rev. E 94, 032133 (2016). https://doi.org/10.1103/PhysRevE.94.032133
Krajenbrink, A., Le Doussal, P.: Exact short-time height distribution in the one-dimensional Kardar–Parisi–Zhang equation with Brownian initial condition. Phys. Rev. E 96, 020102 (2017). https://doi.org/10.1103/PhysRevE.96.020102
Smith, N.R., Kamenev, A., Meerson, B.: Landau theory of the short-time dynamical phase transitions of the Kardar–Parisi–Zhang interface. Phys. Rev. E 97, 042130 (2018). https://doi.org/10.1103/PhysRevE.97.042130
Hartmann, A.K., Meerson, B., Sasorov, P.: Observing symmetry-broken optimal paths of the stationary Kardar–Parisi–Zhang interface via a large-deviation sampling of directed polymers in random media. Phys. Rev. E 104, 054125 (2021). https://doi.org/10.1103/PhysRevE.104.054125
Falkovich, G., Lebedev, V.: Vorticity statistics in the direct cascade of two-dimensional turbulence. Phys. Rev. E 83, 045301 (2011). https://doi.org/10.1103/PhysRevE.83.045301
Schorlepp, T., Grafke, T., May, S., Grauer, R.: Spontaneous symmetry breaking for extreme vorticity and strain in the three-dimensional Navier–Stokes equations. Philos. Trans. R. Soc. A 380, 20210051 (2022). https://doi.org/10.1098/rsta.2021.0051
Alqahtani, M., Grigorio, L., Grafke, T.: Extreme events and instantons in Lagrangian passive scalar turbulence models. Phys. Rev. E 106, 015101 (2022). https://doi.org/10.1103/PhysRevE.106.015101
Bertini, L., De Sole, A., Gabrielli, D., Jona-Lasinio, G., Landim, C.: Macroscopic fluctuation theory. Rev. Mod. Phys. 87, 593 (2015). https://doi.org/10.1103/RevModPhys.87.593
Hurtado, P.I., Garrido, P.L.: Spontaneous symmetry breaking at the fluctuating level. Phys. Rev. Lett. 107, 180601 (2011). https://doi.org/10.1103/PhysRevLett.107.180601
Zarfaty, L., Meerson, B.: Statistics of large currents in the Kipnis–Marchioro–Presutti model in a ring geometry. J. Stat. Mecha. 2016, 033304 (2016). https://doi.org/10.1088/1742-5468/2016/03/033304
Heymann, M., Vanden-Eijnden, E.: The geometric minimum action method: a least action principle on the space of curves. Commun. Pure Appl. Math. 61, 1052 (2008). https://doi.org/10.1002/cpa.20238
Alqahtani, M., Grafke, T.: Instantons for rare events in heavy-tailed distributions. J. Phys. A 54, 175001 (2021). https://doi.org/10.1088/1751-8121/abe67b
Langouche, F., Roekaerts, D., Tirapegui, E.: Functional Integration and Semiclassical Expansions. Springer, Dordrecht (1982)
Cugliandolo, L.F., Lecomte, V.: Rules of calculus in the path integral representation of white noise Langevin equations: the Onsager–Machlup approach. J. Phys. A 50, 345001 (2017). https://doi.org/10.1088/1751-8121/aa7dd6
Itami, M., Sasa, S.: Universal form of stochastic evolution for slow variables in equilibrium systems. J. Stat. Phys. 167, 46 (2017). https://doi.org/10.1007/s10955-017-1738-6
Kleinert, H.: Path Integrals in Quantum Mechanics, Statistics, Polymer Physics, and Financial Markets. World scientific, Singapore (2009)
Vilenkin, A., Yamada, M.: Tunneling wave function of the universe. Phys. Rev. D 98, 066003 (2018). https://doi.org/10.1103/PhysRevD.98.066003
Di Tucci, A., Lehners, J.-L.: No-boundary proposal as a path integral with Robin boundary conditions. Phys. Rev. Lett. 122, 201302 (2019). https://doi.org/10.1103/PhysRevLett.122.201302
Ray, D.B., Singer, I.M.: R-torsion and the Laplacian on Riemannian manifolds. Adv. Math. 7, 145 (1971). https://doi.org/10.1016/0001-8708(71)90045-4
Dunne, G.V.: Functional determinants in quantum field theory. J. Phys. A 41, 304006 (2008). https://doi.org/10.1088/1751-8113/41/30/304006
Montroll, E.W.: Markoff chains, Wiener integrals, and quantum theory. Commun. Pure Appl. Math. 5, 415 (1952). https://doi.org/10.1002/cpa.3160050403
Bleistein, N., Handelsman, R.A.: Asymptotic Expansions of Integrals. Ardent Media, London (1975)
McKane, A.J., Tarlie, M.B.: Regularization of functional determinants using boundary perturbations. J. Phys. A 28, 6931 (1995). https://doi.org/10.1088/0305-4470/28/23/032
Kleinert, H., Chervyakov, A.: Simple explicit formulas for Gaussian path integrals with time-dependent frequencies. Phys. Lett. A 245, 345 (1998). https://doi.org/10.1016/S0375-9601(98)00380-6
Faddeev, L.D., Popov, V.N.: Feynman diagrams for the Yang–Mills field. Phys. Lett. B 25, 29 (1967). https://doi.org/10.1016/0370-2693(67)90067-6
Corazza, G., Singh, R.: Unraveling looping efficiency of stochastic Cosserat polymers. Phys. Rev. Res. 4, 013071 (2022). https://doi.org/10.1103/PhysRevResearch.4.013071
Zhou, J.X., Aliyu, M., Aurell, E., Huang, S.: Quasi-potential landscape in complex multi-stable systems. J. R. Soc. Interface 9, 3539 (2012). https://doi.org/10.1098/rsif.2012.0434
Kardar, M., Parisi, G., Zhang, Y.-C.: Dynamic scaling of growing interfaces. Phys. Rev. Lett. 56, 889 (1986). https://doi.org/10.1103/PhysRevLett.56.889
Calabrese, P., Le Doussal, P.: Exact solution for the Kardar–Parisi–Zhang equation with flat initial conditions. Phys. Rev. Lett. 106, 250603 (2011). https://doi.org/10.1103/PhysRevLett.106.250603
Krajenbrink, A., Le Doussal, P., Prolhac, S.: Systematic time expansion for the Kardar–Parisi–Zhang equation, linear statistics of the GUE at the edge and trapped fermions. Nucl. Phys. B 936, 239 (2018). https://doi.org/10.1016/j.nuclphysb.2018.09.019
Smith, N.R., Meerson, B., Sasorov, P.: Finite-size effects in the short-time height distribution of the Kardar–Parisi–Zhang equation. J. Stat. Mech. 2018, 023202 (2018). https://doi.org/10.1088/1742-5468/aaa783
Quastel, J.: Introduction to KPZ. Curr. Dev. Math. 2011, 125 (2011). https://doi.org/10.4310/CDM.2011.v2011.n1.a3
Hairer, M.: Solving the KPZ equation. Ann. Math. 178, 559 (2013). https://doi.org/10.4007/annals.2013.178.2.4
Fogedby, H.C.: Canonical phase-space approach to the noisy Burgers equation: probability distributions. Phys. Rev. E 59, 5065 (1999). https://doi.org/10.1103/PhysRevE.59.5065
Breiten, T., Dolgov, S., Stoll, M.: Solving differential Riccati equations: A nonlinear space-time method using tensor trains. Numer. Algebra Control Optim. 11, 407 (2021). https://doi.org/10.3934/naco.2020034
Ebener, L., Margazoglou, G., Friedrich, J., Biferale, L., Grauer, R.: Instanton based importance sampling for rare events in stochastic PDEs. Chaos 29, 063102 (2019). https://doi.org/10.1063/1.5085119
Evans, L.C.: Mathematical methods for optimization: Dynamic optimization. Lecture Notes. MIT Press, Cambridge (2021)
Corazza, G., Fadel, M.: Normalized Gaussian path integrals. Phys. Rev. E 102, 022135 (2020). https://doi.org/10.1103/PhysRevE.102.022135
Levi, M.: Classical Mechanics with Calculus of Variations and Optimal Control: An Intuitive Introduction, vol. 69. American Mathematical Society, Providence (2014)
Reid, W.T.: Riccati Differential Equations, vol. 86. Academic Press, Berlin (1972)
Clarke, F.H., Zeidan, V.: Sufficiency and the Jacobi condition in the calculus of variations. Can. J. Math. 38, 1199 (1986). https://doi.org/10.4153/CJM-1986-060-5
Feng, J., Kurtz, T.G.: Large Deviations for Stochastic Processes, vol. 131. American Mathematical Society, Providence (2006)
Acknowledgements
The authors wish to thank Baruch Meerson, Pavel Sasorov and Naftali Smith for pointing out important literature and sharing their insights on the dynamical phase transition for the spatially averaged surface height of the KPZ equation. T.S. and R.G. benefited from support through the DFG collaborative research center SFB-1491. T.G. acknowledges the support received from the EPSRC projects EP/T011866/1 and EP/V013319/1.
Funding
Open Access funding enabled and organized by Projekt DEAL.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no conflict of interest.
Additional information
Communicated by Giulio Biroli.
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendices
Appendix A: Forman’s Theorem for the Second Variation of a General Action Functional
In the remainder of this appendix, we focus on operators originating from the second variation of a generic action functional
for paths \(\phi :[0,T] \rightarrow \mathbb {R}^n\) with boundary conditions that we do not specify in this section. Expanding the action to second order around a stationary path \(\phi \) yields the following quadratic form:
with the convention
and all derivatives of L evaluated along \(\phi \). We transform this expression into the form \(\tfrac{1}{2}\int _0^T \left\langle \gamma , \Omega \gamma \right\rangle _n \textrm{d}t\) via partial integration:
Here, \(\left[ \cdot , \cdot \right] \) denotes the commutator of two operators. With the definition \(\theta := \nabla _{{\dot{\phi }}} L\) for the conjugate momentum and hence
for the momentum fluctuations, the additional boundary term that we obtain and that needs to vanish through the imposition of suitable boundary conditions (cf. main text) for the fluctuations is \( \tfrac{1}{2} \left. \left\langle \gamma , \zeta \right\rangle _n \right| ^T_0\), leaving us with
Written in this way, the Jacobi operator [79], i.e. the second order linear differential operator
realizing the second variation is \(L^2([0,T],\mathbb {R}^n)\)-self-adjoint, i.e. \(\left\langle \gamma _1, \Omega \gamma _2 \right\rangle = \left\langle \Omega \gamma _1, \gamma _2 \right\rangle \) for all fluctuation paths with boundary conditions such that \( \left. \left\langle \zeta _1, \gamma _2 \right\rangle _n \right| ^T_0 - \left. \left\langle \gamma _1, \zeta _2 \right\rangle _n \right| ^T_0 = 0\).
For the first order equation in Forman’s theorem, we read off
The first order version of the Jacobi equation
appearing in Forman’s theorem hence becomes
In many application, such as in this paper, it is more natural to switch to a Hamiltonian instead of a Lagrangian formulation of the Jacobi equation. In fact, we have already seen above that the natural boundary conditions for the fluctuations include the conjugate momentum fluctuations. Due to this reason, we associate to the fundamental system of solutions \(\Upsilon \) of \(\Omega \), understood as a first order differential equation (A16) in \((\gamma ,{\dot{\gamma }})\), the following fundamental system of solutions
The transformation is invertible iff \(P_0 = - \nabla ^2_{{\dot{\phi }}} L\) is invertible (which is exactly an assumption of Forman’s theorem) with
A straightforward calculation then shows that
with
where J is the standard \(2n \times 2n\) symplectic matrix. The second equality in (A20) holds if \(\phi \) is a critical point of the action functional, or, equivalently, \((\phi , \theta )\) is a solution of the canonical equations of motion
The Hamiltonian H is defined via
where the second equality follows by assuming strict convexity of L in \({\dot{\phi }}\) and solving the implicit equation \(\theta = \partial L(\phi , {\dot{\phi }}) / \partial {\dot{\phi }}\) for \({\dot{\phi }}\). Let us summarize the results of the transformation in the following proposition.
Remark A.3
If the second order coefficient matrix \(\nabla ^2_{{\dot{\phi }}} L\) does depend on the path around which the expansion is performed, as is the case for multiplicative noise in the main text, then considering variations \(\left( \nabla ^2_{{\dot{\phi }}} L\right) ^{-1/2} \gamma \) instead of \(\gamma \) naturally leads to the computation of the ratio
instead, to which the proposition can then be applied without any further changes (note that for these new operators, the second order coefficient matrix will be negative unity, and the other coefficients are multiplied by \(\left( \nabla ^2_{{\dot{\phi }}} L (\phi _i, {\dot{\phi }}_i)\right) ^{-1}\), which yields the same equation in (A1) as before, thereby leaving (A28) invariant).
Example A.4
For the Freidlin–Wentzell Lagrangian
the corresponding Hamiltonian is given by
The derivatives of L and H are
and
where we use the notation
Hence
and
Example A.5
For the Lagrangian
appearing in quantum mechanics in imaginary time, with Hamiltonian
Jacobi’s equation becomes
or
which is the classical Gel’fand-Yaglom formula [4, 39].
It is well known from the calculus of variations that, if the Jacobi equation (A15) has no conjugate points [81] in [0, T] (“Jacobi condition”), then it is possible to construct a solution of a certain symmetric matrix Riccati differential equation [82], either forward or backward in time, out of solutions \((\gamma , \zeta ):[0,T] \rightarrow \mathbb {R}^{2n \times n}\) (if, depending on the solution, either \(\gamma (t_0)\) or \(\zeta (t_0)\) is invertible for any \(t_0 \in [0,T]\) and hence for all \(t \in [0,T]\) [83]):
-
\(W := \zeta \gamma ^{-1} :[0,T] \rightarrow \mathbb {R}^{n \times n}\) satisfies the backward Riccati equation
$$\begin{aligned} {\dot{W}} = - \nabla ^2_\phi H - W \nabla _\theta \nabla _\phi H - \left( \nabla _\phi \nabla _\theta H \right) W - W \left( \nabla ^2_\theta H \right) W\,. \end{aligned}$$(A41) -
\(Q = W^{-1} = \gamma \zeta ^{-1}:[0,T] \rightarrow \mathbb {R}^{n \times n}\) solves the forward Riccati equation
$$\begin{aligned} {\dot{Q}} = \nabla ^2_\theta H + Q \nabla _\phi \nabla _\theta H + \left( \nabla _\theta \nabla _\phi H \right) Q + Q \left( \nabla ^2_\phi H\right) Q\,. \end{aligned}$$(A42)
Remark A.6
If it exists, the solution of the backward matrix Riccati equation W can naturally be connected to the positive definiteness of \(\delta ^2 S\) [79], which is why the fact that Riccati matrix differential equations appear in the functional determinant computations is not very surprising from a calculus of variations perspective: Observing that
and inserting this expression for \(\nabla ^2_\phi L\) into the second variation, we obtain, assuming that \(\nabla ^2_{{\dot{\phi }}} L\) is positive definite (“Legendre condition”),
which can be used to show that \(\delta ^2 S[\phi ][\gamma ] > 0\) for all \(\gamma \ne 0\) under appropriate boundary conditions.
Appendix B: Sharp Moment-Generating Function Estimate for Nondegenerate Instantons from WKB Analysis for a General Hamiltonian
As a reference, we state a general sharp estimate for the MGF of a final-time observable \(f:\mathbb {R}^n \rightarrow \mathbb {R}\)
for a \((\varepsilon > 0)\)-indexed family of continuous-time Markov processes \(\left( X_t^\varepsilon \right) _{t \in [0,T]}\) with state space \(\mathbb {R}^n\), deterministic initial value \(X^0 = x \in \mathbb {R}^n\) and generator \(L_\varepsilon \) which we assume to satisfy a large deviation principle as \(\varepsilon \downarrow 0\). Defining (see e.g. [84])
for test functions \(\varphi :\mathbb {R}^n \rightarrow \mathbb {R}\) as well as the LDT Hamiltonian \(H :\mathbb {R}^n \times \mathbb {R}^n \rightarrow \mathbb {R}\), \((\phi , \theta ) \mapsto H(\phi , \theta )\) via
we have the following result, obtained via WKB analysis of the Kolmogorov backward equation for \(A_f^\varepsilon (\lambda )\):
Remark B.2
As remarked in [31], it is possible to transfer the backward to the forward Riccati equation in general solely on the level of Riccati equations (if both are well-posed for the problem at hand), the general link being
with Q solving
Derivation of Proposition B.1
Analogously to [31], we define
such that \(A_f^\varepsilon (\lambda ) = u_\varepsilon (0, x)\) and \(u_\varepsilon \) solves
The WKB ansatz
where we later assume that \(Z_\varepsilon = Z + {{\mathcal {O}}}(\varepsilon )\), leads to
Expanding \(H_\varepsilon \) yields
so, at order \(\varepsilon ^{-1}\), we obtain the Hamilton-Jacobi equation
for S as expected, which can be solved by the method of characteristics, yielding the instanton equations (B6). Differentiating (B18) twice and plugging in the characteristics results in the Riccati equation (B10). For the determination of the leading order prefactor Z, we note that at order \(\varepsilon ^0\),
so evaluating Z(t, x) along the characteristic \(\phi _\lambda \) where \(\nabla _\theta H(\cdot , \nabla S) = {\dot{\phi }}_\lambda \) results in
which can then directly be integrated to get Z(0, x). \(\square \)
Example B.3
For an Itô diffusion
with the generator \(L_\varepsilon \) acting via
we have
so the Hamiltonian is of course given by
Furthermore, since
and
we indeed arrive at the MGF estimate
with prefactor
where the Riccati matrices \(W_\lambda , Q_\lambda :[0,T] \rightarrow \mathbb {R}^{n \times n}\) solve
and
Example B.4
Since previous papers [30,31,32,33] have mostly dealt with additive noise, we test the more general case of multiplicative noise that is included here in a simple toy example. Consider the one-dimensional Itô SDE
describing geometric Brownian motion. Using Itô’s lemma, this SDE can be solved explicitly to get
and hence the distribution of \(X_T^\varepsilon \) is log-normal with PDF
Choosing
as our observable, we can explicitly evaluate the MGF \(A_f^\varepsilon (\lambda )\) for \(\lambda < 1/(2T)\) by integration of the PDF, obtaining
We will now reproduce this result at leading order using the general theory stated above. For the Hamiltonian
the instanton equations become
In addition to the Hamiltonian H being conserved along the instanton, we can read off that the quantity
is also conserved. We obtain
and hence
from the final time condition, the instanton trajectories then being
The \({{\mathcal {O}}}\left( \varepsilon ^{-1} \right) \)-contribution of the instanton in the exponent becomes
as expected. The prefactor at leading order in \(\varepsilon \) is
for
and hence, for the transformed Riccati solution \({\tilde{W}}_\lambda = \phi _\lambda ^2 W_\lambda \),
The solution of this Riccati equation with constant coefficients can easily be integrated to get
where the integration constant \(C_\lambda \), determined through the final condition, is
Evaluating (B43) then reproduces \(R_\lambda \) as found in (B35).
Appendix C: Prefactor for Spatially Homogeneous KPZ Instantons
In this section, we want to evaluate the term
for the Gaussian fluctuations \(Y = \left( Y(x,t) \right) _{x \in [0, l], \; t \in [0,1]}\) around the spatially homogeneous KPZ instanton (179) for the PDF prefactor in (182), where we consider the fluctuations in the Cole-Hopf transformed fields. These fluctuations satisfy the linear SPDE
with initial condition \(Y(\cdot , 0) \equiv 0\). We define the Fourier transform of Y as
for \(k \in \mathbb {Z}\), such that
Then \(R_z\) becomes
in terms of the Fourier modes \(\left( {\hat{Y}}_k(t) \right) _{k \in \mathbb {Z}, \; t \in [0,1]}\) solving
with white in time and uncorrelated complex Gaussian noise
i.e. for \(k \ne 0\) the real and imaginary parts of \({\hat{\eta }}_k\) are independent real Gaussian variables with variance \((2l)^{-1}\), and \({\hat{\eta }}_{-k} = {\hat{\eta }}_{k}^*\) due to \(\eta \) being real. For \(k = 0\), \(\text {Im}\, {\hat{\eta }}_0 \equiv 0\) and \(\text {Re}\, {\hat{\eta }}_0\) has variance \(l^{-1}\). Hence (simultaneously rescaling all \(\text {Re}\, {\hat{\eta }}_k\) to unit variance)
For \(z = 0\), we have \(R_z = 1\) of course, and we start by considering the case \(z < 0\) now where the spatially homogeneous instanton remains the global minimizer of the action functional for all z. Then, rescaling to a standard real Ornstein-Uhlenbeck process via
yields
with
Hence, the problem reduces to the computation of the expectation
of a standard one-dimensional Ornstein-Uhlenbeck process with \(\alpha , T > 0\). This problem can be solved using the same functional integration methods as in the main text, or e.g. by using the Feynman-Kac formula. We follow the latter strategy here. In order to cover all cases that will appear for positive z as well, where coefficients 0 and \(+1\) for the Ornstein-Uhlenbeck drift are possible, we consider
with
in the following where \(\beta \in \{-1, -0, +1 \}\). Then we know that
where the propagator \(K_{\alpha , \beta }(y,s;x,t)\) from point x at time t to point y at time s solves
A Gaussian ansatz for \(K_{\alpha , \beta }\) leads to
with
The solution of the Riccati equation in the relevant cases that we need are:
Hence, the expectation is
For negative z, the cases that appear are
and we thus find
for the prefactor at negative z, which increases monotonically with increasing absolute value of z and can be seen to be finite for all \(z < 0\). For \(z>0\), a similar analysis leads to the following cases:
and the corresponding \(E(\alpha , \beta , T)\)’s need to be multiplied together for each z to get the prefactor.
In particular, we can use this result to explicitly find the critical point \(z_{\text {c}} = z_{\text {c}}(l)\) if the dynamical phase transition is second order. At this point the first factor, namely for \(k = 1\), diverges and becomes negative; i.e. at the critical observable value the spatially homogeneous instanton ceases to be a minimizer and transitions into a saddle. Setting the denominator in the third case of (C21) to zero for \(k = 1\) and \(z > 0\), we find that the critical point is determined via the equation
Focusing on \(l = \pi \) as in the main text and numerically determining the smallest nontrivial real solution to (C25) yields
We remark that for other domain sizes, it is possible that the transition is first order and hence the point where the prefactor for the homogeneous instanton diverges is a priori unrelated to the critical point, or that other modes than \(k = 1\) become unstable first.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Schorlepp, T., Grafke, T. & Grauer, R. Symmetries and Zero Modes in Sample Path Large Deviations. J Stat Phys 190, 50 (2023). https://doi.org/10.1007/s10955-022-03051-w
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s10955-022-03051-w