1 Introduction

Understanding the frequency of rare events and the dynamical trajectories, which generate them has become an important field of research in many physical situations including protein folding [1], chemical reactions [2, 3], atmospheric activities [4], glassy systems [5, 6], disordered media [7]. From the mathematical point of view, the statistical properties of rare events are characterized by large deviation functions [8,9,10,11,12,13,14,15]. In physics, a particular interest for large deviation functions arose in the context of non-equilibrium statistical physics from the discovery of the fluctuation theorem [16,17,18], where the rare event consists in observing an atypical value of a current over a long time window. They also had been used for a long time to study stochastic dynamical systems in a weak noise limit [19,20,21] or extended systems when the system size becomes large [22,23,24].

One of the simplest questions one may ask about the large deviation functions is to consider an empirical observable \(Q_T\) of the form

$$\begin{aligned} Q_T= \int _0^T dt\,f({ C}_t) \end{aligned}$$
(1)

where \(f({ C}_t)\) is a function of the configuration \({ C}_t\) of a stochastic (or a chaotic) system at time t and to try to determine the probability that this empirical observable takes any atypical value \(q\,T\). The corresponding large deviation function \(\phi (q)\) is then simply defined by [25,26,27,28,29,30,31,32,33,34]

$$\begin{aligned} \text { Prob}(Q_T=q T) \sim e^{-T \, \phi (q)} \qquad \text {for large } T. \end{aligned}$$
(2)

(Here, the precise meaning of the symbol \(\sim \) is that \(\lim _{T\rightarrow \infty }\frac{1}{T}\log \text {Prob}(q T)=-\phi (q)\), and this will be used throughout this article.) A rather common situation is when \(\phi (q)\) vanishes at a single value \(q^*\) of q (the most likely value of q) and where \(\phi (q) >0\) for \(q \ne q^*\). The main question raised in the present paper is what are the dominant trajectories of a stochastic process, which contribute to this large deviation function and how to describe their effective dynamics. In particular, we want to determine the probability \(P_t({ C}|Q_T=q\, T)\) of finding the system in a configuration C at an arbitrary time t, conditioned on a certain value of \(Q_T\) for large T. Many of these questions have been studied earlier in different contexts spanning over Physics, Mathematics, and Computer Science [33,34,35,36,37,38,39,40,41,42,43].

In recent years, a theory for this conditioning problem in a Markov process has been developed [44,45,46,47,48]. The analysis is based on a canonical approach, which consists in weighting all the events by an exponential of \(Q_T\) and then to determine the probability

$$\begin{aligned} P_t^{(\lambda )} (C)=\frac{\int dQ \, e^{\lambda Q} P_t(C,Q)}{\sum _{C'}\int dQ \, e^{\lambda Q}P_t(C',Q)} \end{aligned}$$
(3)

where \(P_t(C,Q)\) is the joint probability of configuration C at time t and the observable \(Q_T\) to take value Q given the system in its steady state. This is in contrast to the previous case where \(Q_T\) in (1) was fixed (that we call the microcanonical case). As we will see (in particular, in Sect. 2 and Appendix A), results for the microcanonical case can be obtained using an equivalence of ensembles, which relates these canonical and microcanonical ensembles in the usual way in the large T limit (which plays here the same role as the thermodynamic limit in standard statistical mechanics). This analogy for the two ensembles as canonical and microcanonical has been used earlier [32, 35, 36, 44, 46, 47, 49] as well as their equivalence has been established [47]. (In earlier works the canonical ensemble has been referred as the tilted, biased, or s-ensemble [6, 35, 44, 49].) Using the equivalence of ensembles, it was rigorously shown [47] that, in the large T limit, the conditioned dynamics can be effectively described by a Markov process.

In this paper, we shall follow the canonical approach [44,45,46,47,48]. In the first half of the paper we give an alternative derivation of many results obtained earlier [44,45,46,47,48] mostly in the quasi-stationary regime (region III of Fig. 1). We build our analysis for a discrete time Markov process on a finite configuration space. Then, the continuous time Markov process and Langevin dynamics are obtained as limiting cases. Compared to the rigorous approach in [44,45,46,47,48], our derivation is hopefully easier for a general Physics audience. Moreover, it allows us to easily generalize the results to all other regions of Fig. 1, giving explicit results for the conditioned probability and the effective dynamics.

The second half of this paper is devoted to the weak noise limit of the Langevin dynamics. This limit has been recently studied [50, 51] for specific examples with periodic boundary condition, in their quasi-stationary regime. Application of these ideas for interacting many-body systems will be presented in a forthcoming publication [52].

We will start by reviewing and extending some known [44,45,46,47,48] aspects of the conditioning problem for Markov processes and for the Langevin equation (see Sects. 2 and 3). In the large T limit, one has to distinguish five regions (see Fig. 1) for which we calculate how the measure and the dynamics are modified by the conditioning on \(Q_T\). Then, we will consider the Langevin equation in the weak noise limit, first using a Wentzel–Kramers–Brillouin (WKB) approach [53] (Sects. 4 and 5) and then a variational approach (Sect. 6) based on the search of an optimal path which minimizes an action. This will allow us in particular to obtain the equation followed by the optimal trajectory under conditioning for large T. Lastly, we will see in Sect. 7 that an effect of conditioning is to break causality in the sense that a trajectory becomes correlated to the noise in the future.

Fig. 1
figure 1

A schematic of a time evolution of a Markov process \(C_t\) when conditioned on an empirical observable \(Q_T\) measured in a large time interval [0, T]. Different regions denote different parts of the evolution: (I) \(t<0\), (II) \(t\ge 0\) but small, (III) t and \(T-t\) both large (quasi-stationary regime), (IV) \(T-t> 0\) but small, and (V) \(t\ge T\)

2 Markov process

A schematic time evolution of a Markovian stochastic system conditioned to take a certain value of \(Q_T\) is shown in Fig. 1. For large T, one has to consider five regions. The system starts in its steady state far in the past, then evolves to a quasi-stationary regime (region III in Fig. 1), and finally relaxes to the steady state of the unconditioned dynamics. It is known [44,45,46,47] how to construct an effective dynamics that describes the conditioned process in the large T limit. For a Markov chain, the effective dynamics is Markovian with transition rate, which can be expressed ( for large T) in terms of the largest eigenvalue and eigenvectors of the tilted Markov matrix [44,45,46,47] in the biased ensemble. Similar connections between a conditioned ensemble and a biased ensemble appeared earlier in many contexts: rare events problems [26, 31, 35, 37, 40, 54,55,56,57], kinetically constrained models [5, 6], optimal control theory [39, 48], and also in Quantum systems [58]. The effective dynamics for a Markov process has been studied in depth, mostly for the quasi-stationary regime in [47]. In this section, we give another derivation of the effective dynamics, which extends earlier results in the five regions of Fig. 1.

2.1 The tilted matrix

We focus here our discussion on a discrete time, irreducible, aperiodic, time-homogeneous Markov process [42] on a finite set of configurations. This Markov process is specified by the probability \(M_0(C',C)\) that the system jumps from configuration C to \(C'\) in one time step. This means that the probability \(P_t(C)\) to be in configuration C at time t evolves by

$$\begin{aligned} P_{t+1}(C')=\sum _{C}M_0(C',C)P_t(C). \end{aligned}$$

We consider that at \(t\rightarrow -\infty \), the process starts in its steady state. (We will see later that the continuous time Markov process and the Langevin dynamics can be obtained as limiting cases.)

For this discrete time Markov process, we want to condition on a general empirical observable [32, 44, 47]

$$\begin{aligned} Q_T= \sum _{t=0}^{T-1} f({{{\mathcal {C}}}}_t) + \sum _{t=0}^{T-1}g({{\mathcal {C}}}_{t+1},{{\mathcal {C}}}_{t}) \end{aligned}$$
(4)

where f and g are arbitrary functions of the configurations. For example, by choosing \(f(C) = \delta _{C,C_a} \) and \(g(C',C)=0\), the observable \(Q_T\) represents the total time spent in a particular configuration \(C_a\). Another choice \(f(C)=0\) and \(g(C',C)=\delta _{C',C_b}\delta _{C,C_a}\) would count the total number of jumps from configuration \(C_a\) to configuration \(C_b\). Large deviations of such empirical observables and their conditioning have been studied in the recent past [6, 32,33,34, 44,45,46,47,48, 59].

Our interest is to describe the dynamics conditioned on a certain value of \(Q_T\) in the large T limit. In particular, we want to know what is the conditional probability \({\mathcal {P}}_t(C\vert Q_T)\) for the system to be in a configuration C at an arbitrary time t when conditioned on the observable \(Q_T\) defined by (4).

Let us first analyze the special case \(t=T\). If we define the joint probability \(P_T(C,Q\vert C_0)\) for the system to be in a configuration C at time T and that the observable \(Q_T\) defined by (4) takes value Q given its initial configuration \(C_0\) at time 0, it satisfies a recursion relation:

$$\begin{aligned} P_T(C,Q \vert C_0)=\sum _{C'}M_0(C,C')P_{T-1}(C',Q-f(C')-g(C,C') \vert C_0) \end{aligned}$$
(5)

Then, it is easy to see that the generating function defined by

$$\begin{aligned} G_{T}^{(\lambda )}(C \vert C_0)=\int dQ \;e^{\lambda Q}P_{T}(C,Q \vert C_0) \end{aligned}$$
(6)

satisfies

$$\begin{aligned} G_{T}^{(\lambda )}(C \vert C_0)=\sum _{C'}M_{\lambda }(C,C')G_{T-1}^{(\lambda )}(C'\vert C_0) \end{aligned}$$
(7)

where

$$\begin{aligned} M_\lambda (C,C')= M_0(C,C') e^{\lambda \left[ f(C') + g(C,C')\right] } \end{aligned}$$
(8)

is the tilted matrix. Therefore, \(G_{T}^{(\lambda )}(C \vert C_0)=M_{\lambda }^T(C,C_0)\) is the \((C,C_0)\)th element of the matrix \((M_{\lambda })^T\). For large T (and for real \(\lambda \)), the matrix elements of \((M_{\lambda })^T\) are dominated by the largest eigenvalue \(e^{\mu (\lambda )}\) of \(M_{\lambda }\), resulting in

$$\begin{aligned} G_{T}^{(\lambda )}(C \vert C_0)\simeq e^{T \mu (\lambda )}R_{\lambda }(C)L_{\lambda }(C_0) \end{aligned}$$
(9)

where \(R_{\lambda }(C)\) and \(L_{\lambda }(C)\) are the associated right and left eigenvectors, respectively. In (9) the symbol \(\simeq \) is used, as physicists usually do, to mean that the ratio of the two sides of the equation becomes 1 in the limit \(T\rightarrow \infty \); in fact, as we are considering an irreducible, aperiodic Markov process on a finite configuration space, the Perron-Frobenius theorem [60] ensures that there is a non-vanishing spectral gap and corrections to (9) are exponentially small. For the prefactor in (9) to be correct the eigenvectors must be normalized with \(\sum _C R_{\lambda }(C)L_{\lambda }(C)=1\).

For earlier uses of the tilted matrix see [6, 12, 26, 44, 54, 55, 61] and references therein. For more recent work see [47] where (9) also appears.

Remarks

  1. 1.

    It follows from (69) that the cumulants of \(Q_T\), for large T, can be obtained [6, 26, 47, 54] from the derivatives of \(\mu (\lambda )\) at \(\lambda =0\), and that \(\lim _{T\rightarrow \infty }\frac{1}{T}\log \langle e^{\lambda Q_T}\rangle =\mu (\lambda )\).

  2. 2.

    In the case \(\lambda =0\), the largest eigenvalue is 1, with \(L_0(C)=1\), and \(R_0(C)\) is the steady state probability distribution of the Markov process \(M_0\).

2.2 Ensemble equivalence

By an inverse Laplace transformation (6) becomes

$$\begin{aligned} P_T(C,Q\vert C_0)=\frac{1}{2\pi \, i }\int _{- i \infty }^{ i \infty }dz\, e^{-z Q}\, G_{T}^{(z)}(C \vert C_0), \end{aligned}$$
(10)

where the integral is along the imaginary axis on the complex-z plane.

For an irreducible, aperiodic, time-homogeneous Markov chain on a finite configuration space, \(\mu (\lambda )\) is differentiable and convex. When \(\mu (\lambda )\) is strictly convex, using the asymptotics (9) for large T and the method of steepest descent we get

$$\begin{aligned} P_T(C,Q=qT\vert C_0)\simeq e^{-T \phi (q)}\sqrt{\frac{1}{2\pi T \mu ''(\lambda )}}R_{\lambda }(C)L_{\lambda }(C_0) \end{aligned}$$
(11a)

where the large deviation function \(\phi (q)\) and the eigenvalue \(e^{\mu (\lambda )}\) are related by a Legendre transformation (the Gärtner–Ellis theorem [14, 62])

$$\begin{aligned} \phi (q)=\lambda q-\mu (\lambda )\qquad \text {with}\quad \mu '(\lambda )=q. \end{aligned}$$
(11b)

For convex \(\mu (\lambda )\), where the Legendre transformation is self-dual, (11b) gives

$$\begin{aligned} \mu (\lambda )=\lambda q-\phi (q)\qquad \text{ with } \qquad \lambda =\phi '(q). \end{aligned}$$
(12)

Moreover, as \(\mu ''(\lambda )=1/\phi ''(q)\), using (11a) we get

$$\begin{aligned} P_T(C,Q=qT\vert C_0)\simeq e^{-T \phi (q)}\sqrt{\frac{\phi ''(q)}{2\pi T}}R_{\phi '(q)}(C)L_{\phi '(q)}(C_0) \end{aligned}$$
(13)

One simple way to understand the prefactors in (13) is to use it in (6) to recover (9) by a saddle point calculation. As for (9), the symbol \(\simeq \) in (13) means that the ratio of the two sides goes to 1 in the limit \(T\rightarrow \infty \). However, unlike (9), the higher order correction term to (13) (which could be determined using again (6) and (9)) would be algebraic in T rather than exponential.

We see from (13) that, for large T, the conditional distribution of C at the final time is given by

$$\begin{aligned} {\mathcal {P}}_{T}(C\vert Q=qT)= \frac{P_T(C,Q=qT\vert C_0)}{\sum _{C'}P_T(C',Q=qT\vert C_0)}\simeq \frac{R_{\phi '(q)}(C)}{\sum _{C'} R_{\phi '(q)}(C')} \end{aligned}$$
(14)

This shows that the initial condition \(C_0\) is forgotten at large T. On the other hand, in the canonical ensemble, using (9) one has for the probability at the final time

$$\begin{aligned} P_{T}^{(\lambda )}(C)=\frac{G_T^{(\lambda )}(C\vert C_0)}{\sum _{C'}G_T^{(\lambda )}(C'\vert C_0)}\simeq \frac{R_{\lambda }(C)}{\sum _{C'} R_{\lambda }(C')} \end{aligned}$$
(15)

We see that the two expressions (14) and (15) coincide by choosing \(\lambda =\phi '(q)\). This shows that, for large T, the two ensembles are equivalent: fixing the value of \(Q_T=q\, T\) or weighting the events by a factor \(e^{\lambda Q_T}\) with \(\lambda =\phi '(q)\) lead to asymptotically the same distribution of the final configuration C. This equivalence of ensembles has been established earlier in [46, 47].

Remark

The equivalence might not hold for systems with infinitely many configurations, when the spectral gap for the tilted matrix vanish and \(\mu (\lambda )\) could become non-differentiable [44, 63,64,65]. See [47, 49, 66] for conditions for the equivalence of ensembles to hold.

2.3 The measure conditioned on \(Q_T\) for large T

As shown in Appendix A, the equivalence of ensembles holds not only at time \(t=T\), but at any time t, as long as T is large. The same was established earlier in [45,46,47]. This states that, by generalizing (15), if we define the canonical measure

$$\begin{aligned} P_t^{(\lambda )} (C)=\frac{\int dQ \, e^{\lambda Q} P_t(C,Q)}{\sum _{C'}\int dQ \, e^{\lambda Q}P_t(C',Q)} \end{aligned}$$
(16)

for any time t, then for large T,

$$\begin{aligned} {\mathcal {P}}_t(C \vert Q_T=qT)\simeq P_t^{(\lambda )}(C)\qquad \text{ with }\qquad \lambda =\phi '(q) \end{aligned}$$
(17)

where \(P_t(C,Q)\) is the joint probability of configuration C at time t and the observable \(Q_T\) to take value Q given the system in its steady state; \({\mathcal {P}}_t(C\vert Q)\) is the corresponding conditional probability. As for (13), the symbol \(\simeq \) in (17) means that the ratio of the two sides goes to 1 in the limit \(T\rightarrow \infty \).

This canonical measure (16), for large T, takes different expressions in the five regions indicated in Fig. 1. (A derivation is presented in Appendix A for region II and can be easily extended for other regions. Many of these results can be inferred from the analysis in [46, 47].)

  • Region I. \(t<0\)

    $$\begin{aligned} P_t^{(\lambda )}(C)=\frac{\sum _{C'}L_{\lambda }(C')M_0^{-t}(C',C)R_{0}(C)}{\sum _{C'}L_{\lambda }(C')R_{0}(C')} \end{aligned}$$
    (18a)
  • Region II. \(0\le t\ll T\). One recovers an earlier [46, 47] result

    $$\begin{aligned} P_t^{(\lambda )}(C)=\frac{\sum _{C'}L_{\lambda }(C)M_{\lambda }^{t}(C,C') R_{0}(C')}{e^{t\mu (\lambda )}\sum _{C'}L_{\lambda }(C')R_{0}(C')} \end{aligned}$$
    (18b)
  • Region III. \(1\ll t\) and \(T-t\gg 1\). One recovers an earlier [46, 47] result

    $$\begin{aligned} P_t^{(\lambda )}(C)=R_{\lambda }(C)L_{\lambda }(C) \end{aligned}$$
    (18c)
  • Region IV. \(1\ll t< T\), i.e. \(T-t=\mathcal {O}(1)\)

    $$\begin{aligned} P_t^{(\lambda )}(C)=\frac{\sum _{C'}M_{\lambda }^{T-t}(C',C)R_{\lambda }(C)}{e^{(T-t)\mu (\lambda )}\sum _{C'}R_{\lambda }(C')} \end{aligned}$$
    (18d)
  • Region V. \(T\le t\)

    $$\begin{aligned} P_t^{(\lambda )}(C)=\frac{\sum _{C'}M_{0}^{t-T}(C,C')R_{\lambda }(C')}{\sum _{C'}R_{\lambda }(C')} \end{aligned}$$
    (18e)

To be consistent with the notation of Sect. 2.1 we denote by \(R_0(C)\) the steady state measure of the Markov process \(M_0\). Therefore (15) is a special case of (). Another special case

$$\begin{aligned} P_{t=0}^{(\lambda )}(C)=\frac{L_{\lambda }(C)R_0(C)}{\sum _{C'}L_{\lambda }(C')R_0(C')} \end{aligned}$$
(19)

2.4 Time evolution of the tilted process

Again by a straightforward generalization of the reasoning (see Appendix A), one can show that the equivalence of ensembles holds for the dynamics as well [32, 44,45,46,47]. In fact, the tilted dynamics in the canonical ensemble, where events are weighted by \(e^{\lambda Q_T}\), is itself a Markov process [32, 44,45,46,47] even for small T (see Sect. 2.5). For this process, the probability of jump \(W_t^{(\lambda )}(C',C)\) from configuration C at t to \(C'\) at \(t+1\) depends, in general, on time t. For example, for \(t<0\),

$$\begin{aligned} W_t^{(\lambda )}(C',C)= {\sum _{C''',C''} M_{\lambda }^{T}(C''',C'') \ M_0^{-t-1}(C'',C') \ M_0(C',C)R_0(C) \over \sum _{C''',C''} M_{\lambda }^{T}(C''',C'') \ M_0^{-t}(C'',C)R_0(C) } \end{aligned}$$

while for \(0\le t < T\),

$$\begin{aligned} W_t^{(\lambda )}(C',C)=\frac{\sum _{C'',C_0}M_{\lambda }^{T-t-1}(C'',C') M_{\lambda }(C',C)M_{\lambda }^t(C,C_0)R_0(C_0)}{\sum _{C'',C_0}M_{\lambda }^{T-t} (C'',C)M_{\lambda }^t(C,C_0)R_0(C_0)} \end{aligned}$$

For \(t\ge T\), the transition probability is same as in the unconditioned dynamics, \(W_t^{(\lambda )}(C',C)=M_0(C',C)\).

For large T, the dominant contribution comes from the largest eigenvalue of \(M_{\lambda }\), and one gets in the five regions of Fig. 1:

  • Region I.

    $$\begin{aligned} W_t^{(\lambda )}(C',C)= {\sum _{C''} L_\lambda (C'') \ M_0^{-t-1}(C'',C') \ M_0(C',C) \over \sum _{C''} L_\lambda (C'') \ M_0^{-t}(C'',C) } \end{aligned}$$
    (20a)
  • Region II and III.

    $$\begin{aligned} W_t^{(\lambda )}(C',C)= { L_\lambda (C') \ M_\lambda (C',C) \over e^{\mu (\lambda )} L_\lambda (C) } \end{aligned}$$
    (20b)

    This result for region III has been obtained earlier in [46, 47].

  • Region IV.

    $$\begin{aligned} W_t^{(\lambda )}(C',C)= {\sum _{C''} \ M_\lambda ^{T-t-1}(C'',C') \ M_\lambda (C',C) \over \sum _{C''} \ M_\lambda ^{T-t}(C'',C) } \end{aligned}$$
    (20c)
  • Region V.

    $$\begin{aligned} W_t^{(\lambda )}(C',C)=M_0(C',C) \end{aligned}$$
    (20d)

Using these expressions for \(W_t^{(\lambda )}\) and the corresponding canonical measure in (18a18d), one can check that

$$\begin{aligned} P_{t+1}^{(\lambda )}(C')=\sum _{C}W_t^{(\lambda )}(C',C)P_t^{(\lambda )}(C) \end{aligned}$$
(21)

Remarks

  1. 1.

    We have seen that by deforming the matrix \(M_0\) one can condition on two kinds of observables: \(f(C_t)\) and \(g(C_{t+1},C_t)\) [see (4)]. It is not possible to condition on other time correlations, like, \(Q_T= \sum _{t=1}^T g( {{\mathcal {C}}}_{t+\tau },{{\mathcal {C}}}_t) \) with \(\tau >1\) by simply deforming the matrix \(M_0\). One could still define a tilted Markov process but this would be on a much larger set of configurations since one would need to keep information about \(\tau \) consecutive configurations.

  2. 2.

    In a similar analysis one can describe the time reversed process [42] conditioned on \(Q_T\). We define \({\mathbb {W}}_t^{(\lambda )}(C,C')\) as the transition probability to jump from \(C'\) at \(t+1\) to C at t in the time reversed process. In all five regions of time, they could be expressed in terms of the corresponding \(W_t^{(\lambda )}\) and \(P_{t}^{(\lambda )}\) of the forward process.

    $$\begin{aligned} {\mathbb {W}}_t^{(\lambda )}(C,C')=W_t^{(\lambda )}(C',C) \frac{P_t^{(\lambda )}(C)}{P_{t+1}^{(\lambda )} (C')} \end{aligned}$$
    (22)

    For example, in the quasi-stationary regime (\(1\ll t\) and \(T-t\gg 1\)),

    $$\begin{aligned} {\mathbb {W}}_t^{(\lambda )}(C,C')={ M_\lambda (C',C) R_\lambda (C) \over e^{\mu (\lambda )} R_\lambda (C') }. \end{aligned}$$
    (23)

    The time reversed process is useful in describing how a fluctuation is created. For example, the fluctuation leading to an atypical configuration can be described by relaxation from the same configuration in the time reversed process [52].

2.5 A generalization

The above expressions (18a18e) and (20a20d) can be extended for a more general observable of the form

$$\begin{aligned} Q=\sum _t f_t(C_t)+\sum _t g_t(C_{t+1},C_t) \end{aligned}$$
(24)

where \(f_t(C)\) and \(g_t(C',C)\) are arbitrary functions of configurations in a discrete time irreducible Markov process \(M_0(C',C)\) on a finite configuration space. The observable (4) is just a particular case of (24) with \(f_t(C)=f(C)\) and \(g_t(C',C)=g(C',C)\) for \(t\in [0,T]\) with large T, and both being zero outside this time window.

We consider that the system started at \(t\rightarrow -\infty \) in its steady state and evolves till \(t\rightarrow \infty \), but this can be changed without affecting much of our analysis. One can even generalize to the case when the Markov process \(M_0(C',C)\) depends on time.

Using a reasoning similar to that in Appendix A, we can show that in the canonical ensemble, where the events are weighted by \(e^{\lambda Q}\), the probability \(P_t^{(\lambda )}(C)\) is given by

$$\begin{aligned} P_t^{(\lambda )} (C)=\frac{ Z^{(\lambda )}_t(C)\, {\mathbb {Z}}^{(\lambda )}_t(C)}{\sum _{C'} Z^{(\lambda )}_t(C')\, {\mathbb {Z}}^{(\lambda )}_t(C')} \end{aligned}$$
(25a)

where \(Z^{(\lambda )}_t(C)\) and \({\mathbb {Z}}^{(\lambda )}_t(C)\) follow the recursion relations

$$\begin{aligned} Z^{(\lambda )}_t(C)=&\sum _{C'}e^{\lambda f_{t-1}(C')+\lambda g_{t-1}(C,C')}M_0(C,C')Z^{(\lambda )}_{t-1}(C') \end{aligned}$$
(25b)
$$\begin{aligned} {\mathbb {Z}}^{(\lambda )}_t(C)=&\sum _{C'}e^{\lambda f_{t}(C)+\lambda g_{t}(C',C)}M_0(C',C){\mathbb {Z}}^{(\lambda )}_{t+1}(C') \end{aligned}$$
(25c)

We can also show that the tilted dynamics remains Markovian, and \(P_t^{(\lambda )} (C)\) follows (21) with the transition probability

$$\begin{aligned} W_t^{(\lambda )}(C',C)=&\frac{{\mathbb {Z}}_{t+1}^{(\lambda )}(C')M_0(C',C)e^{\lambda f_{t}(C)+\lambda g_t(C',C)}Z^{(\lambda )}_t(C)}{\sum _{C''}{\mathbb {Z}}_{t+1}^{(\lambda )} (C'')M_0(C'',C)e^{\lambda f_{t}(C)+\lambda g_t(C'',C)}Z^{(\lambda )}_t(C)}\\=&\frac{{\mathbb {Z}}_{t+1}^{(\lambda )}(C')}{{\mathbb {Z}}^{(\lambda )}_t(C)}e^{\lambda f_{t}(C)+\lambda g_t(C',C)}M_0(C',C) \end{aligned}$$
(26)

One can verify using (25c) that \(\sum _{C'}W_t^{(\lambda )}(C',C)=1\).

The expressions (18a18e) and (20a20d) for \(Q=Q_T\) in (4) can be easily recovered from (25a) and (26) by using the corresponding \(f_t(C)\) and \(g_t(C',C)\) and taking large T limit.

2.6 Continuous time Markov process

The case of a continuous time Markov process can be obtained [60] by choosing a Markov matrix \(M_0\) in the discrete time case of the form

$$\begin{aligned} M_0(C',C) = \left( 1 - \sum _{C^{''}}{{\mathcal {M}}}_0(C^{''},C) dt\right) \delta _{C',C} + {{\mathcal {M}}}_0(C',C) \, dt+\cdots \end{aligned}$$
(27)

and subsequently taking the limit \(dt \rightarrow 0\) in the corresponding Master equation. The \({{\mathcal {M}}}_0(C',C)\) is the jump rate from configuration C to \(C'\). Following this construction it is straightforward to extend the results of conditioned process in the discrete time case to the continuous time case. The details are given in Appendix B.

3 The Langevin dynamics

We now extend the above discussion to a Langevin process on an infinite line defined by the stochastic differential equation

$$\begin{aligned} {\dot{X}}_t=F(X_t)+\eta _t \end{aligned}$$
(28)

where F(x) is an external force and \(\eta _t\) is a Gaussian white noise of mean zero and covariance \(\langle \eta _t\eta _{t'}\rangle =\epsilon \;\delta (t-t')\) with \(\epsilon \) being the noise strength. It is well known [60] that the probability \(P_t(x)\) of the process \(X_t\) to be in x at time t follows a Fokker–Planck equation

$$\begin{aligned} \frac{d}{dt} P_t(x)={\mathcal {L}}_0\cdot P_t(x):=-\frac{d}{dx}\left[ F(x)P_t(x)\right] +\frac{\epsilon }{2}\frac{d^2}{dx^2}P_t(x) \end{aligned}$$
(29)

3.1 The tilted Fokker–Planck operator

Our interest is the dynamics conditioned on an empirical observable, considered already in [32, 44,45,46,47],

$$\begin{aligned} Q_T=\int _0^T dt\, f(X_t)+\int _0^T dX_t \; h(X_t) \end{aligned}$$
(30)

where f and h are functions of \(X_t\). In writing the second integral we mean a special class of observables whose discrete analogue

$$\begin{aligned} \int _0^T dX_t \; h(X_t)\equiv \sum _{t}(X_{t+dt}-X_t)\left[ \alpha \, h(X_{t+dt})+(1-\alpha )\, h(X_t)\right] \end{aligned}$$
(31)

with \(\alpha \in [0,1]\). The choice \(\alpha =0\) corresponds to the Îto integral and \(\alpha =\frac{1}{2}\) corresponds to the Stratonovich integral in stochastic calculus [67]. One may view (30) as a limiting case of (4).

A large number of relevant empirical observables in statistical physics are of the form (30). For example, integrated current, work, entropy production, empirical density, etc [18, 32,33,34, 68, 69].

The Langevin dynamics in (28) can be viewed as a continuous space and time limit of a jump process on a one-dimensional chain (see Appendix C). This way, the effective dynamics conditioned on \(Q_T\) in (30) can be obtained from our results in Sect. 2 by suitably taking the continuous limit. For example, a continuous limit of (7) gives (see Appendix C)

$$\begin{aligned} \frac{d}{dT} G_T^{(\lambda )}(x\vert y)={\mathcal {L}}_\lambda \cdot G_T^{(\lambda )}(x \vert y) \end{aligned}$$
(32)

where the tilted Fokker–Planck operator

$$\begin{aligned} {\mathcal {L}}_\lambda :=\lambda f(x)-\left( \frac{d}{dx}-\lambda h(x)\right) F(x)+\frac{\epsilon }{2}\left( \frac{d}{dx}-\lambda h(x)\right) ^2+\epsilon \left( \alpha -\frac{1}{2}\right) \lambda h'(x) \end{aligned}$$
(33)

For an earlier derivation of (33) when \(f(x)=0\) see [68, 69], and for the general case see [46, 47].

If there is a non-vanishing spectral gap for the largest eigenvalue \(\mu (\lambda )\) of \({\mathcal {L}}_\lambda \), then for large T, one gets, analogous to (9),

$$\begin{aligned} G_{T}^{(\lambda )}(x \vert y)\simeq e^{T\mu (\lambda )}r_{\lambda }(x)\ell _{\lambda }(y) \end{aligned}$$
(34)

where \(r_\lambda (x)\) and \(\ell _\lambda (x)\) are the corresponding right and left eigenvectors defined by

$$\begin{aligned} {\mathcal {L}}_\lambda \cdot r_{\lambda }(x)=\mu (\lambda )r_{\lambda }(x) \quad \text {and}\quad {\mathcal {L}}^{\dagger }_\lambda \cdot \ell _{\lambda }(x)=\mu (\lambda )\ell _{\lambda }(x) \end{aligned}$$
(35)

where \({\mathcal {L}}^{\dagger }_\lambda \) is the operator conjugate to \({\mathcal {L}}_\lambda \).

$$\begin{aligned} {\mathcal {L}}^{\dagger }_\lambda := \lambda f(x)+F(x)\left( \frac{d}{dx}+\lambda h(x)\right) +\frac{\epsilon }{2}\left( \frac{d}{dx}+\lambda h(x)\right) ^2+\epsilon \left( \alpha -\frac{1}{2}\right) \lambda h'(x) \end{aligned}$$
(36)

In (34) the symbol \(\simeq \) means that the sub-leading terms are exponentially small in T. Analogous to (9), for the expression (34) the eigenfunctions should satisfy \(\int dx\,\ell (x)r(x)=1\), as discussed in [32, 47].

Remark

Unlike (9) in the discrete Markov process, the existence of a spectral gap for (34) is not assured (see the discussion in [47, 59]). On a one-dimensional line, where F(x) and h(x) are gradients, (35) can be mapped [32, 59, 69, 70] to a Schrödinger equation with potential

$$\begin{aligned} V(x)=\frac{F(x)^2}{2\epsilon }+\frac{F'(x)}{2}-\lambda f(x)-\epsilon \left( \alpha -\frac{1}{2}\right) \lambda h'(x). \end{aligned}$$
(37)

In this case, the question of a spectral gap for \({\mathcal {L}}_\lambda \) maps to the existence of a bound state of the Schrödinger equation with a potential V(x), which is a well studied problem in Quantum mechanics (see [53, 71,72,73,74]). On an infinite line, if V(x) grows when \(\vert x\vert \rightarrow \infty \), then there is a bound state [70].

3.2 Canonical measure for the Langevin dynamics

One could similarly derive the canonical measure and the corresponding rate equation. This way (18a18e) become, for the continuous analogue \(P_t^{(\lambda )}(x)\) of the canonical probability (16) in the five regions of Fig. 1 (see the derivation in Appendix C)

  • Region I

    $$\begin{aligned} P_t^{(\lambda )}(x)=\frac{\left[ e^{-t {\mathcal {L}}_0^\dagger }\cdot \ell _\lambda \right] (x)\,r_0(x)}{\int dy \,\ell _\lambda (y) \, r_0(y)} \end{aligned}$$
    (38a)
  • Region II

    $$\begin{aligned} P_t^{(\lambda )}(x)=\frac{\ell _\lambda (x)\left[ e^{t{\mathcal {L}}_\lambda }\cdot r_0\right] (x)}{e^{t\mu (\lambda )}\int dy \,\ell _\lambda (y) \, r_0(y)} \end{aligned}$$
    (38b)
  • Region III

    $$\begin{aligned} P_t^{(\lambda )}(x)=\ell _\lambda (x)r_\lambda (x) \end{aligned}$$
    (38c)
  • Region IV

    $$\begin{aligned} P_t^{(\lambda )}(x)=\frac{\left[ e^{(T-t){\mathcal {L}}_\lambda ^\dagger }\cdot \ell _0\right] (x)\, r_\lambda (x)}{e^{(T-t)\mu (\lambda )}\int dy \, r_\lambda (y)} \qquad \text {with}\quad \ell _0(x)=1 \end{aligned}$$
    (38d)
  • Region V

    $$\begin{aligned} P_t^{(\lambda )}(x)=\frac{\left[ e^{(t-T){\mathcal {L}}_0}\cdot r_\lambda \right] (x)}{\int dy \,r_\lambda (y)} \end{aligned}$$
    (38e)

These expressions of \(P_{t}^{(\lambda )}(x)\), particularly (38b38d), were already written in [46, 47].

The time evolution of the tilted dynamics is described by a Langevin equation (28) with a modified force \(F_t^{(\lambda )}(x)\), which, in general, depends on time. The force takes different expressions in the five regions indicated in Fig. 1.

  • Region I

    $$\begin{aligned} F_t^{(\lambda )}(x)=F(x)+\epsilon \frac{d}{dx}\log \left[ e^{-t{\mathcal {L}}_0^{\dagger }}\cdot \ell _{\lambda }(x) \right] \end{aligned}$$
    (39a)
  • Region II and III, we recover an earlier result [46, 47],

    $$\begin{aligned} F_t^{(\lambda )}(x)=F(x)+\epsilon \left( \lambda h(x)+\frac{d}{dx}\log \ell _{\lambda }(x)\right) \end{aligned}$$
    (39b)
  • Region IV

    $$\begin{aligned} F_t^{(\lambda )}(x)=F(x)+\epsilon \left( \lambda h(x)+\frac{d}{dx}\log \left[ e^{(T-t){\mathcal {L}}_\lambda ^{\dagger }}\cdot \ell _0(x) \right] \right) \end{aligned}$$
    (39c)
  • Region V

    $$\begin{aligned} F_t^{(\lambda )}(x)=F(x) \end{aligned}$$
    (39d)

A derivation is given in Appendix C. One can easily verify that the probability (38a38e) follows a Fokker–Planck equation with the corresponding force (39a39d). To see this, for example in region I, one can simply use that \(\left[ e^{-t {\mathcal {L}}_0^\dagger }\cdot \ell _\lambda \right] (x)\equiv V_t(x)\) in (38a) is a solution of \(\frac{d}{dt}V_t(x)=-{\mathcal {L}}_0^\dagger \cdot V_t(x)\) and that \({\mathcal {L}}_0\cdot r_0(x)=0\).

Remark

We have considered the noise amplitude \(\epsilon \) in (28) to be a constant. A generalization where the amplitude is a function of \(X_t\) involves a choice of the Îto-Stratonovich discretization [67]. The analysis could be easily extended to such cases, as well as in higher dimensions.

3.3 The Ornstein–Uhlenbeck process

As an illustrative easy example [32] one can consider the Langevin equation in a harmonic potential, \(F(x)=-\gamma \, x\). This is known as the Ornstein–Uhlenbeck process [60, 70]. To make our discussion simple, we choose the observable \(Q_T=\int _0^Tds\; X_s\) which corresponds to \(f(x)=x\) and \(h(x)=0\) in (30). In this case, the tilted Fokker–Planck operator (33) gives

$$\begin{aligned} {\mathcal {L}}_\lambda :=\lambda x +\gamma \frac{d}{dx}x+\frac{\epsilon }{2}\frac{d^2}{dx^2} \end{aligned}$$

Its largest eigenvalue and the corresponding eigenvectors are [32, 70]

$$\begin{aligned} \mu (\lambda )=\frac{\epsilon \lambda ^2}{2\gamma ^2}; \qquad r_\lambda (x)=\mathcal {N}e^{-\frac{\gamma }{\epsilon }\left( x-\frac{\mu }{\lambda }\right) ^2}; \qquad \ell _\lambda (x)=e^{\frac{\lambda }{\gamma }x} \end{aligned}$$
(40)

with \(\mathcal {N}\) determined from normalization \(\int dx \ell _\lambda (x)r_\lambda (x)=1\). The Legendre transformation (12) gives the large deviation function \(\phi (q)=\tfrac{\gamma ^2}{2\epsilon }q^2\).

The canonical measure (38a38e) and the effective force (39a39d) can be explicitly evaluated in this example. One would essentially need to evaluate terms like \(\left[ e^{-t {\mathcal {L}}_0^\dagger }\cdot \ell _\lambda \right] (x)\equiv V_t(x)\) which is a solution of \(\frac{d}{dt}V_t(x)=-{\mathcal {L}}_0^\dagger \cdot V_t(x)\) with an initial condition \(V_0(x)=\ell _\lambda (x)\). It is simple to verify that the solution is

$$\begin{aligned} \left[ e^{-t {\mathcal {L}}_0^\dagger }\cdot \ell _\lambda \right] (x)=\exp \left[ \frac{\lambda x}{\gamma }e^{\gamma t}+\frac{\lambda ^2\epsilon }{4\gamma ^3}\left( 1-e^{2\gamma t}\right) \right] \qquad \text {for } t\le 0 \end{aligned}$$

Similarly, one can verify

$$\begin{aligned} \left[ e^{t{\mathcal {L}}_\lambda }\cdot r_0\right] (x)&=\mathcal {N}\exp \left[ \left( 1-e^{-\gamma t}\right) \left\{ \frac{\lambda x}{\gamma }-\frac{\epsilon \lambda ^2}{4\gamma ^3}\left( 3-e^{-\gamma t}\right) \right\} +\frac{\epsilon \lambda ^2 t}{2\gamma ^2}-\frac{\gamma x^2}{\epsilon }\right] \\&\quad \quad \text {for } t\ge 0, \\ \quad \left[ e^{(T-t){\mathcal {L}}_\lambda ^\dagger }\cdot \ell _0\right] (x)&= \exp \left[ \left( 1-e^{-\gamma (T-t)}\right) \left\{ \frac{\lambda x}{\gamma }-\frac{\epsilon \lambda ^2}{4\gamma ^3}\left( 3-e^{-\gamma (T-t)}\right) \right\} \right. \\&\quad \left. +\frac{\epsilon \lambda ^2 (T-t)}{2\gamma ^2}\right] \qquad \text {for } t\le T, \\ \quad \left[ e^{(t-T){\mathcal {L}}_0}\cdot r_\lambda \right] (x)&= \mathcal {N}\exp \left[ -\frac{\gamma }{\epsilon }\left( x-\frac{\epsilon \lambda }{2\gamma ^2} e^{-\gamma (t-T)}\right) ^2\right] \qquad ~ \text {for } t\ge T. \end{aligned}$$

Using these in the general expression (38a38e) and (39a39d) we find that, in all regions, the canonical measure and the effective force are of the form

$$\begin{aligned} P_t^{(\lambda )}(x)=\sqrt{\frac{\gamma }{\pi \epsilon }}\exp \left[ -\frac{\gamma }{\epsilon }(x-a_t)^2\right] \qquad \text {and}\qquad F_t^{(\lambda )}(x)=-\gamma \left( x-\epsilon \, b_t\right) \end{aligned}$$
(41)

This means that the tilted dynamics is another Langevin equation in a harmonic potential whose minimum is at \(\epsilon \, b_t\). We get, in region I, \(a_t=\frac{\epsilon \lambda }{2\gamma ^2}e^{\gamma t}\) and \(b_t=\frac{ \lambda }{\gamma ^2}e^{\gamma t}\); in region II, \(a_t=\frac{\epsilon \lambda }{\gamma ^2}\left( 1-\frac{1}{2}e^{-\gamma \, t}\right) \) and \( b_t=\frac{ \lambda }{\gamma ^2}\); in region III, \(a_t=\frac{\epsilon \lambda }{\gamma ^2}\) and \(b_t=\frac{ \lambda }{\gamma ^2}\); in region IV, \(a_t=\frac{\epsilon \lambda }{\gamma ^2}\left( 1-\frac{1}{2}e^{-\gamma \, (T-t)}\right) \) and \(b_t=\frac{ \lambda }{\gamma ^2}\left( 1-e^{-\gamma (T-t)}\right) \); in region V, \(a_t=\frac{\epsilon \lambda }{2\gamma ^2} e^{-\gamma \, (t-T)}\) and \(b_t=0\).

For large T, one can get the microcanonical probability \({\mathcal {P}}_t(x\vert q)\) using \(\frac{\epsilon \lambda }{\gamma ^2}=q\) (from \(\lambda =\phi '(q)\)) in the above expression for \(P_t^{(\lambda )}(x)\). From this solution, one can also see that the most likely trajectory followed by the system is \(x(t)=a_t\). A schematic of the trajectory is given in Fig. 2.

Fig. 2
figure 2

A schematic of the most probable trajectory, for large T, of the conditioned Ornestein-Uhlenbeck process defined in Sect. 3.3. The most probable position changes with time, only reaching a time independent value \(q=\frac{\epsilon \lambda }{\gamma ^2}\) at the intermediate quasi-stationary region III. The evolution is symmetric under time reversal, with most probable position \(\frac{q}{2}\) at \(t=0\) and \(t=T\)

Remarks

  1. 1.

    In this example, both \(X_t\) and \(Q_T\) are Gaussian variables. The direct calculation of the covariance is an alternative way of re-deriving (41).

  2. 2.

    Here, the canonical measure \(P_t^{(\lambda )}(x)\) is symmetric under \(t\rightarrow T-t\), thus symmetric under time reversal. This is because on a one-dimensional line the force F(x) can be written as the gradient of a potential and the Langevin dynamics satisfies detailed balance. This would not necessarily be the case on a ring or in higher dimensions.

4 Large deviations in the conditioned Langevin dynamics

We shall now discuss the Langevin dynamics on the infinite line when the noise strength \(\epsilon \) is small. This weak noise limit has been of interest in the past [50, 51, 68, 69, 75,76,77] particularly in the Freidlin-Wentzel theory of large deviations for stochastic differential equations [19]. One may also view the fluctuating hydrodynamics description of interacting many-body systems as a generalization of the Langevin equation, where the weak noise limit comes from the large system size [12, 24, 78, 79]. A generalization of our discussion here to a many-body system will be presented in a forthcoming publication [52].

In this weak noise limit, one can describe rare fluctuations in terms of a large deviation function [19,20,21]. For example, the steady state probability of a Langevin equation describing a particle in a potential U(x) has a large deviation form

$$\begin{aligned} P(x)\sim e^{-\frac{2}{\epsilon }U(x)}\qquad \text {for small } \epsilon . \end{aligned}$$

In this Section, we shall show that a similar large deviation description holds, in general, for the canonical measure in the Langevin equation and also for the conditioned probability using the equivalence of ensembles.

4.1 WKB solution of the eigenfunctions

For small \(\epsilon \), one can try the WKB method [53] to determine the largest eigenvalue and associated eigenvectors of the tilted operator \({\mathcal {L}}_\lambda \) in (33). This means that we look for a solution of the type

$$\begin{aligned} r_{\frac{\kappa }{\epsilon }}(x)\sim e^{-\frac{1}{\epsilon }\psi _\text {right}^{(\kappa )}(x)}, \qquad \ell _{\frac{\kappa }{\epsilon }}(x)\sim e^{-\frac{1}{\epsilon }\psi _\text {left}^{(\kappa )}(x)} \end{aligned}$$
(42a)

by setting

$$\begin{aligned} \lambda =\frac{\kappa }{\epsilon } \qquad \text {and}\qquad \mu \left( \frac{\kappa }{\epsilon }\right) \simeq \frac{1}{\epsilon }\chi (\kappa ) \end{aligned}$$
(42b)

in the eigenvalue equations (35). The scaling (42b) is known [50, 51, 77] for specific examples of (30), e.g. work and entropy production. We find that, for small \(\epsilon \), this is indeed a consistent solution to the leading order when \(\psi _\text {left}^{(\kappa )}\) and \(\psi _\text {right}^{(\kappa )}\) satisfy

$$\begin{aligned} F(x)^2-\left( \frac{d}{dx}\psi _\text {left}^{(\kappa )}(x)-\kappa h(x)-F(x)\right) ^2=&\,2\kappa f(x)-2\chi (\kappa ) \end{aligned}$$
(43a)
$$\begin{aligned} F(x)^2-\left( \frac{d}{dx}\psi _\text {right}^{(\kappa )}(x)+\kappa h(x)+F(x)\right) ^2=&\,2\kappa f(x)-2\chi (\kappa ) \end{aligned}$$
(43b)

When we use such a solution in (34) we get

$$\begin{aligned} G_T^{(\frac{\kappa }{\epsilon })}(x\vert y)\sim e^{\frac{T}{\epsilon }\chi (\kappa )-\frac{1}{\epsilon }\psi _\text {right}^{(\kappa )}(x) -\frac{1}{\epsilon }\psi _\text {left}^{(\kappa )}(y)} \end{aligned}$$
(44)

for small \(\epsilon \). This also gives a large deviation form for the canonical measure. In particular, the canonical measure (15) and (19), for small \(\epsilon \), gives

$$\begin{aligned} P_T^{(\frac{\kappa }{\epsilon })}(x)\sim e^{-\frac{1}{\epsilon }\psi _T^{(\kappa )}(x)} \qquad \text {and} \qquad P_0^{(\frac{\kappa }{\epsilon })}(x)\sim e^{-\frac{1}{\epsilon }\psi _0^{(\kappa )}(x)} \end{aligned}$$
(45)

where \(\psi _T^{(\kappa )}(x)= \psi _\text {right}^{(\kappa )}(x)\) and \(\psi _0^{(\kappa )}(x)=\psi _\text {left}^{(\kappa )}(x)+\mathcal {F}(x)\), up to an additive constant [we denote by \(\mathcal {F}(x)\) the large deviation function associated to the steady state probability of the original Langevin equation (28)].

Remarks

  1. 1.

    The solution (44) implies that, for large T and small \(\epsilon \), the joint probability (13) also has a large deviation form given by

    $$\begin{aligned} P_T(x,Q_T=qT \vert y)\sim e^{-\frac{T}{\epsilon }\varPhi (q)-\frac{1}{\epsilon }\psi _\text {right}(x,q)-\frac{1}{\epsilon }\psi _\text {left}(y,q)} \end{aligned}$$

    where \(\varPhi (q)\), \(\psi _\text {right}(x,q)\), and \(\psi _\text {left}(x,q)\) are related to their counterparts \(\chi (\kappa )\), \(\psi _\text {right}^{(\kappa )}(x)\), and \(\psi _\text {left}^{(\kappa )}(x)\), respectively, by the Legendre transformation (11b), which gives

    $$\begin{aligned} \varPhi (q)=\kappa \; q -\chi (\kappa ) \qquad \text {with}\quad \chi '(\kappa )=q. \end{aligned}$$
    (46)

    This is due to the ensemble equivalence discussed in Sect. 2.2. See [47, 49, 66] for general conditions for the equivalence of ensembles to hold.

  2. 2.

    Later, in Sect. 6.3, we will see that (43a43b) are the Hamilton–Jacobi equations in a variational formulation of the problem.

  3. 3.

    The ansatz (42a42b) is not always applicable, for example, in a double well potential. See [51, 80, 81] for other recent applications of the WKB analysis for conditioned stochastic processes.

4.2 Large deviation of the canonical measure

The WKB solution (42a) gives that the conditioned probability at any time t, for large T and small \(\epsilon \), in the two ensembles, has a large deviation form

$$\begin{aligned} P_t^{(\frac{\kappa }{\epsilon })}(x)\sim e^{-\frac{1}{\epsilon }\psi _t^{(\kappa )}(x)} \qquad \text {and} \qquad {\mathcal {P}}_t(x\vert Q=qT)\sim e^{-\frac{1}{\epsilon }\psi _t(x,q)} \end{aligned}$$
(47)

with the two large deviation functions related by the Legendre transformation (46). This is already seen in (45). For other times, this comes from using the WKB solution (42a42b) in the expressions (38a38e) for small \(\epsilon \).

Among these, the simplest case is the quasi-stationary regime, i.e. \(1\ll t\) and \(T-t\gg 1\), where \(P_t^{(\lambda )}(x)=r_\lambda (x)\ell _\lambda (x)\) given in (38c). Using (42a) we get

$$\begin{aligned} \psi _t^{(\kappa )}(x)\equiv \psi _\text {qs}^{(\kappa )}(x)=\psi _\text {right}^{(\kappa )}(x)+\psi _\text {left}^{(\kappa )}(x) \end{aligned}$$
(48)

(The subscript qs refers to the quasi-stationary state.)

5 Langevin equation on the infinite line

On the infinite line, F(x) is a gradient of a potential U(x), i.e. \(F(x)=-\partial _x U(x)\). (For a finite line with a periodic boundary see [46, 47, 50, 51, 82].) For simplicity, we consider \(Q_T=\int _0^Tdt\, f(X_t)\), i.e. \(h(x)=0\) in (30). Moreover, we consider that \(F(x)^2-2\kappa f(x)\) has a global single minimum [see the remark after (51)] and it grows at \(\vert x \vert \rightarrow \infty \). For small \(\epsilon \) with (37), this ensures that the spectral gap is non-vanishing.

In this case, the two solutions of the Hamilton–Jacobi equations (43a43b) are related,

$$\begin{aligned} \psi _\text {left}^{(\kappa )}(x)=\psi _\text {right}^{(\kappa )}(x)-2U(x)+\text {constant} \end{aligned}$$
(49)

(This would not be true, in general, when F(x) is not a gradient of a potential. For example, on a ring with a circular driving force.)

Moreover, using (49), the effective force (39b) in the quasi-stationary regime, for small \(\epsilon \), can be written as

$$\begin{aligned} F_t^{(\frac{\kappa }{\epsilon })}(x)\simeq F(x)-\partial _x \psi _\text {left}^{(\kappa )}(x)=-\frac{1}{2}\partial _x \psi _\text {qs}^{(\kappa )}(x) \end{aligned}$$
(50)

(This is only the leading order term for small \(\epsilon \).) This shows that the tilted process can be viewed as a Langevin dynamics in the potential landscape of the large deviation function \(\psi _\text {qs}^{(\kappa )}(x)\).

5.1 An explicit solution

The Hamilton–Jacobi equations (43a43b) are simple to solve. For example, let’s take (43b), which is quadratic and has two solutions \(\psi _\pm ^{(\kappa )}(x)\) which follows

$$\begin{aligned} \partial _x\psi _\pm ^{(\kappa )}(x)=-F(x)\pm \sqrt{F(x)^2-2\kappa f(x)+2\chi (\kappa )} \end{aligned}$$

When \(F(x)^2-2\kappa f(x)\) has a single global minimum at a value \(x=u\) and it grows at \(x\rightarrow \pm \infty \), the only possible choice is that

$$\begin{aligned} \partial _x\psi _\text {right}^{(\kappa )}(x)={\left\{ \begin{array}{ll}\partial _x\psi _{+}^{(\kappa )}(x), &{} \text{ for } x\ge u, \\ \partial _x\psi _{-}^{(\kappa )}(x), &{} \text{ for } x\le u. \end{array}\right. } \end{aligned}$$

At the meeting point, the eigenfunction \(r_{\frac{\kappa }{\epsilon }}(x)\) and its derivative are continuous which leads to continuity of \(\partial _x\psi _\text {right}^{(\kappa )}(x)\). The latter condition gives

$$\begin{aligned} \chi (\kappa )=\kappa f(u)-\frac{1}{2}F(u)^2 \qquad \text {with}\qquad \kappa =\frac{F(u)F'(u)}{f'(u)} \end{aligned}$$
(51)

Remark

The reason for imposing the condition that \(F(x)^2-2\kappa f(x)\) has a single global minimum is that otherwise, one can not straightforwardly extend the asymptotic solutions \(\psi _\pm ^{(\kappa )}(x)\) to all values of x, similar to the WKB analysis of double well potential in Quantum Mechanics [53]. This is because between the minima the eigenfunction is a superposition of the \(\psi _{+}^{(\kappa )}(x)\) and \(\psi _{-}^{(\kappa )}(x)\) solutions and one has to carefully match the solutions at each minimum.

The second Hamilton–Jacobi equation (43a) is similarly solved. Integrating these solutions we write

$$\begin{aligned} \psi _\text {right}^{(\kappa )}(x)=&\int _{x^\star }^{x}dz\;\left\{ -F(z)+\text {sgn}(x-u)\sqrt{F(z)^2-F(u)^2-2\kappa [f(z)-f(u)]}\right\} \end{aligned}$$
(52a)
$$\begin{aligned} \psi _\text {left}^{(\kappa )}(x)=&K+\int _{x^\star }^{x}dz\;\left\{ F(z)+\text {sgn}(x-u)\sqrt{F(z)^2-F(u)^2-2\kappa [f(z)-f(u)]}\right\} \end{aligned}$$
(52b)

where K and \(x^\star \) are a priori arbitrary constants. To satisfy the normalization \(\int dx \, r_\lambda (x)\ell _\lambda (x){=}1\), one can choose \(K=0\) for \(x^\star =u\) (using \(F(x)^2-2\kappa f(x)\) has minimum at \(x=u\)).

Using (52a52b) in (45) one can see that \(\psi _T^{(\kappa )}(x)\) and \(\psi _0^{(\kappa )}(x)\) both have minimum at \(x_0\) given by \(f(x_0)=f(u)-\frac{1}{2\kappa }F(u)^2\). This makes \(x_0\) the most likely position at time \(t=0\) and \(t=T\), which is different from the quasi-stationary position u.

As a consequence of (52a52b) we get the large deviation function (48) in the quasi-stationary regime

$$\begin{aligned} \psi _\text {qs}^{(\kappa )}(x)=2\,\text {sgn}(x-u)\int _{u}^{x}dz\; \sqrt{F(z)^2-F(u)^2-2\kappa [f(z)-f(u)]} \end{aligned}$$
(53)

This shows that \(x=u\) is the most likely position in the quasi-stationary regime.

Remarks

  1. 1.

    In this example, one could systematically calculate sub-leading corrections in the eigenvalue and eigenvector. Writing

    $$\begin{aligned} r_{\frac{\kappa }{\epsilon }}(x)= e^{-\frac{1}{\epsilon }\psi _\text {right}^{(\kappa )}(x)-{\widetilde{\psi }}_ \text {right}^{(\kappa )}(x)+\cdots }, \qquad \mu \left( \frac{\kappa }{\epsilon }\right) = \frac{1}{\epsilon }\chi (\kappa )+{\widetilde{\chi }}(\kappa )+\cdots \end{aligned}$$

    in (35) (we are using \(h(x)=0\)) and expanding in powers of \(\epsilon \) one would get in the sub-leading order

    $$\begin{aligned} -F'(x)+\left[ F(x)+\partial _x\psi _\text {right}^{(\kappa )}(x)\right] \partial _x {\widetilde{\psi }}_\text {right}^{(\kappa )}(x)-\frac{1}{2}\partial _x^2 \psi _\text {right}^{(\kappa )}(x)={\widetilde{\chi }}(\kappa ) \end{aligned}$$
    (54)

    Using (52a) we see that the term \(F(x)+\partial _x\psi _\text {right}^{(\kappa )}(x)\) in (54) vanishes at \(x=u\). Moreover, from (52a) we get

    $$\begin{aligned} \lim _{x\rightarrow u}\partial _x^2\psi _\text {right}^{(\kappa )}(x)=-F'(u)+\sqrt{F'(u)^2+F(u)F''(u)-\kappa f''(u)} \end{aligned}$$

    This and the fact that \(\partial _x\psi _\text {right}^{(\kappa )}(x)=-F(x)\) for \(x=u\) gives for the sub-leading order correction to the eigenvalue

    $$\begin{aligned} {\widetilde{\chi }}(\kappa )=-\frac{1}{2}\left[ F'(u)+\sqrt{F'(u)^2+F(u)F''(u)-\kappa f''(u)}\right] \end{aligned}$$
    (55)

    An explicit expression for \({\widetilde{\psi }}_\text {right}^{(\kappa )}(x)\) could also be deduced from (52a) and (54).

  2. 2.

    One can also check that the results for the Ornstein–Uhlenbeck process in Sect. 3.3 can be recovered by choosing \(f(x)=x\) and \(F(x)=-\gamma x\).

6 A variational formulation

In this section, we use the path integral formulation of the Langevin equation [19,20,21, 83]. A similar formulation has been used recently for large deviation of empirical observables [51, 69, 75, 76, 82]. This gives an alternative approach for the conditioned dynamics. This conditioned process has been used in [50, 51] for specific examples of diffusion on a ring, giving explicit results for the effective dynamics in the quasi-stationary state and revealing dynamical phase transitions.

As in Sect. 5, we consider a Langevin equation on an infinite line and \(Q_T=\int _0^Tdt\, f(X_t)\), such that \(F(x)^2-2\kappa f(x)\) has a single global minimum and it grows at \(\vert x \vert \rightarrow \infty \).

We introduce the formulation for the generating function \(G_T^{(\lambda )}(x\vert y)\) for the Langevin dynamics. Using a path integral solution of (32) (see Appendix D for details) one can write, for small \(\epsilon \),

$$\begin{aligned} G_T^{(\frac{\kappa }{\epsilon })}(x\vert y)\sim \int _{z(0)=y}^{z(T)=x}\mathcal {D}[z]e^{\frac{1}{\epsilon }S_T^{(\kappa )}[z(t)]} \end{aligned}$$
(56)

where the Action

$$\begin{aligned} S_T^{(\kappa )}[z]=\int _0^Tdt\left\{ \kappa \, f(z)-\frac{{\dot{z}}^2}{2}+ {\dot{z}}\, F(z)-\frac{F(z)^2}{2} \right\} \end{aligned}$$
(57)

One may view (56) as a sum over all paths (connecting y to x during time T) weighted by \(\exp (\frac{1}{\epsilon }S_T^{(\kappa )}[z])\).

In the small \(\epsilon \) limit, if we assume that (56) is dominated by a single path, we get (44) with

$$\begin{aligned} T\chi (\kappa )-\psi _\text {right}^{(\kappa )}(x)-\psi _\text {left}^{(\kappa )}(y)= \max _{z(t)}S_T^{(\kappa )}[z(t)] \end{aligned}$$
(58)

where the maximum is over all possible trajectories z(t) with \(z(0)=y\) and \(z(T)=x\).

6.1 An explicit solution

We will first show how this variational approach allows one to recover the results of Sect. 5. As before, we limit our discussions to the case where \(F^2(x)-2\kappa \, f(x)\) has a single global minimum at \(x=u\). It will be clear shortly, that in the variational formulation, this condition ensures a single time independent optimal path.

Fig. 3
figure 3

A schematic of the optimal path for the variational problem in Sect. 6.1

Using variational calculus we get from (5657) that the optimal path follows

$$\begin{aligned} \ddot{z}=\frac{d}{dz}\left[ \frac{F(z)^2}{2}-\kappa f(z)\right] \end{aligned}$$

Multiplying the above equation with \(2 {\dot{z}}\) and integrating we get

$$\begin{aligned} {\dot{z}}^2=F(z)^2-2\kappa f(z)+K \end{aligned}$$

where K is an integration constant. We see the similarity with the trajectory of a mechanical particle of constant energy \(\frac{1}{2}K\) in a potential \(\kappa f(z)-\frac{F(z)^2}{2}\) which has a single global maximum at \(x=u\). The trajectory has to cover a finite distance from the point y to the point x in a very large time T. The only possible way this could happen if the trajectory passes arbitrarily close to u which is a repulsive fixed point of the mechanical dynamics. This requires an energy almost equal to the maximum of the mechanical potential, with the difference vanishing as T grows. This gives \(K=2\kappa f(u)-F(u)^2\) and the optimal path

$$\begin{aligned} {\dot{z}}^2=F(z)^2-2\kappa f(z)+2 \kappa f(u)-F(u)^2 \end{aligned}$$
(59)

Such a trajectory spends most of its time in the position u, and deviates from it only near the boundary to comply with the condition \(z(0)=y\) and \(z(T)=x\), as sketched in Fig. 3. Then, we can write the optimal path (59), for large T, as

$$\begin{aligned} {\dot{z}}(t)={\left\{ \begin{array}{ll}\text {sgn}(u-y)\sqrt{F(z)^2-2\kappa f(z)+2 \kappa f(u)-F(u)^2}, &{} \text { for }0\le t\ll T, \\ 0, &{} \text {for }1\ll t \text { and }T-t\gg 1,\\ \text {sgn}(x-u)\sqrt{F(z)^2-2\kappa f(z)+2 \kappa f(u)-F(u)^2}, &{} \text {for }0\le T-t\ll T. \end{array}\right. } \end{aligned}$$

To use this in the variational formula (58) we substitute \(F(z)^2\) from (59) in the expression (57) and get

$$\begin{aligned} \max _{z(t)}S_T^{(\kappa )}[z(t)]=T \left[ \kappa \, f(u)-\frac{1}{2}F(u)^2\right] +\int _0^{t_0} dt\, {\dot{z}}\, \left[ F(z)-{\dot{z}}\right] +\int _{t_0}^T dt\, {\dot{z}}\left[ F(z)-{\dot{z}}\right] \end{aligned}$$

where \(t_0\in [0,T]\). We see that, the integration variable can be changed to z, and when \(1\ll t_0\) and \(T-t_0\gg 1\), we can use \(z(t_0)=u\), in addition to the boundary condition \(z(0)=y\) and \(z(T)=x\). Using the explicit solution of \({\dot{z}}(t)\), given above, we get

$$\begin{aligned} \max _{z(t)}S_T^{(\kappa )}[z(t)]&=T \left[ \kappa \, f(u)-\frac{1}{2}F(u)^2\right] \\&\quad -\int _u^{y} dz\, \left[ F(z)+\text {sgn}(y-u)\sqrt{F(z)^2-2\kappa f(z)+2 \kappa f(u)-F(u)^2}\right] \\&\quad -\int _u^{x} dz\, \left[ -F(z)+\text {sgn}(x-u)\sqrt{F(z)^2-2\kappa f(z)+2 \kappa f(u)-F(u)^2}\right] \end{aligned}$$

When we use this result in the variational formula (58) for large T, we get \(\chi (\kappa )=\left[ \kappa \, f(u)-\frac{1}{2}F(u)^2\right] \), in agreement with our earlier result in (51). Moreover, we see that the second and third term gives \(\psi _\text {left}^{(\kappa )}(y)\) and \(\psi _\text {right}^{(\kappa )}(x)\) in (52a52b).

6.2 Large deviation function

One could write a similar variational formula for \(\psi _t^{(\kappa )}(x)\), defined in (47), at an arbitrary time t. For large T,

$$\begin{aligned} \psi _t^{(\kappa )}(x)\simeq \max _{z}A_T^{(\kappa )}[z(\tau )]- \max _{z(t)=x}A_T^{(\kappa )}[z(\tau )] \end{aligned}$$
(60a)

where the action

$$\begin{aligned} A_T^{(\kappa )}[z(\tau )]=\int _{-\infty }^{\infty }d\tau \left\{ a(\tau )\, f(z)-\frac{{\dot{z}}^2}{2}+ {\dot{z}}\, F(z)-\frac{F(z)^2}{2} \right\} \end{aligned}$$
(60b)

with \(a(\tau )=\kappa \) for \(\tau \in [0,T]\) and \(a(\tau )=0\) elsewhere. The first maximization in (60a) is over all paths, whereas the second maximization is over paths, which are conditioned to be at \(z(\tau )=x\) for \(\tau =t\).

One may understand the formula (60a) as an optimal contribution from an ensemble of paths with probability weight \(e^{\frac{1}{\epsilon } A_T^{(\kappa )}[z]}\) conditioned to pass through x at time t; the first term in (60a) is due to normalization.

Here, we show how one can use this variational approach to derive \(\psi _t^{(\kappa )}(x)\) at an arbitrary time. For this we impose as in Sect. 6.1 that \(F(x)^2-2\kappa \, f(x)\) has a single global minimum such that the most likely position in the quasi-stationary regime is time independent, \(z(\tau )=u\).

6.2.1 Quasi-stationary regime

Among all the five regions in Fig. 1, the simplest is to analyze the quasi-stationary regime where \(1\ll t\) and \(T-t\gg 1\). Here, for the optimization in (60a), one essentially needs to consider paths which asymptotically reach u, both at small t, as well as when t is close to T. A schematic such path is given in Fig. 4.

Fig. 4
figure 4

A schematic of a path leading to a fluctuation x at time t, and subsequent relaxation to the quasi-stationary value u in region III

The analysis is quite similar to that in Sect. 6.1. We get that the optimal path follows

$$\begin{aligned} \frac{dz(\tau )}{d\tau }={\left\{ \begin{array}{ll}\text {sgn}(x-u)\sqrt{F(z)^2-2\kappa f(z)+2 \kappa f(u)-F(u)^2}, &{} \text {for }\tau <t\\ \text {sgn}(u-x)\sqrt{F(z)^2-2\kappa f(z)+2 \kappa f(u)-F(u)^2}, &{} \text {for} \tau >t, \end{array}\right. } \end{aligned}$$
(61)

and using this in (60a) we get

$$\begin{aligned} \psi _t^{(\kappa )}(x)=\int _0^t d\tau \, {\dot{z}} \left[ {\dot{z}}-F(z)\right] +\int _t^T d\tau \, {\dot{z}} \left[ {\dot{z}}-F(z)\right] \end{aligned}$$

Changing the integration variable to z and using the solution (61) with the asymptotics sketched in Fig. 4, we get

$$\begin{aligned} \psi _t^{(\kappa )}(x)=&\int _u^x dz \left[ -F(z)+\text {sgn}(x-u)\sqrt{F(z)^2-2\kappa f(z)+2 \kappa f(u)-F(u)^2}\right] \\&+\int _u^x dz \left[ F(z)+\text {sgn}(x-u)\sqrt{F(z)^2-2\kappa f(z)+2 \kappa f(u)-F(u)^2}\right] \end{aligned}$$

Comparing with the expression in (52a52b) we see that \(\psi _t^{(\kappa )}(x)=\psi _\text {right}^{(\kappa )}(x)+\psi _\text {left}^{(\kappa )}(x)\), in agreement with our earlier result (48) and (53).

Remark

From (61) one could see that the optimal path leading to a fluctuation in the quasi-stationary regime and subsequent relaxation follows a deterministic evolution in the potential landscape of \(\psi _\text {right}^{(\kappa )}\) and \(\psi _\text {left}^{(\kappa )}\).

$$\begin{aligned} \frac{dz(\tau )}{d\tau }=&F(z)+\frac{d}{dz} \psi _\text {right}^{(\kappa )}(z)=-\frac{d}{dz}\left[ U(z)- \psi _\text {right}^{(\kappa )}(z)\right] \quad \text{ for } \tau <t, \end{aligned}$$
(62a)
$$\begin{aligned} \frac{dz(\tau )}{d\tau }=&F(z)-\frac{d}{dz} \psi _\text {left}^{(\kappa )}(z)=-\frac{d}{dz}\left[ U(z)+\psi _\text {left}^{(\kappa )}(z)\right] \qquad \text{ for } \tau >t. \end{aligned}$$
(62b)

6.2.2 Region II \((0\le t \ll T)\)

The calculation of \(\psi _t^{(\kappa )}(x)\) in other regions of time is quite similar. For example, in region II, in the variational formula (60a), one essentially needs to consider paths which started at the minimum of U(x) (with \(F(x)=-U'(x)\)) when \(\tau \rightarrow -\infty \), pass through \(z=x\) at \(\tau =t\ge 0\), and asymptotically reach the quasi-stationary value u for large time \(\tau \gg 1\), as illustrated in Fig. 5.

Following an analysis similar to that in Sect. 6.1 it is straightforward to show that the optimal path in this case

$$\begin{aligned} {\dot{z}}(\tau )={\left\{ \begin{array}{ll}-F(z), &{} \text {for }\tau \le 0\\ \text {sgn}(x-y)\sqrt{F(z)^2-2\kappa f(z)+K_1}, &{} \text {for } 0\le \tau \le t\\ \text {sgn}(u-x)\sqrt{F(z)^2-2\kappa f(z)+K_2}, &{} \text {for }\tau \ge t, \end{array}\right. } \end{aligned}$$
(63)

where \(K_1\) and \(K_2\) are integration constants, and the optimal path passes through \(z(0)=y\) (say) when \(\tau =0\). The solution for \(\tau \le 0\) is easy to see from the condition that at \(\tau \rightarrow -\infty \) the system started at the minimum of the potential U(z) with \(F(z)=-U'(z)\). Similar asymptotics that for large time the system relaxes to the quasi-stationary position \(z=u\) gives the constant \(K_2=2 \kappa f(u)-F(u)^2\). In addition, we have the condition

$$\begin{aligned} t=\int _0^td\tau =\int _y^{x}\frac{dz}{{\dot{z}}}=\int _y^{x}\frac{dz}{\text {sgn}(x-y)\sqrt{F(z)^2-2\kappa f(z)+K_1}} \end{aligned}$$
(64)

where we used the solution (63) and this fixes the constant \(K_1\).

When we use the solution (63) to write \(F(z)^2\) in the expression (60b), we get

$$\begin{aligned} \max _{z(t)=x}A_T^{(\kappa )}[z(\tau )]=(T-t)\left[ \kappa f(u)-\frac{F(u)^2}{2}\right] +t\,\frac{K_1}{2}-\int _{-\infty }^T d\tau \,{\dot{z}}\big [{\dot{z}}-F(z)\big ] \end{aligned}$$

Using this in (60a) and the result that \(\max _{z(t)}A_T^{(\kappa )}[z(\tau )]=T\left[ \kappa f(u)-\frac{F(u)^2}{2}\right] \), we get

$$\begin{aligned} \psi _t^{(\kappa )}(x)=t \left[ \kappa f(u)-\frac{F(u)^2}{2}\right] -t\,\frac{K_1}{2}+\int _{-\infty }^T d\tau \,{\dot{z}}\big [{\dot{z}}-F(z)\big ] \end{aligned}$$

In this expression, the integration variable can be changed from \(\tau \) to z, and then using the explicit solution (63), we get

$$\begin{aligned} \psi _t^{(\kappa )}(x)=t \left[ \kappa f(u)-\frac{F(u)^2}{2}\right] +\psi _\text {left}^{(\kappa )}(x) +{\widehat{B}}_t^{(\kappa )}(x,y)+\mathcal {F}(y) \end{aligned}$$
(65a)

where \(\psi _\text {left}^{(\kappa )}(x)\) is given in (52b), \(\mathcal {F}(y)=-2\int _0^y dz\, F(z)\) and

$$\begin{aligned} {\widehat{B}}_t^{(\kappa )}(x,y)=-t\,\frac{K_1}{2}+\int _y^x dz\left[ -F(z)+\text {sgn}(x-y)\sqrt{F(z)^2-2\kappa f(z)+K_1} \right] \end{aligned}$$
(65b)

We note that the condition (64) is equivalent to \(\partial _{K_1}{\widehat{B}}_t^{(\kappa )}(x,y)=0\), which relates \(K_1\) to y. In addition, the solution (65a) must be optimal over a variation in y. These two conditions together leads to \(\partial _y{\widehat{B}}_t^{(\kappa )}(x,y)=2F(y)\), which with the formula (65b) gives \(K_1=2\kappa \, f(y)\). We note that this is equivalent of continuity of \({\dot{z}}(\tau )\) at \(\tau =0\) in the solution (63). This result for \(K_1\), along with (64) and (65a65b) gives a parametric solution of \(\psi _t^{(\kappa )}(x)\) in region II.

We have checked that the same result could be derived using the eigenfunction of the tilted Fokker–Planck operator discussed earlier in Sect. 4.

Fig. 5
figure 5

A schematic of a path leading to a fluctuation x at t in region II, and subsequent relaxation to the quasistationary position u

6.3 The Hamilton–Jacobi equations from the variational approach

In Sect. 4 we have shown how one can write the large deviation function in terms of a solution of the Hamilton–Jacobi equations (43a43b) derived from the tilted Fokker–Planck operator. In this section, we describe how the same equations can be obtained using the variational formulation in (58). The advantage is that this variational approach can be extended to more general problems (see our future publication [52], where the approach becomes a generalization of the one by Bertini et al. [79]).

We start with a derivation of (43a). Using the definition (6) one can write for the Langevin equation

$$\begin{aligned} G_T^{(\lambda )}(x\vert y)=\int dz\, G_{T-t}^{(\lambda )}(x\vert z)G_{t}^{(\lambda )}(z\vert y) \end{aligned}$$
(66)

A schematic illustrating this integration is shown in Fig. 6.

Fig. 6
figure 6

A schematic of the sample of paths contributing in the time convolution in (66)

Using the large deviation form (44) and the path integral representation (56), for small \(\epsilon \), it is straightforward to write

$$\begin{aligned} t\chi (\kappa )-\psi _\text {left}^{(\kappa )}(y)\simeq \max _z\left\{ S_{t}^{(\kappa )}(z, y)-\psi _\text {left}^{(\kappa )}(z)\right\} \end{aligned}$$
(67)

where, from the Action (57), we get for small t,

$$\begin{aligned} S_{t}^{(\kappa )}(z, y)= t \, \kappa f(y)-\frac{1}{2}\left[ \frac{(z-y)}{t}-F(y)\right] ^2+\cdots \end{aligned}$$

Expanding (67) around y we get

$$\begin{aligned} \chi (\kappa )\simeq \kappa \, f(y)-\frac{F(y)^2}{2}+\frac{1}{t}\max _z\left\{ (z-y)[F(y)-\partial _y\psi _\text {left}^{(\kappa )}(y)]-\frac{( z-y)^2}{2\,t}\right\} \end{aligned}$$

Higher order terms in the expansion are negligible in the small t limit.

In this expression, the maximum is for

$$\begin{aligned} \frac{( z-y)}{t}=F(y)-\partial _y\psi _\text {left}^{(\kappa )}(y) \end{aligned}$$

Substituting this in the above expression for \(\chi (\kappa )\) and taking \(t\rightarrow 0\) limit we recover the Hamilton–Jacobi equation (43a) for \(h(x)=0\). One can similarly derive the Hamilton–Jacobi equation (43b) for \(\psi _\text {right}^{(\kappa )}(x)\). The analysis could be extended for \(h(x)\ne 0\), as well.

7 The effect of conditioning on the noise

In Sect. 3, we found (shown earlier in [44,45,46,47]) that the Langevin dynamics (28) conditioned on \(Q_T\), for large T, can be effectively described by another Langevin equation with a modified force (39a39d), and still a white noise. Here, we show that the noise realizations in the original Langevin equation (28), which are compatible with the condition on (30), are colored.

7.1 The effective dynamics

In (39a39d) we have seen that the Langevin dynamics conditioned on \(Q_T\) in (30) can be described, in the large T limit, by another Langevin dynamics with an effective force \(F_t^{(\lambda )}(x)\) and a Gaussian white noise \({\widetilde{\eta }}_t\) with mean zero and covariance \(\langle {\widetilde{\eta }}_t{\widetilde{\eta }}_{t'}\rangle =\epsilon \,\delta (t-t')\). In the weak noise limit, the effective force in the quasi-stationary regime (\(t\gg 1\) and \(T-t\gg 1\)) is given by (50) with (53), when \(h(x)=0\) in (30). So the effective dynamics, for large T and small \(\epsilon \), is

$$\begin{aligned} {\dot{X}}_t=-\text {sgn}(X_t-u)\sqrt{F(X_t)^2-F(u)^2-2\kappa [f(X_t)-f(u)]} +{\widetilde{\eta }}_t. \end{aligned}$$
(68)

In this quasi-stationary regime, the most probable position \(X_t=u\) is time independent (under the condition that \(F(x)^2-2\kappa f(x)\) has a single global minimum at \(x=u\)). Writing small fluctuations \(r_t=X_t-u\) around u, we get from (68)

$$\begin{aligned} {\dot{r}}_t=-\varGamma _u \, r_t+{\widetilde{\eta }}_t, \qquad \text {with}\qquad \varGamma _u=\sqrt{F'(u)^2+F(u)F''(u)-\kappa f''(u)}. \end{aligned}$$

The solution

$$\begin{aligned} r_t=\int _{-\infty }^t dt'\, e^{-\varGamma _u\, (t-t')}\,{\widetilde{\eta }}_{t'} \end{aligned}$$

leads to the following correlation

$$\begin{aligned} \left\langle X_tX_{t'}\right\rangle _c=\left\langle r_t r_{t'}\right\rangle =\frac{\epsilon }{2\varGamma _u}e^{-\varGamma _u\vert t-t'\vert }. \end{aligned}$$
(69)

So, for small \(\epsilon \), a typical trajectory of the effective dynamics in the quasi-stationary state has small fluctuations around \(X_t=u\) with correlation (69).

7.1.1 The conditioned dynamics

If we come back to the original Langevin equation (28),

$$\begin{aligned} {\dot{Y}}_t=F(Y_t)+\eta _t \end{aligned}$$
(70)

then, \(\eta _t\) is a priori delta-correlated in time. We are now going to show that conditioning on a value of \(Q_T\) for large T, induces correlations of the noise \(\eta _t\). To do so we use the fact that at least for small \(\epsilon \) the trajectories of the dynamics (70) when conditioned on \(Q_T\) for large T are the same as for the effective dynamics (68) and therefore

$$\begin{aligned} {\langle Y_t\vert Q_T \rangle =\langle X_t \rangle \quad \text {and}}\quad \langle Y_t Y_{t'}\vert Q_T \rangle _c=\langle X_t X_{t'}\rangle _c \end{aligned}$$
(71)

Small fluctuations \(s_t=Y_t-u\) in the quasi-stationary regime are generated by a noise realization \(\eta _t\) in (70) given by

$$\begin{aligned} \eta _t\simeq -F(u)+{\dot{s}}_t-F'(u)\, s_t \end{aligned}$$
(72)

Then, using (69), (71), and (72) one gets

$$\begin{aligned} \left\langle Y_t \, \eta _{t'}\vert Q_T\right\rangle _c=\left\langle s_t\, \eta _{t'} \vert Q_T\right\rangle ={\left\{ \begin{array}{ll}g_R(t'-t), &{} \text { for } t'> t\\ g_F(t-t'), &{} \text { for } t'< t \end{array}\right. } \end{aligned}$$
(73a)

where

$$\begin{aligned} g_F(t)={ - F'(u)+\varGamma _u \over 2 \varGamma _u}e^{-\varGamma _u t}\qquad \text {and}\qquad g_R(t)={ -F'(u)- \varGamma _u \over 2 \varGamma _u}e^{-\varGamma _u t} \end{aligned}$$
(73b)

In this description (73a), we see that the fluctuation \(s_t\) is correlated not only to the noise in the past, but also to the noise in the future. Of course, when one removes the conditioning, i.e. for \(\kappa =0\), and using \(F(0)=0\) (assuming \(x=0\) is the stable fixed point for the unconditioned case), one has \(\varGamma _0=-F'(0)\) and \(g_R=0\), as one would expect in a Markovian process. One can also see, using (69), (71), and (72) that

$$\begin{aligned} \left\langle \eta _t\vert Q_T \right\rangle =-F(u)\qquad \text {and}\qquad \langle \eta _t\eta _{t'}\vert Q_T\rangle _c=\epsilon \, \frac{F'(u)^2-\varGamma _u^2}{2\varGamma _u}\, e^{-\varGamma _u\vert t-t'\vert } \end{aligned}$$
(74)

This means that in the conditioned ensemble, the original white noise \(\eta _t\) in the Langevin equation (70) becomes colored due to the conditioning on \(Q_T\), even in the large T limit.

8 Summary

In this work we studied how a stochastic system adapts its dynamics when it is conditioned on a certain value of an empirical observable \(Q_T\) of the form (4). This problem has been studied earlier in [32, 44,45,46,47]. The constrained dynamics in the large T limit is described by an effective Markov process [see (2126)] if the original process is itself Markovian. In the case of the Langevin dynamics, the conditioning modifies the effective force [see (39a39d)]. The description in terms of the effective dynamics in the large T limit comes from an equivalence of ensembles between the microcanonical ensemble [where conditioning is on a fixed value of \(Q_T\), defined in (4) and (30)] and the canonical ensemble (where the dynamics is weighted by \(e^{\lambda \, Q_T}\)). This is similar to the equivalence of thermodynamic ensembles in equilibrium when volume is large. The equivalence of ensembles and several of the expressions obtained in Sects. 2 and 3 were already known [32, 44,45,46,47,48, 59], mostly in the quasi-stationary regime. Here, we extend them to all regions of time.

In the weak noise limit of the Langevin dynamics, one can introduce large deviation functions which characterize fluctuations in the conditioned dynamics, for large T. Using a WKB solution we showed in Sect. 4.1 that these large deviation functions can be expressed in terms of the solution of the Hamilton–Jacobi equations (43a43b). The same result can also be derived (see Sect. 6) using a variational formulation, where the large deviation functions are related to the minimum of the Action that characterizes the path-space probability. Within this variational approach, one can calculate the optimal trajectory, which describes how atypical fluctuations are generated and how they relax (6163). A similar approach to our variational formulation was used recently [50, 51] in the quasi-stationary regime of a Langevin dynamics in a periodic potential.

One of the rather surprising aspects in the Langevin dynamics (28) is that the noise realizations, which are compatible with the condition on \(Q_T\) in (30) become correlated over time [see (74)]. Moreover, fluctuations of the position at a time become correlated to the noise in the future.

The examples discussed in this paper are simple as they deal with a single degree of freedom. They are part of a theory which is rather general. In a forthcoming publication [52] we shall apply the same ideas for a system with many degrees of freedom [12, 24, 30], e.g. the symmetric exclusion process. The variational approach discussed here for the Langevin dynamics can be generalized for large systems where the weak noise limit comes from the large volume. Several of the ideas used in this paper will be extended there.

We have seen in (18c) and (38c) that in the quasi-stationary regime the canonical measure is a product of the left and right eigenvectors corresponding to the largest eigenvalue of the tilted matrix. Even in the non-stationary regime [see (25a)] the canonical measure is a product of a left vector and a right vector, which evolve according to linear equations. This is reminiscent of Quantum Mechanics, where probability is expressed as a product of the wave function, as already noted by Schrödinger [84] (see also [85, 86] for additional references).