1 Introduction

In this paper we study the Kardar–Parisi–Zhang (KPZ) equation in one spatial dimension

$$\begin{aligned} \partial _t h= \tfrac{1}{2} \partial _{xx} h+ \tfrac{1}{2} (\partial _x h)^2 + \xi , \end{aligned}$$
(1.1)

where \( h=h(t,x) \), \( (t,x)\in (0,\infty )\times \mathbb {R}\), and \( \xi =\xi (t,x) \) denotes the spacetime white noise. The equation was introduced by [KPZ86] to describe the evolution of a randomly growing interface, and is connected to many physical systems including directed polymers in a random environment, last passage percolation, randomly stirred fluids, and interacting particle systems. The equation exhibits integrability and has statistical distributions related to random matrices. We refer to [FS10, Qua11, Cor12, QS15, CW17, CS19] and the references therein for the mathematical study of and related to the KPZ equation.

Due to the roughness of \( h\), the term \( (\partial _xh)^2 \) in (1.1) does not make literal sense, and the well posedness of the Kardar–Parisi–Zhang (KPZ) equation requires renormalization [Hai14, GIP15]. In this paper we work with the notion of Hopf–Cole solution. Informally exponentiating \( Z= \exp (h) \) brings the Kardar–Parisi–Zhang (KPZ) equation to the Stochastic Heat Equation (SHE)

$$\begin{aligned} \partial _t Z= \tfrac{1}{2} \partial _{xx} Z+ \xi Z. \end{aligned}$$
(1.2)

It is standard to establish the well posedness of (1.2) by chaos expansion; see Sect. 2.1.1 for more discussions on Wiener chaos. For a function-valued initial data \( Z(0,\cdot ) \ge 0 \) that is not identically zero, [Mue91] showed that \( Z(t,x)>0 \) for all \( t>0 \) and \( x\in \mathbb {R}\) almost surely. The Hopf–Cole solution of the Kardar–Parisi–Zhang (KPZ) equation is then defined as \( h\,{:}{=}\,\log Z\). This notion of solution coincides with that of [Hai14, GIP15] under suitable assumptions. An often considered initial data is to start the Stochastic Heat Equation (SHE) from a Dirac delta at the origin, i.e., \( Z(0,\cdot )=\delta _0(\cdot ) \), which is referred to as the narrow wedge initial data for \( h\). For such an initial data, [Flo14] established the positivity for \( Z(t,x) \) so that the Hopf–Cole solution \( h\,{:}{=}\,\log Z\) is well-defined.

Large deviations of the KPZ equation have been intensively studied in the mathematics and physics communities in recent years. Results are quite fruitful in the long time regime, \( t\rightarrow \infty \). For the narrow wedge initial data, physics literature predicted that the one-point, lower-tail Large Deviation Principle (LDP) rate function should go through a crossover from a cubic power to a \( \frac{5}{2} \) power [KLD18b]. (The prediction of the \( \frac{5}{2} \) power actually first appeared in the short time regime; see the discussion about the short time regime below.) The work [CG20b] derived rigorous, detailed bounds on the one-point tail probabilities for the narrow wedge initial data and in particular proved the cubic-to-\( \frac{5}{2} \) crossover. Similar bounds are obtained in [CG20a] for general initial data. The exact lower-tail rate function were derived in the physics works [SMP17, CGK+18, KLDP18, LD19], and was rigorously proven in [Tsa18, CC19]. Each of these works adopts a different method. In [KLD19], the four methods in [SMP17, CGK+18, KLDP18, Tsa18] were shown to be closely related. As for the upper tail, the physics work [LDMS16] derived a \( \frac{3}{2} \) power law for the entire rate function under the narrow wedge initial data, and [DT19] gave a rigorous proof for this upper-tail Large Deviation Principle (LDP). The work [GL20] extended this upper-tail Large Deviation Principle (LDP) to general initial data.

For the finite time regime, \( t\in (0,\infty ) \) fixed, motivated by studying the positivity or regularity (of the one-point density) of the Stochastic Heat Equation (SHE) or related equations, the works [Mue91, MN08, Flo14, CHN16, HL18] established tail probability bounds of the Stochastic Heat Equation (SHE) or related equations.

In this paper we focus on short time large deviations of the Kardar–Parisi–Zhang (KPZ) equation. Employing the Weak Noise Theory (WNT), the physics works [KK07, KK09, MKV16, KMS16] predicted that the one-point, lower-tail rate function should crossover from a quadratic power law to a \( \frac{5}{2} \) power law for the narrow wedge and flat initial data. By analyzing an exact formula, the physics work [LDMRS16] obtained the entire one-point rate function for the narrow wedge initial data; see Sect. 1.4. This was confirmed by the numerical result [HLDM+18]. From this one-point rate function [LDMRS16] also demonstrated the crossover. The quadratic power arises from the Gaussian nature of the Kardar–Parisi–Zhang (KPZ) equation in short time, while the \( \frac{5}{2} \) power appears to be a persisting trait of the deep lower tail of the Kardar–Parisi–Zhang (KPZ) equation in all time regimes. Our main result gives the first proof of the short time Large Deviation Principle (LDP) for the Kardar–Parisi–Zhang (KPZ) equation and the quadratic-to-\( \frac{5}{2} \) crossover.

Theorem 1.1

Let h denote the solution of the Kardar–Parisi–Zhang (KPZ) equation (1.1) with the initial data \( Z(0,\cdot )=\delta _0(\cdot ) \).

  1. (a)

    For any \( \lambda >0 \), the limits exist

    $$\begin{aligned} \lim _{t\rightarrow 0 } t^\frac{1}{2} \log \mathbb {P}\big [ h(2t,0) + \log \sqrt{4\pi t} \le -\lambda \big ]&=: -\Phi (-\lambda ),\\ \lim _{t\rightarrow 0 } t^\frac{1}{2} \log \mathbb {P}\big [ h(2t,0) + \log \sqrt{4\pi t} \ge \lambda \big ]&=: -\Phi (\lambda ). \end{aligned}$$
  2. (b)

    \( \displaystyle \lim _{\lambda \rightarrow 0 } \lambda ^{-2} \Phi (\lambda ) = \tfrac{1}{\sqrt{2\pi }}. \)

  3. (c)

    \( \displaystyle \lim _{\lambda \rightarrow \infty } \lambda ^{-\frac{5}{2}} \Phi (-\lambda ) = \tfrac{4}{15\pi }. \)

Remark 1.2

Our method works also for the flat initial data \( h(0,x) \equiv 0 \), but we treat only the narrow wedge initial data to keep the paper at a reasonable length.

Our result generalizes immediately to h(2tx) , for \( x\in \mathbb {R}\). This is because, under the delta initial data, the one-point law of \( Z(2t,x)/p(2t,x) \) does not depend on x. This fact can be verified from the Feynman–Kac formula for the Stochastic Heat Equation (SHE).

Remark 1.3

Even though Large Deviation Principle (LDP) rate functions are model dependent, the \( \frac{5}{2} \) tail seems to be somewhat ubiquitous in the Kardar–Parisi–Zhang (KPZ) class. It shows up in all time regimes for the Kardar–Parisi–Zhang (KPZ) equation, and has also been observed in the TASEP [DL98]. A very interesting question is to investigate to what extend is the \( \frac{5}{2} \) tail universal, and to find a unifying approach to understand the origin of the tail.

Remark 1.4

The aforementioned physics works [KK09, MKV16, LDMRS16, KMS16] also derived the asymptotics of the deep upper tail. The prediction is \( \lim _{\lambda \rightarrow \infty } \lambda ^{-3/2} \Phi (\lambda ) = \frac{4}{3} \). We leave this question for future work.

Remark 1.5

The short-time large deviations for the KPZ equation were also studied under other initial data or on a half-line. For the KPZ equation starting from Brownian initial data, the problem was studied in physics works [KLD17, MS17]. For the half-line KPZ equation, the same problem was studied in the physics work [KLD18a, MV18]; see also [Kra19] for a summary of these results. It is interesting to see whether our method generalizes in these situations.

Let us emphasize that, even though we follow the overarching idea of the Weak Noise Theory (WNT), our method significantly differs from existing physics heuristics. As will be explained below, the Weak Noise Theory (WNT) amounts to establishing a Freidlin–Wentzell LDP and analyzing the corresponding variational problem. The second step—analyzing the variational problem—is the harder step. The physics works [KK09, MKV16, KMS16] provide convincing heuristic for this step by a formal PDEs argument. However, as will be explained in Sect. 1.1.1, to make this PDE argument rigorous requires elaborate treatments and seems challenging. We hence adopt a different method.

In Sect. 1.1, we will recall the physics heuristic from [KK09, MKV16, KMS16] and explain why it seems challenging to make the heuristic rigorous. In Sect. 1.2, we will explain our method for proving Theorem 1.1.

1.1 Discussions about the physics heuristics

Here we recall the method used in the physics works [KK09, MKV16, KMS16]. The first step is to perform scaling to turn the short-time Large Deviation Principle (LDP) into a Freidlin–Wentzell Large Deviation Principle (LDP). One scales

$$\begin{aligned} h_\varepsilon (t,x) \,{:}{=}\, h( \varepsilon t, \varepsilon ^{1/2} x) + \log (\varepsilon ^{1/2}), \end{aligned}$$
(1.3)

which brings the Kardar–Parisi–Zhang (KPZ) equation into

$$\begin{aligned} \partial _t h_\varepsilon = \tfrac{1}{2} \partial _{xx} h_\varepsilon + \tfrac{1}{2} (\partial _x h_\varepsilon )^2 + \sqrt{\varepsilon }\xi . \end{aligned}$$
(1.4)

The term \( \log (\varepsilon ^{1/2}) \) in (1.3) ensures that the narrow wedge initial data stays invariant. Equation (1.4) is in the form for studying Freidlin–Wentzell LDPs. Roughly speaking, for a generic \( \rho \in L^2([0,T]\times \mathbb {R}) \), we expect \( \mathbb {P}[ \sqrt{\varepsilon }\xi \approx \rho ] \approx \exp (-\frac{1}{2} \varepsilon ^{-1}\Vert \rho \Vert _{L^2}^2) \). When the event \( \{\sqrt{\varepsilon }\xi \approx \rho \} \) occurs, one expects \( h_\varepsilon \) to approximate the solution \( \mathsf {h}=\mathsf {h}(\rho ;t,x) \) of

$$\begin{aligned} \partial _t \mathsf {h}= \tfrac{1}{2} \partial _{xx} \mathsf {h}+ \tfrac{1}{2} (\partial _x \mathsf {h})^2 + \rho . \end{aligned}$$
(1.5)

In more formal terms, one expects \( \{h_\varepsilon \} \) to satisfy an Large Deviation Principle (LDP) with speed \( \varepsilon ^{-1} \) and the rate function \( J(f) = \inf \{ \frac{1}{2}\Vert \rho \Vert _{L^2} : \mathsf {h}(\rho ) = f \} \). Once such an Large Deviation Principle (LDP) is established in a suitable space, by the contraction principle we should have

$$\begin{aligned} \Phi (\lambda ) = - \lim _{\varepsilon \rightarrow 0} \varepsilon \log \mathbb {P}\big [ h_\varepsilon (2,0) \ge \lambda \big ]&= \inf \big \{ \tfrac{1}{2}\Vert \rho \Vert _{L^2}^2 \,:\, \mathsf {h}(\rho ;2,0) \ge \lambda \big \},\nonumber \\&\quad \lambda >0, \end{aligned}$$
(1.6)
$$\begin{aligned} \Phi (-\lambda ) = - \lim _{\varepsilon \rightarrow 0} \varepsilon \log \mathbb {P}\big [ h_\varepsilon (2,0) \le -\lambda \big ]&= \inf \big \{ \tfrac{1}{2}\Vert \rho \Vert _{L^2}^2 \,:\, \mathsf {h}(\rho ;2,0) \le -\lambda \big \},\nonumber \\&\quad -\lambda <0. \end{aligned}$$
(1.7)

To find the infimum in (1.7), one can perform variation of \( \frac{1}{2} \Vert \rho \Vert _{L^2}^2 = \frac{1}{2} \int _0^2 \int _{\mathbb {R}} \rho ^2 \, \mathrm {d}x\mathrm {d}t \) in \( \rho \) under the constraint \( \mathsf {h}_\lambda (\rho ;2,0) = -\lambda \), c.f., [MKV16, Sect A, Supplementary Material]. The result suggests that any minimizer \( \rho \) should solve

$$\begin{aligned} \partial _t \rho = -\tfrac{1}{2} \partial _{xx} \rho + \partial _x (\rho \,\partial _x\mathsf {h}). \end{aligned}$$
(1.8)

With a negative Laplacian \( -\tfrac{1}{2} \partial _{xx}\rho \), the equation (1.8) needs to be solved backward in time from the terminal data \( \rho (2,x) = - c(\lambda ) \delta _0(x) \), c.f., [MKV16, Sect A, Supplementary Material], where \( c(\lambda )>0 \) is a constant fixed by \( \mathsf {h}(\rho ;2,0) = -\lambda \).

In the near-center regime, i.e., \( \lambda \rightarrow 0 \), standard perturbation arguments can be applied to analyze (1.5) and (1.8) to conclude the quadratic power law.

We will focus on the deep lower tail regime, i.e., \( -\lambda \rightarrow -\infty \). We scale \( \lambda ^{-1}\mathsf {h}(\rho ;t,\lambda ^{1/2}x) \mapsto \mathsf {h}(\rho ;t,x) \) and \( \lambda ^{-1} \rho (t,\lambda ^{1/2}x) \mapsto \rho (t,x) \). To see why such scaling is relevant, note that, under the conditioning \( \mathsf {h}(\rho ;2,0) \le -\lambda \), it is natural to scale \( \mathsf {h}\) by \( \lambda ^{-1} \). Time cannot be scaled since we are probing \( \mathsf {h}\) at \( t=2 \). After scaling \( \mathsf {h}\) by \( \lambda ^{-1} \), we find that the quadratic term \( \frac{1}{2}(\partial _x\mathsf {h})^2 \) in (1.5) gains an excess \( \lambda \) factor compared to the left hand side. To bring the quadratic term back to the same footing as the left hand side, we scale x by \( \lambda ^{-1/2} \). Similar considerations lead to the same scaling of \( \rho \). Under such scaling the Eqs. (1.5) and (1.8) become

$$\begin{aligned} \partial _t \mathsf {h}&= \tfrac{1}{2} \lambda ^{-1} \partial _{xx} \mathsf {h}+ \tfrac{1}{2} (\partial _x \mathsf {h})^2 + \rho , \end{aligned}$$
(1.9)
$$\begin{aligned} \partial _t \rho&= -\tfrac{1}{2} \lambda ^{-1} \partial _{xx} \rho + \partial _x (\rho \,\partial _x \mathsf {h}). \end{aligned}$$
(1.10)

As \( \lambda \rightarrow \infty \) it is tempting to drop the Laplacian terms in (1.9)–(1.10). Doing so produces

$$\begin{aligned} \partial _t \mathsf {h}&= \tfrac{1}{2} (\partial _x \mathsf {h})^2 + \rho , \end{aligned}$$
(1.11)
$$\begin{aligned} \partial _t \rho&= \partial _x (\rho \,\partial _x \mathsf {h}), \end{aligned}$$
(1.12)

with the initial data \( \lim _{t\downarrow 0} (\mathsf {h}(t,x) t) = -\frac{1}{2} x^2 \) and the terminal data \( \rho (2,x) = -c(1)\delta _0(x) \).

Equations (1.11)–(1.12) can be solved by the procedure in [KK09, MKV16, KMS16]. For the completeness of presentation we briefly recall the procedure below. It begins by solving (1.11)–(1.12) by power series expansion in x. In view of the initial data of \( \mathsf {h}\) and the terminal data of \( \rho \), it is natural to assume \( \mathsf {h}(t,x)=\mathsf {h}(t,-x) \) and \( \rho (t,x)=\rho (t,-x) \). Under such assumptions, the series terminates at the quadratic power for both \( \mathsf {h}\) and \( \rho \) and produces the solution \( \mathsf {h}(t,x) = k(t) + \frac{1}{2} a(t) x^2 \) and \( \rho (t,x) = -\frac{1}{2\pi } r(t)+\tfrac{1}{2\pi } (r(t)/\ell ^2(t))x^2 \). The factor \( \frac{1}{2\pi } \) is just a convention we choose; the functions a(t), k(t), r(t),  and \( \ell (t) \) can be found by inserting the series solution in (1.11)–(1.12). The only relevant property to our current discussion is that \( r(t)>0 \).

The series solution, however, is nonphysical. Indeed, with \( r(t)>0 \), we have \( \Vert \rho \Vert _{L^2} = \infty \). This issue is rectified by observing that the minimizing \( \rho \) of the right hand side of (1.7) should be nonpositive. This is so because \( \mathsf {h}(\rho ;t,x) \) increases in \( \rho \). Hence the positive part \( \rho _+ \) of \( \rho \) would only make \( \mathsf {h}(\rho ;2,0) = -1 \) harder to achieve while costing excess \( L^2 \) norm. This observation prompts us to truncate

$$\begin{aligned} \rho _*(t,x) \,{:}{=}\, - \tfrac{1}{2\pi } r(t)\big ( 1 - \tfrac{x^2}{\ell (t)^2} \big )_+. \end{aligned}$$

It can be verified that such a \( \rho _*\) and a suitably truncated \( \mathsf {h}\) solve (1.11)–(1.12).

Remark 1.6

It may appear that the preceding scaling applies also to the upper-tail regime \( \lambda \rightarrow \infty \), but that is not the case. In the upper-tail regime, the analyses of the physics works [KK09, MKV16, KMS16] show that, in the pre-scaled coordinates, the optimal \( \rho (t,x) \) concentrates in a small corridor of size \( O(\lambda ^{-1/2}) \) around \( x=0 \). This behavior is in sharp contrast with that of the lower-tail, where the optimal \( \rho (t,x) \) spans across a region in x of width \( O(\lambda ^{1/2}) \) in the pre-scaled coordinate. The distinction of behaviors in the upper- and lower-tail regimes is ubiquitous in the KPZ universality class. As a result, the preceding scaling does not apply to the upper-tail regime.

1.1.1 Challenge in making the PDE argument rigorous.

To make this PDE analysis rigorous requires elaborate treatments and seems challenging. This is so because (1.11)–(1.12) are fully nonlinear equations. Taking derivative \(u = \partial _x \mathsf {h}\) in (1.11)–(1.12) gives

$$\begin{aligned} \partial _t u&= \tfrac{1}{2} \partial _x (u^2) + \partial _x \rho ,\\ \partial _t \rho&= \partial _x (\rho u). \end{aligned}$$

These equations do not have unique weak solutions, just like the inviscid Burgers equation [Eva98, Chapter 3.4]. One needs to impose certain entropy conditions to ensure the uniqueness of weak solutions, and argue that in the limit \( \lambda \rightarrow \infty \) the solution of (1.11)–(1.12) converges to the entropy solution.

1.2 Our method

Our method, which differs from the physics heuristic described in Sect. 1.1, operates at the level of the Stochastic Heat Equation (SHE) instead of the Kardar–Parisi–Zhang (KPZ) equation. Recall that we defined the solution of the Kardar–Parisi–Zhang (KPZ) equation thorough the Hopf–Cole transformation, so the solution \( h_\varepsilon \) to (1.4) is given by \( h_\varepsilon \,{:}{=}\, \log Z_\varepsilon + \log (\varepsilon ^{1/2}) \), where \( Z_\varepsilon \) solves

$$\begin{aligned} \partial _t Z_\varepsilon = \tfrac{1}{2} \partial _{xx} Z_\varepsilon + \sqrt{\varepsilon } \xi Z_\varepsilon , \end{aligned}$$
(1.13)

with the delta initial condition \( Z_\varepsilon (0,\cdot ) = \delta _0(x) \). We seek to establish the the Freidlin–Wentzell Large Deviation Principle (LDP) for (1.13). Roughly speaking, the Large Deviation Principle (LDP) states that \( \mathbb {P}[ Z_\varepsilon \approx \mathsf {Z}] \approx \exp (-\varepsilon ^{-1} \frac{1}{2} \Vert \rho \Vert _{L^2}^2) \), where \( \mathsf {Z}=\mathsf {Z}(\rho ;t,x) \) solves the PDE

$$\begin{aligned} \partial _t \mathsf {Z}= \tfrac{1}{2} \partial _{xx} \mathsf {Z}+ \rho \mathsf {Z}. \end{aligned}$$
(1.14)

The precise statement of the Freidlin–Wentzell Large Deviation Principle (LDP) as well as the well posedness of (1.14) will be given in Sect. 1.2.1. Use the contraction principle to specialize the Freidlin–Wentzell Large Deviation Principle (LDP) to one point. We have

$$\begin{aligned} \Phi (\lambda )&= \inf \big \{\tfrac{1}{2} \Vert \rho \Vert _{L^2}^2: \log \mathsf {Z}(\rho ; 2, 0) \ge \lambda \big \}, \end{aligned}$$
(1.15)
$$\begin{aligned} \Phi (-\lambda )&= \inf \big \{\tfrac{1}{2} \Vert \rho \Vert _{L^2}^2: \log \mathsf {Z}(\rho ; 2, 0) \le -\lambda \big \}. \end{aligned}$$
(1.16)

To analyze the variational problems (1.15)–(1.15), we express \( \mathsf {Z}\) by the Feynman–Kac formula as

$$\begin{aligned} \mathsf {Z}(\rho ;t,x)= \mathbb {E}_{0\rightarrow x} \Big [ \exp \Big ( \int _0^t \rho (s,B_\text {b}(s)) \, \mathrm {d}s \Big ) \Big ] p(t,x), \end{aligned}$$
(1.17)

where the \( \mathbb {E}_{0\rightarrow x} \) is taken with respect to a Brownian bridge \( B_\text {b}(s) \) that starts from \( B_\text {b}(0)=0 \) and ends in \( B_\text {b}(t)=x \), and \( p(t,x) \,{:}{=}\, \exp (-x^2/2t)/\sqrt{2\pi t} \) denotes the standard heat kernel.

Given the Feynman–Kac formula, standard perturbation argument can be applied to obtained the quadratic law in the near-center regime, \( \lambda \rightarrow 0 \); this is done in Sect. 4.1.

Here we focus on the deep lower tail regime, i.e., analyzing (1.15) in the limit \( -\lambda \rightarrow -\infty \). The scaling \(\rho (\cdot ,\cdot ) \mapsto \lambda \rho (\cdot , \lambda ^{-\frac{1}{2}} \cdot )\) mentioned in Sect. 1.1 gives

$$\begin{aligned} \Phi (-\lambda ) = \lambda ^{5/2} \inf \big \{\tfrac{1}{2} \Vert \rho \Vert _{L^2}^2: \mathsf {h}_\lambda (\rho ;2,0) \le -1 \big \}, \end{aligned}$$
(1.18)

where

$$\begin{aligned} \begin{aligned} \mathsf {h}_\lambda (\rho ;t,x)\,&\,{:}{=}\, \,(\text {lower order term}) -\tfrac{x^2}{2t} + \lambda ^{-1} \log \mathbb {E}_{0\rightarrow \lambda ^{1/2} x}\\&\Big [ \exp \Big ( \int _0^t \lambda \rho (s,\lambda ^{-\frac{1}{2}} B_\text {b}(s)) \, \mathrm {d}s \Big ) \Big ]. \end{aligned} \end{aligned}$$
(1.19)

The details of this scaling are given in Sect. 4.2.1, and the precise expression of (1.19) is given in (4.11).

We seek to analyze the right hand side of (1.19) for \( (t,x)=(2,0) \). For a suitable class of \( \rho \), Varadhan’s lemma gives, as \( -\lambda \rightarrow -\infty \),

$$\begin{aligned} \lambda ^{-1} \log \mathbb {E}_{0\rightarrow 0} \Big [ \exp \Big ( \int _0^2 \lambda \rho (s,\lambda ^{-\frac{1}{2}} B_\text {b}(s)) \, \mathrm {d}s \Big ) \Big ] \longrightarrow - \inf _{\gamma } \Big \{ \int _0^2 \tfrac{1}{2} \gamma '(s)^2 - \rho (s,\gamma (s)) \ \mathrm {d}s \Big \}, \end{aligned}$$
(1.20)

where the infimum is taken over all \( H^1 \) path \( \gamma (s) \) that starts and ends in 0, i.e., \( \gamma (0)=\gamma (2)=0 \). This limit transition is reminiscent of the convergence (under the zero-temperature limit) of the free energy of a directed polymer to that of a last passage percolation. Our task is hence to find the \( \rho =\rho (s,y) \) with the minimal \( L^2 \) norm such that the right hand side of (1.20) is \( \le -1 \).

It is natural to guess that the minimizing \( \rho \) should be the \( \rho _*\) obtained in the aforementioned PDE heuristic. Taking this explicit \( \rho _*\), we prove the convergence (1.20) (by Varadhan’s lemma) and solve the path variational problem on the right side of (1.20); see Lemma 4.2 and Proposition 4.3. The explicit constant \( \frac{4}{15\pi } \) in Theorem 1.1(c) comes from the \( L^2 \) norm of \( \rho _*\).

The last step is to verify that such a \( \rho _*\) is indeed the minimizer. This is done in Sect. 4.2.3. There we appeal to an identity (4.30) that involves \( \rho _*\). This identity follows from the fact that for \( \rho =\rho _*\), the right hand side of (1.20) is equal to \( -1 \). Using this identity, we show that, for any \( \rho \) that satisfies the required condition \( \mathsf {h}_\lambda (\rho ;2,0) \le -1 \), the quantity \( \langle \rho _*-\rho ,\rho _*\rangle \) is approximately \( \le 0 \); see (4.32). This bound then verifies that \( \rho _*\) is the minimizer.

1.2.1 Freidlin–Wentzell LDP for the Stochastic Heat Equation (SHE).

Here we state our result on the Freidlin–Wentzell Large Deviation Principle (LDP) for the Stochastic Heat Equation (SHE) (1.13). For the purpose of proving Theorem 1.1, it suffices to just consider the narrow wedge initial data, but we also consider function-valued initial data for their independent interest.

Let us set up the notation, first for function-valued initial data. For \( a\in \mathbb {R}\), define the weighted sup norm \( \Vert g \Vert _{a} \,{:}{=}\, \sup _{x \in \mathbb {R}} \{ e^{-a|x|} |g(x)| \} \). Let \( C_a(\mathbb {R}) \,{:}{=}\, \{ g\in C(\mathbb {R}) : \Vert g \Vert _{a} <\infty \} \), and endow this space with the norm \( \Vert \cdot \Vert _{a} \). Slightly abusing notation, for functions that depend also on time, we use the same notation

$$\begin{aligned} \Vert f \Vert _{a} \,{:}{=}\, \big \{ e^{-a|x|} |f(t,x)| \, : \, (t,x)\in [0,T]\times \mathbb {R}\big \} \end{aligned}$$
(1.21)

to denote the analogous norm, and let \( C_a([0,T]\times \mathbb {R}) \,{:}{=}\, \{ f\in C([0,T]\times \mathbb {R}) : \Vert f \Vert _{a} <\infty \} \), endowed with the norm \( \Vert \cdot \Vert _{a} \). Adopt the notation \( C_{a_*^+}(\mathbb {R}) \,{:}{=}\, \cap _{a>a_*}C_a(\mathbb {R}) \) and \( C_{a_*^+}([0,T]\times \mathbb {R}) \,{:}{=}\, \cap _{a>a_*} C_a([0,T]\times \mathbb {R}). \) Let \( p(t,x) \,{:}{=}\, \exp (-\frac{x^2}{2t})/\sqrt{2\pi t} \) denote the standard heat kernel. Recall that the mild solution of (1.13) with a deterministic initial data \( g_*\) is a process \( Z_\varepsilon \) that satisfies

$$\begin{aligned} Z_\varepsilon (t,x) = \int _\mathbb {R}p(t,x-y) g_*(y) \, \mathrm {d}y + \varepsilon ^{\frac{1}{2}} \int _\mathbb {R}p(t-s,x-y) Z_\varepsilon (s,y) \xi (s,y) \, \mathrm {d}s \mathrm {d}y. \end{aligned}$$
(1.22)

It is standard, e.g., [Qua11, Sections 2.1–2.6], to show that for any \( g_*\in C_{a_*^+}(\mathbb {R}) \), there exists a unique mild solution \( Z_\varepsilon \) of (1.13) given by the chaos expansion; see Sect. 2.1.1 for a discussion about chaos expansion. Further, as shown later in Corollary 3.6, the chaos expansion (and hence \( Z_\varepsilon \)) is \( C_{a_*^+}([0,T]\times \mathbb {R}) \)-valued. Next we turn to the rate function. Fix \( g_*\in C_{a_*+}(\mathbb {R}) \). For \( \rho \in L^2([0,T]\times \mathbb {R}) \), consider the PDE

$$\begin{aligned} \partial _t \mathsf {Z}= \tfrac{1}{2} \partial _{xx} \mathsf {Z}+ \rho \mathsf {Z}, \qquad \mathsf {Z}(\rho ;0,\cdot ) = g_*(\cdot ), \end{aligned}$$

where \( \mathsf {Z}=\mathsf {Z}(\rho ;t,x) \), \( t\in [0,T] \), and \( x\in \mathbb {R}\). This PDE is interpreted in the Duhamel sense as

$$\begin{aligned} \mathsf {Z}(\rho ;t,x) = \int _\mathbb {R}p(t,x-y) g_*(y) \, \mathrm {d}y + \int _0^t \int _\mathbb {R}\rho (s,y) \mathsf {Z}(\rho ;s,y) \, \mathrm {d}y \mathrm {d}s. \end{aligned}$$
(1.23)

We will show in Sect. 2.1.2 that (1.23) admit a unique \( C_{a_*^+}([0,T]\times \mathbb {R}) \)-valued solution. We will often write \( \mathsf {Z}(\rho ) = \mathsf {Z}(\rho ;\cdot ,\cdot ) \) and accordingly view \( \rho \mapsto \mathsf {Z}(\rho ) \) as a function \( L^2([0,T]\times \mathbb {R})\rightarrow C_{a}([0,T]\times \mathbb {R}) \), for \( a>a_* \). Here \( \rho \) should be viewed as a deviation of the spacetime white noise \( \sqrt{\varepsilon }\xi \). For each such deviation \( \rho \) we run the PDE (1.23) to obtain the corresponding deviation \( \mathsf {Z}(\rho )=\mathsf {Z}(\rho ;t,x) \) of \( Z_\varepsilon \). Now, since the spacetime white noise \( \xi \) is Gaussian with the correlation \( \mathbb {E}[\xi (t,x)\xi (s,y)] = \delta _0(t-s)\delta _0(x-y) \), one expects the rate function to be the \( L^2 \) norm of \( \rho \), more precisely

$$\begin{aligned} I(f) \,{:}{=}\, \inf \big \{\tfrac{1}{2}\Vert \rho \Vert _{L^2} \,:\, \rho \in L^2([0,T]\times \mathbb {R}), \mathsf {Z}(\rho ) = f \big \}, \end{aligned}$$
(1.24)

with the convention \( \inf \emptyset \,{:}{=}\, +\infty \).

As for the narrow wedge initial data, we adopt the same notation as in the preceding but replace \( g_*\in C_{a_*^+}(\mathbb {R}) \) with \( g_*=\delta _0 \). More explicitly, the mild solution of the Stochastic Heat Equation (SHE) (1.13) satisfies

$$\begin{aligned} Z_\varepsilon (t,x) = p(t,x) + \varepsilon ^{\frac{1}{2}} \int _\mathbb {R}p(t-s,x-y) Z_\varepsilon (s,y) \xi (s,y) \, \mathrm {d}s \mathrm {d}y, \end{aligned}$$
(1.22-nw)

and the function \( \mathsf {Z}(\rho ) \) now solves

$$\begin{aligned} \mathsf {Z}(\rho ;t,x) = p(t,x) + \int _0^t \int _\mathbb {R}\rho (s,y) \mathsf {Z}(\rho ;s,y) \, \mathrm {d}y \mathrm {d}s. \end{aligned}$$
(1.23-nw)

Recall that \( Z_\varepsilon \) starts from the delta initial condition \( Z_\varepsilon (0,\cdot )=\delta _0(x) \). The smoothing effect of the Laplacian in the Stochastic Heat Equation (SHE) makes \( Z_\varepsilon (t,\cdot ) \) function-valued for all \( t>0 \), but when \( t \rightarrow 0 \) the process \( Z_\varepsilon (t,\cdot ) \) becomes singular as it approaches \( \delta _0 \). To avoid the singularity, we work with the space \( C_a([\eta ,T]\times \mathbb {R}) \), \( \eta >0 \), \( a\in \mathbb {R}\), equipped with the norm

$$\begin{aligned} \Vert f \Vert _{a,\eta } \,{:}{=}\, \big \{ e^{-a|x|} |f(t,x)| \, : \, (t,x)\in [\eta ,T]\times \mathbb {R}\big \}. \end{aligned}$$
(1.25)

It is standard to show that (1.22-nw) admits a unique solution that is \( C_a([\eta ,T]\times \mathbb {R}) \)-valued for all \( \eta >0 \) and \( a\in \mathbb {R}\). The same holds for (1.23-nw).

Let \( \Omega \) be a topological space. Recall that a function \( \varphi :\Omega \rightarrow \mathbb {R}\cup \{+\infty \} \) is a good rate function if \( \varphi \) is lower semi-continuous and the set \( \{ f: \varphi (f) \le r \} \) is compact for all \( r<+\infty \). Recall that a sequence \( \{W_\varepsilon \} \) of \( \Omega \)-valued random variables satisfies an Large Deviation Principle (LDP) with speed \( \varepsilon ^{-1} \) and the rate function \( \varphi \) if for any closed \( F \subset \Omega \) and open \( G\subset \Omega \),

$$\begin{aligned} \liminf _{\varepsilon \rightarrow 0} \varepsilon \log \mathbb {P}\big [W_\varepsilon \in G\big ] \ge -\inf _{f \in G} \varphi (f), \qquad \limsup _{\varepsilon \rightarrow 0} \varepsilon \log \mathbb {P}\big [W_\varepsilon \in F\big ]&\le -\inf _{f \in F} \varphi (f). \end{aligned}$$

In this paper we prove the following Freidlin–Wentzell Large Deviation Principle (LDP) for the Stochastic Heat Equation (SHE).

Proposition 1.7

  1. (a)

    Fix \( a_*\in \mathbb {R}\), \( g_*\in C_{a_*^+}(\mathbb {R}) \), and \( T<\infty \). Let \( Z_\varepsilon \) be the solution of (1.22) and let \( \mathsf {Z}(\rho ) \) be the solution of (1.23). For any \( a>a_* \), the function \( I: C_a([0,T]\times \mathbb {R})\rightarrow \mathbb {R}\cup \{+\infty \} \) in (1.24) is a good rate function. Further, \( \{Z_\varepsilon \}_\varepsilon \) satisfies an Large Deviation Principle (LDP) in \( C_a([0,T]\times \mathbb {R}) \) with speed \( \varepsilon ^{-1} \) and the rate function \( I\).

  2. (b)

    Fix \( T<\infty \). Let \( Z_\varepsilon \) be the solution of (1.22-nw) and let and let \( \mathsf {Z}(\rho ) \) be the solution of (1.23-nw). For any \( a\in \mathbb {R}\) and \( \eta \in (0,T) \), the function \( I: C_a([\eta ,T]\times \mathbb {R})\rightarrow \mathbb {R}\cup \{+\infty \} \) in (1.24) is a good rate function. Further, \( \{Z_\varepsilon \}_\varepsilon \) satisfies an Large Deviation Principle (LDP) in \( C_a([\eta ,T]\times \mathbb {R}) \) with speed \( \varepsilon ^{-1} \) and the rate function \( I\).

1.3 Literature on the WNT and Freidlin–Wentzell LDPs for stochastic PDEs

The Weak Noise Theory (WNT), also known as the optimal fluctuation theory, dates back at least to the works [HL66, ZL66, Lif68] in condensed matter physics. In the context of stochastic PDEs, the Weak Noise Theory (WNT) studies large deviations of the solution’s trajectory when the noise is scaled to be weaker and weaker. Such scaling is often equivalent to the short time scaling of a fixed SPDE. (See (1.3)–(1.4) for the case of the Kardar–Parisi–Zhang (KPZ) equation.) In the physics literature, the Weak Noise Theory (WNT) was carried out in [Fog98] for the noisy Burgers equation, in [KK07, KK09] for directed polymer and in [KMS16, MKV16] for the KPZ equation. The Weak Noise Theory (WNT) is also known as the instanton method in turbulence theory [FKLM96, FGV01, GGS15], the macroscopic fluctuation theory in lattice gases [BDSG+15], and WKB methods in reaction-diffusion systems [EK04, MS11].

The Freidlin–Wentzell Large Deviation Principle (LDP) has been established for various stochastic PDEs, including reaction-diffusion-like stochastic equations [CM97, BDM08], the stochastic Allen–Cahn equation [HW15], and the stochastic Navier–Stokes equation [CD19].

1.4 Some discussions about the rate function \( \Phi \)

The physics work [LDMRS16] used a different method to derive

$$\begin{aligned} \Phi (\lambda ) = {\left\{ \begin{array}{ll} \frac{-1}{\sqrt{4\pi }} \min \limits _{z \in [-1, +\infty )} \big \{ z e^{\lambda } + \text {Li}_{\frac{5}{2}}(-z)\big \}, &{}\lambda \le \lambda _c,\\ \frac{-1}{\sqrt{4\pi }}\min \limits _{z \in [-1, 0)} \big \{z e^{\lambda } + \text {Li}_{\frac{5}{2}} (-z) - \frac{8\sqrt{\pi }}{3} (-\log (-z))\big \}, &{} \lambda \ge \lambda _c, \end{array}\right. } \end{aligned}$$

where \(\text {Li}_\nu (z)\) is the poly-logarithm function and \( \lambda _c = \log \zeta (\frac{3}{2}) \). Though not completely mathematically rigorous, the derivation is based on convincing arguments and is backed by the numerical result [HLDM+18]. Based on this expression, the work obtained many properties of \( \Phi \), including its analyticity on \( \lambda \in \mathbb {R}\), and lower-order terms in the deep lower-tail regime \( -\lambda \rightarrow -\infty \) (beyond the leading term \( \frac{4}{15\pi }\lambda ^{\frac{5}{2}} \)). Our results do not cover these detailed properties of \( \Phi \). Rigorously proving these properties is an interesting open question.

1.5 Outline of the rest of the paper

In Sect. 2, we recall the formalism of Wiener chaos, recall a result from [HW15] that gives the Large Deviation Principle (LDP) for finitely many chaos, and prepare some properties of the function \( \mathsf {Z}(\rho ) \). In Sect. 3, we establish tail probability bounds on the Wiener chaos for the Stochastic Heat Equation (SHE). Based on such tail bounds, we leverage the Large Deviation Principle (LDP) for finitely many chaos into the Large Deviation Principle (LDP) for the Stochastic Heat Equation (SHE), thereby proving Proposition 1.7. In Sect. 4, we analyze the variational problem given by the one-point Large Deviation Principle (LDP) for the Stochastic Heat Equation (SHE) and prove Theorem 1.1.

2 Wiener Spaces, Wiener Chaos, and the Function \( \mathsf {Z}(\rho ) \)

In this section we recall the formalism of Wiener spaces and chaos, and prepare some properties of \( \mathsf {Z}(\rho ) \).

2.1 Function-valued initial data

Throughout this subsection we fix \( T<\infty \), \( a_*\in \mathbb {R}\), and \( g_*\in C_{a_*^+}(\mathbb {R}) \), and initiate the Stochastic Heat Equation (SHE) (1.13) from \( Z_\varepsilon (0,\cdot ) = g_*(\cdot ) \).

2.1.1 Wiener spaces and chaos.

We will mostly follow [HW15, Section 3]. The basic elements of the Wiener space formalism consists of \( (\mathcal {B},\mathcal {H},\mu ) \), where \( \mathcal {B}\) is a Banach space over \( \mathbb {R}\) equipped with a Gaussian measure \( \mu \), and \( \mathcal {H}\subset \mathcal {B}\) is the Cameron–Martin space of \( \mathcal {B}\). In our setting \( \mathcal {H}= L^2([0,T]\times \mathbb {R}) \), and \( \mathcal {B}\) can be any a Banach space such that the embedding \( \mathcal {H}\subset \mathcal {B}\) is dense and Hilbert–Schmidt. To be concrete, fixing an arbitrary orthonormal basis \( \{e_1,e_2,\ldots \} \) of \( \mathcal {H}=L^2([0,T]\times \mathbb {R}) \), we let

$$\begin{aligned} \mathcal {B}\,{:}{=}\, \Big \{ \xi =\sum \xi _i e_i \, : \, \xi _1,\xi _2,\ldots \in \mathbb {R}, \ \Vert \xi \Vert _{\mathcal {B}} <\infty \Big \}, \qquad \big \Vert \sum \xi _i e_i \big \Vert _{\mathcal {B}}^2 \,{:}{=}\, \sum \nolimits _{i\ge 1} \tfrac{1}{i^2}|\xi _i|^2. \end{aligned}$$
(2.1)

Identifying \( \mathcal {B}\) as a subset of \( \mathbb {R}^{\mathbb {Z}_{\ge 1}} \), we set \( \mu \,{:}{=}\, \otimes _{\mathbb {Z}_{\ge 1}} \nu \), where \( \nu \) is the standard Gaussian measure on \( \mathbb {R}\). The space \( \mathcal {B}\) serves as the sample space. For example, for \( f\in L^2([0,T]\times \mathbb {R}) \) with \( f= \sum f_i e_i \), the function

$$\begin{aligned} W(f): \mathcal {B}\rightarrow \mathbb {R},\qquad W(f) \,{:}{=}\, \sum \nolimits _{i\ge 1} f_i \xi _i \end{aligned}$$
(2.2)

should be identified with the random variable \( \int _0^T \int _\mathbb {R}f(t,x) \xi (t,x) \, \mathrm {d}t \mathrm {d}x \). This identification justifies using \( \xi \) to denote both elements of \( \mathcal {B}\) and the spacetime white noise.

The Hermite polynomials \( H_n(x) \) are the unique polynomials satisfying \( \deg (H_n) = n \) and

$$\begin{aligned} e^{\tau x -\frac{\tau ^2}{2}} = \sum _{n=0}^\infty \tau ^n H_n (x). \end{aligned}$$
(2.3)

The n-th \( \mathbb {R}\)-valued Wiener chaos is the closure in \( L^2(\mathcal {B}\rightarrow \mathbb {R},\mu ) \) of the linear subspace spanned by \( \prod _{i=1}^\infty H_{\alpha _i}(W(e_i)) \), for \( (\alpha _1,\alpha _2,\ldots ) \in \mathbb {Z}_{\ge 0}\times \mathbb {Z}_{\ge 0}\times \ldots \) and \( \alpha _1+\alpha _2+\ldots =n \). Since our goal is to establish a functional Large Deviation Principle (LDP), it is natural to consider Wiener chaos at the functional level. We will follow the formalism of Banach-valued Wiener chaos from [HW15, Section 3]. Fix \( a>a_* \) and consider \( E= C_a([0,T]\times \mathbb {R}) \), which is a separable Banach space. The n-th \( E\)-valued Wiener chaos is the space

$$\begin{aligned}&\Big \{ \Psi \in L^2(\mathcal {B}\rightarrow E,\mu ) \, : \, \int \Psi (\xi ) \psi (\xi ) \mu (\mathrm {d}\xi ) =0, \\&\forall \psi \in (m\text {-th } \mathbb {R}\text {-valued Wiener chaos), with } m\ne n \Big \}. \end{aligned}$$

In probabilistic notation, the n-th \( E\)-valued Wiener chaos consists of \( C_a([0,T]\times \mathbb {R}) \)-valued random variables \( \Psi \) such that \( \mathbb {E}[ \Vert \Psi \Vert _{a}^2 ]<\infty \) and that \( \mathbb {E}[ \Psi \psi ] = 0 \), for all \( \psi \) in the m-th \( \mathbb {R}\)-valued Wiener chaos with \( m\ne n \).

We now turn to the Stochastic Heat Equation (SHE). Set

$$\begin{aligned} Y_{n} (t,x)&\,{:}{=}\,&\int _{\Delta _n (t)} \int _{\mathbb {R}^{n+1}} p(s_{n}-s_{n+1},y_n-y_{n+1}) g_*(y_{n+1}) \mathrm {d}y_{n+1} \nonumber \\&\quad \prod _{i=1}^n p(s_{i-1}-s_{i},y_{i-1}-y_{i}) \xi (s_i, y_i) \, \mathrm {d}s_i \mathrm {d}y_i, \end{aligned}$$
(2.4)

where \( \Delta _n (t) = \{\mathbf {s} = (s_0,s_1, \dots , s_{n+1}) : 0=s_{n+1}< s_n< \dots< s_1 < s_0 = t\} \), with the convention \( s_0\,{:}{=}\,t \) and \( y_0 \,{:}{=}\, x \). Iterating (1.22) gives

$$\begin{aligned} Z_\varepsilon (t,x) = \sum _{n=0}^\infty \varepsilon ^{\frac{n}{2}} Y_{n}(t,x). \end{aligned}$$
(2.5)

We will show later in Proposition 3.5 that each \( Y_n \) defines a \( C_a([0,T]\times \mathbb {R}) \)-valued random variable, and show in Corollary 3.6 that the right hand side of (2.5) converges in \( \Vert \cdot \Vert _{a} \) almost surely. It is standard to show that (2.5) gives the unique mild solution of the Stochastic Heat Equation (SHE). Further, given the n-fold stochastic integral expression in (2.4), it is standard to show that, for fixed \( (t,x)\in [0,T]\times \mathbb {R}\), the random variable \( Y_{n}(t,x) \) lies in the n-th \( \mathbb {R}\)-valued Wiener chaos, and \( Y_{n}\in C_a([0,T]\times \mathbb {R}) =:E\) lies in the n-th \( E\)-valued Wiener chaos. Accordingly, we refer to the series (2.5) as the chaos expansion for the Stochastic Heat Equation (SHE).

Let \( Z_{N,\varepsilon } \,{:}{=}\, \sum _{n=0}^N \varepsilon ^{\frac{n}{2}}Y_{n} \) denote the partial sum of the chaos expansion (2.5). The LDPs of finitely many \( E\)-valued Wiener chaos has been established in [HW15, Theorem 3.5]. We next apply this result to obtain an Large Deviation Principle (LDP) for \( Z_{N,\varepsilon } \). Following the notation in [HW15], we view \( Y_{n} \) as a function \( \mathcal {B}\rightarrow C_a([0,T]\times \mathbb {R}) \), denoted \( Y_{n}(\xi ) \), and define

$$\begin{aligned} (Y_n)_\text {hom}: L^2([0,T]\times \mathbb {R}) \rightarrow C_a([0,T]\times \mathbb {R}), \qquad (Y_{n})_\text {hom}(\rho ) \,{:}{=}\, \int _{\mathcal {B}} Y_{n}(\xi +\rho ) \, \mu (\mathrm {d}\xi ). \end{aligned}$$
(2.6)

The last integral is well-defined for any \( \rho \in L^2([0,T]\times \mathbb {R}) \) by the Cameron–Martin theorem. Further define

$$\begin{aligned} I_N: C_a([0,T]\times \mathbb {R}) \rightarrow \mathbb {R}\cup \{+\infty \} \quad I_N (f)&\,{:}{=}\, \inf \Big \{\tfrac{1}{2} \Vert \rho \Vert _{L^2}^2 \, : \, \rho \in L^2([0,T]\times \mathbb {R}),\nonumber \\&\sum _{n=0}^N (Y_n)_\text {hom}(\rho ) = f \Big \}, \end{aligned}$$
(2.7)

with the convention \( \inf \emptyset \,{:}{=}\, +\infty \). We now apply [HW15, Theorem 3.5] to obtain an Large Deviation Principle (LDP) for \( Z_{N,\varepsilon } \).

Proposition 2.1

(Special case of [HW15, Theorem 3.5]) For any fixed \( a>a_* \), the function \( I_N \) in (2.7) is a good rate function. For fixed \( N<\infty \), \( \{ Z_{N,\varepsilon } \,{:}{=}\, \sum _{n=0}^N \varepsilon ^{\frac{n}{2}} Y_n \}_\varepsilon \) satisfies an Large Deviation Principle (LDP) on \(C_a([0,T]\times \mathbb {R})\) with speed \( \varepsilon ^{-1} \) and the rate function \( I_N \).

Proof

Applying [HW15, Theorem 3.5] with \( \delta (\varepsilon ) = 0 \) and with \( \varvec{\Psi }^{(\varepsilon )} = ( Y_{0}, \varepsilon ^{1/2} Y_{1},\ldots , \varepsilon ^{N/2}Y_N ) \in E^{N+1} \) gives an Large Deviation Principle (LDP) on \( C_a([0,T]\times \mathbb {R})^{N+1} \) for \( \varvec{\Psi }^{(\varepsilon )} \) with speed \( \varepsilon ^{-1} \) and the rate function \( J(f_0,\ldots ,f_N) \,{:}{=}\, \inf \{ \frac{1}{2} \Vert \rho \Vert _{L^2}^2 : \rho \in L^2([0,T]\times \mathbb {R}), \ (Y_n)_\text {hom}(\rho ) = f_n, n=0,\ldots ,N \}. \) Since the map \( C_a([0,T]\times \mathbb {R})^{N+1} \rightarrow C_a([0,T]\times \mathbb {R}) \), \( (f_0,\ldots ,f_N) \mapsto f_0+\ldots +f_N \) is continuous, the claimed result follows by the contraction principle. \(\quad \square \)

2.1.2 Properties of the function \( \mathsf {Z}(\rho ) \).

Recall that \( \mathsf {Z}(\rho ) \) denotes the solution of (1.23). We begin by developing an series expansion for \( \mathsf {Z}(\rho ) \) that mimics the chaos expansion for the Stochastic Heat Equation (SHE). For fixed \( \rho \in L^2([0,T]\times \mathbb {R}) \), let

$$\begin{aligned} \mathsf {Y}_{n}(\rho ;t,x)&s\,{:}{=}\, \int _{\Delta _n(t)}\int _{\mathbb {R}^{n+1}} p(s_n-s_{n+1},y_n-y_{n+1}) g_*(y_{n+1}) \mathrm {d}y_{n+1} \nonumber \\&\quad \prod _{i=1}^n p(s_{i-1}-s_{i},y_{i-1}-y_i) \rho (s_i,y_i) \mathrm {d}s_i \mathrm {d}y_i. \end{aligned}$$
(2.8)

where \( \Delta _n (t) \,{:}{=}\, \{\mathbf {s} = (s_0,s_1, \dots , s_{n+1}) : 0=s_{n+1}< s_n< \dots< s_1 < s_0 = t\} \), with the convention \( s_0\,{:}{=}\,t \) and \( y_0 \,{:}{=}\, x \). Iterating (1.23) shows that the unique solution is given by

$$\begin{aligned} \mathsf {Z}(\rho ;t,x) = \sum _{n=0}^\infty \mathsf {Y}_n(\rho ;t,x), \end{aligned}$$
(2.9)

provided that the right hand side of (2.9) converges in \( \Vert \cdot \Vert _{a} \).

To verify this convergence we proceed to establish a bound on \( \Vert \mathsf {Y}_{n}(\rho ) \Vert _{a} \). Hereafter, we will use \( C=C(a_1,a_2,\ldots ) \) to denote a deterministic positive finite constant. The constant may change from line to line or even within the same line, but depends only on the designated variables \( a_1,a_2,\ldots \). Recall that \( p(t,x) \) denotes the standard heat kernel. The following bounds will be useful in our subsequent analysis. The proof of these bounds is standard and hence omitted.

Lemma 2.2

Fix \( a\in \mathbb {R}\) and \( \theta \in (0,\frac{1}{2}) \). There exists \( C=C(a,\theta ,T) \) such that for all \( x,x'\in \mathbb {R}\) and \( s<t\in [0,T] \),

  1. (a)

    \(p(t,x) \le C t^{-1/2} e^{a|x|} \),

  2. (b)

    \(\int _\mathbb {R}p(t,x-y) e^{a|y|} \mathrm {d}y \le C e^{a|x|}\),

  3. (c)

    \(\int _\mathbb {R}p(t,x-y)^2 e^{a|y|} \mathrm {d}y \le C t^{-\frac{1}{2}} e^{a|y|}\),

  4. (d)

    \(\int _\mathbb {R}\big (p(t,x-y) - p(t,x'-y)\big )^2 e^{a|y|} \mathrm {d}y \le C |x-x'|^{2\theta } \, t^{-\frac{1}{2} - \theta } (e^{a|x|}\vee e^{a|x'|}) \), and

  5. (e)

    \(\int _\mathbb {R}\big (p(t,x-y) - p(s,x-y)\big )^2 e^{a|y|} \mathrm {d}y \le C |t-s|^\theta \, s^{-\frac{1}{2} - \theta } e^{a|x|} \).

Fix \( a\in \mathbb {R}\), \( \eta \in (0,T) \), and \( \theta \in (0,\frac{1}{2}) \). There exists \( C=C(a,\theta ,T,\eta ) \) such that for all \( s<t\in [\eta ,T] \) and \( x,x',y\in \mathbb {R}\),

  1. (i)

    \(|p(t,x-y) - p(t,x'-y)| \le C |x-x'|^\theta (e^{a|x-y|}\vee e^{a|x'-y|}) \), and

  2. (ii)

    \(|p(t, x) - p(s, x)| \le C |t-s| e^{a|x|} \).

The next lemma gives a bound on \( \Vert \mathsf {Y}_{n}(\rho ) \Vert _{a} \) and verifies the convergence of the right hand side of (2.9).

Lemma 2.3

Fix \( a>a_* \). There exists \( C=C(T,a) \) such that, for all \( \rho \in L^2([0,T]\times \mathbb {R}) \) and \( n\in \mathbb {Z}_{\ge 0}\), we have \( \Vert \mathsf {Y}_{n}(\rho ) \Vert _{a} \le \frac{C^n}{\Gamma (n/2)^{1/2}} \Vert \rho \Vert _{L^2}^n \).

Proof

Throughout this proof we write \( C=C(T,a) \). Let \( F_n(t) \,{:}{=}\, \sup _{x\in \mathbb {R}} e^{2a|x|}|\mathsf {Y}_n(\rho ;t,x)|^2 \). For \( n=0 \), we have \( \mathsf {Y}_{0}(\rho ;t,x) = \int _\mathbb {R}p(t,x-y) g_*(y)\mathrm {d}y \). That \( g_*\in C_{a^*_+}(\mathbb {R}) \) implies \( |g_*(y)| \le C e^{a|y|} \). Combining this with Lemma 2.2(b) gives \( F_0(t) \le C \). Next, for \( n\ge 1 \), referring to (2.8), we see that \( \mathsf {Y}_n(\rho ;t,x) \) can be expressed iteratively as

$$\begin{aligned} \mathsf {Y}_n(\rho ;t,x) = \int _0^t \int _\mathbb {R}p(t-s,x-y) \mathsf {Y}_{n-1}(\rho ;s,y) \rho (s,y) \mathrm {d}s \mathrm {d}y. \end{aligned}$$

Take square on both sides and apply the Cauchy–Schwarz inequality to get \( \mathsf {Y}_n(\rho ;t,x)^2 \le \int _0^t \int _\mathbb {R}p(t-s,x-y)^2 \mathsf {Y}_{n-1}(\rho ;s,y)^2 \mathrm {d}s \mathrm {d}y \, \Vert \rho \Vert _{L^2}^2. \) Within the last integral, use \( \mathsf {Y}_{n-1}(\rho ;s,y)^2 \le F_{n-1}(s) e^{2a|y|} \) and Lemma 2.2(c), and divide both sides by \( e^{-2a|x|} \). We obtain \( F_n(t) \le C \Vert \rho \Vert _{L^2}^2 \int _0^t F_{n-1}(s) (t-s)^{-1/2} \mathrm {d}s \). Iterating this inequality and using \( F_0(t) \le C \) complete the proof. \(\quad \square \)

As it turns out, the function \( (Y_{n})_\text {hom}(\rho ) \) in (2.6) is equal to \( \mathsf {Y}_{n}(\rho ) \) in (2.8).

Lemma 2.4

For any \( \rho \in L^2([0,T]\times \mathbb {R}) \) and \( n\in \mathbb {Z}_{\ge 0}\), we have \( (Y_n)_\mathrm {hom}(\rho ) = \mathsf {Y}_{n}(\rho ). \)

Proof

Recall the notation \( W(f) \) from (2.2). Since \( \rho \in L^2([0,T]\times \mathbb {R}) \), the Cameron–Martin theorem gives

$$\begin{aligned} (Y_{n})_\text {hom}(\rho ) \,{:}{=}\, \int _{\mathcal {B}} \mathsf {Y}_n(\rho +\xi ) \mu (\mathrm {d}\xi ) = \mathbb {E}\big [ \exp \big ( W(\rho ) - \tfrac{1}{2} \Vert \rho \Vert _{L^2}^2\big ) \mathsf {Y}_n \big ]. \end{aligned}$$
(2.10)

Taking \( \tau = \Vert \rho \Vert _{L^2} \) and \( x =W(\rho /\Vert \rho \Vert _{L^2}) \) in (2.3) gives \( \exp ( W(\rho ) - \frac{1}{2} \Vert \rho \Vert _{L^2}^2 ) = \sum _{m=0}^\infty \Vert \rho \Vert _{L^2}^m H_m(W(\rho /\Vert \rho \Vert _{L^2})). \) Invoke the well-known identity, c.f., [Nua06, Proposition 1.1.4],

$$\begin{aligned} \Vert \rho \Vert _{L^2}^m H_m(W(\rho /\Vert \rho \Vert _{L^2})) = \int _{\Delta _m(T)} \int _{\mathbb {R}^m} \prod _{i=1}^m \rho (s_i, y_i) \xi (s_i, y_i) \mathrm {d}s_i \mathrm {d}y_i, \end{aligned}$$
(2.11)

insert the result into (2.10), and exchange the sum and expectation in the result. We have

$$\begin{aligned} (Y_{n})_\text {hom}(\rho ;t,x) = \sum _{m = 0}^\infty \mathbb {E}\bigg [ \Big ( \int _{\Delta _m (T)} \int _{\mathbb {R}^m} \rho ^{\otimes m}(\mathbf {s}, \mathbf {y}) \prod _{i=1}^m \xi (s_i, y_i) \, \mathrm {d}s_i \mathrm {d}y_i \Big ) \ Y_n(t,x) \bigg ]. \end{aligned}$$

Within the last expression, the random variable on the right hand side of (2.11) belongs to the m-th \( \mathbb {R}\)-valued Wiener chaos. Since \( Y_n \) belongs to the n-th \( E\)-valued Wiener chaos, the expectation is nonzero only when \( m=n \). Calculating this expectation from (2.4) concludes the desired result. \(\quad \square \)

2.2 The narrow wedge initial data

Throughout this subsection we fix \( 0<\eta<T<\infty \) and \( a\in \mathbb {R}\), and initiate the Stochastic Heat Equation (SHE) (1.13) from \( Z_\varepsilon (0,\cdot ) = \delta _0(\cdot ) \).

For the Wiener space formalism, the spaces \( \mathcal {H}=L^2([0,T]\times \mathbb {R}) \) and \( \mathcal {B}\) remain the same as in Sect. 2.1.1, while the space \( E\) now changes to \( E= C_a([\eta ,T]\times \mathbb {R}) \). The chaos expansion takes the same form as (2.5) but with

$$\begin{aligned} Y_{n} (t,x) \,{:}{=}\, \int _{\Delta _n (t)} \int _{\mathbb {R}^{n+1}} p(s_{n}-s_{n+1},y_n) \, \prod _{i=1}^n p(s_{i-1}-s_{i},y_{i-1}-y_{i}) \xi (s_i, y_i) \, \mathrm {d}s_i \mathrm {d}y_i. \end{aligned}$$
(2.4-nw)

Recall the norm \( \Vert \cdot \Vert _{a,\eta } \) from (1.25). Proposition 3.5-nw in the following asserts that each \( Y_n \) defines a \( C_a([\eta ,T]\times \mathbb {R}) \)-valued random variable, and Corollary 3.6-nw asserts that the right hand side of (2.5) converges in \( \Vert \cdot \Vert _{a,\eta } \) almost surely. The functions \( (Y_{n})_\text {hom}(\rho ) \) and \( I_N \) are defined the same way as in Sect. 2.1.1, but with \( C_a([\eta ,T]\times \mathbb {R}) \) in place of \( C_a([0,T]\times \mathbb {R}) \). More explicitly,

$$\begin{aligned}&(Y_n)_\text {hom}: L^2([0,T]\times \mathbb {R}) \rightarrow C_a([\eta ,T]\times \mathbb {R}), \qquad (Y_{n})_\text {hom}(\rho ) \!\,{:}{=}\,\! \int _{\mathcal {B}} Y_{n}(\xi +\rho ) \, \mu (\mathrm {d}\xi ), \end{aligned}$$
(2.6-nw)
$$\begin{aligned} I_N: C_a([\eta ,T]\times \mathbb {R}) \rightarrow \mathbb {R}\cup \{+\infty \}, \ I_N (f) \!&\,{:}{=}\,\! \inf \Big \{\tfrac{1}{2} \Vert \rho \Vert _{L^2}^2 : \rho \in L^2([0,T]\times \mathbb {R}),\\&\quad \sum _{n=0}^N (Y_n)_\text {hom}(\rho ) = f \Big \}, \end{aligned}$$
(2.7-nw)

with the convention \( \inf \emptyset \,{:}{=}\, +\infty \).

Likewise, for Eq. (1.23-nw), the unique solution is given by the expansion (2.9) but with

$$\begin{aligned} \mathsf {Y}_{n}(\rho ;t,x) \,{:}{=}\, \int _{\Delta _n(t)}\int _{\mathbb {R}^{n}} p(s_n-s_{n+1},y_n) \, \prod _{i=1}^n p(s_{i-1}-s_{i},y_{i-1}-y_i) \rho (s_i,y_i) \mathrm {d}s_i \mathrm {d}y_i. \end{aligned}$$
(2.8-nw)

Similar proofs of Proposition 2.1 and Lemmas 2.3 and 2.4 applied in the current setting give

Proposition 2.1-nw.  For any fixed \( a\in \mathbb {R}\) and \( \eta \in (0,T) \), the function \( I_N \) in (2.7-nw) is a good rate function. For fixed \( N<\infty \), \( \{ Z_{N,\varepsilon } \,{:}{=}\, \sum _{n=0}^N \varepsilon ^{\frac{n}{2}} Y_n \}_\varepsilon \) satisfies an Large Deviation Principle (LDP) on \(C_a([0,T]\times \mathbb {R})\) with speed \( \varepsilon ^{-1} \) and the rate function \( I_N \).

Lemma 2.3-nw.  Fix \( a\in \mathbb {R}\) and \( \eta <T\in (0,\infty ) \). There exists \( C=C(T,a,\eta ) \) such that, for all \( \rho \in L^2([0,T]\times \mathbb {R}) \) and \( n\in \mathbb {Z}_{\ge 0}\), we have \( \Vert \mathsf {Y}_{n}(\rho ) \Vert _{a,\eta } \le \frac{C^n}{\Gamma (n/2)^{1/2}} \Vert \rho \Vert _{L^2}^n \).

Lemma 2.4-nw.  For any \( \rho \in L^2([0,T]\times \mathbb {R}) \) and \( n\in \mathbb {Z}_{\ge 0}\), we have \( (Y_n)_\mathrm {hom}(\rho ) = \mathsf {Y}_{n}(\rho ). \)

3 Freidlin–Wentzell LDP for the SHE

3.1 Function-valued initial data

Throughout this subsection, we fix \( T<\infty \), \( a_* \in \mathbb {R}\), and \( g_*\in C_{a_*^+}(\mathbb {R}) = \cap _{a>a_*} C_a(\mathbb {R}) \), and let \( Z_\varepsilon \) denote the solution of (1.13) with the initial data \( g_*\).

Recall from Proposition 2.1 that \( Z_{N,\varepsilon } \,{:}{=}\, \sum _{n=0}^N \varepsilon ^{\frac{n}{2}} Y_n \) satisfies an Large Deviation Principle (LDP) with the rate function \( I_N \) given in (2.7). By Lemma 2.4, the function \( I_N \) can be expressed as

$$\begin{aligned} I_N (f) \,{:}{=}\, (2.7) = \inf \Big \{\tfrac{1}{2} \Vert \rho \Vert _{L^2}^2 \, {:} \, \rho \in L^2([0,T]\times \mathbb {R}), \ \sum _{n=0}^N \mathsf {Y}_n(\rho ) = f \Big \}. \end{aligned}$$
(3.1)

Recall that \( \mathsf {Z}(\rho ) = \sum _{n=0}^\infty \mathsf {Y}_n(\rho ) \). Referring to the definition of \( I\) in (1.24), we see that formally taking \( N\rightarrow \infty \) in (3.1) produces \( I(f) \). The proof of Proposition 1.7 hence amounts to justifying this limit transition at the level of LDPs. Key to justifying such a limit transition is a tight enough bound on the tail probability \( \mathbb {P}[\Vert Y_n \Vert _{a}\ge r] \), which we establish in Sect. 3.1.1.

3.1.1 Tail probability of \( \Vert Y_{n} \Vert _{a} \).

We will utilize the fact that, for any \( (t,x)\in [0,T]\times \mathbb {R}\), the random variable \( Y_n(t,x) \) belongs to the n-th \( \mathbb {R}\)-valued Wiener chaos. For X in the n-th \( \mathbb {R}\)-valued Wiener chaos, the hypercontractivity inequality asserts that higher moments of X are controlled by the second moments, c.f., [Nua06, Theorem 1.4.1],

$$\begin{aligned} \mathbb {E}\big [|X|^{p}\big ] \le p^{\frac{np}{2}} \big ( \mathbb {E}\big [|X|^{2}\big ] \big )^{\frac{p}{2}}, \quad \text {for all } p \ge 2. \end{aligned}$$
(3.2)

We now use this inequality to produce a tail probability bound.

Lemma 3.1

Let X be an \(\mathbb {R}\)-valued random variable in the n-th Wiener chaos and let \( \sigma ^2\,{:}{=}\, \mathbb {E}[X^2] \). There exists a universal constant \( C \in (0,\infty ) \) such that, for all \( n\in \mathbb {Z}_{\ge 1}\) and \( r\ge 0 \),

$$\begin{aligned} \mathbb {P}\big [ |X|\ge r \big ] \le \exp \big ( - \tfrac{n}{C}\sigma ^{-\frac{2}{n}} r^{\frac{2}{n}} +n \big ). \end{aligned}$$

Proof

Assume without loss of generality \(\sigma = 1\). We seek to bound \( \mathbb {E}[\exp (\alpha |X|^{2/n})] \) for \(\alpha > 0\). To this end, invoke Taylor expansion to get \( \mathbb {E}[\exp (\alpha |X|^{2/n})] = \sum _{k=0}^{n} \frac{1}{k!} \alpha ^k \mathbb {E}[|X|^{2k/n}] + \sum _{k=n+1}^{\infty } \frac{1}{k!} \alpha ^k \mathbb {E}[|X|^{2k/n}]. \) On the right hand side, use (3.2) to bound the moments for \( k \ge n+1 \). As for \( k\le n \), we simply bound \( \mathbb {E}[|X|^{2k/n}] \le (\mathbb {E}[ |X|^2 ])^{k/n} = 1 \). Combining these bounds gives \( \mathbb {E}[\exp (\alpha |X|^{2/n})] \le \sum _{k = 0}^{n} \frac{1}{k!} \alpha ^k + \sum _{k=n+1}^{\infty } \frac{1}{k!} \alpha ^k (\frac{2k}{n})^{k} . \) The first term on the right hand side is bounded by \(e^{\alpha }\). For the second term, using the inequality \( k^k \le e^k k!\) gives \( \sum _{k= n+1}^{\infty } \frac{1}{k!} \alpha ^k (\frac{2k}{n})^{k} \le \sum _{k = n+1}^\infty (\frac{2e\alpha }{n})^k\). Combining these bounds and setting \(\alpha = n/(4e) \) in the result gives \( \mathbb {E}[\exp (\frac{n}{4e} |X|^{2/n})] \le e^{\frac{n}{4e}} + 2^{-n} \le e^n \). Now applying Markov’s inequality completes the proof. \(\quad \square \)

In light of Lemma 3.1, bounding the tail probability of \( Y_{n}(t,x) \) amounts to bounding its second moment, which we do next. Recall that T, \( g_*\in C_{a_*^+}(\mathbb {R}) \), and \( a_*\in \mathbb {R}\) are fixed throughout this section.

Proposition 3.2

Fix \( a>a_* \), \( \theta _1\in (0,1) \), \(\theta _2\in (0,\frac{1}{2}) \), and \( n \in \mathbb {Z}_{\ge 1}\). There exists \( C=C(T,a,\theta _1,\theta _2) \) such that for all \( t, t'\in [0,T] \) and \( x,x'\in \mathbb {R}\),

  1. (a)

    \(\mathbb {E}\big [Y_n(t, x)^2\big ] \le e^{2a|x|} \frac{C^n}{\Gamma (\frac{n}{2})} \),

  2. (b)

    \( \mathbb {E}\big [\big ( Y_n(t, x) - Y_n(t, x') \big )^2\big ] \le \frac{C^n}{\Gamma (\frac{n}{2})} ( e^{2a|x|} \vee e^{2a|x'|}) |x - x'|^{\theta _1} \), and

  3. (c)

    \( \mathbb {E}\big [\big ( Y_n(t, x) - Y_n(t', x) \big )^2\big ] \le \frac{C^n}{\Gamma (\frac{n}{2})} e^{2a|x|} |t - t'|^{\theta _2} \).

Proof

Fix \( a>a_* \), \( \theta _1 \in (0, 1)\), \(\theta _2 \in (0, \frac{1}{2})\), and \( n \in \mathbb {Z}_{\ge 1}\). Throughout this proof we write \( C=C(T,g_*,a,\theta _1,\theta _2) \).

(a) We begin by developing an iterative bound. It is readily verified from (2.4) that the chaos can be expressed as

$$\begin{aligned} Y_n (t, x) = \int _0^t \int _\mathbb {R}p(t-s,x-y) Y_{n-1} (s, y) \xi (s, y) \mathrm {d}s\mathrm {d}y. \end{aligned}$$
(3.3)

Applying Itô’s isometry gives \( \mathbb {E}[Y_n (t, x)^2] = \int _0^t \int _{\mathbb {R}} p(t-s,x-y)^2 \mathbb {E}[Y_{n-1} (s, y)^2 ] \mathrm {d}s\mathrm {d}y. \) To streamline notation, set \( F_n (s) \,{:}{=}\, \sup _{x \in \mathbb {R}}e^{-2a|x|} \mathbb {E}[Y_n(s, x)^2 ] \). The last integral is bounded by \( \int _0^t F_{n-1}(s) \int p(t-s,x-y)^2 e^{2a|y|} \mathrm {d}y. \) Further using Lemma 2.2(c) to bound the last integral gives \( \mathbb {E}[Y_n(t, x)^2] \le C \int _0^t (t-s)^{-\frac{1}{2}} e^{2a|x|} F_{n-1} (s) \mathrm {d}s. \) Multiplying both sides by \(\exp (-2a|x|)\) and taking the supremum over x give

$$\begin{aligned} F_n (t) \le C \int _0^t (t-s)^{-\frac{1}{2}} F_{n-1}(s)\mathrm {d}s. \end{aligned}$$
(3.4)

To utilize the iterative bound (3.4), we need to establish a bound on \( F_0(t) \). By definition

$$\begin{aligned} F_0 (t) \,{:}{=}\, \sup _{x\in \mathbb {R}} \Big \{ e^{-2a |x|} \Big ( \int p(t,x-y) g_*(y)\mathrm {d}y \Big )^2 \Big \}. \end{aligned}$$

Note that \( g_*\in C_{a_*^+}(\mathbb {R}) \) implies \( |g_*(y)| \le C e^{a|y|}.\) Insert this bound into the definition of \( F_0(t) \), and use Lemma 2.2(b) to bound the resulting integral (over y). The result gives \( |F_0 (t)| \le C \). Iterating (3.4) from \( n=1 \) and using \( |F_0 (t)| \le C \) give \( F_n (t) \le C^n (\Gamma (n/2))^{-1} t^n\), which concludes the desired result.

(b) Set \( x=x \) and \( x=x' \) in (3.3), take the difference of the result, and Apply Itô’s isometry. We have

$$\begin{aligned} \mathbb {E}\big [ \big ( Y_n(t, x) - Y_n(t, x') \big )^2\big ] = \int _0^t \int _\mathbb {R}\big (p(t-s,x-y) - p(t-s,x'-y)\big )^2 \mathbb {E}\big [Y_{n-1}(s,y)^2\big ] \mathrm {d}s \mathrm {d}y. \end{aligned}$$
(3.5)

Use Part (a) to bound \( \mathbb {E}[Y_{n-1}(t,x)^2] \), and apply Lemma 2.2(d) to bound the resulting integral. Doing so produces the desired result.

(c) Assume without loss of generality \(t> t'\). Set \( t=t \) and \( t=t' \) in (3.3), take the difference, and apply Itô’s isometry to the result. We have

$$\begin{aligned} \begin{aligned} \mathbb {E}\big [\big (Y_n(t, x) - Y_n(t', x)\big )^2\big ]&= \int _0^{t'} \int _\mathbb {R}\big (p(t - s,x-y) - p(t' - s,x-y) \big )^2\\&\quad \mathbb {E}\big [Y_{n-1}(s, y)^2\big ]\mathrm {d}s \mathrm {d}y\\&\quad + \int _{t'}^{t} \int _\mathbb {R}p(t - s,x-y) \mathbb {E}\big [Y_{n-1}(s, y)^2\big ] \mathrm {d}s\mathrm {d}y. \end{aligned} \end{aligned}$$
(3.6)

On the right hand side, use Part (a) to bound \( \mathbb {E}[Y_{n-1}(s,y)^2] \), apply Lemma 2.2(e) and Lemma 2.2(c) to bound the resulting integrals, respectively. Doing so produces the desired result. \(\quad \square \)

Based on Lemmas 3.1 and Proposition 3.2, we now derive some pointwise Hölder bounds on \( Y_{n} \).

Corollary 3.3

Fix \( a\in (a_*,\infty ) \), \( \alpha \in (0, \frac{1}{4}) \), and \(\beta \in (0, \frac{1}{2})\). There exists \(C = C(T,a,\alpha ,\beta )\) such that for all \( n\in \mathbb {Z}_{\ge 1}\), \( r \ge 0 \), \( t,t'\in [0,T] \), and \( x,x'\in \mathbb {R}\),

  1. (a)

    \( \mathbb {P}\Big [\, |Y_n (t, x) - Y_n (t, x')| \ge |x-x'|^\beta ( e^{a|x|} \vee e^{a|x'|}) r \Big ] \le \exp \big (- \tfrac{1}{C} n^{\frac{3}{2}} r^\frac{2}{n} + n \big ) \), and

  2. (b)

    \( \displaystyle \mathbb {P}\Big [\, |Y_n (t', x) - Y_n (t, x)| \ge e^{a|x|} |t-t'|^\alpha r \Big ] \le \exp \big (-\tfrac{1}{C} n^{\frac{3}{2}} r^\frac{2}{n} + n \big ) \).

Proof

Set \( U \,{:}{=}\, (e^{-a|x|}\wedge e^{-a|x'|}) \frac{ Y_n (t,x)-Y_n (t,x') }{ |x-x'|^{\beta } } \), \( V \,{:}{=}\, (e^{-a|x|}\wedge e^{-a|x'|}) \frac{ Y_n (t,x)-Y_n (t',x) }{ |t-t'|^{\alpha } } \), \( \sigma ^2 \,{:}{=}\, \mathbb {E}[U^2] \), and \( {\eta }^2 \,{:}{=}\, \mathbb {E}[{V}^2] \). Proposition 3.2(b) and (c) give \( \sigma ^2 \le C^n/\Gamma (\frac{n}{2}) \) and \( {\eta }^2 \le C^n/\Gamma (\frac{n}{2}) \). Taking \( \frac{1}{n} \) power on both sides and using \( \Gamma (\frac{n}{2})^{-1/n} \le C n^{-1/2} \), we have \( \sigma ^{\frac{2}{n}} \le C n^{-1/2} \) and \( {\eta }^{\frac{2}{n}} \le C n^{-1/2} \). Next, since \(Y_n (t, x) \), \( Y_n(t,x') \), \( Y_n(t',x) \), and \( Y_n(t',x') \) belong to the n-th \( \mathbb {R}\)-valued Wiener chaos, U and V also belong to the n-th Wiener chaos. The desired results now follow from Lemma 3.1. \(\quad \square \)

Our next step is to leverage the pointwise bounds in Corollary 3.3 to a functional bound. To this end it is convenient to first work with Hölder seminorms. For \( f\in C([0,T]\times \mathbb {R}) \) and \( k\in \mathbb {Z}\), set

$$\begin{aligned}{}[ f ]_{a,\alpha ,\beta ,k}&\,{:}{=}\, e^{-a|k|} \sup \bigg \{ \frac{ |f(t_1, x_1) - f(t_2, x_2)| }{ |t_1 - t_2|^\alpha + |x_1 - x_2|^\beta } \,:\, (t_1, x_1)\ne (t_2, x_2) \in [0,T]\times [k,k+1] \bigg \}. \end{aligned}$$
(3.7)

This quantity measures the Hölder continuity of f on \( [0,T]\times [k,k+1] \).

Proposition 3.4

Fix \( a\in (a_*,\infty ) \), \( \alpha \in (0,\frac{1}{4}) \), and \( \beta \in (0,\frac{1}{2}) \). There exists \( C=C(T,a,\alpha ,\beta ) \) such that, for all \( r \ge (Cn^{-\frac{1}{2}})^\frac{n}{2} \), \( n\in \mathbb {Z}_{\ge 1}\), and \( k\in \mathbb {Z}\),

$$\begin{aligned} \mathbb {P}\big [ \, [ Y_{n} ]_{a,\alpha ,\beta ,k} \ge r \big ] \le C\,\exp \big ( -\tfrac{1}{C} n^{\frac{3}{2}} r^{\frac{2}{n}} \big ). \end{aligned}$$

Proof

Throughout this proof we write \( C=C(T,a_*,a,\alpha ,\beta ) \).

The proof follows similar argument in the proof of Kolmogorov’s continuity theorem. The starting point is an inductive partition of \( [0,T]\times [k,k+1] \) into nested rectangles. Let \( \tau _0 \,{:}{=}\, T \) and \( \zeta _0\,{:}{=}\,1 \) denote the side lengths of \( R^{(0)}_{11} \,{:}{=}\, [0,T]\times [k,k+1] \). We proceed by induction in \( \ell =0,1,2,\ldots \). Assume, for \( \ell \ge 0 \), we have obtained the rectangles \( R^{(\ell )}_{ij} \), for \( i=1, \ldots , \prod _{\ell '=1}^{\ell -1} m_{\ell '} \) and \( j=1, \ldots , \prod _{\ell '=1}^{\ell -1} n_{\ell '} \). We partition each \( R^{(\ell )}_{ij} \) into \( m_{\ell } \times n_{\ell } \) rectangles of equal size. The side lengths of the resulting rectangles are therefore \( \tau _{\ell +1} = \tau _{\ell }/m_\ell \) and \( \zeta _{\ell +1} = \zeta _{\ell }/n_\ell \). The numbers \( m_\ell \) and \( n_\ell \) are chosen in such a way that

$$\begin{aligned}&\tfrac{1}{2} \le \tau _\ell ^\alpha /\zeta _\ell ^\beta \le 2, \quad \text {for } \ell = 1,2,\ldots , \end{aligned}$$
(3.8)
$$\begin{aligned}&2 \le m_\ell , n_\ell \le C, \quad \text {for } \ell = 0,1,2,\ldots . \end{aligned}$$
(3.9)

Let \( \mathcal {V}_\ell \,{:}{=}\, \{ (i\tau _\ell ,k+j\zeta _\ell ) : i=1, \ldots , \prod _{\ell '=1}^{\ell -1} m_{\ell '}, j=1, \ldots , \prod _{\ell '=1}^{\ell -1} n_{\ell '} \} \) denote the set of the vertices at the \( \ell \)-th level, and let \( \mathcal {E}_\ell \) denote the corresponding set of edges.

For \( (t_1,x_1)\ne (t_2,x_2) \in [0,T]\times [k,k+1] \), let

$$\begin{aligned} \ell _* = \ell _*(t_1,x_1,t_2,x_2) \,{:}{=}\, \min \{ \ell \in \mathbb {Z}_{\ge 0}\, : |t_1-t_2| \ge \tau _\ell \text { or } |x_1-x_2| \ge \zeta _\ell \}. \end{aligned}$$
(3.10)

It is standard to show that, for any \( f\in C([0,T]\times \mathbb {R}) \),

$$\begin{aligned} |f(t_1,x_1)-f(t_2,x_2)| \le C \sum _{\ell \ge \ell _*} \max _{\mathbf {e}\in \mathcal {E}_\ell } |f(\partial \mathbf {e})|. \end{aligned}$$
(3.11)

Here \( |f(\partial \mathbf {e})| \,{:}{=}\, |f(s_1,y_1)-f(s_2,y_2)| \), where \( (s_1,y_1) \) and \( (s_2,y_2) \) are the two ends of the edge \( \mathbf {e}\in \mathcal {E}_\ell \).

Below we will apply (3.11) for \( f=e^{-a|k|}Y_{n} \). To prepare for this application let us first derive a bound on

$$\begin{aligned} \sum _{\ell _0 \ge 0} \mathbb {P}\Big [\, \sum _{\ell \ge \ell _0} \max _{\mathbf {e}\in \mathcal {E}_\ell } e^{-a|k|} |Y_{n}(\partial \mathbf {e})| \ge (\tau _{\ell _0}^\alpha + \zeta _{\ell _0}^\beta ) r \Big ]. \end{aligned}$$
(3.12)

Set \( \delta \,{:}{=}\, (\frac{1}{2} (\frac{1}{4}-\alpha )) \wedge (\frac{1}{2}(\frac{1}{2}-\beta )) \). Fix any edge \( \mathbf {e}\in \mathcal {E}_\ell \). If \( \mathbf {e}\) is in the t direction, apply Corollary 3.3(b) with \( \{(t,x),(t',x)\} = \partial \mathbf {e}\), \( \alpha \mapsto \alpha +\delta \), and \( r\mapsto \tau _\ell ^{-\delta } r \). If \( \mathbf {e}\) is in the x direction, apply Corollary 3.3(a) with \( \{(t,x),(t,x')\} = \partial \mathbf {e}\), \( \beta \mapsto \beta +\delta \), and \( r\mapsto \zeta _\ell ^{-\delta } r \). The result gives

$$\begin{aligned} \mathbb {P}\big [ e^{-a|k|-|a|} |Y_{n}(\partial \mathbf {e})| \ge \tau _\ell ^\alpha r \big ]&\le \exp \big ( -\tfrac{1}{C} n^\frac{3}{2} \tau ^{-\delta }_\ell r^{\frac{2}{n}} + n \big ), \quad \text {if } \mathbf {e}\text { is in the } t \text { direction,} \end{aligned}$$
(3.13)
$$\begin{aligned} \mathbb {P}\big [ e^{-a|k|-|a|} |Y_{n}(\partial \mathbf {e})| \ge \zeta _\ell ^\beta r \big ]&\le \exp \big ( -\tfrac{1}{C} n^\frac{3}{2} \zeta ^{-\delta }_\ell r^{\frac{2}{n}} + n \big ), \quad \text {if } \mathbf {e}\text { is in the } x \text { direction.} \end{aligned}$$
(3.14)

On the right hand sides of (3.13)–(3.14), use \( m_\ell ,n_\ell \ge 2 \) to bound \( \tau _\ell ^{-\delta } \ge e^{\frac{\ell }{C}} \) and \( \zeta _\ell ^{-\delta } \ge e^{-\frac{\ell }{C}} \). Take the union bound of the result over \( \mathbf {e}\in \mathcal {E}_\ell \). The condition \( m_\ell ,n_\ell \le C \) gives \( |\mathcal {E}_\ell | \le C^\ell \). Hence

$$\begin{aligned} \mathbb {P}\Big [\, \max _{\mathbf {e}\in \mathcal {E}_\ell } e^{-a|k|} |Y_{n}(\partial \mathbf {e})| \ge e^{|a|} (\tau _\ell ^\alpha + \zeta _\ell ^\beta ) r \Big ] \le C^\ell \exp \big ( - \tfrac{1}{C} e^{\frac{\ell }{C n}} n^\frac{3}{2} r^{\frac{2}{n}} +n \big ). \end{aligned}$$
(3.15)

Next, the condition \( m_\ell ,n_\ell \ge 2 \) implies \( \tau _\ell \le \tau _{\ell _0} 2^{-\ell +\ell _0} \) and \( \zeta _\ell \le \zeta _{\ell _0} 2^{-\ell +\ell _0} \), and therefore \( \sum _{\ell \ge \ell _0} (\tau _\ell ^\alpha + \zeta _\ell ^\beta )r \le C (\tau _{\ell _0}^\alpha + \zeta _{\ell _0}^\beta )r. \) Use this inequality to take the union bound of (3.15) over \( \ell \ge \ell _0 \) and absorb \( e^{|a|} \) into C. We have

$$\begin{aligned} \mathbb {P}\Big [ \sum _{\ell \ge \ell _0} \max _{\mathbf {e}\in \mathcal {E}_\ell } e^{-a|k|} |Y_{n}(\partial \mathbf {e})| \ge (\tau _{\ell _0}^\alpha +\zeta _{\ell _0}^\beta ) Cr \Big ] \le \sum _{\ell \ge \ell _0}C^\ell \exp \big ( -\tfrac{1}{C} e^{\frac{\ell }{Cn}} n^\frac{3}{2} r^{\frac{2}{n}} + n \big ). \end{aligned}$$

Use \( e^{\frac{\ell }{Cn}} \ge 1 + \frac{\ell }{Cn} \) on the right hand side, sum both sides over \( \ell _0 \in \mathbb {Z}_{\ge 0}\), and rename \( Cr\mapsto r \). Doing so gives (3.12) \(\le \exp ( -\tfrac{1}{C}n^\frac{3}{2} r^{\frac{2}{n}} ) \sum _{\ell _0\ge 0} \sum _{\ell \ge \ell _0} \exp ( -\tfrac{\ell }{C} n^{\frac{1}{2}} r^{\frac{2}{n}} + n + \ell C). \) For all \( r \ge (C_0n^{-\frac{1}{2}})^{\frac{n}{2}} \) and \( C_0 \) sufficiently large, the last double sum is convergent and bounded. Hence

$$\begin{aligned} (3.12)\le C \exp \big ( -\tfrac{1}{C}n^\frac{3}{2} r^{\frac{2}{n}} \big ), \quad \text {for all } r \ge (Cn^{-\frac{1}{2}})^{\frac{n}{2}}. \end{aligned}$$
(3.16)

Now, set \( f= e^{-a|k|} Y_{n} \) in (3.11) and use (3.16). We have that, for any \( r \ge (Cn^{-\frac{1}{2}})^{\frac{n}{2}} \),

$$\begin{aligned} e^{-a|k|} |Y_{n}(t_1,x_1)-Y_{n}(t_2,x_2)| \le C\,(\tau _{\ell _*}^\alpha +\zeta _{\ell _*}^\beta ) r, \quad \forall (t_1,x_1),(t_2,x_2)\in [0,T]\times [k,k+1] \end{aligned}$$
(3.17)

holds with probability \( \ge 1 - C\, \exp ( -\frac{1}{C} n^\frac{3}{2} r^{\frac{2}{n}} ) \). Referring to the definition of \( \ell _* \) in (3.10), we see that either \( |t_1-t_2| \ge \tau _{\ell _*} \) or \( |x_1-x_2| \ge \zeta _{\ell _*} \) holds. Combining this fact with the condition (3.8) gives \( \frac{ \tau _{\ell _*}^\alpha +\zeta _{\ell _*}^\beta }{ |t_1-t_2|^\alpha +|x_1-x_2|^\beta } \le 3. \) Divide both sides of (3.17) by \( |t_1-t_2|^\alpha +|x_1-x_2|^\beta \), use the last inequality on the right hand side, take supremum of over \( (t_1,x_1)\ne (t_2,x_2)\in [0,T]\times [k,k+1] \) in the result, and rename \( 3 C r \mapsto r \). Doing so concludes the desired result. \(\quad \square \)

We now state and prove a bound on \( \mathbb {P}[ \,\Vert Y_{n} \Vert _{a} \ge r ] \).

Proposition 3.5

Fix \( a>a_* \). There exists \( C=C(T,a) \) such that, for all \( r \ge (Cn^{-\frac{1}{2}})^{\frac{n}{2}} \) and \( n\in \mathbb {Z}_{\ge 0}\),

$$\begin{aligned} \mathbb {P}\big [ \, \Vert Y_{n} \Vert _{a} \ge r \big ] \le C\,\exp \big ( -\tfrac{1}{C} n^{\frac{3}{2}} r^{\frac{2}{n}}\big ). \end{aligned}$$

Proof

Throughout this proof we write \( C=C(T,a) \).

For \( n=0 \), note that \( Y_{0}(t,x) = \int _{\mathbb {R}} p(t,x-y) g_*(y) \, \mathrm {d}y \) is deterministic. It is straightforward to check from Lemma 2.2(b) and \( g_*\in C_{a_*^+}(\mathbb {R}) \) that \( \Vert Y_{0} \Vert _{a} < \infty \). Let \( b\,{:}{=}\,(a+a_*)/2 \). For \( n \ge 1 \), note from (2.4) that \( Y_n(0,0) = 0 \). Given this property, from the definitions (1.21) and (3.7) of \( \Vert \cdot \Vert _{a} \) and \( [ \cdot ]_{a,\alpha ,\beta ,k} \) it is straightforward to check

$$\begin{aligned} \Vert Y_n \Vert _{a} \le C \sum _{k\in \mathbb {Z}} [ Y_n ]_{a,\frac{1}{8},\frac{1}{4},k} \le C \sum _{k\in \mathbb {Z}} [ Y_n ]_{b,\frac{1}{8},\frac{1}{4},k} \, e^{-\frac{1}{2} (a-a_*)|k|}. \end{aligned}$$

Apply Proposition 3.4 with \( r \mapsto e^{\frac{1}{2} (a-a_*)|k|} r \) and \( (a,\alpha ,\beta )\mapsto (b,\frac{1}{8},\frac{1}{4}) \), and take the union bound of the result over \( k\in \mathbb {Z}\). We have \( \mathbb {P}[ \, \Vert Y_n \Vert _{a} \ge C r ] \le \sum _{k\in \mathbb {Z}} C\,\exp ( -\tfrac{1}{C} n^{\frac{3}{2}} e^{\frac{|k|}{Cn}} r^{\frac{2}{n}} ). \) Within the last expression, use \( e^{\frac{|k|}{Cn}} \ge 1 +\frac{|k|}{Cn} \), sum the result over \( k\in \mathbb {Z}\), and rename \( Cr \mapsto r \) in the result. Doing so concludes the desired result. \(\quad \square \)

Proposition 3.5 immediately implies

Corollary 3.6

Fix \( a>a_* \). We have \( \mathbb {E}[ \, \Vert Y_{n} \Vert _{a}^k ] < \infty \) for all \( k,n\in \mathbb {Z}_{\ge 0}\), and \( \mathbb {P}[ \sum _{n=0}^\infty \Vert Y_{n} \Vert _{a} <\infty ] = 1 \).

3.1.2 Proposition 1.7(a).

Recall \( I\) from (1.24). We begin by show that this function is a good rate function.

Lemma 3.7

For any \( a>a_* \), the function \( I: C_a([0,T]\times \mathbb {R}) \rightarrow \mathbb {R}\cup \{+\infty \} \) is a good rate function.

Proof

Throughout this proof we write \( \mathcal {H}= L^2([0,T]\times \mathbb {R}) \) and \( \Vert \cdot \Vert _\mathcal {H}= \Vert \cdot \Vert _{L^2} \). Recall that \( \mathcal {H}\subset \mathcal {B}\) is the Cameron–Martin subspace of \( \mathcal {B}\).

We begin with a reduction. It is well-known that under \( \mu \), the random vector \( \sqrt{\varepsilon }\xi \) satisfies an Large Deviation Principle (LDP) on \( \mathcal {B}\) with speed \( \varepsilon ^{-1} \) and the good rate function \(I_*: \mathcal {B}\rightarrow \mathbb {R}\cup \{+\infty \} \) given by \( I_*(\rho ) \,{:}{=}\, \frac{1}{2} \Vert \rho \Vert ^2_\mathcal {H}\) for \( \rho \in \mathcal {H}\) and \( I_*(\rho ) \,{:}{=}\, + \infty \) for \( \rho \notin \mathcal {H}\), c.f. [Led96, Chapter 4]. Recall that \( \mathsf {Z}\) maps \( \mathcal {H}\) to \( C_a([0,T]\times \mathbb {R}) \). We extend the domain of this map to \( \mathcal {B}\) by setting the function be 0 outside \( \mathcal {H}\), i.e.,

$$\begin{aligned} \widetilde{\mathsf {Z}}: \mathcal {B}\rightarrow C_a([0,T]\times \mathbb {R}), \quad \widetilde{\mathsf {Z}}(\zeta ) \,{:}{=}\, \left\{ \begin{array}{ll} \mathsf {Z}(\zeta ), &{}\text { when } \zeta \in \mathcal {H}, \\ 0, &{}\text { otherwise}. \end{array}\right. \end{aligned}$$

Referring to (1.24), we see that \( I\) is a pullback of \(I_*\) via \( \widetilde{\mathsf {Z}} \). Let \( \Omega (r) \,{:}{=}\, \{\zeta \in \mathcal {B}: I_*(\zeta ) \le r \} \) denote a sub-level set of \( I_*\). By [DS01, Lemma 2.1.4], to prove \( I\) is a good rate function, it suffices to construct a sequence of continuous functions \( \varphi _N: \mathcal {B}\rightarrow C_a([0,T]\times \mathbb {R}) \) such that for all \( r<\infty \),

$$\begin{aligned} \lim _{N\rightarrow \infty } \sup _{\zeta \in \Omega (r)} \Vert \widetilde{\mathsf {Z}}(\zeta ) - \varphi _N(\zeta ) \Vert _{a} =0. \end{aligned}$$
(3.18')

Since \( I_*(\zeta )<\infty \) only when \( \zeta \in \mathcal {H}\), we have \( \Omega (r) = \{\rho \in \mathcal {H}: \Vert \rho \Vert ^2_{\mathcal {H}} \le 2r \} \), and (3.18’) reduces to

$$\begin{aligned} \lim _{N\rightarrow \infty } \sup _{\zeta \in \Omega (r)} \Vert \mathsf {Z}(\rho ) - \varphi _N(\rho ) \Vert _{a} =0. \end{aligned}$$
(3.18)

We will construct the \( \varphi _N \) via truncation. First, combining (2.9) and Lemma 2.4 gives, for \( \rho \in \mathcal {H}\),

$$\begin{aligned} \mathsf {Z}(\rho ) = \sum _{n=0}^\infty \mathsf {Y}_n(\rho ) = \sum _{n=0}^{N} (Y_n)_\text {hom}(\rho ) + \sum _{n> N} \mathsf {Y}_n(\rho ). \end{aligned}$$
(3.19)

The \( n>N \) terms in (3.19) can be bounded by Lemma 2.3.

Focusing on the \( n\le N \) terms in (3.19), we seek to approximate each \( (Y_n)_\text {hom}(\rho ) \) by a continuous function. To this end we follow the argument in [HW15, Section 3]. Recall the notation \( W(f) \) from (2.2) and recall the orthonormal basis \( \{e_1,e_2,\ldots \} \subset \mathcal {H}\) from Sect. 2.1.1. Regarding \( W(e_i): \mathcal {B}\rightarrow \mathbb {R}\) as a random variable, we let \( \mathcal {F}_k \) be the sigma algebra generated by \( W(e_1), \dots , W(e_k) \), and set \( \Psi _{n,k} \,{:}{=}\, \mathbb {E}[ Y_n | \mathcal {F}_k ] \). Given that \( Y_n \) belongs to the n-th \( E\)-valued Wiener chaos (recall that \( E= C_a([0,T]\times \mathbb {R}) \)), it is standard to check:

  1. (i)

    \( \lim _{k\rightarrow \infty } \mathbb {E}[ \Vert Y_n - \Psi _{n,k} \Vert _{a}^2 ] =0 \),

  2. (ii)

    \( \Psi _{n,k} \) can be expressed as a finite sum of the form \( \Psi _{n,k} = \sum y_\alpha \prod _{i=1}^k W(e_i)^{\alpha _i} \), where \( y_\alpha \in C_a([0,T]\times \mathbb {R}) \) and \( \alpha = (\alpha _1,\alpha _2,\ldots ) \in \mathbb {Z}_{\ge 0}\times \mathbb {Z}_{\ge 0}\times \ldots \).

Now consider the function \( (\Psi _{n,k})_\text {hom}: \mathcal {B}\rightarrow C_a([0,T]\times \mathbb {R}) \) defined by \( (\Psi _{n,k})_\text {hom}(\zeta ) \,{:}{=}\, \int _{\mathcal {B}} \Psi _{n,k}(\xi +\zeta ) \mu (\mathrm {d}\xi ). \) A priori, such an integral is guaranteed to be well-defined only for \( \zeta \in \mathcal {H}\). Yet for the special case considered here, the integral is well-defined for all \( \zeta \in \mathcal {B}\) and the result gives a continuous function \( \mathcal {B}\rightarrow C_a([0,T]\times \mathbb {R}) \). To see why, recall the definition of \( \mathcal {B}\) from (2.1), and for \( \zeta \in \mathcal {B}\) write \( \zeta = \sum _{i\ge 1} \zeta _i e_i \). From (ii) we have \( \int _{\mathcal {B}} \Psi _{n,k}(\xi +\zeta ) \mu (\mathrm {d}\xi ) = \sum y_\alpha \prod _{i=1}^k \mathbb {E}[(\zeta _i+\Xi _i)^{\alpha _i}] \), where \( \Xi _1,\Xi _2,\ldots \) are independent standard \( \mathbb {R}\)-valued Gaussian random variables, and the sum is finite. From the last expression we see that the integral is well-defined and gives a continuous function \( \mathcal {B}\rightarrow C_a([0,T]\times \mathbb {R}) \). Next, for \( \rho \in \mathcal {H}\), by the Cameron–Martin theorem, we have \( \Vert (Y_n)_\text {hom}(\rho ) - (\Psi _{n,k})_\text {hom}(\rho ) \Vert _{a} = \Vert \int _\mathcal {B}\exp \big ( W(\rho ) - \tfrac{1}{2} \Vert \rho \Vert ^2_\mathcal {H}\big ) \big ( Y_n(\xi ) - \Psi _{n,k}(\xi ) \big ) \mu (\mathrm {d}\xi ) \Vert _{a}. \) Applying the Cauchy–Schwarz inequality to the last expression gives

$$\begin{aligned} \Vert (Y_n)_\text {hom}(\rho ) - (\Psi _{n,k})_\text {hom}(\rho ) \Vert _{a}^2 \le \exp \big ( \tfrac{1}{2} \Vert \rho \Vert ^2_\mathcal {H}\big ) \mathbb {E}\big [ \Vert Y_n - \Psi _{n,k} \Vert _{a}^2 \big ]. \end{aligned}$$
(3.20)

The right hand side converges to zero as \( k\rightarrow \infty \) by (i). We have obtained an approximate of \( (Y_n)_\text {hom} \) by the continuous function \( (\Psi _{n,k})_\text {hom} \).

We now construct \( \varphi _N \). For fixed N, invoke (i) to obtain \( k_n\in \mathbb {Z}_{\ge 1}\) such that \( \mathbb {E}[ \Vert Y_n - \Psi _{n,k_n} \Vert _{a}^2 ] \le (N+1)^{-2} \). Set \( \varphi _N \,{:}{=}\, \sum _{n=0}^N \Psi _{n,k_n} \). This is a continuous function \( \mathcal {B}\rightarrow C_a([0,T]\times \mathbb {R}) \) since each \( \Psi _{n,k} \) is. Subtract \( \varphi _N \) from both sides of (3.19), take \( \Vert \cdot \Vert _{a} \) on both sides, and use (3.20), \( \mathbb {E}[ \Vert Y_n - \Psi _{n,k_n} \Vert _{a}^2 ] \le (N+1)^{-2} \), and Lemma 2.3 to bound the result. We have, for all \( \rho \in \mathcal {H}\),

$$\begin{aligned} \Vert \, \mathsf {Z}(\rho ) - \varphi _N(\rho ) \Vert _{a} \le \exp \big ( \tfrac{1}{4} \Vert \rho \Vert ^2_\mathcal {H}\big ) (N+1)^{-1} + \sum _{n\ge N} \frac{1}{\Gamma (n/2)^{\frac{1}{2}}} \big ( C(a,T)\,\Vert \rho \Vert _{\mathcal {H}}\big )^n. \end{aligned}$$

Now consider \( \rho \in \Omega (2r) \), whence \( \Vert \rho \Vert ^2_\mathcal {H}\le 2r \). We see that the desired property (3.18) follows. \(\quad \square \)

Recall that \( Z_{N,\varepsilon } \,{:}{=}\, \sum _{n=0}^N \varepsilon ^{n/2}Y_{n} \). Next we show that \( Z_{N,\varepsilon } \) is an exponentially good approximation of \( Z_\varepsilon \).

Proposition 3.8

For any \( r>0 \) and \( a>a_* \), we have \( \lim \limits _{N \rightarrow \infty }\limsup \limits _{\varepsilon \rightarrow 0} \varepsilon \log \mathbb {P}\big [ \Vert Z_{N, \varepsilon } - Z_\varepsilon \Vert _{a} \ge r \big ] = -\infty . \)

Proof

By definition, \(Z_\varepsilon - Z_{N,\varepsilon } = \sum _{n >N} \varepsilon ^{\frac{n}{2}}Y_{n} \). Fix arbitrary \( N \in \mathbb {Z}_{\ge 1}\) and \( r>0 \). We seek to apply Proposition 3.5 with \( r\mapsto 2^{N-n} \varepsilon ^{-n/2}r \) and \( n > N \). For fixed Nr, the required condition \( 2^{N-n}\varepsilon ^{-n/2}r \ge (Cn^{-1/2})^{n/2} \) is satisfied for all \( n>N \) as long as \( \varepsilon \) is small enough. Summing the result over \( N>n \) and applying the union bound gives

$$\begin{aligned} \mathbb {P}\big [ \Vert Z_\varepsilon - Z_{N,\varepsilon } \Vert _{a} \ge r \big ] \le \sum _{n>N} \mathbb {P}\big [ \Vert Y_n \Vert _{a} \ge 2^{N-n}\varepsilon ^{-\frac{n}{2}}r \big ] \le C \, \sum _{n>N} \exp \big ( - \tfrac{1}{C} \varepsilon ^{-1} n^{\frac{3}{2}} e^{\frac{N-n}{Cn}} \big ), \end{aligned}$$

where \( C=C(T,a,r) \). On the right hand side, use \( e^{\frac{N-n}{Cn}} \ge 1 - \frac{N-n}{Cn} \) (which holds since \( n>N \)), sum the result. On both sides of the result, apply \( \varepsilon \log (\,\cdot \,) \), and take the limits \( \varepsilon \rightarrow 0 \) and \( N\rightarrow \infty \) in order. Doing so concludes the desired result. \(\quad \square \)

We seek to apply [DZ94, Theorem 4.2.16 (b)]. Doing so requires establishing a few properties of the rate functions. Let \( B_r(f) \,{:}{=}\, \{ f'\in C_{a}([0,T]\times \mathbb {R}) : \Vert f'-f \Vert _{a} < r \} \) denote the open ball of radius r around f. Recall \( I\) from (1.24) and recall \( I_N \) from (3.1).

Lemma 3.9

  1. (a)

    For any closed \( F\subset C_{a}([0,T]\times \mathbb {R}) \), we have \( \displaystyle \inf _{f\in F} I(f) \le \liminf _{N\rightarrow \infty } \, \inf _{f\in F} I_N(f). \)

  2. (b)

    For any \( f_0 \in C_a([0,T]\times \mathbb {R}) \), we have \( \displaystyle I(f_0) = \lim _{r\rightarrow 0} \liminf _{N\rightarrow \infty } \, \inf _{f\in B_r(f_0)} I_N(f). \)

Proof

(a) Let A denote the right hand side and assume without loss of generality \( A<\infty \). Referring to the definition of \( I_N \) in (3.1), we let \( \{(N_k,\rho _k)\}_{k=1}^\infty \subset \mathbb {Z}_{\ge 1}\times L^2([0,T]\times \mathbb {R}) \) be such that \( N_1<N_2<\ldots \rightarrow \infty \), \( \Vert \rho _k\Vert _{L^2} \le A + \frac{1}{k} \), and \( \sum _{n=0}^{N_k} \mathsf {Y}_n(\rho _k) =: f_k \in F \). Our next step is to relate \( (\rho _k,f_k) \) to \( I\). Recall that \( \mathsf {Z}(\rho ) = \sum _{n=0}^\infty \mathsf {Y}_n(\rho ) \). Letting \( f'_k \,{:}{=}\, f_k + \sum _{n>N_k} \mathsf {Y}_n(\rho _k) \in C_a([0,T]\times \mathbb {R}) \), we have \( \mathsf {Z}(\rho _{k}) = \widetilde{f}_k \). Referring to the definition of \( I\) in (1.24), we see that \( I(f'_k) \le \frac{1}{2}\Vert \rho _k\Vert _{L^2} \le A+\frac{1}{k} \). Also, \( \Vert f'_k-f_k \Vert _{a} \le \sum _{n>N_k} \Vert \mathsf {Y}_n(\rho _k) \Vert _{a} \). Using Lemma 2.3 and \( \Vert \rho _k\Vert _{L^2} \le A+1 \) to bound the last expression gives

$$\begin{aligned} \lim _{k\rightarrow \infty }\Vert f'_k-f_k \Vert _{a} = 0. \end{aligned}$$
(3.21)

By Lemma 3.7, the sequence \( \{f'_k\}_{k=1}^\infty \) is contained in a compact set. Hence, after passing to a subsequence we have \( f'_k \rightarrow f_* \) in \( C_a([0,T]\times \mathbb {R}) \). The condition (3.21) remains true after passing to the subsequence. Since \( f_k\in F \) and F is closed, we have \( f_*\in F \). By Lemma 3.7, \( I\) is lower semi-continuous, whereby \( I(f_*) \le \liminf _{k} I(f'_k) \). Lower bound the left hand side by \( \inf _{f\in F}I(f) \) and upper bound the right hand side by \( \liminf _k (A+\frac{1}{k}) = A \). We conclude the desired result.

(b) Apply Part (a) with \( F=\overline{B_r(f_0)} \) and use the lower semicontinuity of \( I\) on the left hand side of the result. Doing so gives the inequality \( \le \) for the desired result. It hence suffices to show the reverse inequality \( \ge \). To this end, we assume without loss of generality \( I(f_0)<\infty \), and let \( \{\widetilde{\rho }_k\}_{k=1}^\infty \subset L^2([0,T]\times \mathbb {R}) \) be such that \( \Vert \widetilde{\rho }_k\Vert _{L^2} \le I(f_0) + \frac{1}{k} \) and that \( \mathsf {Z}(\rho _k) = \sum _{n=0}^{\infty } \mathsf {Y}_n(\widetilde{\rho }_k) = f_0 \). Let \( \widetilde{f}_k \,{:}{=}\, \sum _{n=0}^{n} \mathsf {Z}(\rho _{k}) \). Referring to the definition of \( I_N \) in (3.1), we see that \( I_N(\widetilde{f}_k) \le \frac{1}{2}\Vert \rho _k\Vert _{L^2} \le I(f_0)+\frac{1}{k} \). Also, using Lemma 2.3 and \( \Vert \rho _k\Vert _{L^2} \le I(f_0)+1 \) gives \( \lim _{k\rightarrow \infty }\Vert f_0-\widetilde{f}_k \Vert _{a} =0. \) This statement implies that, for any given \( r>0 \) and for all k large enough (depending on r), we have \( \widetilde{f}_k \in B_r(f_0) \). From this and \( I_N(\widetilde{f}_k) \le I(f_0)+\frac{1}{k} \) the desired result follows. \(\quad \square \)

We are now ready to complete the proof of Proposition 1.7(a). The Large Deviation Principle (LDP) for \( \{ Z_{N,\varepsilon } \}_\varepsilon \) is established in Proposition 2.1 with the rate function \( I_N \). Given this, we apply [DZ94, Theorem 4.2.16 (b)] to go from the large deviations of \( \{ Z_{N,\varepsilon } \}_\varepsilon \) to that of \( \{ Z_\varepsilon \}_\varepsilon \). This theorem asserts that \( \{Z_\varepsilon \}_\varepsilon \) satisfies an Large Deviation Principle (LDP) with the rate function I contingent upon the following conditions.

  1. (1)

    I is a good rate function,

  2. (2)

    \( \{ Z_{N,\varepsilon } \}_\varepsilon \) is an exponentially good approximation (defined in [DZ94, Definition 4.2.14]) of \( \{ Z_{\varepsilon } \}_\varepsilon \),

  3. (3)

    \( \displaystyle I(f_0) = \sup _{r > 0} \liminf _{N \rightarrow \infty } \inf _{f \in B_r(f_0)} I_N (f) \), and

  4. (4)

    \( \displaystyle \inf _{f \in F} I(f) \le \limsup _{N \rightarrow \infty } \inf _{f \in F} I_m (f)\), for every closed set \( F \subset C_{a}([0,T]\times \mathbb {R}) \).

These conditions are verified by Lemma 3.7, Proposition 3.8, Lemma 3.9(b), and Lemma 3.9(a), respectively. Applying [DZ94, Theorem 4.2.16 (b)] completes the proof of Proposition 1.7(a).

3.2 The narrow wedge initial data, Proof of Proposition 1.7(b)

Throughout this subsection, we fix \( 0<\eta<T<\infty \), \( a\in \mathbb {R}\), and let \( Z_\varepsilon \) denote the solution of (1.13) with the initial data \( Z_\varepsilon (0,\cdot )=\delta _0(\cdot ) \).

The proof of Proposition 1.7(b) parallels that of Proposition 1.7(a), starting with the analog of Proposition 3.2-nw:

Proposition 3.2-nw.  Fix \( \theta _1 \in (0,\frac{1}{2}) \), \(\theta _2\in (0,1) \), and \( n \in \mathbb {Z}_{\ge 1}\). There exists \( C=C(T,\eta ,a,\theta _1,\theta _2) \) such that for all \( t, t'\in [\eta ,T] \) and \( x,x'\in \mathbb {R}\),

  1. (a)

    \( \mathbb {E}\big [\big ( Y_n(t, x) - Y_n(t, x') \big )^2\big ] \le \frac{C^n}{\Gamma (\frac{n}{2})} (e^{2a|x|} \vee e^{2a|x'|}) |x - x'|^{\theta _2} \), and

  2. (b)

    \( \mathbb {E}\big [\big ( Y_n(t, x) - Y_n(t', x) \big )^2\big ] \le \frac{C^n}{\Gamma (\frac{n}{2})} e^{2a|x|} |t - t'|^{\theta _1} \).

Proof

Throughout this proof we write \( C=C(T,\eta ,a,\theta _1,\theta _2) \).

(a) By [Cor18, Lemma 2.4], we have

$$\begin{aligned} \mathbb {E}[Y_n(t,x)^2] = t^{\frac{n}{2}} 2^{-n} \Gamma (\tfrac{n}{2})^{-1} p(t, x)^2. \end{aligned}$$
(3.22)

The identity (3.5) continues to hold here. Inserting (3.22) into the right hand side of (3.5) gives

$$\begin{aligned} \mathbb {E}\big [(Y_n (t,x) - Y_n(t,x'))^2\big ] \le \frac{C^n}{\Gamma (\frac{n}{2})} \int _0^t \int _\mathbb {R}\big (p(t-s,x-y) - p(t-s,x'-y)\big )^2 p(s, y)^2 \mathrm {d}y \mathrm {d}s. \end{aligned}$$

On the right hand side, divide the integral into two parts for \( s > \eta /2 \) and for \( s < \eta /2 \). For the former use Lemma 2.2(a) to bound \( p(s,y)^2 \le C e^{2a|y|} \) (note that \( s>\eta /2 \)) and use Lemma 2.2(d) to bound the remaining integral; for the latter use Lemma 2.2(i) to bound \( (p(t-s,x-y) - p(t-s,x'-y))^2 \le C |x-x'|^{\theta _2} (e^{2a|x-y|} \vee a^{2a|x'-y|}) \) (note that \( t-s \ge \eta /2 \)) and use Lemma 2.2(c) to bound the remaining integral. Doing so concludes the desired result.

(b) The identity (3.6) continues to hold here. Inserting (3.22) into the right hand side of (3.6) gives

$$\begin{aligned} \mathbb {E}\big [(Y_n(t,x) - Y_n(t',x))^2 \big ]&\le \frac{C^n}{\Gamma (\frac{n}{2})} \Big ( \int _0^{t'} \int _\mathbb {R}\big (p(t-s,x-y) - p(t'-s,x-y) \big )^2 \nonumber \\&\quad p(s, y)^2 \mathrm {d}y \mathrm {d}s \end{aligned}$$
(3.23)
$$\begin{aligned}&\quad + \int _{t'}^{t} \int _\mathbb {R}p(t-s,x-y)^2 p(s, y)^2 \mathrm {d}y \mathrm {d}s \Big ). \end{aligned}$$
(3.24)

On the right hand side of (3.23), divide the integral into two parts for \( s > \eta /2 \) and for \( s < \eta /2 \). For the former use Lemma 2.2(a) to bound \( p(s, y)^2 \le Ce^{2a|y|} \) (note that \( s>\eta /2 \)) and use Lemma 2.2(e) to bound the remaining integral; for the latter use Lemma 2.2(ii) to bound \( (p(t-s,x-y) - p(t'-s,x-y))^2 \le C |t'-t|^{\theta _1} e^{2a|x-y|} \) (note that \( t'-s \ge \eta /2 \)) and use Lemma 2.2(c) to bound the remaining integral. The integral in (3.24) can be evaluated to be \( \int _{t'}^t 4^{-1} \pi ^{-3/2} t^{-1/2}s^{-1/2}(t-s)^{-1/2} \exp (-\frac{x^2}{2t}) \mathrm {d}s. \) Using \( s,t\ge \eta \) to bound the last integral gives (3.24) \(\le C |t-t'|^{1/2} e^{2a|x|} \le C |t-t'|^{\theta _1} e^{2a|x|} \). From the preceding bounds we conclude the desired result. \(\quad \square \)

Given Proposition 3.2-nw, a similar proof of Proposition 3.5-nw adapted to the current setting yields

Proposition 3.5-nw.  There exists \( C=C(T,\eta ,a) \) such that, for all \( r \ge (Cn^{-\frac{1}{2}})^{\frac{n}{2}} \) and \( n\in \mathbb {Z}_{\ge 0}\),

$$\begin{aligned} \mathbb {P}\big [ \, \Vert Y_{n} \Vert _{a,\eta } \ge r \big ] \le C\,\exp \big ( -\tfrac{1}{C} n^{\frac{3}{2}} r^{\frac{2}{n}}\big ). \end{aligned}$$

Corollary 3.6-nw.  We have \( \mathbb {E}[ \, \Vert Y_{n} \Vert _{a,\eta }^k ] < \infty \) for all \( k,n\in \mathbb {Z}_{\ge 0}\), and \( \mathbb {P}[ \sum _{n=0}^\infty \Vert Y_{n} \Vert _{a,\eta } <\infty ] = 1 \).

Given Proposition 3.5-nw, the rest of the proof for Proposition 1.7 (b) follows the arguments in Sect. 3.1.2 mutatis mutandis.

4 The Quadratic and \( \frac{5}{2} \) Laws

Fix \( Z_\varepsilon (0,\cdot ) = \delta _0(\cdot ) \). Our goal is to prove Theorem 1.1. By the scaling (1.3), we have

$$\begin{aligned}&\mathbb {P}\big [ h(2\varepsilon ,0) + \sqrt{4\pi \varepsilon } \ge \lambda \big ] = \mathbb {P}\big [ \sqrt{4\pi }Z_\varepsilon (2,0) \ge e^{\lambda } \big ], \quad \mathbb {P}\big [ h(2\varepsilon ,0) + \sqrt{4\pi \varepsilon } \le -\lambda \big ] \\&\quad = \mathbb {P}\big [ \sqrt{4\pi }Z_\varepsilon (2,0) \le e^{-\lambda } \big ]. \end{aligned}$$

Hence Theorem 1.1(a) follows from Proposition 1.7(b) (for any \( a\in \mathbb {R}\) and \( T\ge 2 \)) and the contraction principle, with

$$\begin{aligned} \Phi (\lambda )&= \inf \{ \tfrac{1}{2} \Vert \rho \Vert _{L^2}^2 \, : \, \sqrt{4\pi }\mathsf {Z}(\rho ;2,0) \ge e^{\lambda } \}, \end{aligned}$$
(4.1)
$$\begin{aligned} \Phi (-\lambda )&= \inf \{ \tfrac{1}{2} \Vert \rho \Vert _{L^2}^2 \, : \, \sqrt{4\pi }\mathsf {Z}(\rho ;2,0) \le e^{-\lambda } \}. \end{aligned}$$
(4.2)

Proving Theorem 1.1(b) and (c) thus amounts to evaluating the infimums in (4.14.2) and (4.14.2), which will be carried out in Sects. 4.1 and 4.2, respectively.

4.1 Near-center tails, proof of Theorem 1.1(b)

In view of (4.14.2)–(4.14.2), our goal is to show

$$\begin{aligned} \lim _{\lambda \rightarrow 0} \lambda ^{-2} \inf \{ \tfrac{1}{2} \Vert \rho \Vert _{L^2}^2 \, : \, \sqrt{4\pi }\mathsf {Z}(\rho ;2,0) \ge e^{\lambda } \}&= \tfrac{1}{\sqrt{2\pi }}, \end{aligned}$$
(4.3)
$$\begin{aligned} \lim _{\lambda \rightarrow 0} \lambda ^{-2} \inf \{ \tfrac{1}{2} \Vert \rho \Vert _{L^2}^2 \, : \, \sqrt{4\pi }\mathsf {Z}(\rho ;2,0) \le e^{-\lambda } \}&= \tfrac{1}{\sqrt{2\pi }}. \end{aligned}$$
(4.4)

The proofs of (4.3) and (4.4) are the same so we consider only (4.3). Fix \( \rho \in L^2([0,2]\times \mathbb {R}) \). Since our goal is to prove (4.3), we assume \( \Vert \rho \Vert _{L^2} \le \lambda \) and \( \lambda \le 1 \). Recall that \( \mathsf {Z}(\rho ;t,x) = \sum _{n=0}^\infty \mathsf {Y}_n(\rho ;t,x) \), with \( \mathsf {Y}_n(\rho ;t,x) \) is given (2.8-nw). Let \( O(\lambda ^k) \) denote a generic function of \( \lambda \) such that \( |O(\lambda ^k)|\le C\lambda ^k \), for all \( \lambda \in (0,1] \). Specialize at \( (t,x) = (2,0) \) and apply the bound in Lemma 2.3-nw for \( n \ge 2 \). We have

$$\begin{aligned} \sqrt{4\pi } \mathsf {Z}(\rho ;2,0) = 1 + \sqrt{4\pi } \int _0^2 \int _{\mathbb {R}} \rho (s,y) p(2-s, y) p(s, y) \, \mathrm {d}y \mathrm {d}s + O(\lambda ^2). \end{aligned}$$
(4.5)

Now assume \( \sqrt{4\pi }\mathsf {Z}(\rho ;2,0) \ge e^{\lambda } \). Inserting this inequality into (4.5) and Taylor expanding \( e^{\lambda } \) gives

$$\begin{aligned} \sqrt{4\pi } \int _0^2 \int _{\mathbb {R}} \rho (s,y) p(2-s, y) p(s, y) \, \mathrm {d}y \mathrm {d}s \ge \lambda + O(\lambda ^2). \end{aligned}$$

On the left hand side, apply the Cauchy–Schwarz inequality to separate \( \rho (s,y) \) and \( p(2-s, y) p(s, y) \), and use

$$\begin{aligned} \int _0^2 \int _\mathbb {R}p(2-s, y)^2 p(s, y)^2 \mathrm {d}y \mathrm {d}s = 2^{-5/2}\pi ^{-1/2} \end{aligned}$$
(4.6)

We have \( \Vert \rho \Vert _{L^2} \ge (2/\pi )^{1/4}\lambda + O(\lambda ^2) \). Taking square of both sides and divide the result by \( \frac{1}{2\lambda ^2} \) gives the inequality ‘\( \ge \)’ in (4.3).

To show the reverse inequality, take \( \kappa >1 \) and \( \rho (s,y) = \lambda \kappa 2^{3/2} p(2-s, y) p(s, y) \). Inserting this \( \rho \) into (4.5) and using (4.6) give \( \sqrt{4\pi } \mathsf {Z}(\rho ;2,0) \ge 1 + \kappa \lambda + O(\lambda ^2) \). With \( \kappa >1 \), the last expression is larger than \( e^{\lambda } \) for all \( \lambda \) small enough. On the other hand, by using (4.6) we have \( \frac{1}{2}\lambda ^{-2}\Vert \rho \Vert _{L^2}^2 = \frac{\kappa ^2}{\sqrt{2\pi }} \). Hence the left hand side of (4.3) is bounded by \( \frac{\kappa ^2}{\sqrt{2\pi }} \). Now taking \( \kappa \downarrow 1 \) completes the proof.

4.2 Deep lower tail, proof of Theorem 1.1(c)

4.2.1 The Feynman–Kac formula and scaling.

Here we consider the deep lower-tail regime, i.e., \( -\lambda \rightarrow -\infty \). The first step is to express \( \mathsf {Z}(\rho ;t,x) \) by the Feynman–Kac formula. Namely,

$$\begin{aligned} \mathsf {Z}(\rho ;t,x)&= \mathbb {E}_x \Big [ \exp \Big ( \int _0^t \rho (s,B(t-s)) \, \mathrm {d}s \Big ) \, \delta _0(B(t)) \Big ] \end{aligned}$$
(4.7)
$$\begin{aligned}&= \mathbb {E}_{0\rightarrow x} \Big [ \exp \Big ( \int _0^t \rho (s,B_\text {b}(s)) \, \mathrm {d}s \Big ) \Big ] p(t,x). \end{aligned}$$
(4.8)

In (4.7), the expectation \( \mathbb {E}_x \) is taken with respect to a Brownian motion that starts from x, and in (4.8) the \( \mathbb {E}_{0\rightarrow x} \) is taken with respect to a Brownian bridge \( B_\text {b}(s) \) that starts from \( B_\text {b}(0)=0 \) and ends in \( B_\text {b}(t)=x \). Indeed, the expression (4.7) is equivalent to (2.9) upon Taylor-expanding the exponential in (4.7) and exchanging the sum with the expectation. The exchange is justified by the bound in Lemma 2.3-nw. Set

$$\begin{aligned} \mathsf {h}(\rho ;t,x)&\,{:}{=}\, \log (\sqrt{4\pi }\mathsf {Z}(\rho ;t,x)) = \log (\sqrt{4\pi }p(t,x)) \nonumber \\&\quad + \log \mathbb {E}_{0\rightarrow x} \Big [ \exp \Big ( \int _0^t \rho (s,B_\text {b}(s)) \, \mathrm {d}s \Big ) \Big ]. \end{aligned}$$
(4.9)

Take log on both sides of (4.7) and insert the result into (4.14.2). We have

$$\begin{aligned} \Phi (- \lambda ) = \inf \big \{ \tfrac{1}{2} \Vert \rho \Vert _{L^2}^2 : \, \mathsf {h}(\rho ;2,0) \le -\lambda \big \}. \end{aligned}$$
(4.10)

We expect the right hand side of (4.10) to grow as \( \lambda ^{5/2} \) when \( \lambda \rightarrow \infty \). As pointed out in [KK07, KK09, MKV16, KMS16], such a power law follows from scaling. More precisely, when \( \lambda \rightarrow \infty \), it is natural to scale \( \mathsf {h}\mapsto \lambda ^{-1} \mathsf {h}\) and \( \rho \mapsto \lambda \rho \). Accordingly, for the Brownian bridge in (4.9) to complete on the same footing, it is desirable to have a factor \( \lambda ^{-1/2} \) multiplying \( B_\text {b}(s) \). This is so because large deviations of \( \lambda ^{-1/2} B_\text {b}(s) \) occurs at rate \( \lambda \), which is compatible with the scaling \( \rho \mapsto \lambda \rho \). To implement these scaling, in (4.9) replace \( \rho (t,x) \mapsto \lambda \rho (t,\lambda ^{-1/2} x) \) and \( x\mapsto \lambda ^{1/2} x \) and divide the result by \( \lambda \). Let \( \mathsf {h}_\lambda (\rho ;t,x) \,{:}{=}\, \lambda ^{-1} \mathsf {h}(\lambda \rho (\cdot ,\lambda ^{-1/2} \cdot );t,\lambda ^{1/2} x) \) denote the resulting function on the left hand side. We have

$$\begin{aligned} \mathsf {h}_\lambda (\rho ;t,x)&= \lambda ^{-1}\log (\sqrt{4\pi }p(t,\lambda ^{\frac{1}{2}} x))\nonumber \\&\quad + \lambda ^{-1} \log \mathbb {E}_{0\rightarrow \lambda ^{1/2} x} \Big [ \exp \Big ( \int _0^t \lambda \rho (s,\lambda ^{-\frac{1}{2}} B_\text {b}(s)) \, \mathrm {d}s \Big ) \Big ]. \end{aligned}$$
(4.11)

The replacement \( \rho (t,x) \mapsto \lambda \rho (t,\lambda ^{-1/2} x) \) changes \( \Vert \rho \Vert _{L^2}^2 \) by a factor of \( \lambda ^{5/2} \), so (4.10) translates into

$$\begin{aligned} \Phi (-\lambda ) = \lambda ^\frac{5}{2} \inf \big \{ \tfrac{1}{2} \Vert \rho \Vert _{L^2}^2 : \, \mathsf {h}_\lambda (\rho ;2,0) \le -1 \big \}. \end{aligned}$$
(4.12)

Proving Theorem 1.1(c) hence amounts to proving

$$\begin{aligned} \lim _{\lambda \rightarrow \infty } \big ( \inf \big \{ \tfrac{1}{2} \Vert \rho \Vert _{L^2}^2 : \, \mathsf {h}_\lambda (\rho ;2,0) \le -1 \big \} \big ) = \frac{4}{15\pi }. \end{aligned}$$
(4.13)

4.2.2 The optimal deviation \( \rho _*\) and its geodesics.

We begin by introducing a function \( \rho _*\in L^2([0, 2] \times \mathbb {R}) \). The definition of this function is motivated by physics argument [KK09, MKV16, KMS16]; see Sect. 1.1. In the context of Proposition 1.7, \( \rho \) describes possible deviations of the spacetime white noise \( \sqrt{\varepsilon }\xi \). Such \( \rho _*\) is a candidate for the optimal \( \rho \), so we refer to \( \rho _*\) as the optimal deviation.

To define \( \rho _*\), consider the unique \( C^1[1,2) \)-valued solution r(t) of the equation

$$\begin{aligned} r'(t) = 2^\frac{1}{2} \pi ^{-\frac{1}{2}} \, r^2 \sqrt{r-\pi /2},&\text { for } t \in (1,2), \quad r(1)=\pi /2, \quad \text {and } r|_{(1,2)}>\pi /2, \end{aligned}$$
(4.14)

and symmetrically extend it to \( C^1(0,2) \) by setting \( r(t) \,{:}{=}\, r(2-t) \) for \( t\in (0,1) \). Integrating (4.14) gives

$$\begin{aligned} \frac{(r(t)-\pi /2)^\frac{1}{2}}{r(t)\pi /2} + (\tfrac{2}{\pi })^{\frac{3}{2}} \arctan \Big ( \big ( \tfrac{r(t)}{\pi /2}-1 \big )^{\frac{1}{2}} \Big ) = (\tfrac{2}{\pi })^{\frac{1}{2}} \, |t-1|. \end{aligned}$$
(4.15)

Let us note a few useful properties of r(t) . It can be checked from (4.15) that \( \lim _{s\downarrow 0}r(s)=\lim _{s\uparrow 2}r(s)=+\infty \). The integral \( \int _0^2 r(t) \, \mathrm {d}t= 2 \int _1^2 r(t) \, \mathrm {d}t \) can be evaluated with the aid of (4.14): perform the change of variables \( 2 \int _1^2 r(t) \, \mathrm {d}t = 2\int _{\pi /2}^\infty \frac{r}{r'(t)} \mathrm {d}r \) and use (4.14) to substitute \( r'(t) \). The result reads

$$\begin{aligned} \int _0^2 r(t) \, \mathrm {d}t = \int _0^2 |r(t)| \, \mathrm {d}t = 2\pi . \end{aligned}$$
(4.16)

Set \( \ell (t): = 1/r(t) \) for \( t\in (0,2) \), and let \( \ell (0)\,{:}{=}\,0 \) and \( \ell (2)\,{:}{=}\,0 \) so that \( \ell \in C[0,2] \). We define

$$\begin{aligned} \rho _*(t,x) \,{:}{=}\, - \frac{r(t)}{2\pi } \,\Big ( 1 - \frac{x^2}{\ell (t)^2} \Big )_+. \end{aligned}$$
(4.17)

Next, setting \( \rho =\rho _*\) in (4.9), we seek to characterize the \( \lambda \rightarrow \infty \) limit of the resulting function:

$$\begin{aligned} \mathsf {h}_*(t,x) \,{:}{=}\, \lim _{\lambda \rightarrow \infty } \mathsf {h}_\lambda (\rho _*;t,x), \end{aligned}$$
(4.18)

for all \( (t,x)\in (0,2]\times \mathbb {R}\). Even though only \( \mathsf {h}_*(2,0) \) will be relevant toward the proof of (4.134.28), we treat general \( (t,x)\in (0,2]\times \mathbb {R}\) for its independent interest.

Remark 4.1

Indeed, with \( \rho _*\) being the optimal deviation of the spacetime white noise, the function \( \mathsf {h}_*\) should be viewed as the limit shape of \( h_{\varepsilon ,\lambda }(t,x) \,{:}{=}\, \lambda ^{-1}\log Z_\varepsilon (t,\lambda ^{1/2}x) \) under the conditioning \( \{ h_{\varepsilon ,\lambda }(0,2) \le -1 \} \) with \( \lambda \gg 1 \). A explicit expression of \( \mathsf {h}_*(1,x) \) is given in [HMS19]. One can show that [HMS19, Eq’s (10)–(11)] coincide with the variational expression of \( \mathsf {h}_*\) given in (4.22) below.

Proving that \( \mathsf {h}_*\) is the limit shape of \( h_{\varepsilon ,\lambda } \) remains open, which we leave for future work.

To characterize (4.18), we first turn the limit into certain minimization problem over paths, by using Varadhan’s lemma. To setup notation, we let \( H^1_{0,x}[0,t] \) denote the space of \( H^1 \) functions on [0, t] such that \( \gamma (0)=0 \) and \( \gamma (t)=x \), and likewise for \( C_{0,x}[0,t] \). For \( \gamma \in H^1_{0,x}[0,t] \), set

$$\begin{aligned} U(\gamma ;t,x) = \int _0^t \tfrac{1}{2} \gamma '(s)^2 - \rho _*(s,\gamma (s)) \ \mathrm {d}s. \end{aligned}$$
(4.19)

Lemma 4.2

For any \( (t,x)\in (0,2]\times \mathbb {R}\),

$$\begin{aligned} \lim _{\lambda \rightarrow \infty } \mathsf {h}_\lambda (\rho _*;t,x) =: \mathsf {h}_*(t,x) = - \inf \big \{ U(\gamma ;t,x) : \gamma \in H^1_{0,x}[0,t] \big \}. \end{aligned}$$
(4.20)

Proof

Let \( F(\gamma ) \,{:}{=}\, \int _0^t \rho _*(s,\gamma (s))\, \mathrm {d}s \). In (4.11), set \( \rho \mapsto \rho _*\) and let \( \lambda \rightarrow \infty \) to get

$$\begin{aligned} \lim _{\lambda \rightarrow \infty } \mathsf {h}_\lambda (\rho _*;t,x) = -\tfrac{x^2}{2t} + \lim _{\lambda \rightarrow \infty } \lambda ^{-1} \log \mathbb {E}_{0\rightarrow \lambda ^{1/2} x} \big [ \exp \big ( \lambda F(\lambda ^{-\frac{1}{2}} B_\text {b}(s)) \big ) \big ]. \end{aligned}$$
(4.21)

We have assumed that the last limit exists. To prove the existence of the limit and to evaluate it we appeal to Varadhan’s lemma. To start, let us establish the Large Deviation Principle (LDP) for \( \{ \lambda ^{-1/2} B_\text {b}(s) : s\in [0,t] \} \). Express \( B_\text {b}\) as \( B_\text {b}(s) = B(s) + (x-B(t)) s /t \), where \( B\) denotes a standard Brownian motion. Since the map \( \gamma \mapsto \gamma +(x-\gamma (t))s/t \) from \( \{ \gamma \in C[0,t] : \gamma (0)=0\} \) to \( C_{0,x}[0,t] \) is continuous, we can use the contraction principle to push forward the Large Deviation Principle (LDP) for \( \lambda ^{-1/2}B\). The result asserts that \( \lambda ^{-1/2}B_\text {b}\) enjoys an Large Deviation Principle (LDP) with speed \( \lambda \) and the rate function \( I_\mathrm {bb}(\gamma ) \,{:}{=}\, \inf \{ \frac{1}{2} \int _0^t (\gamma '(s)-v-\frac{x}{t})^2 \mathrm {d}s : v\in \mathbb {R}\} \) for \( \gamma \in H^1_{0,x}[0,t] \) and \( I_\mathrm {bb}(\gamma ) = +\infty \) otherwise. Optimizing over \( v\in \mathbb {R}\) gives

$$\begin{aligned} I_\mathrm {bb}(\gamma ) = \left\{ \begin{array}{ll} \int _0^t \frac{1}{2}\gamma '(s)^2 \mathrm {d}s - \frac{x^2}{2t}, &{}\text { for } \gamma \in H^1_{0,x}[0,t],\\ +\infty , &{}\text { for } \gamma \in C_{0,x}[0,t]{\setminus } H^1_{0,x}[0,t]. \end{array}\right. \end{aligned}$$

To apply Varadhan’s lemma we need to check, for \( F(\gamma ) \,{:}{=}\, \int _0^t \rho _*(s,\gamma (s))\, \mathrm {d}s \):

  1. (i)

    \( F: C_{0,x}[0,t] \rightarrow \mathbb {R}\) is continuous. This statement would follow if \( \rho _*\) were uniformly continuous on \( [0,t]\times \mathbb {R}\). The function \( \rho _*(s,y) \) however is discontinuous at (0, 0) and (2, 0) . To circumvent this issue, for small \( \delta >0 \), we consider the truncation \( \rho _*^\delta (s,y) \,{:}{=}\, \mathbf {1}_{\{|s-1|<1-\delta \}}\rho _*(s,y) \). The truncated functional \( F_\delta (\gamma )\,{:}{=}\,\int \rho _*^\delta (t,\gamma (t))\, \mathrm {d}t \) is continuous on \( C_{0,x}[0,t] \). The difference \( F -F_\delta \) is bounded by \( |(F-F_\delta )(\gamma )| \le \int _{|s-1|>1-\delta } |\rho _*(s,\gamma (s))|\, \mathrm {d}s \le \frac{1}{2\pi } \int _{|s-1|>1-\delta } |r(s)| \mathrm {d}s \). By (4.16), the last expression converges to zero as \( \delta \rightarrow 0 \), uniformly in \( \gamma \in C_{0,x}[0,t] \). From these properties we conclude that \( F: C_{0,x}[0,t] \rightarrow \mathbb {R}\) is continuous.

  2. (ii)

    \( \displaystyle \lim _{M\rightarrow \infty } \limsup _{\lambda \rightarrow \infty } \lambda ^{-1} \log \mathbb {E}_{0\rightarrow x }\big [ \exp \big (\lambda F(\lambda ^{-1/2}B_\text {b})\big ) \mathbf {1}\{F(\lambda ^{-1/2}B_\text {b})>M\}\big ] = -\infty \) This holds since \( \rho _*\le 0 \), which implies \( F \le 0 \).

Varadhan’s lemma applied to the last term in (4.21) completes the proof. \(\quad \square \)

Lemma 4.2 expresses \( \mathsf {h}_*(t,x) \) in terms of a variational problem over paths. We refer to the minimizing path(s) in (4.20) (if exists) as a geodesic. The next step is to identify the geodesic. Let

$$\begin{aligned} \Omega \,{:}{=}\, \{ (s,y) : s\in [0,2], |y| \le \ell (s) \} \end{aligned}$$

denote the support of \( \rho _*\), with the boundary \( \partial \Omega = \{ (s,y) : t\in [0,2], |y| =\ell (s) \} \).

Proposition 4.3

  1. (a)

    For any \( (t,x)\in (0,2]\times \mathbb {R}\), the infimum

    $$\begin{aligned} \mathsf {h}_*(t,x) = -\inf \big \{ U(\gamma ;t,x) : \gamma \in H^1_{0,x}[0,t] \big \} \end{aligned}$$
    (4.22)

    is attended in \( H^1_{0,x}[0,t] \).

  2. (b)

    When \( (t,x)=(2,0) \), the geodesics are \( \alpha \ell (\cdot ) \), \( |\alpha | \le 1 \).

  3. (c)

    When \( (t,x) \in \Omega \cap \{t\in (0,2)\} \), the unique geodesic is \( (x/\ell (t)) \ell (\cdot ) \).

  4. (d)

    When \( (t,x) \in \Omega ^\mathrm {c}\cap \{t\in (0,2]\} \), is the geodesic is the unique \( C^1_{0,x}[0,t] \) path such that \( \gamma |_{[0,t_*]}=\ell |_{[0,t_*]} \) and \( \gamma |_{[t_*,t]} \) is linear, for some \( t_*\in (0,t) \).

See Fig. 1 for an illustration for these geodesics.

Fig. 1
figure 1

The solid curves are the geodesics for (4.22), with the thick ones being \( \pm \ell (\cdot ) \). Those geodesics outside \( \pm \ell (\cdot ) \) are linear, and touch \( \pm \ell (\cdot ) \) at tangent

Remark 4.4

An intriguing feature of Proposition 4.3(b) is the nonuniqueness of the geodesics between (0, 0) and (2, 0) . For any \( |\alpha | \le 1 \), \( \gamma =\alpha \ell \) is one such geodesic, so the paths span a lens-shaped region \( \Omega \). For the exponential Last Passage Percolation (LPP), [BGS19] proved that the point-to-point geodesic (in the context of Last Passage Percolation (LPP)) does not concentrate around any given path under a lower-tail conditioning. Though the setups differ, the result of [BGS19] and Proposition 4.3(b) are consistent. It is an intriguing question to explore deeper connection between these two phenomena. For example, is it true that for Last Passage Percolation (LPP) under lower-tail conditioning, the distribution of the geodesic spans a lens-like region?

To streamline the proof of Proposition 4.3, let us prepare a few technical tools. The Euler–Lagrangian equation for (4.19) is

$$\begin{aligned} \gamma '' = - \partial _x \rho _*(s,\gamma (s)) = \left\{ \begin{array}{ll} - \tfrac{r(s) }{\pi \ell (s)^2} \gamma , &{} \text { when } (s,\gamma (s)) \in \Omega ^\circ , \\ 0, &{} \text{ when } (s,\gamma (s)) \in \Omega ^\mathrm {c}. \end{array}\right. \end{aligned}$$
(4.23)

Equation (4.23) is ambiguous when \( (s,\gamma (s)) \in \partial \Omega \) because \( \partial _x\rho _*\) is not continuous there. We will avoid referencing (4.23) when \( (s,\gamma (s)) \in \partial \Omega \). It will be convenient to also consider

$$\begin{aligned} \gamma '' = - \tfrac{r(s) }{\pi \ell (s)^2} \gamma , \end{aligned}$$
(4.24)

which coincides with (4.23) in \( \Omega ^\circ \).

Lemma 4.5

  1. (a)

    The function \( \ell \) is strictly concave and \( \lim _{s\downarrow 0} |\ell '(s)| = +\infty \).

  2. (b)

    For any \( \alpha \in \mathbb {R}\), the function \( \alpha \ell (s) \) solves (4.24) for \( s\in (0,2) \).

  3. (c)

    For any for any \( |\alpha |\le 1 \), \( U(\alpha \ell ;2,0) = -1 \).

  4. (d)

    In \( (\partial \Omega )^\mathrm {c} \), any geodesic of (4.22) is \( C^2 \) and solves (4.23).

  5. (e)

    When \( (t,x)\in \Omega \), any geodesic of (4.22) lies entirely in \( \Omega \).

  6. (f)

    Let \( \gamma \in H^1_{0,x}[0,t] \) be a geodesic of (4.22), and consider \( (t_*,\gamma (t_*)) \in \partial \Omega \) with \( t_*\in (0,t) \). Then

    $$\begin{aligned} \lim \limits _{\beta \downarrow 0 } \Big ( \frac{1}{\beta } \int _{t_*}^{t_*+\beta } \gamma '(s)\mathrm {d}s - \frac{1}{\beta } \int _{t_*-\beta }^{t_*} \gamma '(s)\mathrm {d}s \Big ) = 0. \end{aligned}$$

Proof

Parts (a)–(c) follow by straightforward calculations from \( \ell (s)=1/r(s) \), (4.14), and (4.16). Part (d) follows by standard variation procedure.

(e) The geodesic \( \gamma \) starts and ends within \( \Omega \), i.e., \( (0,\gamma (0)) =(0,0)\in \Omega \) and \( (t,\gamma (t)) = (t,x) \in \Omega \). If the geodesic ever leaves \( \Omega \), then there exists \( t_1<t_2\in [0,t] \) such that \( \gamma |_{(t_1,t_2)} \) lies outside \( \Omega \) and \( (t_i,\gamma (t_i)) \in \partial \Omega \) for \( i=1,2 \). See Fig. 2 for an illustration. Let us compare the functional \( U(\cdot ;t,x) \) (c.f., (4.19)) restricted onto the segments \( \gamma |_{[t_1,t_2]} \) and \( \pm \ell |_{[t_1,t_2]} \), where the \( \pm \) sign depends on which side of the boundary \( (t_1,\gamma (t_1)) \) and \( (t_2,\gamma (t_2)) \) belong to, c.f., Fig. 2. First \( \rho _*\) vanishes along both segments. Next, the strict concavity of \( \ell \) from Part (a) implies \( \int _{t_1}^{t_2} \gamma '(s)^2 \mathrm {d}s > \int _{t_1}^{t_2} \ell '(s)^2 \mathrm {d}s \). Therefore, we can modify \( \gamma \) by replacing the segment \( \gamma |_{[t_1,t_2]} \) with \( \pm \ell |_{[t_1,t_2]} \) to decreases the value of \( U(\gamma ;2,0) \). This contradicts with assumption that \( \gamma \) is a geodesic. Hence the geodesic must stay completely within \( \Omega \).

Fig. 2
figure 2

Illustration of Part (e) of the proof of Lemma 4.5

(f) The idea is to perform variation. Fix a neighborhood O of \( t_* \) with \( \overline{O}\subset (0,2) \). For \( f\in C^\infty _c(O) \) consider

$$\begin{aligned} F(\alpha )\,{:}{=}\, \int _0^t \tfrac{1}{2}(\gamma '+\alpha f')^2 - \rho _*(s,\gamma +\alpha f) \ \mathrm {d}s. \end{aligned}$$

The derivative \( \partial _x\rho _*\) is bounded on \( \overline{O}\times \mathbb {R}\) (even though not continuous). Taylor expanding F around \( \alpha =0 \) then gives \( \int \gamma '(s) f'(s) \mathrm {d}s \le c \int |f(s)| \mathrm {d}s \), for some constant \( c<\infty \). Within the last inequality, substitute \( f(s)\mapsto f(s+u) \), integrate the result over \( u\in [-\frac{1}{2}\beta ,\frac{1}{2}\beta ] \), and divide both sides by \( \beta \). This gives

$$\begin{aligned}&\frac{1}{\beta } \int \gamma '(s) (f(s+\tfrac{1}{2}\beta )-f(s-\tfrac{1}{2}\beta )) \mathrm {d}s = \frac{1}{\beta } \int (\gamma '(s-\tfrac{1}{2} \beta ) -\gamma '(s+\tfrac{1}{2} \beta )) f(s)\\&\quad \mathrm {d}s \le c \int |f(s)|\mathrm {d}s. \end{aligned}$$

This inequality holds for smooth f(s) supported in \( \{s:s\pm \frac{1}{2}\beta \in O\} \). Since \( \gamma '\in L^2[0,t] \), the equality extends to \( f\in L^2 \). Specializing \( f= \pm \mathbf {1}_{(t_*-\frac{1}{2}\beta ,t_*+\frac{1}{2}\beta )}\) and taking \( \beta \downarrow 0 \) gives the desired result. \(\quad \square \)

Proposition 4.3

(a) The proof follows from standard argument of the direct method. Take any minimizing sequence \( \{ \gamma _n \} \). For such a sequence, \( \{\gamma '_n\} \) is bounded in \( L^2[0,t] \). By the Banach–Alaoglu theorem, after passing to a subsequence we have \( \gamma '_n \rightarrow \eta \in L^2[0,t] \) weakly in \( L^2[0,t] \). Let \( \gamma (\overline{s}) \,{:}{=}\, \int _0^{\overline{s}} \eta (s) \mathrm {d}s \). We then have \( \gamma _n \rightarrow \gamma \) in \( C_{0,x}[0,t] \) and \( \int _0^t \gamma '(s)^2 \mathrm {d}s = \Vert \eta \Vert _{L^2}^2 \le \lim _{n}\Vert \gamma '_n\Vert _{L^2}^2 \). Also, by Property (i) in the proof of Lemma 4.2, \( \int _0^t \rho _*(s,\gamma _n(s))\mathrm {d}s \rightarrow \int _0^t \rho _*(s,\gamma (s))\mathrm {d}s \). We have verified that \( \gamma \in H^1_{0,x}[0,t] \) a geodesic.

(b) The proof amounts to showing that any geodesic must be of the form \( \alpha \ell \), for some \( |\alpha |\le 1 \). Once this is done, Lemma 4.5(c) guarantees that any such path is a geodesic.

We begin with a reduction. For a geodesic \( \gamma \in H^1_{0,0}[0,2] \), consider its first and second halves \( \gamma _1 \,{:}{=}\, \gamma |_{[0,1]} \) and \( \gamma _2(s) \,{:}{=}\, \gamma (2-s)|_{s\in [0,1]} \). Joining each half with itself end-to-end gives the symmetric paths \( \overline{\gamma }_i(s) \,{:}{=}\, \gamma _i(s)\mathbf {1}_{[0,1]}(s) + \gamma _i(s-1)\mathbf {1}_{(1,2]}(s) \), for \( s\in [0,2] \) and \( i=1,2 \). These symmetrized paths are also geodesics. To see why, note that since \( \rho _*(s,y) \) is symmetric around \( s=1 \), we have \( U(\overline{\gamma }_i;2,0) = 2 U(\gamma _{i};1,\gamma (1)) \), for \( i=1,2 \), and \( U(\gamma ;2,0) = U(\gamma _{1};1,\gamma (1)) + U(\gamma _{2};1,\gamma (1)) \). On the other hand, \( \gamma \) being a geodesic implies \( U(\gamma ;2,0) \le U(\overline{\gamma }_i;2,0) \), for \( i=1,2 \). From the these relations we infer that \( U(\overline{\gamma }_1;2,0)=U(\overline{\gamma }_2;2,0) = U(\gamma ;2,0) \), namely, the symmetrized paths \( \overline{\gamma }_1 \) and \( \overline{\gamma }_2 \) are also geodesics. Recall that our goal is to show any geodesic must be of the form \( \alpha \ell \), for some \( |\alpha |\le 1 \). If we can establish the statement for \( \overline{\gamma }_1 \) and \( \overline{\gamma }_2 \), the same immediately follows for \( \gamma \). Hence, without loss of generality, hereafter we consider only symmetric geodesics.

Fix a geodesic \( \gamma \in H^1_{0,0}[0,2] \). As argued in the preceding paragraph, we can and shall assume \( \gamma (s) \) is symmetric around \( s=1 \), and by Lemma 4.5(e) the path lies entirely in \( \Omega \). The last condition implies \( |\gamma (1)| \le \ell (1) \). Consider first the case \( |\gamma (1)| < \ell (1) \). By Lemma 4.5(d), within a neighborhood of \( s=1 \) the path \( \gamma (s) \) is \( C^2 \) and solves (4.23) and therefore (4.24). The symmetry of \( \gamma \) gives \( \gamma '(1)=0 \). The uniqueness of the ODE (4.24) and Lemma 4.5(b) now imply \( \gamma (s) = \alpha \ell (s) \), for \( \alpha = \gamma (1)/\ell (1) \) and for all s in a neighborhood of \( s=1 \). This matching \( \gamma (s) = \alpha \ell (s) \) extends to \( s\in (0,2) \) by standard continuity argument. This concludes the desired result for the case \( |\gamma (1)| < \ell (1) \).

Turning to the case \( |\gamma (1)|=\ell (1) \), we need to show \( \gamma =\pm \ell \). Let us argue by contradiction. Assuming the contrary, we can find \( t_2\in (0,1)\cup (1,2) \) such that \( (t_2,\gamma (t_2)) \in \Omega ^\circ \). By the symmetry of \( \gamma \) around \( s=1 \) we can and shall assume \( t_2\in (1,2) \). Tracking along \( \gamma \) backward in time from \( t_2 \), we let \( t_* \,{:}{=}\, \inf \{ s\in [0,t_*] : |\gamma (s)| < \ell (s) \} \) be the first hitting time of \( \partial \Omega \). Indeed \( t_* \in [1,t_2) \) and \( \gamma (t_*) = \pm \ell (t_*) \). Let us take ‘\( + \)’ for simplicity of notation; see Fig. 3 for an illustration. The case for ‘\( - \)’ can be treated by the same argument. By Lemma 4.5(d), \( \gamma |_{(t_*,t_2)} \) solves (4.23) and therefore (4.24). On the other hand, \( \ell \) also solves (4.24) by Lemma 4.5(b). These facts along with the well-posedness of (4.24) at \( (t_*,\ell (t_*)) \) imply that \( \gamma |_{[t_*,t_2)} \in C^2[t_*,t_2) \) and \( \lim _{\beta \downarrow 0}\gamma '(t_*+\beta ) \ne \ell '(t_*) \). Either ‘<’ or ‘>’ holds between these two quantities. The property \( \{(t,\gamma (t))\}_{t\in (t_*,t_2)} \subset \Omega ^\circ \) tells us that it is ‘<’, namely \( \lim _{\beta \downarrow 0}\gamma '(t_*+\beta ) < \ell '(t_*) \). Combining this inequality with Lemma 4.5(f) gives \( \lim _{\beta \downarrow 0} \frac{1}{\beta }\int _{t_*-\beta }^{t_*} \gamma '(s) \mathrm {d}s = \lim _{\beta \downarrow 0} \frac{1}{\beta }(\ell (t_*)-\gamma (t_*-\beta )) < \ell '(t_*) \). Recall from Lemma 4.5(a) that \( \ell \) is concave. The last inequality then forces \( \gamma (t_*-\beta ) > \ell (t_*-\beta ) \) for all small enough \( \beta >0 \). This statement contradicts with the fact that \( \gamma \) lies within \( \Omega \). We have reached a contradiction and hence completed the proof for the case \( |\gamma (1)|=\ell (1) \).

Fig. 3
figure 3

Illustration of Part (b) of the proof of Proposition 4.3. Only the portion \( s\ge t_* \) of the curve \( \gamma (s) \) is shown

(c) Our goal is to characterize the geodesic between (0, 0) and (tx) . The idea is to ‘embed’ such a minimization problem into a minimization problem between (0, 0) and (2, 0) . More precisely consider

$$\begin{aligned} \inf \big \{ U(\gamma ;2,0) : \gamma \in H^1_{0,x}[0,2], \ \gamma (t)=x \big \}. \end{aligned}$$
(4.25)

The infimum is taken over all \( H^1 \) path that joins (0, 0) and (2, 0) and passes through (tx) . Such an infimum can be divided into two parts as

(4.26)

Take any geodesic \( \gamma \in H^1_{0,x}[0,t] \) for the first infimum in (4.26) and any geodesic \( \overline{\gamma } \in H^1_{x,0}[t,2] \) for the second infimum in (4.26). (The existence of such geodesics can be established by the same argument in Part (a).) The concatenated path \( \gamma _\text {c}(s) \,{:}{=}\, \gamma (s)\mathbf {1}_{s\in [0,t]} + \overline{\gamma }(s)\mathbf {1}_{s\in (t,2]} \) is a geodesic for (4.25). Hence \( U(\gamma _\text {c};2,0) \ge U(\widetilde{\gamma };2,0) \), for any \( \widetilde{\gamma }\in H^1_{0,0}[0,2] \) that passes through (tx) . Set \( \alpha =x/\ell (t) \). The last inequality holds in particular for \( \widetilde{\gamma }= \alpha \ell \). On the other hand, under current assumption \( (t,x)\in \Omega \), we have \( |\alpha | \le 1 \), so Part (b) asserts that \( \alpha \ell \) minimizes (4.25) even without the constraint \( \gamma (t)=x \). Therefore, \( U(\gamma _\text {c};2,0) = U(\alpha \ell ;2,0) \), and \( \gamma _\text {c} \) itself is a geodesic for \( \inf \{ U(\cdot ;0,2) : \widetilde{\gamma }\in H^1_{0,0}[0,2] \} \). The last statement and Part (b) force \( \gamma _\text {c} = \alpha \ell \), which concludes the desired result.

(d) Fix a geodesic \( \gamma \in H^1_{0,x}[0,t] \). By Lemma 4.5(d) and the fact that \( (\partial _x \rho _*)|_{\Omega ^\mathrm {c}} = 0 \), the path \( \gamma \) is linear outside \( \Omega \). Tracking along \( \gamma \) backward in time from t, we let \( t_* \,{:}{=}\, \inf \{ s\in [0,t] : |\gamma (s)|> \ell (s) \} > 0 \) be the first hitting time of the boundary. By Lemma 4.5(a) must have \( t_* > 0 \). The segment \( \gamma |_{[0,t_*]} \) is itself is a geodesic for \( U(\cdot ;t_*,\gamma (t_*)) \). Since \( (t_*,\gamma (t_*))=(t_*,\pm \ell (t_*)) \in \Omega \), Part (c) implies that \( \gamma |_{[0,t_*]} = \pm \ell |_{[0,t_*]} \). The path \( \gamma \) is \( C^1 \) except possibly at \( s=t_* \), but Lemma 4.5(f) guarantees that \( \gamma (s) \) is also \( C^1 \) at \( s=t_* \). For the given \( (t,x)\in \Omega ^\mathrm {c} \), there is exactly one \( t_*\in (0,t) \) that satisfies all the prescribed properties, so we have identified the unique geodesic \( \gamma \). \(\quad \square \)

Given Lemma 4.2 and Proposition 4.3, it is possible to evaluate \( \mathsf {h}_*(t,x) \) by calculating \( U(\gamma ;t,x) \) along the geodesic(s) given in Proposition 4.3. In particular, Proposition 4.3(b) and Lemma 4.5(c) gives

$$\begin{aligned} \mathsf {h}_*(2,0) \,{:}{=}\, \lim _{\lambda \rightarrow \infty } \mathsf {h}_\lambda (\rho _*;2,0) = -1. \end{aligned}$$
(4.27)

Also, straightforward calculations from (4.17) (with the help of (4.16)) gives \( \frac{1}{2} \Vert \rho _*\Vert _{L^2}^2 = \frac{4}{15\pi } \).

We are now ready to prove one side of the inequalities in (4.134.28), namely

$$\begin{aligned} \limsup _{\lambda \rightarrow \infty } \big ( \inf \big \{ \tfrac{1}{2} \Vert \rho \Vert _{L^2}^2 : \, \mathsf {h}_\lambda (\rho ;2,0) \le -1 \big \} \big ) \le \tfrac{1}{2} \Vert \rho _*\Vert _{L^2}^2 = \tfrac{4}{15\pi }. \end{aligned}$$
(4.28)

To show (4.134.28) we would like to have \( \mathsf {h}_\lambda (\rho _*;2,0) \le -1 \) for all large enough \( \lambda \), but (4.27) only gives the inequality for \( \lambda =+\infty \). We circumvent this issue by scaling. Fix \( \kappa >1 \) and let \( (\rho _*)_\kappa (t,x) \,{:}{=}\, \kappa \rho _*(t,\kappa ^{1/2}x) \). Referring to the scaling from (4.9) to (4.11), we see that \( \mathsf {h}_\lambda ((\rho _*)_\kappa ;2,0) = \kappa \mathsf {h}_\lambda (\rho _*;2,0) \). This identity together with (4.27) implies \( \mathsf {h}_\lambda ((\rho _*)_\kappa ;2,0) < -1 \) for all large enough \( \lambda \). On the other hand, \( \frac{1}{2}\Vert (\rho _*)_\kappa \Vert _{L^2}^2 = \frac{\kappa ^{5/2}}{2}\Vert \rho _*\Vert _{L^2}^2 \), so the left hand side of (4.134.28) is at most \( \frac{\kappa ^{5/2}}{2}\Vert \rho _*\Vert _{L^2}^2 \). Letting \( \kappa \downarrow 1 \) concludes (4.134.28).

4.2.3 The reverse inequality.

To prove (4.134.28), it now remains only to show the reverse inequality. Fix any \( \rho \in L^2([0,2]\times \mathbb {R}) \) with \( \mathsf {h}_\lambda (\rho ;2,0) \le -1 \).

The first step is to relate \( \mathsf {h}_\lambda (\rho ;2,0) \) to the functional \( U(\gamma ;2,0) \), c.f., (4.19). Within (4.11), set \( (t,x)\mapsto (2,0) \), express the Brownian bridge as \( B_\text {b}(t) = B(t) - t B(2)/2 \), where \( B_\text {b}\) denotes a standard Brownian motion, and apply the Cameron–Martin–Girsanov theorem with \( \lambda ^{1/2}\gamma \in H^1_{0,0}[0,2] \) being the drift/shift. The result gives

$$\begin{aligned} \mathsf {h}_\lambda (\rho ;2,0)&= - \int _0^2 \tfrac{1}{2} \gamma '(t)^2 \mathrm {d}t + \lambda ^{-1} \log \mathbb {E}_{0 \rightarrow 0} \\&\quad \Big [ \exp \Big ( \int _0^2\Big (\lambda \rho (t,\gamma +\lambda ^{-\frac{1}{2}} B_\text {b}) \, \mathrm {d}t + \lambda ^{\frac{1}{2}}\gamma '(t) \mathrm {d}B(t)\Big ) \Big ) \Big ]. \end{aligned}$$

Applying Jensen’s inequality to the last term yields, for any \( \gamma \in H^1_{0,0}[0,2] \),

$$\begin{aligned} -1 \ge \mathsf {h}_\lambda (\rho ;2,0) \ge -\lambda ^{-1} \log \sqrt{4\pi } - \int _0^2 \tfrac{1}{2} \gamma '(t)^2 - \mathbb {E}_{0 \rightarrow 0} \big [ \rho (t,\gamma +\lambda ^{-\frac{1}{2}} B_\text {b}) \big ] \ \mathrm {d}t. \end{aligned}$$
(4.29)

On the right hand side, the first term vanishes as \( \lambda \rightarrow \infty \), and the second term resemble the functional \( U(\gamma ;2,0) \). The difference are that \( \rho \) replaces \( \rho _*\), and there is an additional expectation over \( \lambda ^{-\frac{1}{2}} B_\text {b}\).

We next use (4.29) to derive a useful inequality. First, recall from Lemma 4.5(c) that, for all \( |\alpha | \le 1 \),

$$\begin{aligned} -1 = -U(\alpha \ell ;2,0) = - \int _0^2 \tfrac{1}{2} (\alpha \ell ')^2 - \rho _*(t,\alpha \ell ) \ \mathrm {d}t. \end{aligned}$$
(4.30)

Substitute \( \gamma \mapsto \alpha \ell \) in (4.29) and subtract (4.30) from the result. This gives, for all \( |\alpha | \le 1 \),

$$\begin{aligned} \int _0^2 \big ( \rho _*(t,\alpha \ell ) - \mathbb {E}_{0 \rightarrow 0} \big [ \rho (t,\alpha \ell +\lambda ^{-\frac{1}{2}} B_\text {b}) \big ] \big )\, \mathrm {d}t \ge -\lambda ^{-1} \log \sqrt{4\pi }. \end{aligned}$$

Multiply both sides by \( -\frac{1}{2\pi }(1-\alpha ^2)_+ \) and integrate the result over \( \alpha \in \mathbb {R}\). On the left hand side of the result, swap the integrals, multiply the integrand by \( 1=r(t)\ell (t) \), and recognize \( -\frac{r(t)}{2\pi }(1-x^2/\ell (t)^2)_+ = \rho _*(t,x) \). We have

$$\begin{aligned} \int _0^2\int _\mathbb {R}\rho _*(t,\alpha \ell ) \Big ( \rho _*(t,\alpha \ell ) - \mathbb {E}_{0 \rightarrow 0} \big [ \rho (t,\alpha \ell +\lambda ^{-\frac{1}{2}} B_\text {b}) \big ] \Big )\, \ell (t) \mathrm {d}\alpha \mathrm {d}t \le \lambda ^{-1} \tfrac{15}{16} \log \sqrt{4\pi }. \end{aligned}$$
(4.31)

To see why (4.31) is useful, let us pretend for a moment that \( \lambda =+\infty \) in (4.31). The discussion in this paragraph is informal, and serves merely as a motivation for the rest of the proof. Informally set \( \lambda =+\infty \) in (4.31), and perform the change of variables \( x=\alpha \ell (t) \) on the left hand side. The result gives \( \langle \rho _*,\rho _*-\rho \rangle \le 0 \) and hence \( \Vert \rho _*\Vert _{L^2}^2 + \Vert \rho -\rho _*\Vert _{L^2}^2 \le \Vert \rho \Vert _{L^2}^2 \). The last inequality implies \( \Vert \rho _*\Vert _{L^2}^2 \le \Vert \rho \Vert _{L^2}^2 \), which is the desired result.

In light of the preceding discussion, we seek to develop an estimate of \( \langle \rho _*,\rho _*-\rho \rangle \). To alleviate heavy notation we will often abbreviate \( \lambda ^{-1/2}B_\text {b}=: \mathrm {bb}\). Write \( \langle \rho _*,\rho _*-\rho \rangle = \int (\rho _*^2 - \rho _*\rho )(t,x) \mathrm {d}x \mathrm {d}t. \) Within the integral add and subtract \( \mathbb {E}[\rho _*^2(t,x-\mathrm {bb})] \) and \( \mathbb {E}[\rho _*(t,x-\mathrm {bb})\rho (t,x)] \). This gives \( \langle \rho _*,\rho _*-\rho \rangle = A_1 + A_2 + A_3 \), where

$$\begin{aligned} A_1&\,{:}{=}\, \mathbb {E}\int _0^2 \int _{\mathbb {R}} \rho _*(t,x-\mathrm {bb}) \big ( \rho _*(t,x-\mathrm {bb}) - \rho (t,x) \big ) \ \mathrm {d}x \mathrm {d}t,\\ A_2&\,{:}{=}\, \mathbb {E}\int _0^2 \int _{\mathbb {R}} \rho _*^2(t,x) - \rho _*^2(t,x-\mathrm {bb}) \ \mathrm {d}x \mathrm {d}t,\\ A_3&\,{:}{=}\, \mathbb {E}\int _0^2 \int _{\mathbb {R}} \big ( \rho _*(t,x-\mathrm {bb}) - \rho _*(t,x) \big ) \rho (t,x) \ \mathrm {d}x \mathrm {d}t. \end{aligned}$$

For \( A_1 \), the change of variables \( x=\alpha \ell (t) + \mathrm {bb}= \alpha \ell (t) + \lambda ^{-1/2} B_\text {b}(t) \) reveals that \( A_1 \) is equal to the left hand side of (4.31). Hence \( A_1 \le \lambda ^{-1} \frac{16}{15}\log \sqrt{4\pi } \). The term \( A_2 \) does not depend on \( \rho \), and it is readily checked from (4.17) that \( \lim _{\lambda \rightarrow \infty } |A_2| =0 \). As for \( A_3 \), the Cauchy–Schwarz inequality gives \( |A_3|\le A_{31}^{1/2} \Vert \rho \Vert _{L^2} \), where \( A_{31} \,{:}{=}\, \mathbb {E}\int ( \rho _*(t,x-\mathrm {bb}) - \rho _*(t,x) )^2 \mathrm {d}t \mathrm {d}x \). The term \( A_{31} \) does not depend on \( \rho \), and it is readily checked from (4.17) that \( \lim _{\lambda \rightarrow \infty } |A_{31}| =0 \). Adopt the notation \( o_\lambda (1) \) for a generic quantity that depends only on \( \lambda \) such that \( \lim _{\lambda \rightarrow \infty }|o_\lambda (1)| = 0 \). Collecting the preceding results on \( A_1 \), \( A_2 \), and \( A_3 \) now gives

$$\begin{aligned} \langle \rho _*,\rho _*-\rho \rangle \le o_\lambda (1) ( 1 + \Vert \rho \Vert _{L^2} ). \end{aligned}$$
(4.32)

Since \( \Vert \rho \Vert _{L^2}^2 = \Vert \rho _*\Vert _{L^2}^2 + \Vert \rho -\rho _*\Vert _{L^2}^2 - 2\langle \rho _*,\rho _*-\rho \rangle \), the bound (4.32) implies \( \Vert \rho _*\Vert _{L^2}^2 \le (1+o_\lambda (1))\Vert \rho \Vert _{L^2}^2 + o_\lambda (1).\) This inequality holds for all \( \rho \in L^2 \) with \( \mathsf {h}_\lambda (\rho ;0,2) \le -1 \), and \( o_\lambda (1)\rightarrow 0 \) does not depend on \( \rho \). The desired result hence follows:

$$\begin{aligned} \liminf _{\lambda \rightarrow \infty } \big ( \inf \big \{ \tfrac{1}{2} \Vert \rho \Vert _{L^2}^2 : \, \mathsf {h}_\lambda (\rho ;2,0) \le -1 \big \} \big ) \ge \tfrac{1}{2} \Vert \rho _*\Vert _{L^2}^2 = \tfrac{4}{15\pi }. \end{aligned}$$