1 Introduction

In this paper, we study a specific long-range directed polymer model, that is, the Cauchy directed polymer model on \({\mathbb {Z}}^{1+1}\). The long-range directed polymer model is an extension of the classic nearest-neighbor directed polymer model. For details about the nearest-neighbor model, we refer to [8, 12, 13]; for details about the long-range model, we refer to [7, 16].

1.1 The Model

We now introduce the Cauchy directed polymer model on \({\mathbb {Z}}^{1+1}\). The model consists of a random field and a heavy-tailed random walk on \({\mathbb {Z}}\), whose increment distribution is in the domain of attraction of the 1-stable law. The random field models the random environment and the random walk models the polymer chain. The polymer chain interacts with the random environment. We want to investigate whether this interaction significantly influences the behavior of the polymer chain compared to the case with no random environment.

To be precise, we denote the random walk, its probability and expectation by \(S=(S_{n})_{n\ge 0}, {\mathbf {P}}\), and \({\mathbf {E}}\) respectively. The random walk S starts at 0 and has i.i.d. increments satisfying

$$\begin{aligned} \left\{ \begin{array}{l} {\mathbf {P}}(S_{1}-S_{0}>k)/{\mathbf {P}}(|S_{1}-S_{0}|>k)\sim p,\\ {\mathbf {P}}(|S_{1}-S_{0}|>k)\sim k^{-1}L(k), \end{array}\right. \quad \quad \text{ for } \text{ some }~p\in [0,1]~\text{ as }~k\rightarrow \infty , \end{aligned}$$
(1.1)

where \(L(\cdot )\) is a function slowly varying at infinity, i.e., \(L(\cdot ):(0,\infty )\rightarrow (0,\infty )\) and for any \(a>0, \lim _{t\rightarrow \infty }L(at)/L(t)=1\). The condition (1.1) is necessary and sufficient for \(S_{1}\) to belong to the domain of attraction of the 1-stable law, i.e.  there exist a positive sequence \((a_{n})_{n\ge 1}\) and a real sequence \((b_{n})_{n\ge 1}\), such that

$$\begin{aligned} \frac{S_{n}-b_{n}}{a_{n}}\overset{d}{\rightarrow }G,\quad \text{ as }~n\rightarrow \infty , \end{aligned}$$
(1.2)

where \(\overset{d}{\rightarrow }\) stands for weak convergence and G is some 1-stable random variable. When G is symmetric, it is known as the Cauchy distribution. In this paper, with a slight abuse of terminology, we say that any 1-stable law is Cauchy for convenience.

Convergence (1.2) is a well-known result. It can be shown that

$$\begin{aligned}&n{\mathbf {P}}(|S_{1}|>a_{n})\sim 1,\quad \text{ as }~n\rightarrow \infty . \end{aligned}$$
(1.3)
$$\begin{aligned}&b_{n}={\left\{ \begin{array}{ll} n{\mathbf {E}}[S_{1}],&{}\quad \text{ if }\,\, {\mathbf {E}}[|S_{1}|]<\infty ,\\ n{\mathbf {E}}[S_{1}\mathbb {1}_{\{|S_{1}|\le a_{n}\}}], &{}\quad \text{ if }~{\mathbf {E}}[|S_{1}|]=\infty . \end{array}\right. } \end{aligned}$$
(1.4)

Furthermore, we have:

$$\begin{aligned} a_{n}=n\varphi (n) \end{aligned}$$
(1.5)

with \(\varphi (n)=n^{-1}\sup \{x: x^{-1}nL(x)\le 1\}\), where \(\varphi (\cdot )\) can be proved to be slowly varying at infinity.

The random field, its probability and expectation are denoted by \(\omega :=(\omega _{n,x})_{n\in {\mathbb {N}}, x\in {\mathbb {Z}}}, {\mathbb {P}}\) and \({\mathbb {E}}\) respectively. Here \(\omega \) is a family of i.i.d. random variables independent of the random walk S. We assume that \(\omega \)’s moment generating function is finite in a neighborhood of 0, meaning that there exists a constant \(c>0\), such that

$$\begin{aligned} \lambda (\beta ):=\log {\mathbb {E}}[\exp (\beta \omega _{n,x})]<\infty ,\quad \forall \beta \in (-c,c). \end{aligned}$$
(1.6)

Beside (1.6), we also assume that

$$\begin{aligned} {\mathbb {E}}[\omega _{n,x}] = 0\quad \text{ and }\quad {\mathbb {E}}[(\omega _{n,x})^{2}]=1. \end{aligned}$$
(1.7)

Given the random environment \(\omega \) and polymer length N, the law of the polymer is defined via a Gibbs transformation of the law of the underlying random walk, namely,

$$\begin{aligned} \frac{\text{ d }{\mathbf {P}}_{N,\beta }^{\omega }}{\text{ d }{\mathbf {P}}}(S):=\frac{1}{Z_{N,\beta }^{\omega }} \exp \left( \sum \limits _{n=1}^{N}\beta \omega _{n,S_{n}}\right) , \end{aligned}$$

where \(\beta >0\) is the inverse temperature and

$$\begin{aligned} Z_{N,\beta }^{\omega }={\mathbf {P}}\left[ \exp \left( \sum \limits _{n=1}^{N}\beta \omega _{n,S_{n}}\right) \right] \end{aligned}$$

is the partition function.

It turns out that \(Z_{N,\beta }^{\omega }\) plays a key role in the study of the directed polymer model. In [5], Bolthausen first showed that the normalized partition function

$$\begin{aligned} {\hat{Z}}_{N,\beta }^{\omega }:=\exp (-N\lambda (\beta ))Z_{N,\beta }^{\omega } \end{aligned}$$

converges to a limit \({\hat{Z}}_{\infty ,\beta }^{\omega }\) almost surely with either \({\mathbb {P}}({\hat{Z}}_{\infty ,\beta }^{\omega }=0)=0\) or \({\mathbb {P}}({\hat{Z}}_{\infty ,\beta }^{\omega }=0)=1\) (depending on \(\beta \)). The range of \(\beta \) satisfying the former is called the weak disorder regime and the range of \(\beta \) satisfying the latter is called the strong disorder regime. It has been shown (cf. [10, 16]) that in the weak disorder regime, the polymer chain still fluctuates on scale \(a_{n}\), similar to the underlying random walk. This phenomenon is called delocalization. It is believed that in the strong disorder regime, there should be a narrow corridor in space-time with distance to the origin much larger than \(a_{n}\) at time n, to which the polymer chain is attracted with high probability. This phenomenon is called localization.

There actually exists a stronger condition than strong disorder, which we now introduce. As in the physics literature, we define the free energy of the system by

$$\begin{aligned} p(\beta ):=\lim \limits _{N\rightarrow \infty }\frac{1}{N}\log {\hat{Z}}_{N,\beta }^{\omega }. \end{aligned}$$
(1.8)

Celebrated results like [11, Proposition 2.5] and [7, Proposition 3.1] show that the limit in (1.8) exists almost surely and

$$\begin{aligned} p(\beta )=\lim \limits _{N\rightarrow \infty }\frac{1}{N}{\mathbb {E}}[\log {\hat{Z}}_{N,\beta }^{\omega }] \end{aligned}$$
(1.9)

is non-random. By Jensen’s inequality, we have a trivial bound \(p(\beta )\le 0\). It is easy to see that if \(p(\beta )<0\), then \({\hat{Z}}_{N,\beta }^{\omega }\) decays exponentially fast and thus strong disorder holds. Therefore, we call the range of \(\beta \) with \(p(\beta )<0\) the very strong disorder regime.

It has been shown in [10, Theorem 3.2] and [7, Theorem 6.1] that as \(\beta \) increases, there is a phase transition from the weak disorder regime, through the strong disorder regime, to the very strong disorder regime, which we summarize in the following.

Theorem 1.1

There exist \(0\le \beta _{1}\le \beta _{2}\le \infty \), such that weak disorder holds if and only if \(\beta \in \{0\}\cup (0,\beta _{1})\); strong disorder holds if and only if \(\beta \in (\beta _{1},\infty )\); and very strong disorder holds if and only if \(\beta \in (\beta _{2},\infty )\).

In [16, Proposition 1.13], the author showed that for the Cauchy directed polymer with \(b_{n}\equiv 0\) in (1.2), \(\beta _{1}=0\) if and only if the random walk S is recurrent. Let \({\tilde{S}}\) be an independent copy of S. Since \(S-{\tilde{S}}\) is symmetric and thus \(b_{n}\equiv 0\) in (1.2), one can easily check that the recurrence of \(S-{\tilde{S}}\) is also equivalent to \(\beta _{1}=0\) by the same method used in [16, Proposition 1.13]. When \(\beta _{1}=0\), the model is called disorder relevant, since for arbitrarily small \(\beta >0\), disorder modifies the large scale behavior of the underlying random walk.

It is conjectured that \(\beta _{1}=\beta _{2}\), i.e., the strong disorder regime coincides with the very strong disorder regime (excluding the critical \(\beta \)). So far, the conjecture has only been proved for the nearest-neighbor directed polymer on \({\mathbb {Z}}^{d+1}\) for \(d=1\) in [9] and \(d=2\) in [15], and for the long-range directed polymer with underlying random walks in the domain of attraction of an \(\alpha \)-stable law for some \(\alpha \in (1,2]\) in [16].

The main purpose of this paper is to show that for disorder relevant Cauchy directed polymer, under some regularity assumptions on the random walk, \(\beta _{1}=\beta _{2}\), i.e., \(\beta _{2}=0\). We will present the precise results in the next subsection.

1.2 Main Results

Recall that S is the random walk defined in (1.1) and \({\tilde{S}}\) is an independent copy of S. Note that the expected local time of \(S-{\tilde{S}}\) at the origin up to time N is given by

$$\begin{aligned} D(N):=\sum \limits _{n=1}^{N}{\mathbf {P}}^{\bigotimes 2}(S_{n}={\tilde{S}}_{n})=\sum \limits _{n=1}^{N}\sum \limits _{x\in {\mathbb {Z}}}{\mathbf {P}}(S_{n}=x)^{2}, \end{aligned}$$
(1.10)

where \({\mathbf {P}}^{\bigotimes 2}\) is the probability on product space. The quantity \(D(\cdot )\) is known as the overlap, which will be crucial in our analysis.

Note that \(S-{\tilde{S}}\) is symmetric, and by [14, Chapter VIII.8, Corollary],

$$\begin{aligned} \frac{S_{n}-{\tilde{S}}_{n}}{a_{n}}\overset{d}{\rightarrow }H,\quad \text{ as }~n\rightarrow \infty , \end{aligned}$$
(1.11)

where \(a_{n}\) is the same as in (1.2) and H is some symmetric Cauchy random variable. If \((S-{\tilde{S}})/h\) is an irreducible aperiodic random walk on \({\mathbb {Z}}\), then by Gnedenko’s local limit theorem (cf. [4, Theorem 8.4.1]),

$$\begin{aligned} {\mathbf {P}}^{\bigotimes 2}(S_{n}={\tilde{S}}_{n})\sim \frac{g(0)h}{a_{n}}, \end{aligned}$$
(1.12)

where \(g(\cdot )\) is the density function for H. Hence, \(S-{\tilde{S}}\) is recurrent if and only if \(\sum _{n=1}^{\infty }a_{n}^{-1}=\infty \). Therefore, for disorder relevant Cauchy directed polymer model, the overlap D(N) tends to infinity as N tends to infinity.

We mention that in [16, Proposition 3.1], the author showed that

$$\begin{aligned} \sum \limits _{n=1}^{\infty }\frac{1}{nL(n)}=\infty \Leftrightarrow \sum \limits _{n=1}^{\infty }\frac{1}{a_{n}}=\infty , \end{aligned}$$
(1.13)

where \(L(\cdot )\) was introduced in (1.1). When an explicit close form for \(a_{n}\) is hard to deduce, (1.13) provides an alternative way for checking the recurrence of \(S-{\tilde{S}}\).

To prove \(\beta _{2}=\beta _{1}=0\), we need an extra assumption on the distribution of S, which is

$$\begin{aligned} \begin{aligned} {\mathbf {P}}(S_{1}=k)\sim pL(k)k^{-2},\quad \text{ as }~k\rightarrow \infty ,\\ {\mathbf {P}}(S_{1}=-k)\sim qL(k)k^{-2},\quad \text{ as }~k\rightarrow \infty . \end{aligned} \end{aligned}$$
(1.14)

By [4, Proposition 1.5.10], the stronger regular condition (1.14) implies (1.1). The reason that we assume (1.14) is that we want to have a better control of the local behavior of S. The following result will be used in our proof.

Theorem 1.2

[1, Theorem 2.4] Let S be a random walk that satisfies (1.14), and \(a_{n}\) and \(b_{n}\) be the constants in (1.2). Then there exist positive constants \(c_{1}\) and \(c_{2}\), such that for any \(|k|\ge a_{n}\) with \({\mathbf {P}}(S_{n}-\lfloor b_{n}\rfloor =k)\ne 0\),

$$\begin{aligned} c_{1}nL(|k|)k^{-2}\le {\mathbf {P}}(S_{n}-\lfloor b_{n}\rfloor =k)\le c_{2}nL(|k|)k^{-2}. \end{aligned}$$
(1.15)

Remark 1.1

Although only the upper bound for \({\mathbf {P}}(S_{n}-\lfloor b_{n}\rfloor =|k|)\) was presented in [1, Theorem 2.4], the author also showed that if \(|k|/a_{n}\rightarrow \infty \) as \(n\rightarrow \infty \), then

$$\begin{aligned} {\mathbf {P}}(S_{n}-\lfloor b_{n}\rfloor =|k|)\sim (p\mathbb {1}_{k>0}+q\mathbb {1}_{k<0})nL(|k|)k^{-2},\quad \text{ as }~n\rightarrow \infty . \end{aligned}$$
(1.16)

One may check that the lower bound in (1.15) can be proved by the method developed in proving (1.16). Note that by (1.3),

$$\begin{aligned} \frac{nL(|k|)}{k^{2}}\sim \frac{a_{n}L(|k|)}{k^{2}L(a_{n})},\quad \text{ as }~n\rightarrow \infty , \end{aligned}$$
(1.17)

which will be useful later.

Now we are ready to present our main results. Recall \(\beta _{1}\) and \(\beta _{2}\) from Theorem 1.1. Throughout the rest of this paper, we assume \(\beta _{1}=0\), i.e., the model is disorder relevant, which is an equivalent condition for \(S-{\tilde{S}}\) to be recurrent according to the statement right below Theorem 1.1.

We first show that with some extra assumption on the underlying random walk S, \(\beta _{2}=\beta _{2}=0\), i.e, the free energy is strictly negative as soon as \(\beta >0\).

Theorem 1.3

Let the Cauchy directed polymer model be defined as in Sect. 1.1. We assume that the underlying random walk S satisfies (1.14) and \(S-{\tilde{S}}\) is recurrent. We set

$$\begin{aligned} D^{-1}(x):=\max \{N: D(N)\le x\}. \end{aligned}$$
(1.18)

If the centering constant \(b_{n}\equiv 0\) in (1.2), then for arbitrarily small \(\epsilon >0\), there exists a \(\beta ^{(1)}>0\), such that for any \(\beta \in (0,\beta ^{(1)})\),

$$\begin{aligned} p(\beta )\le -(D^{-1}((1+\epsilon )\beta ^{-2}))^{-(1+\epsilon )}. \end{aligned}$$
(1.19)

If we drop the assumption \(b_{n}\equiv 0\), then some technical difficulties will arise. We will elaborate on this point when we prove Theorem 1.3.

We can also give a lower bound for the free energy, and the lower bound is valid under fairly general conditions.

Theorem 1.4

Let the Cauchy directed polymer model be defined as in Subsection 1.1. If \(S-{\tilde{S}}\) is recurrent, then for arbitrarily small \(\epsilon >0\), there exists a \(\beta ^{(2)}>0\), such that for \(\beta \in (0,\beta ^{(2)})\),

$$\begin{aligned} p(\beta )\ge -D^{-1}((1-\epsilon )\beta ^{-2})^{-(1-\epsilon )}. \end{aligned}$$
(1.20)

Note that in Theorem 1.4, the underlying random walk S only needs to satisfy (1.1). Neither (1.14) nor \(b_{n}\equiv 0\) are needed.

In particular, if S satisfies (1.14) with the slowly varying function \(L(\cdot )\equiv c\), then \(a_{n}\) can be chosen to be \(c(p+q)n\). Hence, \(D(N)\sim \log {N}/c(p+q)\) and \(D^{-1}(x)\sim \exp (c(p+q)x)\). Since \(S-{\tilde{S}}\) is recurrent due to (1.13), we have:

Corollary 1.5

Let the Cauchy directed polymer model be defined as in Sect. 1.1. If the underlying random walk S satisfies (1.14) with \(L(\cdot )\equiv c\) and the centering constant \(b_{n}\equiv 0\) in (1.2), then by Theorems 1.3 and 1.4,

$$\begin{aligned} \lim \limits _{\beta \rightarrow 0}\beta ^{2}\log (-p(\beta ))=-c(p+q). \end{aligned}$$
(1.21)

1.3 Organization and Discussion

Theorem 1.3 will be proved in Sect. 2 by a now classic fractional-moment/coarse-graining/change-of-measure procedure. We will adapt the approaches developed in [2, 3].

Theorem 1.4 will be proved in Sect. 3 using a second moment computation introduced in [3] and a concentration inequality developed in [6].

Although our proof techniques are adaptation of known methods, some new subtle arguments are needed, since the random walk in the Cauchy domain of attraction is much harder to deal with than 2-dimensional simple random walk.

We believe that the approach in this paper can be applied to handle the 2 dimensional long-range directed polymer with stable exponent \(\alpha =2\), which is the critical case for long-range directed polymer on \({\mathbb {Z}}^{2+1}\). With some regularity assumption on the underlying random walk S, one can prove \(\beta _{2}=\beta _{1}=0\) if \(S-{\tilde{S}}\) is recurrent by our methods.

It dose not seem likely that one can prove the upper bound (1.19) under the general condition in Theorem 1.4 by the fractional-moment/coarse-graining/change-of-measure procedure. One may have to find a totally new approach to deal with the upper bound in the general case.

2 Proof of Theorem 1.3

We start with the fractional-moment method. Recall (1.9), for any \(\theta \in (0,1)\),

$$\begin{aligned} p(\beta )=\lim \limits _{N\rightarrow \infty }\frac{1}{N}{\mathbb {E}}[\log {\hat{Z}}_{N,\beta }^{\omega }]\le \varliminf \limits _{N\rightarrow \infty }\frac{1}{\theta N}\log {\mathbb {E}}[({\hat{Z}}_{N,\beta }^{\omega })^{\theta }] \end{aligned}$$

by Jensen’s inequality. In this proof, \(\theta \) cannot be chosen arbitrarily. In fact, we will see later that \(\theta \) should be larger than 1 / 2. Then our strategy is to chose a coarse-graining length \(l=l(\beta )\), write \(N=ml\), and let m tend to infinity. Along the subsequence \(N=ml\), we have

$$\begin{aligned} p(\beta )\le \varliminf \limits _{m\rightarrow \infty }\frac{1}{ml\theta }\log {\mathbb {E}}[({\hat{Z}}_{ml,\beta }^{\omega })^{\theta }]. \end{aligned}$$

If we can prove

$$\begin{aligned} {\mathbb {E}}[({\hat{Z}}_{ml,\beta }^{\omega })^{\theta }]\le 2^{-m}, \end{aligned}$$
(2.1)

then we obtain \(p(\beta )<0\). In order to further prove the upper bound (1.17) for any \(\epsilon >0\), one appropriate choice of l is

$$\begin{aligned} l=l(\beta ):=\inf \{n\in {\mathbb {N}}: D(\lfloor n^{1-\epsilon ^{2}}\rfloor )\ge (1+\epsilon )\beta ^{-2}\}. \end{aligned}$$
(2.2)

Note that D(N) tends to infinity as N tends to infinity, since \(S-{\tilde{S}}\) is recurrent. Thus, l tends to infinity as \(\beta \) tends to 0.

Now we introduce the coarse-graining method. First, we partition all real number \({\mathbb {R}}\) into blocks of size \(a_{l}\) by setting

$$\begin{aligned} I_{y}:=ya_{l}+(-a_{l}/2, a_{l}/2],\quad \forall y\in {\mathbb {Z}}, \end{aligned}$$

where \(a_{l}\) is the scaling constant in (1.2). Since \(a_{l}\) tends to infinity as l tends to infinity, we can choose \(a_{l}\) to be an integer and thus \(ya_{l}\) is also an integer. Note that \((I_{y})_{y\in {\mathbb {Z}}}\) is a disjoint family and \(\cup _{y\in \mathbb {Z}}I_{y}=\mathbb {R}\). Next, for any \({\mathcal {Y}}=(y_{1},\ldots ,y_{m})\), define

$$\begin{aligned} {\mathcal {T}}_{{\mathcal {Y}}}=\{S_{il}\in I_{y_{i}}, \text{ for }~1\le i\le m\}, \end{aligned}$$

and we say \({\mathcal {Y}}\) is a coarse-grained trajectory for \(S\in {\mathcal {T}}_{{\mathcal {Y}}}\). We can now decompose the partition function \({\hat{Z}}_{ml,\beta }^{\omega }\) in terms of different coarse-grained trajectories by

$$\begin{aligned} {\hat{Z}}_{ml,\beta }^{\omega }=\sum \limits _{{\mathcal {Y}}\in {\mathbb {Z}}^{m}}{\mathbf {E}}\left[ \exp \left( \sum \limits _{n=1}^{ml}(\beta \omega _{n,S_{n}}-\lambda (\beta ))\right) \mathbb {1}_{\{S\in {\mathcal {T}}_{{\mathcal {Y}}}\}}\right] :=\sum \limits _{{\mathcal {Y}}\in {\mathbb {Z}}^{m}}Z_{{\mathcal {Y}}}. \end{aligned}$$

By the inequality \((\sum _{n}a_{n})^{\theta }\le \sum _{n}a_{n}^{\theta }\) for positive sequence \((a_{n})_{n}\) and \(\theta \in (0,1]\),

$$\begin{aligned} {\mathbb {E}}[({\hat{Z}}_{ml,\beta }^{\omega })^{\theta }]\le \sum \limits _{{\mathcal {Y}}\in {\mathbb {Z}}^{m}}{\mathbb {E}}[(Z_{{\mathcal {Y}}})^{\theta }]. \end{aligned}$$
(2.3)

Therefore, to prove (2.1), we only need to prove

Proposition 2.1

If l is sufficiently large, then uniformly in \(m\in {\mathbb {N}}\), we have

$$\begin{aligned} \sum \limits _{{\mathcal {Y}}\in {\mathbb {Z}}^{m}}{\mathbb {E}}[(Z_{{\mathcal {Y}}})^{\theta }]\le 2^{-m}. \end{aligned}$$

To prove Proposition 2.1, we need a change-of-measure argument. For any \({\mathcal {Y}}\in {\mathbb {Z}}^{m}\), we introduce a positive function \(g_{{\mathcal {Y}}}(\omega )\), which can be considered as a probability density after scaling. Then by Hölder’s inequality,

$$\begin{aligned} {\mathbb {E}}[(Z_{{\mathcal {Y}}})^{\theta }]={\mathbb {E}}[g_{{\mathcal {Y}}}^{-\theta }(g_{{\mathcal {Y}}}Z_{{\mathcal {Y}}})^{\theta }]\le \left( {\mathbb {E}}[g_{{\mathcal {Y}}}^{-\theta /(1-\theta )}]\right) ^{1-\theta }\left( {\mathbb {E}}[g_{{\mathcal {Y}}}Z_{{\mathcal {Y}}}]\right) ^{\theta }. \end{aligned}$$
(2.4)

Here \({\mathcal {M}}_{g_{{\mathcal {Y}}}}(\cdot ):={\mathbb {E}}[g_{{\mathcal {Y}}}\mathbb {1}_{(\cdot )}]\) can be considered as a new measure. We will choose \(g_{{\mathcal {Y}}}\) such that the expected value of \(Z_{{\mathcal {Y}}}\) under \({\mathcal {M}}_{g_{\mathcal {Y}}}\) is significantly smaller than that under the original measure \({\mathbb {E}}\), and the cost of change of measure, the term \({\mathbb {E}}[g_{{\mathcal {Y}}}^{-\theta /(1-\theta )}]\), is not too large.

To choose \(g_{{\mathcal {Y}}}\), we need to first introduce some notation. We can first choose an integer R (not dependent on \(\beta \)) and then define space-time blocks (with the convention \(y_{0}=0\))

$$\begin{aligned} B_{i,y_{i-1}}:=[(i-1)l+1,\ldots ,il]\times {\tilde{I}}_{y_{i-1}},~\text{ for }~i=1,\ldots ,m, \end{aligned}$$

where

$$\begin{aligned} {\tilde{I}}_{y}=ya_{l}+(-Ra_{l},Ra_{l}). \end{aligned}$$
(2.5)

Since S is in the domain of attraction of a 1-stable Lévy process, the graph of \((S_{(i-1)l+k})_{k=1}^{l}\) with \(S_{(i-1)l}=y_{i-1}\) is contained within \(B_{i,y_{i-1}}\) with probability close to 1 when R is large enough. Therefore, it suffices to perform the change of measure on \(\omega \) in \(B=\cup _{i=1}^{m}B_{i,y_{i-1}}\). By translation invariance, it is natural to choose

$$\begin{aligned} g_{{\mathcal {Y}}}(\omega )=\prod \limits _{i=1}^{m}g_{i,y_{i-1}}(\omega ) \end{aligned}$$

such that each \(g_{i,y_{i-1}}\) depends only on \(\omega \) in \(B_{i,y_{i-1}}\).

To make \({\mathbb {E}}[g_{{\mathcal {Y}}}Z_{{\mathcal {Y}}}]\) small, we can construct \(g_{{\mathcal {Y}}}\) according to the following heuristics. we first set a threshold. For any block \(B_{i,y}\), If the contribution of \(\omega \) in \(B_{i, y}\) to the partition function exceeds the threshold, then we choose \(g_{i,y}\) to be small. If the contribution of \(\omega \) in \(B_{i, y}\) to the partition function is less than the threshold, then we simply set \(g_{i,y}\) to be 1. Before we present the exact construction of \(g_{{\mathcal {Y}}}\), we need to define some auxiliary quantities, which will help us compute the contribution to \(Z_{N,\beta }^{\omega }\) from each block \(B_{i,y}\).

For arbitrarily small \(\epsilon >0\), we introduce

$$\begin{aligned} u=u(l):=\lfloor l^{1-\epsilon ^{2}}\rfloor \quad \text{ and }\quad q=q(l):=\frac{1}{\epsilon ^{2}}\max \left\{ \log \left( \sqrt{\varphi (l)}\right) ,\log D(l)\right\} , \end{aligned}$$
(2.6)

where \(\varphi (\cdot )\) is the slowly varying function in (1.3). Note that by (2.2), u and q both tend to infinity as \(\beta \) tends to 0, and the definitions of q and u ensure that

$$\begin{aligned} q\ll u\ll l\quad \text{ and }\quad 1+\epsilon \le \beta ^{2}D(u)\le 1+2\epsilon . \end{aligned}$$
(2.7)

We will use (2.7) repeatedly.

Then we define \(X(\omega )\) depending on \(\omega \) in \(B_{1,0}\) by

$$\begin{aligned} X(\omega ):=\frac{1}{\sqrt{2Rla_{l}}D(u)^{q/2}}\sum \limits _{{\underline{x}}\in ({\tilde{I}}_{0})^{q+1},{\underline{t}}\in J_{l,u}}{\mathbf {P}}({\underline{t}},{\underline{x}})\omega _{{\underline{t}},{\underline{x}}} \end{aligned}$$
(2.8)

with

$$\begin{aligned}&{\underline{x}}:=(x_{0},\ldots ,x_{q})\quad \text{ and }\quad {\underline{t}}:=(t_{0},\ldots ,t_{q}),\\&J_{l,u}:=\{{\underline{t}}: 1\le t_{0}<\cdots <t_{q}\le l, t_{i}-t_{i-1}\le u, \forall j=1,\ldots ,q\}, \end{aligned}$$
$$\begin{aligned} {\mathbf {P}}({\underline{t}},{\underline{x}}):=\prod \limits _{i=1}^{q}{\mathbf {P}}(S_{t_{i}}-S_{t_{i-1}}=x_{i}-x_{i-1}), \end{aligned}$$
(2.9)

and

$$\begin{aligned} \omega _{{\underline{t}},{\underline{x}}}:=\prod \limits _{i=0}^{q}\omega _{t_{i},x_{i}} \end{aligned}$$

where the constant R is chosen to be the same as in (2.5).

We can regard \(X(\omega )\) as an approximation of the contribution from \(\omega \) in \(B_{1,0}\) to the normalized partition function \({\hat{Z}}_{N,\beta }^{\omega }\). It can be viewed as something like the qth order term in the Taylor expansion of \({\hat{Z}}_{N,\beta }^{\omega }\) in \(\omega \). We introduce this approximation since \(X(\omega )\) is a mutilinear combination of \(\omega _{t,x}\)’s, which is treatable, while it is rarely possible to do computation on the partition function directly. One may refer to [2, Section 4.2] for more discussions concerning the choice of \(X(\omega )\).

It is not hard to check that by (1.7) and (1.10),

$$\begin{aligned} {\mathbb {E}}[X(\omega )]=0\quad \text{ and }\quad {\mathbb {E}}[(X(\omega ))^{2}]\le 1. \end{aligned}$$
(2.10)

Then, by translation invariance, for the contribution from \(\omega \) in any block \(B_{i,y}\), we can define

$$\begin{aligned} X^{(i,y)}(\omega ):=X(\theta _{l}^{i-1,y}\omega ), \end{aligned}$$
(2.11)

where \(\theta _{l}^{i-1,y}\omega _{j,x}:=\omega _{j+(i-1)l,x+ya_{l}}\) is a shift operator.

Now we can set

$$\begin{aligned} g_{i,y}(\omega ):=\exp \left( -K\mathbb {1}_{\{X^{(i,y)}(\omega )\ge \exp (K^{2})\}}\right) , \end{aligned}$$
(2.12)

where K is a fixed constant independent of any other parameter. We then have

$$\begin{aligned} {\mathbb {E}}[(g_{i,y})^{-\theta /(1-\theta )}]=1+(\exp (\theta K/(1-\theta ))-1){\mathbb {P}}(X^{(i,y)}(\omega )\ge \exp (K^{2}))\le 2 \end{aligned}$$

by Chebyshev’s inequality and (2.10) if we choose K large enough. Since \(g_{i,y_{i-1}}\) and \(g_{j,y_{j-1}}\) are defined on disjoint blocks \(B_{i,y_{i-1}}\) and \(B_{j,y_{j-1}}\) for \(i\ne j\), by independence of \(\omega \) in \(B_{i,y_{i-1}}\) and \(B_{j,y_{j-1}}\),

$$\begin{aligned} \left( {\mathbb {E}}[g_{{\mathcal {Y}}}^{-\theta /(1-\theta )}]\right) ^{1-\theta }=\left( \prod \limits _{i=1}^{m}{\mathbb {E}}[g_{i,y_{i-1}}^{-\theta /(1-\theta )}]\right) ^{1-\theta }\le 2^{m(1-\theta )}\le 2^{m}. \end{aligned}$$
(2.13)

Next, we turn to analyze \({\mathbb {E}}[g_{{\mathcal {Y}}}Z_{{\mathcal {Y}}}]\) in (2.4). We can rewrite it as

$$\begin{aligned} {\mathbb {E}}[g_{{\mathcal {Y}}}Z_{{\mathcal {Y}}}]={\mathbf {E}}\left[ {\mathbb {E}}\left[ g_{{\mathcal {Y}}}\exp \left( \sum \limits _{n=1}^{ml}(\beta \omega _{n,S_{n}}-\lambda (\beta ))\right) \right] \mathbb {1}_{\{S\in {\mathcal {T}}_{{\mathcal {Y}}}\}}\right] . \end{aligned}$$
(2.14)

For any given trajectory of S, we define a change of measure by

$$\begin{aligned} \frac{\text{ d }{\mathbb {P}}^{S}}{\text{ d }{\mathbb {P}}}(\omega ):=\exp \left( \sum \limits _{n=1}^{ml}(\beta \omega _{n,S_{n}}-\lambda (\beta ))\right) . \end{aligned}$$

We can check that \({\mathbb {P}}^{S}\) is a probability measure, and \(\omega \) remains a family of independent random variables under \({\mathbb {P}}^{S}\), but the distribution of \(\omega _{n,S_{n}}\) is exponentially tilted with

$$\begin{aligned} {\mathbb {E}}^{S}[\omega _{n,x}]=\lambda '(\beta )\mathbb {1}_{\{S_{n}=x\}}\quad \text{ and }\quad {\mathbb {V}}\text{ ar }^{S}(\omega _{n,x})=1+(\lambda ''(\beta )-1)\mathbb {1}_{\{S_{n}=x\}}. \end{aligned}$$
(2.15)

One can check that

$$\begin{aligned} \lim \limits _{\beta \rightarrow 0}\frac{\lambda '(\beta )}{\beta }=1\quad \text{ and }\quad \lim \limits _{\beta \rightarrow 0}\lambda ''(\beta )=1. \end{aligned}$$

Hence, for \(\epsilon \) given in Theorem 1.3, when \(\beta \) is sufficiently small, we have

$$\begin{aligned} \left| \frac{\lambda '(\beta )}{\beta }-1\right| \le \epsilon ^{3}\quad \text{ and }\quad |\lambda ''(\beta )-1|\le \frac{\epsilon ^{3}}{2}. \end{aligned}$$
(2.16)

By independence of \(\omega \), (2.14) can be further rewritten as

$$\begin{aligned} {\mathbb {E}}[g_{{\mathcal {Y}}}Z_{{\mathcal {Y}}}]={\mathbf {E}}\left[ {\mathbb {E}}^{S}[g_{{\mathcal {Y}}}] \mathbb {1}_{\{S\in {\mathcal {T}}_{{\mathcal {Y}}}\}}\right] ={\mathbf {E}}\left[ \prod \limits _{i=1}^{m}{\mathbb {E}}^{S}[g_{i,y_{i-1}}]\mathbb {1}_{\{S_{il}\in I_{y_{i}}\}}\right] . \end{aligned}$$
(2.17)

Applying the Markov property by consecutively conditioning on \(S_{(m-1)l}, S_{(m-2)l}, \ldots \) and taking maximum according to \(x\in I_{y_{i}-1}\) each time, (2.17) can be bounded above by

$$\begin{aligned} \prod \limits _{i=1}^{m}\max \limits _{x\in I_{y_{i-1}}}{\mathbf {E}}\left[ {\mathbb {E}}^{S}[g_{i,y_{i-1}}]\mathbb {1}_{\{S_{il}\in I_{y_{i}}\}}\bigg |S_{(i-1)l}=x\right] . \end{aligned}$$

Using translation invariance (2.11) and noting that \(f(y_{1},y_{2},\ldots ,y_{m})=(y_{1}, y_{2}-y_{1},\ldots , y_{m}-y_{m-1})\) is a bijection from \({\mathbb {Z}}^{m}\) to \({\mathbb {Z}}^{m}\), we sum \(({\mathbb {E}}[g_{{\mathcal {Y}}}Z_{{\mathcal {Y}}}])^{\theta }\) over \({\mathcal {Y}}\in {\mathbb {Z}}^{m}\) and than obtain

$$\begin{aligned} \sum \limits _{{\mathcal {Y}}\in {\mathbb {Z}}^{m}}\left( {\mathbb {E}}[g_{{\mathcal {Y}}}Z_{{\mathcal {Y}}}]\right) ^{\theta }\le \left( \sum \limits _{y\in {\mathbb {Z}}}\max \limits _{x\in I_{0}}\left( {\mathbf {E}}^{x}\left[ {\mathbb {E}}^{S}[g_{1,0}]\mathbb {1}_{\{S_{l}\in I_{y}\}}\right] \right) ^{\theta }\right) ^{m}, \end{aligned}$$
(2.18)

where \({\mathbf {E}}^{x}\) is the expectation with respect to \({\mathbf {P}}^{x}\), which is the probability measure for random walk S starting at x.

Remark 2.1

Here we explain why we have to assume \(b_{n}\equiv 0\) in (1.2). For the coarse-grained trajectory of S, \(S_{il}-S_{(i-1)l}-b_{l}\) should be of scale \(a_{l}\). However, if \({\mathbf {E}}[S_{1}]\) does not exist, then \(b_{n}\) may not be proportional to n. Hence, for \(k>n\),

$$\begin{aligned} (S_{kl}-b_{kl})-(S_{nl}-b_{nl})\overset{d}{=}S_{(k-n)l}-(b_{kl}-b_{nl})\overset{d}{\ne }S_{(k-n)l}-b_{(k-n)l}, \end{aligned}$$

which will cause the subsequent use of the Markov property to fail.

If \(b_{n}\) is proportional to n, then when handling the coarse-grained trajectory and defining (2.9), we can replace all \(S_{n}\) by \(S_{n}-b_{n}\) such that all of our arguments are still valid. Nevertheless, we just assume \(b_{n}\equiv 0\) for simplicity.

Now by (2.4), (2.13) and (2.18), to prove Proposition 2.1, we only need to show

Proposition 2.2

For small enough \(\beta >0\),

$$\begin{aligned} \sum \limits _{y\in {\mathbb {Z}}}\max \limits _{x\in I_{0}}\left( {\mathbf {E}}^{x}\left[ {\mathbb {E}}^{S}[g_{1,0}]\mathbb {1}_{\{S_{l}\in I_{y}\}}\right] \right) ^{\theta }\le \frac{1}{4}. \end{aligned}$$
(2.19)

To prove Proposition 2.2, we split the summation in (2.19) into two parts.

Firstly, since \(g_{1,0}\le 1\),

$$\begin{aligned} \sum \limits _{|y|\ge M}\max \limits _{x\in I_{0}}\left( {\mathbf {E}}^{x}\left[ {\mathbb {E}}^{S}[g_{1,0}]\mathbb {1}_{\{S_{l}\in I_{y}\}}\right] \right) ^{\theta }\le \sum \limits _{|y|\ge M}\max \limits _{x\in I_{0}}{\mathbf {P}}^{x}(S_{l}\in I_{y})^{\theta }. \end{aligned}$$
(2.20)

By Theorem 1.2, when M is large enough and fixed, for any \(k\ge M\) and \(j\in \{1,\ldots ,a_{l}-1\}\),

$$\begin{aligned} {\mathbf {P}}(S_{l}=ka_{l}+j)\le C\frac{a_{l}L(ka_{l}+j)}{(ka_{l}+j)^{2}L(a_{l})}\le C \frac{L(ka_{l}+j)}{k^{2}a_{l}L(a_{l})}. \end{aligned}$$

Then by Potter bounds (cf. [4, Theorem 1.5.6]), for any \(\gamma >0\), there exist some constant C, such that for k and j,

$$\begin{aligned} \frac{L(ka_{l}+j)}{L(a_{l})}\le Ck^{\gamma }\quad \text{ uniformly }. \end{aligned}$$

Hence, the summand in (2.20) can be uniformly bounded from above by \(Ck^{\theta (\gamma -2)}\). Therefore, when \(\gamma <1\), we can choose \(\theta \) close to 1 enough such that \(\theta (\gamma -2)<-1\) and then (2.20) can be bounded from above by 1 / 8 for sufficiently large M.

Next, we turn to the control of the summand in (2.19) for \(|y|\le M\). We can first apply a trivial bound

$$\begin{aligned} {\mathbf {E}}^{x}\left[ {\mathbb {E}}^{S}[g_{1,0}]\mathbb {1}_{\{S_{l}\in I_{y}\}}\right] \le {\mathbf {E}}^{x}\left[ {\mathbb {E}}^{S}[g_{1,0}]\right] . \end{aligned}$$
(2.21)

Then we want to show

Lemma 2.3

For any \(\eta >0\), we can choose K large enough in (2.12), which only depends on \(\eta \), such that for small enough \(\beta >0\), we have

$$\begin{aligned} \max \limits _{x\in I_{0}}{\mathbf {E}}^{x}\left[ {\mathbb {E}}^{S}[g_{1,0}]\right] \le \eta . \end{aligned}$$

By (2.21) and Lemma 2.3, if we choose \(\eta =(16M)^{-1/\theta }\), then

$$\begin{aligned} \sum \limits _{|y|\le M}\max \limits _{x\in I_{0}}\left( {\mathbf {E}}^{x}\left[ {\mathbb {E}}^{S}[g_{1,0}]\mathbb {1}_{\{S_{l}\in I_{y}\}}\right] \right) ^{\theta }\le \frac{1}{8}. \end{aligned}$$
(2.22)

Combining (2.22) and the upper bound for (2.20), we deduce Proposition 2.2. Therefore, it only remains to prove Lemma 2.3.

Indeed, Lemma 2.3 follows from the following two lemmas.

Lemma 2.4

For any \(\delta >0\), we can choose a large enough R in (2.5), which only depends on \(\delta \) and the \(\epsilon \) in Theorem 1.2, such that for small enough \(\beta >0\), and for any \(x\in I_{0}\), we have

$$\begin{aligned} {\mathbf {P}}^{x}({\mathbb {E}}^{S}[X]\ge (1+\epsilon ^{2})^{q})\ge 1-\delta . \end{aligned}$$

Lemma 2.5

If \(\beta \) is positive and sufficiently small, then for any trajectory S of the underlying random walk, we have

$$\begin{aligned} {\mathbb {V}}\text{ ar }^{S}(X)\le (1+\epsilon ^{3})^{q}. \end{aligned}$$

We postpone the proof of Lemmas 2.4 and 2.5, and deduce Lemma 2.3 first.

Proof of Lemma 2.3

By the definition of \(g_{1,0}\), for any trajectory S, we have the following trivial bound

$$\begin{aligned} {\mathbb {E}}^{S}[g_{1,0}]\le \exp (-K)+{\mathbb {P}}^{S}(X(\omega )\le \exp (K^{2})). \end{aligned}$$
(2.23)

By Chebyshev’s inequality,

$$\begin{aligned} {\mathbb {P}}^{S}(X(\omega )\le \exp (K^{2}))\le (\exp (K^{2})-{\mathbb {E}}^{S}[X])^{-2}{\mathbb {V}}\text{ ar }^{S}(X). \end{aligned}$$
(2.24)

We denote \(A=\{{\mathbb {E}}^{S}[X]\ge (1+\epsilon ^{2})^{q}\}\). For any \(x\in I_{0}\), by (2.24), Lemma 2.4 and Lemma 2.5, we then have

$$\begin{aligned}&{\mathbf {E}}^{x}\left[ {\mathbb {P}}^{S}(X(\omega )\le \exp (K^{2}))\right] \nonumber \\&\quad \le {\mathbf {P}}^{x}(A^{c})+{\mathbf {E}}^{x}\left[ {\mathbb {P}}^{S}(X(\omega )\le \exp (K^{2}))\mathbb {1}_{A}\right] \nonumber \\&\quad \le \delta +\frac{(1+\epsilon ^{3})^{q}}{2(1+\epsilon ^{2})^{2q}}, \end{aligned}$$
(2.25)

where we use the fact that \((1+\epsilon ^{2})^{q}-\exp (K^{2})\ge \sqrt{2}(1+\epsilon ^{2})^{q}\) to obtain the last line, since q can be made arbitrarily large by choosing \(\beta \) close enough to 0.

Now we first take \({\mathbf {E}}^{x}\)-expectation on the both sides of (2.23). Then, we choose K large enough such that \(\exp (-K)<\eta /3\). Next, we let \(\beta \) tend to 0 so that the last line of (2.25) is smaller than \(2\eta /3\), which implies Lemma 2.3. \(\square \)

The proofs of Lemma 2.4 and 2.5 involve some long and tedious computations. Hence, we put each proof in one subsection to make the structure more clear and we will write some intermediate steps as lemmas to clarify the proofs.

2.1 Proof of Lemma 2.4

In this subsection, we prove Lemma 2.4.

Proof of Lemma 2.4

First, we recall the definition (2.8) of X. Note that \(\omega \) is a family of independent random variables under \({\mathbb {P}}^{S}\), and by (2.15), \({\mathbb {E}}^{S}[\omega _{n,x}]=0\) if \(S_{n}\ne x\). Hence, for any trajectory of S, we have

$$\begin{aligned} \begin{aligned} {\mathbb {E}}^{S}[X]=&\frac{(\lambda '(\beta ))^{q+1}}{\sqrt{2Rla_{l}}D(u)^{q/2}}\sum \limits _{{\underline{t}}\in J_{l,u}}{\mathbf {P}}({\underline{t}},{\underline{S}}^{({\underline{t}})})\mathbb {1}_{\{S_{t_{k}}\in {\tilde{I}}_{0},\forall k\in \{0,\ldots ,q\}\}}\\ \ge&\frac{(\lambda '(\beta ))^{q+1}}{\sqrt{2Rla_{l}}D(u)^{q/2}}\sum \limits _{{\underline{t}}\in J_{l,u}}{\mathbf {P}}({\underline{t}},{\underline{S}}^{({\underline{t}})})\mathbb {1}_{\left\{ \max \limits _{1\le t\le l}|S_{t}|\le Ra_{l}\right\} }, \end{aligned} \end{aligned}$$
(2.26)

where

$$\begin{aligned} {\underline{S}}^{({\underline{t}})}:=(S_{t_{0}},\ldots ,S_{t_{q}}) \end{aligned}$$
(2.27)

and we will use notation (2.27) in what follows. We emphasize that in (2.26), the trajectory \({\underline{S}}^{({\underline{t}})}\) should be substituted into the \({\underline{x}}\) in (2.9) and readers should not mix it up with the random walk S in (2.9).

Note that for any \(x\in I_{0}=(-a_{l}/2, a_{l}/2]\),

$$\begin{aligned} {\mathbf {P}}^{x}\left( \max \limits _{1\le t\le l}|S_{t}|>Ra_{l}\right) \le {\mathbf {P}}\left( \max \limits _{1\le t\le l}|S_{t}|>(R-1)a_{l}\right) . \end{aligned}$$
(2.28)

Since S is attracted to some 1-stable Lévy process, for any \(\delta >0\), we can choose \(R=R(\delta ,\epsilon )\) large enough such that uniformly in l, the probability in (2.28) is smaller than \(\delta /2\). In what follows, we will simply write R for \(R(\delta ,\epsilon )\).

On the event \(\{\max _{1\le t\le l}|S_{t}|\le Ra_{l}\}\), by (2.16), (2.6) and (2.7), we have

$$\begin{aligned} \begin{aligned} {\mathbb {E}}^{S}[X]&\ge \frac{\beta }{\sqrt{2R\varphi (l)}}(1-\epsilon ^{3})^{q+1}(\beta ^{2}D(u))^{q/2}\frac{1}{lD(u)^{q}}\sum \limits _{{\underline{t}}\in J_{l,u}}{\mathbf {P}}({\underline{t}},{\underline{S}}^{({\underline{t}})})\\&\ge \frac{\beta }{\sqrt{2R}}\frac{(1-\epsilon ^{3})^{q+1}(1+\epsilon )^{q/2}}{\exp (\epsilon ^{2}q)}\frac{1}{lD(u)^{q}}\sum \limits _{{\underline{t}}\in J_{l,u}}{\mathbf {P}}({\underline{t}},{\underline{S}}^{({\underline{t}})}). \end{aligned} \end{aligned}$$
(2.29)

Note that for \(\epsilon \) small enough, by (2.7),

$$\begin{aligned}&\beta \frac{(1-\epsilon ^{3})^{q+1}(1+\epsilon )^{q/2}}{(1+\epsilon ^{2})^{2q}\exp (\epsilon ^{2}q)}\ge \beta \left( 1+\frac{\epsilon }{20}\right) ^{q}\\&\quad \ge \beta \left( 1+\frac{\epsilon }{20}\right) ^{\frac{1}{\epsilon ^{2}}\log {D(l)}}\gg \beta \exp (\log {D(u)})\ge \frac{1+\epsilon }{\beta }\gg 1. \end{aligned}$$

Hence,

$$\begin{aligned} \frac{\beta }{\sqrt{2R}}\frac{(1-\epsilon ^{3})^{q+1}(1+\epsilon )^{q/2}}{\exp (\epsilon ^{2}q)}\ge (1+\epsilon ^{2})^{2q} \end{aligned}$$

and (2.29) implies that

$$\begin{aligned} {\mathbb {E}}^{S}[X]\ge (1+\epsilon ^{2})^{2q}\frac{1}{lD(u)^{q}}\sum \limits _{t\in J_{l,u}}{\mathbf {P}}({\underline{t}},{\underline{S}}^{({\underline{t}})}). \end{aligned}$$
(2.30)

Recall that the probability in (2.28) is smaller than \(\delta /2\) and by (2.30) on \(\{\max _{1\le t\le l}|S_{t}|\le Ra_{l}\}\), we have

$$\begin{aligned} {\mathbf {P}}^{x}\left( {\mathbb {E}}^{S}[X]<(1+\epsilon ^{2})^{q}\right) \le \frac{\delta }{2}+{\mathbf {P}}^{x}\left( \frac{1}{lD(u)^{q}}\sum \limits _{t\in J_{l,u}}{\mathbf {P}}({\underline{t}},{\underline{S}}^{({\underline{t}})})<\frac{1}{(1+\epsilon ^{2})^{q}}\right) .\qquad \end{aligned}$$
(2.31)

To bound the probability on the right-hand side of (2.31), we introduce a random variable

$$\begin{aligned} W_{l}=W_{l}(S):=\frac{1}{lD(u)^{q}}\sum \limits _{{\underline{t}}\in J'_{l,u}}{\mathbf {P}}({\underline{t}},{\underline{S}}^{({\underline{t}})}), \end{aligned}$$

where

$$\begin{aligned} J'_{l,u}=\{{\underline{t}}\in J_{l,u}: 1\le t_{0}\le l/2\}. \end{aligned}$$

Since \(J'_{l,u}\subset J_{l,u}\), it suffices to prove

$$\begin{aligned} {\mathbf {P}}^{x}\left( W_{l}<\frac{1}{(1+\epsilon ^{2})^{q}}\right) \le \frac{\delta }{2}. \end{aligned}$$
(2.32)

Note that by the definition of \({\mathbf {P}}({\underline{t}},{\underline{S}}^{({\underline{t}})})\), the law of \(W_{l}\) does not depend on the starting point \(S_{0}=x\). Hence, during the rest of the proof, we can simply use \({\mathbf {P}}\) instead of \({\mathbf {P}}^{x}\) for short. Our strategy to prove (2.32) is to show that the mean of \(W_{l}\) is 1 / 2 and the variance of \(W_{l}\) can be controlled.

First, by recalling the definition of lu,  and q, when \(\beta \) is small enough, \(l/2+qu<l\). Since the value of \({\mathbf {P}}({\underline{t}},{\underline{S}}^{({\underline{t}})})\) does not depend on \(S_{t_{0}}\), we have

$$\begin{aligned} {\mathbf {E}}\left[ \sum \limits _{\{t\in J'_{l,u}\}}{\mathbf {P}}({\underline{t}},{\underline{S}}^{({\underline{t}})})\right]= & {} \frac{l}{2}\sum \limits _{\{t\in J'_{l,u},t_{0}=1\}}{\mathbf {E}}\left[ {\mathbf {P}}({\underline{t}},{\underline{S}}^{({\underline{t}})})\right] \nonumber \\= & {} \frac{l}{2}\left( \sum \limits _{t=1}^{u}\sum \limits _{x\in {\mathbb {Z}}}{\mathbf {P}}(S_{t}=x)^{2}\right) ^{q}=\frac{l}{2}D(u)^{q}. \end{aligned}$$
(2.33)

Therefore, \({\mathbf {E}}[W_{l}]=1/2\). By Chebyshev’s inequality, we have:

$$\begin{aligned} {\mathbf {P}}\left( W_{l}-{\mathbf {E}}[W_{l}]<\frac{1}{(1+\epsilon ^{2})^{q}}-{\mathbf {E}}[W_{l}]\right) \le 4{\mathbf {V}}\text{ ar }(W_{l}), \end{aligned}$$
(2.34)

It remains to control the variance of \(W_{l}\). We define

$$\begin{aligned} Y_{j}=\frac{1}{D(u)^{q}}\sum \limits _{{\underline{t}}\in J'_{l,u}(j)}{\mathbf {P}}({\underline{t}},{\underline{S}}^{({\underline{t}})})-1, \end{aligned}$$

where \(J'_{l,u}(j)=\{{\underline{t}}\in J'_{l,u}: t_{0}=j\}\). It is obvious that \(W_{l}-{\mathbf {E}}[W_{l}]=\left( \sum _{j=1}^{l/2}Y_{j}\right) /l\) and \({\mathbf {E}}[Y_{j}]=0\) by (2.33). Then we have

$$\begin{aligned} {\mathbf {V}}\text{ ar }(W_{l})=\frac{1}{l^{2}}\sum \limits _{j_{1},j_{2}=1}^{\frac{l}{2}}{\mathbf {E}}[Y_{j_{1}}Y_{j_{2}}]. \end{aligned}$$
(2.35)

By Gnedenko’s local limit theorem (cf. [4, Theorem 8.4.1]), there exists a constant \(C_{1}\), such that for any \(t>0\) and \(x\in {\mathbb {Z}}\),

$$\begin{aligned} {\mathbf {P}}(S_{t}=x)\le \frac{C}{a_{t}}. \end{aligned}$$
(2.36)

Hence, by (2.36), (1.12) and (1.10),

$$\begin{aligned} Y_{j}\le \frac{1}{D(u)^{q}}\sum \limits _{{\underline{t}}\in J'_{l,u}(j)}{\mathbf {P}}({\underline{t}},{\underline{S}}^{({\underline{t}})})\le \frac{1}{D(u)^{q}}\left( \sum \limits _{t=1}^{u}\frac{C_{1}}{a_{t}}\right) ^{q}\le (C_{2})^{q}. \end{aligned}$$
(2.37)

Next, we will show that most summands in (2.35) are zero. Note that for \(j\in \{1,\ldots , l/2\}\), \(t_{q}-t_{0}\le qu\) for \(t_{0},t_{q}\in J'_{l,u}(j)\). If we denote the increment of S by \((Z_{n})_{n\ge 1}\), then \(Y_{j}\) only depends on \((Z_{j+1},\ldots ,Z_{j+qu})\). Therefore, for \(|j_{1}-j_{2}|>qu\), \(Y_{j_{1}}\) and \(Y_{j_{2}}\) are independent and \({\mathbf {E}}[Y_{j_{1}}Y_{j_{2}}]={\mathbf {E}}[Y_{j_{1}}]{\mathbf {E}}[Y_{j_{2}}]=0\). By (2.37),

$$\begin{aligned} {\mathbf {V}}\text{ ar }(W_{l})\le \frac{qu}{l}(C_{2})^{2q}\le q(C_{2})^{2q}l^{-\epsilon ^{2}}. \end{aligned}$$

Then (2.34) is bounded above by \((C_{3})^{q}l^{-\epsilon ^{2}}\), which tends to 0 as \(\beta \) tends to 0 by the definition of q and l and we complete the proof of Lemma 2.4. \(\square \)

2.2 Proof of Lemma 2.5

In this subsection, we prove Lemma 2.5. We will use C to represent generic constants in the proof and it could change from line to line.

Proof of Lemma 2.5

For any trajectory of S, we shift the environment by

$$\begin{aligned} {\hat{\omega }}_{n,x}:=\omega _{n,x}-\lambda '(\beta )\mathbb {1}_{\{S_{n}=x\}}. \end{aligned}$$
(2.38)

It is not hard to check that under \({\mathbb {P}}^{S}\), \({\hat{\omega }}\) is a family of independent random variables with mean 0. Besides, when \(\beta \) is small enough, by (2.15) and (2.16), the variance of \({\hat{\omega }}_{n,x}\) can be bounded by \(1+(\epsilon ^{3}/2)\).

To bound \({\mathbb {V}}\text{ ar }^{S}(X)\), we start by observing that

$$\begin{aligned} {\mathbb {E}}^{S}[X^{2}]=\frac{1}{2Rla_{l}D(u)^{q}}{\mathbb {E}}^{S}\left[ \left( \sum \limits _{{\underline{x}}\in ({\tilde{I}}_{0})^{q+1},{\underline{t}}\in J_{l,u}}{\mathbf {P}}({\underline{t}},{\underline{x}})\prod \limits _{i=0}^{q}\left( {\hat{\omega }}_{t_{j},x_{j}}+\lambda '(\beta )\mathbb {1}_{\{S_{t_{j}}=x_{j}\}}\right) \right) ^{2}\right] .\nonumber \\ \end{aligned}$$
(2.39)

A simple expansion shows that

$$\begin{aligned} \prod \limits _{i=0}^{q}\left( {\hat{\omega }}_{t_{j},x_{j}}+\lambda '(\beta )\mathbb {1}_{\{S_{t_{j}}=x_{j}\}}\right) =\sum \limits _{r=0}^{q+1}(\lambda '(\beta ))^{r}\sum \limits _{\begin{array}{c} A\subset \{0,\ldots ,q\}\\ |A|=r \end{array}}\prod \limits _{k\in A}\mathbb {1}_{\{S_{t_{k}}=x_{k}\}}\prod \limits _{j\in \{0,\ldots ,q\}\backslash A}{\hat{\omega }}_{t_{j},x_{j}}. \end{aligned}$$

Therefore, the square term in \({\mathbb {E}}^{S}\) in (2.39) is the summation over \({\underline{x}},{\underline{x}}'\in ({\tilde{I}}_{0})^{q+1}, {\underline{t}},{\underline{t}}'\in J_{l,u}\) of \({\mathbf {P}}({\underline{t}},{\underline{x}}){\mathbf {P}}({\underline{t}}',{\underline{x}}')\) times

$$\begin{aligned} \sum \limits _{r=0}^{q+1}\sum \limits _{r'=0}^{q+1}(\lambda '(\beta ))^{r+r'}\sum \limits _{\begin{array}{c} A\subset \{0,\ldots ,q\},|A|=r\\ B\subset \{0,\ldots ,q\},|B|=r \end{array}}\prod \limits _{\begin{array}{c} k\in A\\ k'\in B \end{array}}\mathbb {1}_{\{S_{t_{k}}=x_{k}\}}\mathbb {1}_{\{S_{t'_{k'}}=x'_{k'}\}}\prod \limits _{\begin{array}{c} j\in \{0,\ldots ,q\}\backslash A\\ j'\in \{0,\ldots ,q\}\backslash B \end{array}}{\hat{\omega }}_{t_{j},x_{j}}{\hat{\omega }}_{t'_{j'},x'_{j'}}.\nonumber \\ \end{aligned}$$
(2.40)

Note that \({\hat{\omega }}\) is a family of independent and mean-zero random variables under \({\mathbb {P}}^S\). When taking \({\mathbb {P}}^{S}\)-expectation in (2.40), the summand is nonzero if and only if \(r=r'\) and

$$\begin{aligned} \{(t_{j},x_{j})|~j\in \{0,\ldots ,q\}\backslash A\}=\{(t'_{j'},x'_{j'})|~j\in \{0,\ldots ,q\}\backslash B\}. \end{aligned}$$

Hence, to compute the \({\mathbb {P}}^{S}\)-expectation of (2.40), we can first fix \((t_{j},x_{j})\) for \(j\in \{0, \ldots , q\}\backslash A\), and then define a set of \((q-r+1)\)-tuples:

$$\begin{aligned} {\mathcal {S}}_{q-r}:=\{{\underline{s}}:=(s_{0},\ldots ,s_{q-r}): 1\le s_{0}<\cdots <s_{q-r}\le l, s_{q-r}-s_{0}\le qu\}. \end{aligned}$$

For any given \({\underline{s}}\in {\mathcal {S}}_{q-r}\), we further define a related set of r-tuples:

$$\begin{aligned} {\mathcal {T}}_{r}({\underline{s}}):=\{{\underline{t}}=(t_{1},\ldots ,t_{r}):~1\le t_{1}<\cdots <t_{r}\le l, {\underline{s}}\cdot {\underline{t}}\in J_{l,u}\}, \end{aligned}$$

where \({\underline{s}}\cdot {\underline{t}}\) is a \((q+1)\)-tuple, which contains all the entries of \({\underline{s}}\) and \({\underline{t}}\) and the entries are ordered from the smallest to the largest.

Now we can have a nicer form for \({\mathbb {V}}\) \(\hbox {ar}^{S}(X)\). Note that the \({\mathbb {P}}^{S}\)-expectation of the term \(r=r'=q+1\) in (2.40) is exactly the term \({\mathbb {E}}^{S}[X]^{2}\), so we can subtract it on both sides of (2.39) and by recalling \({\mathbb {E}}^{S}[({\hat{\omega }}_{n,x})^{2}]\le (1+\epsilon ^{3}/2)\le 2\) from (2.38), we obtain

$$\begin{aligned} {\mathbb {V}}\text{ ar }^{S}(X)&\le \frac{(1+\epsilon ^{3}/2)^{q+1}}{2Rla_{l}D(u)^{q}}\sum \limits _{{\underline{x}}\in ({\tilde{I}}_{0})^{q+1},{\underline{t}}\in J_{l,u}}{\mathbf {P}}({\underline{t}},{\underline{x}})^{2}\nonumber \\&\quad +\frac{1}{2Rla_{l}D(u)^{q}}\sum \limits _{r=1}^{q}(\lambda '(\beta ))^{2r}2^{q+1-r}\nonumber \\&\quad \times \sum \limits _{{\underline{s}}\in {\mathcal {S}}_{q-r}}\sum \limits _{{\underline{x}}\in ({\tilde{I}}_{0})^{q+1-r}} \sum \limits _{{\underline{t}},{\underline{t}}'\in {\mathcal {T}}_{r}({\underline{s}})}{\mathbf {P}}(({\underline{s}}\cdot {\underline{t}}),({\underline{x}},{\underline{S}}^{({\underline{t}})})){\mathbf {P}}(({\underline{s}}\cdot {\underline{t}}'),({\underline{x}},{\underline{S}}^{({\underline{t}}')})),\qquad \end{aligned}$$
(2.41)

where the first term on the right-hand side of (2.41) corresponds to \(r=0\), and it is actually equal to \((1+\epsilon ^{3}/2)^{q+1}{\mathbb {E}}[X^{2}]\) and bounded above by \((1+\epsilon ^{3}/2)^{q+1}\). For the \((q+1)\)-tuple \(({\underline{x}},{\underline{S}}^{({\underline{t}})})\) in the last summation, its ith element is \(x_{j}\) if and only if the ith element in \({\underline{s}}\cdot {\underline{t}}\) is \(s_{j}\), while it is \(S_{t_{j}}\) if and only if the ith element in \({\underline{s}}\cdot {\underline{t}}\) is \(t_{j}\).

Finally, we will bound

$$\begin{aligned}&\sum \limits _{{\underline{s}}\in {\mathcal {S}}_{q-r}}\sum \limits _{{\underline{x}}\in ({\tilde{I}}_{0})^{q+1-r}} \sum \limits _{{\underline{t}},{\underline{t}}'\in {\mathcal {T}}_{r}({\underline{s}})}{\mathbf {P}}(({\underline{s}}\cdot {\underline{t}}),({\underline{x}},{\underline{S}}^{({\underline{t}})})){\mathbf {P}}(({\underline{s}}\cdot {\underline{t}}'),({\underline{x}},{\underline{S}}^{({\underline{t}}')}))\nonumber \\&\quad =\sum \limits _{{\underline{s}}\in {\mathcal {S}}_{q-r}}\sum \limits _{{\underline{x}}\in ({\tilde{I}}_{0})^{q+1-r}}\left( \sum \limits _{{\underline{t}}\in {\mathcal {T}}_{r}({\underline{s}})}{\mathbf {P}}(({\underline{s}}\cdot {\underline{t}}),({\underline{x}},{\underline{S}}^{({\underline{t}})}))\right) ^{2}, \end{aligned}$$
(2.42)

which is the most complicated part of the proof.

First, let us denote \(s_{-1}:=0\) and \(s_{q-r+1}:=l\). We can split the summation \(\sum _{{\underline{t}}\in {\mathcal {T}}_{r}({\underline{s}})}{\mathbf {P}}(({\underline{s}}\cdot {\underline{t}}),({\underline{x}},{\underline{S}}^{({\underline{t}})}))\) according to the position of \(t_{1}\). We have

$$\begin{aligned} \sum \limits _{{\underline{t}}\in {\mathcal {T}}_{r}({\underline{s}})}{\mathbf {P}}(({\underline{s}}\cdot {\underline{t}}),({\underline{x}},{\underline{S}}^{({\underline{t}})}))=\sum \limits _{k=0}^{q+1-r}\sum \limits _{{\underline{t}}\in {\mathcal {T}}_{r}({\underline{s}}),t_{1}\in (s_{k-1},s_{k})}{\mathbf {P}}(({\underline{s}}\cdot {\underline{t}}),({\underline{x}},{\underline{S}}^{({\underline{t}})})). \end{aligned}$$

We observe that

$$\begin{aligned}&\sum \limits _{{\underline{t}}\in {\mathcal {T}}_{r}({\underline{s}}),t_{1}\in (s_{k-1},s_{k})}{\mathbf {P}}(({\underline{s}}\cdot {\underline{t}}),({\underline{x}},{\underline{S}}^{({\underline{t}})}))\le \sum \limits _{\begin{array}{c} 0= m_{0}=\cdots =m_{k-1}<\\ m_{k}\le m_{k+1}\le \cdots \le m_{q-r}\le r \end{array}}\mathbb {1}_{{\underline{t}}\in {\mathcal {T}}_{r}({\underline{s}})}\nonumber \\&\quad \times \prod \limits _{i=1}^{q-r}\sum \limits _{s_{i-1}<t_{m_{i-1}+1}<\cdots<t_{m_{i}}<s_{i}}{\mathbf {P}}((s_{i-1},t_{m_{i-1}+1},\ldots ,t_{m_{i}},s_{i}),(x_{i-1},S_{t_{m_{i-1}+1}},\ldots ,S_{t_{m_{i}}},x_{i}))\nonumber \\&\quad \times \sum \limits _{0<t_{1}<\cdots<t_{m_{0}}<s_{0}}{\mathbf {P}}((t_{1},\ldots ,t_{m_{0}},s_{0}),(S_{t_{1}},\ldots ,S_{t_{m_{0}}},x_{0}))\nonumber \\&\quad \times \sum \limits _{s_{q-r}<t_{m_{q-r}+1}<\cdots<t_{r}<l}{\mathbf {P}}((s_{q-r},t_{m_{q-r}+1},\ldots ,t_{r}),(x_{q-r},S_{t_{m_{q-r}+1}},\ldots ,S_{t_{r}})). \end{aligned}$$
(2.43)

Here \(m_{i}\) denotes the number of t-indices before \(s_{i}\). If \(m_{0}=0\), then the third line of (2.43) is simply 1 and so is the fourth line of (2.43) if \(m_{q-r}=r\). Note that

We can bound the factor in the second line of (2.43) for any \(i\in \{1,\ldots ,q-r\}\) according to the following lemma.

Lemma 2.6

There exists a constant C, such that for any \(j\in {\mathbb {N}}\) and any \((z_{i})_{i=1}^{j}\in {\mathbb {Z}}^{j}\),

$$\begin{aligned} \sum \limits _{\begin{array}{c} 0<t_{1}<\cdots<t_{j}<s\\ |t_{i}-t_{i-1}|\le u, i=1,\ldots ,j \end{array}}{\mathbf {P}}((0,t_{1},\ldots ,t_{j},s),(0,z_{1},\ldots ,z_{j},x))\le (CD(u))^{j}p_{s}(0,x),\qquad \end{aligned}$$
(2.44)

where \(t_{0}:=0\) for convention and we use the notation

$$\begin{aligned} p_{t}(x,y)={\mathbf {P}}(S_{t}=y-x) \end{aligned}$$

for any \(t\ge 1\) and \(y,x\in {\mathbb {Z}}\).

Proof of Lemma 2.6

Recall the definition (2.9) for \({\mathbf {P}}({\underline{t}},{\underline{x}})\) and note that the product of the first two factors of \({\mathbf {P}}((0,t_{1},\ldots ,t_{j},s),(0,z_{1},\ldots ,z_{j},x))\) is

$$\begin{aligned} {\mathbf {P}}(S_{t_{1}}=z_{1}){\mathbf {P}}(S_{t_{2}}-S_{t_{1}}=z_{2}-z_{1}). \end{aligned}$$
(2.45)

We now show an upper bound for (2.45) when it is non-zero. By Gnedenko’s local limit theorem (cf. [4, Theorem 8.4.1]), there exists a constant C, such that for all \(t\in {\mathbb {N}}\) and any \(|x|\le 2a_{t}\) with \({\mathbf {P}}(S_{t}=x)\ne 0\),

$$\begin{aligned} {\mathbf {P}}(S_{t}=x)\ge \frac{C}{a_{t}}. \end{aligned}$$
(2.46)

When \(|z_{2}|\le 2a_{t_{2}}\), by (2.36) and (2.46), we have

$$\begin{aligned} \frac{{\mathbf {P}}(S_{t_{1}}=z_{1}){\mathbf {P}}(S_{t_{2}}-S_{t_{1}}=z_{2}-z_{1})}{{\mathbf {P}}(S_{t_{2}}=z_{2})}\le C\frac{a_{t_{2}}}{a_{t_{1}}a_{t_{2}-t_{1}}}=C\frac{t_{2}\varphi (t_{2})}{t_{1}\varphi (t_{1})(t_{2}-t_{1})\varphi (t_{2}-t_{1})}.\nonumber \\ \end{aligned}$$
(2.47)

Suppose \(t_{1}\ge t_{2}-t_{1}\). Then \(t_{2}/t_{1}\le 2\). By Potter bounds (cf. [4, Theorem 1.5.6]),

$$\begin{aligned} \frac{{\mathbf {P}}(S_{t_{1}}=z_{1}){\mathbf {P}}(S_{t_{2}}-S_{t_{1}}=z_{2}-z_{1})}{{\mathbf {P}}(S_{t_{2}}=z_{2})}\le \frac{C}{a_{t_{1}}\wedge a_{t_{2}-t_{1}}}. \end{aligned}$$
(2.48)

When \(|z_{2}|\ge 2a_{t_{2}}\), by (1.15),

$$\begin{aligned} {\mathbf {P}}(S_{t_{2}}=z_{2})\ge Ct_{2}L(|z_{2}|)/(z_{2})^{2}. \end{aligned}$$
(2.49)

Suppose \(|z_{1}|\ge |z_{2}-z_{1}|\). Then \(|z_{1}|\ge a_{t_{2}}\ge a_{t_{1}}\). We can apply the upper bound in (1.15) to \({\mathbf {P}}(S_{t_{1}}=z_{1})\) and apply (2.36) to \({\mathbf {P}}(S_{t_{2}-t_{1}}=z_{2}-z_{1})\), and then by (2.49), we have

$$\begin{aligned} \frac{{\mathbf {P}}(S_{t_{1}}=z_{1}){\mathbf {P}}(S_{t_{2}}-S_{t_{1}}=z_{2}-z_{1})}{{\mathbf {P}}(S_{t_{2}}=z_{2})}\le \frac{t_{1}(z_{2})^{2}L(|z_{1}|)}{t_{2}(z_{1})^{2}L(|z_{2}|)}\frac{C}{a_{t_{2}-t_{1}}}. \end{aligned}$$

Since \(t_{1}/t_{2}\le 1\) and \(|z_{2}|/|z_{1}|\le 2\), by Potter bounds (cf. [4, Theorem 1.5.6]), (2.48) also holds, i.e., we have establish (2.48) for any \(z_{2}\in {\mathbb {Z}}\).

Then, by (2.48), (1.12) and (1.10), we have

$$\begin{aligned}&\sum \limits _{\begin{array}{c} 0<t_{1}<\cdots<t_{j}<s\\ |t_{i}-t_{i-1}|\le u, i=1,\ldots ,j \end{array}}{\mathbf {P}}((0,t_{1},\ldots ,t_{j},s),(0,z_{1},\ldots ,z_{j},x))\\&\quad \le \sum \limits _{\begin{array}{c} 0<t_{1}<\cdots<t_{j}<s\\ |t_{i}-t_{i-1}|\le u, i=1,\ldots ,j \end{array}}\frac{C}{a_{t_{1}}\wedge a_{t_{2}-t_{1}}}{\mathbf {P}}((0,t_{2},\ldots ,t_{j},s),(0,z_{2},\ldots ,z_{j},x))\\&\quad \le \sum \limits _{0<t_{1}<2u}\frac{C}{a_{t_{1}}\wedge a_{2u-t_{1}}}\sum \limits _{\begin{array}{c} 0<t_{2}<\cdots<t_{j}<s\\ |t_{i}-t_{i-1}|\le u, i=2,\ldots ,j, t_{1}:=0 \end{array}}{\mathbf {P}}((0,t_{2},\ldots ,t_{j},s),(0,z_{2},\ldots ,z_{j},x))\\&\quad \le 2CD(u)\sum \limits _{\begin{array}{c} 0<t_{2}<\cdots<t_{j}<s\\ |t_{i}-t_{i-1}|\le u, i=2,\ldots ,j,t_{1}:=0 \end{array}}{\mathbf {P}}((0,t_{2},\ldots ,t_{j},s),(0,z_{2},\ldots ,z_{j},x)). \end{aligned}$$

By induction, we then prove (2.44). \(\square \)

The case \(r=q\) in (2.41) will be dealt with later. For \(1\le r\le q-1\) in (2.41), i.e. \(|{\underline{s}}|\ge 2\), we apply Lemma 2.6 for all terms in (2.43) with ts-indices larger than \(s_{k}\) to obtain an upper bound

$$\begin{aligned}&\mathbb {1}_{{\underline{t}}\in {\mathcal {T}}_{r}({\underline{s}})}\sum \limits _{\begin{array}{c} 0= m_{0}=\cdots =m_{k-1}<\\ m_{k}\le m_{k+1}\le \cdots \le m_{q-r}\le r \end{array}}\left( \sum \limits _{0<t_{1}<\cdots<t_{m_{0}}<s_{0}}{\mathbf {P}}((t_{1},\ldots ,t_{m_{0}},s_{0}),(S_{t_{1}},\ldots ,S_{t_{m_{0}}},x_{0}))\right) \nonumber \\&\qquad \qquad \times \prod \limits _{i=1}^{k-1}p_{s_{i}-s_{i-1}}(x_{i-1},x_{i})\nonumber \\&\qquad \qquad \times \sum \limits _{s_{k-1}<t_{m_{k-1}+1}<\cdots<t_{m_{k}}<s_{k}}{\mathbf {P}}((x_{k-1},S_{t_{m_{k-1}+1}},\ldots ,S_{m_{k}},x_{k}))\nonumber \\&\qquad \qquad \times (CD(u))^{m_{q-r}-m_{k}}\prod \limits _{i=k+1}^{q-r}p_{s_{i}-s_{i-1}}(x_{i-1},x_{i})\nonumber \\&\qquad \qquad \times \sum \limits _{s_{q-r}<t_{m_{q-r}+1}<\cdots<t_{r}<l}{\mathbf {P}}((s_{q-r},t_{m_{q-r}+1},\ldots ,t_{r}),(x_{q-r},S_{t_{m_{q-r}+1}},\ldots ,S_{t_{r}})). \end{aligned}$$
(2.50)

Recall that the factor in the first line of (2.50) is 1 if \(m_{0}=0\) and note that if \(m_{q-r}<r\), i.e. \(t_{1}<s_{q-r}\), we should further bound the last line in (2.50) from above by \((CD(u))^{r-m_{q-r}}\), which is due to (1.12) and (1.10).

Note that the number of possible interlacements of \(0\le m_{0}\le \cdots \le m_{q-r}\le r\) is not larger than \(2^{q}\). Hence, according to the value of k, (2.50) can be bounded above by

$$\begin{aligned} J_{0}&=2^{q}\mathbb {1}_{{\underline{t}}\in {\mathcal {T}}_{r}({\underline{s}})}\sum \limits _{m_{k}=1}^{r}(CD(u))^{r-m_{k}}\nonumber \\&\quad \times \sum \limits _{0<t_{1}<\cdots<t_{m_{0}}<s_{0}}{\mathbf {P}}((t_{1},\ldots ,t_{m_{0}},s_{0}),(S_{t_{1}},\ldots ,S_{t_{m_{0}}},x_{0}))\prod \limits _{i=1}^{q-r}p_{s_{i}-s_{i-1}}(x_{i-1},x_{i})\nonumber \\ \end{aligned}$$
(2.51)

if \(k=0\);

$$\begin{aligned} J_{k}&=2^{q}\mathbb {1}_{{\underline{t}}\in {\mathcal {T}}_{r}({\underline{s}})}\sum \limits _{m_{k}=1}^{r}(CD(u)^{r-m_{k}})\prod \limits _{i=1}^{k-1}p_{s_{i}-s_{i-1}}(x_{i-1}-x_{i})\nonumber \\&\quad \times \sum \limits _{s_{k-1}<t_{1}<\cdots<t_{m_{k}}<s_{k}}{\mathbf {P}}((s_{k-1},t_{1},\ldots ,t_{m_{k}},s_{k}))\nonumber \\&\quad \times \prod \limits _{i=k+1}^{q-r}p_{s_{i}-s_{i-1}}(x_{i-1},x_{i}) \end{aligned}$$
(2.52)

if \(1\le k\le q-r\); and

$$\begin{aligned}&J_{q+1-r}=2^{q}\mathbb {1}_{{\underline{t}}\in {\mathcal {T}}_{r}({\underline{s}})}\sum \limits _{m_{k}=1}^{r}\prod \limits _{i=1}^{q-r}p_{s_{i}-s_{i-1}}(x_{i-1},x_{i})\nonumber \\&\quad \times \sum \limits _{s_{q-r}<t_{1}<\cdots <t_{r}\le l}{\mathbf {P}}((s_{q-r},t_{1},\ldots ,t_{r}),(x_{q-r},S_{t_{1}},\ldots ,S_{t_{r}})) \end{aligned}$$
(2.53)

if \(k=q+1-r\).

Now we can expand the square term in (2.42) and then bound (2.42) from above by

$$\begin{aligned} \sum \limits _{{\underline{s}}\in {\mathcal {S}}_{q-r}}\sum \limits _{{\underline{x}}\in ({\tilde{I}}_{0})^{q+1-r}}\sum \limits _{k,k'=1}^{q+1-r}\sum \limits _{m_{k},m'_{k'}=1}^{r}J_{k}J_{k'}, \end{aligned}$$

where the expressions for \(J_{k}\) and \(J_{k'}\) can be (2.51),(2.52), or (2.53). We will use different summing strategies to bound

$$\begin{aligned} \sum \limits _{{\underline{s}}\in {\mathcal {S}}_{q-r}}\sum \limits _{{\underline{x}}\in ({\tilde{I}}_{0})^{q+1-r}}\sum \limits _{m_{k},m'_{k'}=1}^{r}J_{k}J_{k'} \end{aligned}$$
(2.54)

for different k and \(k'\). There are two basic cases:

$$\begin{aligned} \begin{aligned}&\text{ Case } \text{ A: }~k=k',\\&\text{ Case } \text{ B: }~k\ne k', \end{aligned} \end{aligned}$$

and we start by bounding Case A: \(k=k'\).

According to the value of k and \(k'\), there are three sub-cases of Case A:

$$\begin{aligned} \begin{aligned}&\text{ Case } \text{ A1: }~k=k=0,\\&\text{ Case } \text{ A2: }~k=k'=q+1-r,\\&\text{ Case } \text{ A3: }~1\le k=k'\le q-r. \end{aligned} \end{aligned}$$

Case A1: \(k=k'=0\):

If \(k=k'=0\) in (2.54), then we can first fix the position of \(s_{0}\), which has at most l choices. Note that we have the term \(\prod _{i=1}^{q-r}\sum _{(x_{1},\ldots ,x_{q-r})\in ({\tilde{I}}_{0})^{q-r}}(p_{s_{i}-s_{i-1}}(x_{i-1},x_{i}))^{2}\). Hence, for any \(x_{0}\), we can sum over \(s_{1},\ldots ,s_{q-r}\) and \(x_{1},\ldots ,x_{q-r}\) by (1.10), which gives \(\prod _{i=1}^{q-r}D(s_{i}-s_{i-1})\). By Potter bounds [4, Theorem 1.5.6],

$$\begin{aligned} \prod \limits _{i=1}^{q-r}D(s_{i}-s_{i-1})\le (CD(u))^{q-r}\prod \limits _{i=1}^{q-r}\frac{s_{i}-s_{i-1}}{u}. \end{aligned}$$
(2.55)

Since \(s_{q-r}\le qu\), \(\prod _{i=1}^{q-r}((s_{i}-s_{i-1})/u)\le (q/(q-r))^{q-r}\). Hence, (2.55) is bounded above by \(C^{q}D(u)^{q-r}\).

Next, we use the trivial bound

$$\begin{aligned} \sum \limits _{0<t'_{1}<\cdots<t'_{m'_{0}}<s_{0}}{\mathbf {P}}((t'_{1},\ldots ,t'_{r},s_{0}),(S_{t'_{1}},\ldots ,S_{t'_{m'_{0}}},x_{0}))\le (CD(u))^{m'_{0}} \end{aligned}$$

and then sum over \(s_{0}-t_{m_{0}}\) and \(x_{0}\) by

$$\begin{aligned} \sum \limits _{s_{0}-t_{m_{0}}=1}^{u}\sum \limits _{x_{0}\in {\tilde{I}}_{0}}p_{s_{0}-t_{m_{0}}}(S_{t_{r}},x_{0})\le \sum \limits _{t=1}^{u}1=u. \end{aligned}$$

At last, we use the trivial bound

$$\begin{aligned} \sum \limits _{0<t_{1}<\cdots <t_{m_{0}}}{\mathbf {P}}((t_{1},\ldots ,t_{m_{0}}),(S_{t_{1}},\ldots ,S_{t_{m_{0}}}))\le (CD(u))^{m_{0}-1}. \end{aligned}$$

Now we obtain that for any \(m_{0}\) and \(m'_{0}\),

$$\begin{aligned} \sum \limits _{{\underline{s}}\in {\mathcal {S}}_{q-r}}\sum \limits _{x\in ({\tilde{I}}_{0})^{q+1-r}}(J_{0})^{2}\le C^{q}ulD(u)^{q+r-1}. \end{aligned}$$

Case A2: \(k=k'=q+1-r\):

If \(k=k'=q+1-r\) in (2.54), then we can first fix the position of \(s_{q-r}\) and then apply the strategy above to obtain that

$$\begin{aligned} \sum \limits _{{\underline{s}}\in {\mathcal {S}}_{q-r}}\sum \limits _{x\in ({\tilde{I}}_{0})^{q+1-r}}(J_{q+1-r})^{2}\le C^{q}ulD(u)^{q+r-1}. \end{aligned}$$

Case A3: \(1\le k=k'\le q-r\):

If \(1\le k=k'\le q-r\) in (2.54), then \(s_{k-1}<t_{1}\) and \(s_{k-1}<t'_{1}\) by (2.52) and we can first fix the position of \(s_{k-1}\), which has at most l choices. Note that we have the term \(\prod _{i=k-1}^{1}\sum _{(x_{k-2},\ldots ,x_{0})\in ({\tilde{I}}_{0})^{k-1}}(p_{s_{i}-s_{i-1}}(x_{i-1},x_{i}))^{2}\). Hence, for any \(x_{k-1}\), we can sum over \(s_{0},\ldots ,s_{k-2}\) and \(x_{0},\ldots ,x_{k-2}\) by (1.10) and (2.55) (hold \((s_{k-1},x_{k-1})\) for the moment), which gives \(C^{q}D(u)^{k-1}\). For the same reason, then we can sum over \(s_{k+1},\ldots ,s_{q-r}\) and \(x_{k+1},\ldots ,x_{q-r}\) (hold \((s_{k},x_{k})\) for the moment), which gives \(C^{q}D(u)^{q-r-k}\) . These summations and products together give \(C^{q}D(u)^{q-r-1}\).

Next, we apply Lemma 2.6 to obtain

$$\begin{aligned}&\sum \limits _{s_{k-1}<t'_{1}<\cdots<t'_{m'_{k}}<s_{k}}{\mathbf {P}}((s_{k-1},t'_{1},\ldots ,t'_{m'_{k}},s_{k}),(x_{k-1},S_{t'_{1}},\ldots ,S_{t'_{m'_{k}}},x_{k}))\\&\quad \le (CD(qu))^{m'_{k}}p_{s_{k}-s_{k-1}}(x_{i-1},x_{i}). \end{aligned}$$

Then it remains to bound

$$\begin{aligned} \begin{aligned} \sum \limits _{s_{k}-s_{k-1}=1}^{m_{k}u}&\sum \limits _{s_{k-1}<t_{1}<\cdots<t_{m_{k}}<s_{k}}\sum \limits _{x_{k-1},x_{k}\in {\tilde{I}}_{0}}\\&p_{s_{k}-s_{k-1}}(x_{k-1},x_{k}){\mathbf {P}}((s_{k-1},t_{1},\ldots ,t_{m_{k}},s_{k}),(x_{k-1},S_{t_{1}},\ldots ,S_{t_{m_{k}}},x_{k})). \end{aligned} \end{aligned}$$
(2.56)

Note that by (1.14), there exists a \(T>0\), such that for all \(t\ge T\), \(P(S_{t}=x)>0\) for any \(x\in {\mathbb {Z}}\). Hence, we can split (2.56) into three parts:

  1. (i)

    \(\min \{t_{1}-s_{k-1}, s_{k}-t_{m_{k}}\}\ge T\).

  2. (ii)

    \(\min \{t_{1}-s_{k-1}, s_{k}-t_{m_{k}}\}<T\) and \(\max \{t_{1}-s_{k-1}, s_{k}-t_{m_{k}}\}\ge T\).

  3. (iii)

    \(\max \{t_{1}-s_{k-1}, s_{k}-t_{m_{k}}\}<T\).

To deal with part (i) in (2.56), we need the following lemma.

Lemma 2.7

For any \(\epsilon >0\), there exists a constant C, such that for any \(k\ge 2\) and all \(n\ge k\),

$$\begin{aligned} \sum \limits _{\begin{array}{c} j_{1}+\cdots +j_{k}=n\\ j_{i}>0, \forall i\in \{1,\ldots ,k\} \end{array}}\frac{1}{a_{j_{1}+j_{2}+n}}\left( \prod \limits _{i=3}^{k}\frac{1}{a_{j_{i}}}\mathbb {1}_{\{k\ge 3\}}+\mathbb {1}_{\{k<3\}}\right) \le n^{\epsilon ^{4}}C^{k-1}D(n)^{k-2}. \end{aligned}$$
(2.57)

Proof of Lemma 2.7

We prove it by induction.

For \(k=2\), by Potter bounds [4, Theorem 1.5.6],

$$\begin{aligned} \sum \limits _{\begin{array}{c} j_{1}+j_{2}=n\\ j_{1},j_{2}>0 \end{array}}\frac{1}{a_{j_{1}+j_{2}+n}}=\frac{n-1}{a_{2n}}\le \frac{1}{2\phi (2n)}\le Cn^{\epsilon ^{4}}. \end{aligned}$$

Suppose (2.57) is valid for \(k\ge 2\) and then for \(k+1\), since \(a_{(\cdot )}\) is increasing,

$$\begin{aligned}&\sum \limits _{\begin{array}{c} j_{1}+\cdots +j_{k+1}=n\\ j_{i}>0, \forall i\in \{1,\ldots ,k+1\} \end{array}}\frac{1}{a_{j_{1}+j_{2}+n}}\prod \limits _{i=3}^{k+1}\frac{1}{a_{j_{i}}}\\&\quad \le \sum \limits _{j_{k+1}=1}^{n-k}\frac{1}{a_{j_{k+1}}}\sum \limits _{\begin{array}{c} j_{1}+\cdots +j_{k}=n-j_{k+1}\\ j_{i}>0, \forall i\in \{1,\ldots ,k\} \end{array}}\frac{1}{a_{j_{1}+j_{2}+n-j_{k+1}}}\prod \limits _{i=3}^{k}\frac{1}{a_{j_{i}}}\\&\quad \le \sum \limits _{j_{k+1}=1}^{n-k}\frac{1}{a_{j_{k+1}}}(n-j_{k+1})^{\epsilon ^{4}}C^{k-1}D(n-j_{k+1})^{k-2}\\&\quad \le n^{\epsilon ^{4}}C^{k}D(n))^{k-1}. \end{aligned}$$

Then the induction is completed and the lemma has been proved. \(\square \)

Since \(\min \{t_{1}-s_{k-1}, s_{k}-t_{m_{k}}\}\ge T\), we have

$$\begin{aligned}&\sum \limits _{x_{k-1},x_{k}\in {\tilde{I}}_{0}}p_{t_{1}-s_{k-1}}(x_{k-1},S_{t_{1}})p_{s_{k}-s_{k-1}}(x_{k-1},x_{k})p_{s_{k}-t_{m_{k}}}(S_{t_{m_{k}}},x_{k})\\&\quad \le C p_{t_{1}-s_{k-1}+s_{k}-s_{k-1}+s_{k}-t_{m_{k}}}(S_{t_{1}},S_{t_{m_{k}}})\le \frac{C}{a_{t_{1}-s_{k-1}+s_{k}-t_{m_{k}}+s_{k}-s_{k-1}}}, \end{aligned}$$

where

$$\begin{aligned}&p_{t_{m_{k=1}+1}-s_{k-1}}(x_{k-1},S_{t_{1}})\le Cp_{t_{1}-s_{k-1}}(S_{t_{1}},x_{k-1})\\&p_{s_{k}-t_{m_{k}}}(S_{t_{m_{k}}},x_{k})\le Cp_{s_{k}-t_{m_{k}}}(x_{k},S_{t_{m_{k}}})\\ \end{aligned}$$

follow from the arguments (2.46)–(2.49). Then, by Lemma 2.7, part (i) in (2.56) is bounded above by

$$\begin{aligned}&C\sum \limits _{s_{k}-s_{k-1}=1}^{m_{k}u}\sum \limits _{s_{k-1}<t_{1}<\cdots<t_{m_{k}}<s_{k}}\frac{1}{a_{t_{1}-s_{k-1}+s_{k}-t_{m_{k}}+s_{k}-s_{k-1}}}\prod \limits _{i=2}^{t_{m_{k}}}\frac{1}{a_{t_{i}-t_{i-1}}}\nonumber \\&\quad \le C^{m_{k}}(m_{k}u)^{1+\epsilon ^{4}}(D(m_{k}u))^{m_{k}-1}\le C^{m_{k}}(qu)^{1+\epsilon ^{4}}(D(qu))^{m_{k}-1}. \end{aligned}$$
(2.58)

We will use the following lemma to handle D(qu).

Lemma 2.8

Recall that \(u\rightarrow \infty \) as the inverse temperature \(\beta \rightarrow 0\). We have

$$\begin{aligned} \lim \limits _{\beta \rightarrow 0}\frac{D(qu)}{D(u)}=1 \end{aligned}$$
(2.59)

Proof of Lemma 2.8

Without loss of generality, we may assume that \(D(\cdot )\) and \(\varphi (\cdot )\) are differentiable by [4, Theorem 1.8.2]. Then by definition of \(D(\cdot )\), it follows that \(D'(u)\sim (u\varphi (u))^{-1}\).

We will apply [4, Proposition 2.3.2, Theorem 2.3.1] to prove (2.59), which reduces (2.59) to showing

$$\begin{aligned} \lim \limits _{\beta \rightarrow 0}\frac{uD'(u)\log q}{D(u)}=0. \end{aligned}$$

By recalling the definition of u and q from (2.6), we need to show

$$\begin{aligned} \lim \limits _{\beta \rightarrow 0}\max \left\{ \frac{f_{1}(u):=\log \left( \log \sqrt{\varphi \left( u^{\frac{1}{1-\epsilon ^{2}}}\right) }\vee e\right) }{\varphi (u)D(u)},\frac{f_{2}(u):=\log \log {D\left( u^{\frac{1}{1-\epsilon ^{2}}}\right) }}{\varphi (u)D(u)}\right\} =0.\nonumber \\ \end{aligned}$$
(2.60)

We will prove (2.60) by proving both \(f_{1}(u)/\varphi (u)D(u)\) and \(f_{2}(u)/\varphi (u)D(u)\) tend to 0 as \(\beta \) tends to 0.

For \(f_{1}(u)/\varphi (u)D(u)\), note that \(\varphi (u)D(u)\rightarrow \infty \) as \(\beta \rightarrow 0\). Then by L’Hospital rule, we have

$$\begin{aligned} \begin{aligned}&\lim \limits _{\beta \rightarrow 0}\frac{\log \log \sqrt{\varphi \left( u^{\frac{1}{1-\epsilon ^{2}}}\right) }}{\varphi (u)D(u)}\\&\quad =\lim \limits _{\beta \rightarrow 0}\frac{1}{\log \varphi \left( u^{\frac{1}{1-\epsilon ^{2}}}\right) }\frac{1}{\varphi \left( u^{\frac{1}{1-\epsilon ^{2}}}\right) }\frac{u^{\frac{\epsilon ^{2}}{1-\epsilon ^{2}}}\varphi \left( u^{\frac{1}{1-\epsilon ^{2}}}\right) }{(1-\epsilon ^{2})(\varphi '(u)D(u)+\varphi (u)D'(u))}\\&\quad =\lim \limits _{\beta \rightarrow 0}\frac{1}{\log \varphi \left( u^{\frac{1}{1-\epsilon ^{2}}}\right) }\frac{1}{\varphi \left( u^{\frac{1}{1-\epsilon ^{2}}}\right) }\frac{u^{\frac{1}{1-\epsilon ^{2}}}\varphi \left( u^{\frac{1}{1-\epsilon ^{2}}}\right) }{(1-\epsilon ^{2})u(\varphi '(u)D(u)+1)}=0, \end{aligned} \end{aligned}$$

where we use the property that \(\lim _{x\rightarrow \infty }x\varphi '(x)/\varphi (x)=0\) by [4, Section 1.8].

By the same computation as above, we also have

$$\begin{aligned} \lim \limits _{\beta \rightarrow 0}\frac{\log \log D(u^{\frac{1}{1-\epsilon ^{2}}})}{\varphi (u)D(u)}=0 \end{aligned}$$

and thus (2.59) is proved. \(\square \)

By Lemma 2.8, (2.58) can be bounded above by \((2C)^{m_{k}}(qu)^{1+\epsilon ^{4}}D(u)^{m_{k}-1}\).

For part (ii) in (2.56), let us assume \(s_{k}-t_{m_{k}}\ge T\) and \(t_{1}-s_{k-1}<T\). Then

$$\begin{aligned}&\sum \limits _{x_{k-1},x_{k}\in {\tilde{I}}_{0}}p_{t_{1}-s_{k-1}}(x_{k-1},S_{t_{1}})p_{s_{k}-s_{k-1}}(x_{k-1},x_{k})p_{s_{k}-t_{m_{k}}}(S_{t_{m_{k}}},x_{k})\\&\quad \le C\sum \limits _{x_{k-1}\in {\tilde{I}}_{0}}p_{t_{1}-s_{k-1}}(x_{k-1},S_{t_{1}}) p_{s_{k}-s_{k-1}+s_{k}-t_{m_{k}}}(x_{k-1},S_{t_{m_{k}}})\\&\quad \le \sum \limits _{x_{k-1}\in {\tilde{I}}_{0}}p_{t_{1}-s_{k-1}}(x_{k-1},S_{t_{1}})\frac{C}{a_{s_{k}-t_{m_{k}}+s_{k}-s_{k-1}}}\le \frac{C}{a_{s_{k}-t_{m_{k}}+s_{k}-s_{k-1}}}. \end{aligned}$$

It is not hard to check that by the proof of Lemma 2.7, it follows that

$$\begin{aligned} \sum \limits _{\begin{array}{c} j_{1}+\cdots +j_{k}=n\\ j_{i}>0, \forall i\in \{1,\ldots ,k\} \end{array}}\frac{1}{a_{j_{1}+n}}\left( \prod \limits _{i=2}^{k}\frac{1}{a_{j_{i}}}\mathbb {1}_{\{k\ge 2\}}+\mathbb {1}_{\{k<2\}}\right) \le n^{\epsilon ^{4}}C^{k-1}D(n)^{k-1}. \end{aligned}$$

Hence, part (ii) in (2.56) can be bounded above by \(TC^{m_{k}-1}(qu)^{1+\epsilon ^{4}}D(u)^{m_{k}-1}\), where T comes from \(\sum _{t_{1}-s_{t-1}=1}^{T}\).

For part (iii) in (2.56), we have

$$\begin{aligned}&\sum \limits _{x_{k-1},x_{k}\in {\tilde{I}}_{0}}p_{t_{1}-s_{k-1}}(x_{k-1},S_{t_{1}})p_{s_{k}-s_{k-1}}(x_{k-1},x_{k})p_{s_{k}-t_{m_{k}}}(S_{t_{m_{k}}},x_{k})\\&\quad \le \frac{C}{a_{s_{k}-s_{k-1}}}\sum \limits _{x_{k-1},x_{k}\in {\tilde{I}}_{0}}p_{t_{1}-s_{k-1}}(x_{k-1},S_{t_{1}})p_{s_{k}-t_{m_{k}}}(S_{t_{m_{k}}},x_{k})\le \frac{C}{a_{s_{k}-s_{k-1}}}. \end{aligned}$$

Similarly, part (iii) can be bounded above by \(T^{2}C^{m_{k}-2}(qu)^{1+\epsilon ^{4}}D(u)^{m_{k}-1}\). Hence, (2.56) can be bounded above by \(C^{m_{k}}(qu)^{1+\epsilon ^{4}}D(u)^{m_{k}-1}\) and we obtain that for any \(m_{k}\) and \(m'_{k}\),

$$\begin{aligned} \sum \limits _{{\underline{s}}\in {\mathcal {S}}_{q-r}}\sum \limits _{{\underline{x}}\in ({\tilde{I}}_{0})^{q+1-r}}(J_{k})^{2}\le C^{q}(qu)^{1+\epsilon ^{4}}lD(u)^{q+r-1}, \end{aligned}$$

which finished the estimate for Case A3.

Now all sub-cases of Case A have been handled and we start to consider Case B for (2.54). Recall that \(k\ne k'\) in Case B and we may just assume that \(k<k'\). First, we can fix the position of \(s_{k-1}\), which has at most l choices. Next, if \(k'=q+1-r\), then we just use the trivial bound

$$\begin{aligned} \sum \limits _{s_{q-r}<t'_{1}<\cdots <t'_{r}\le l}{\mathbf {P}}((s_{q-r},t'_{1},\ldots ,t'_{r}),(x_{q-r},S_{t'_{1}},\ldots ,S_{t'_{r}}))\le (CD(u))^{r}, \end{aligned}$$

while if \(k'<q+1-r\), then we apply Lemma 2.6 to obtain

$$\begin{aligned}&\sum \limits _{s_{k'-1}<t'_{1}<\cdots<t'_{m'_{k'}}<s_{k'}}{\mathbf {P}}((s_{k'-1},t'_{1},\ldots ,t'_{m'_{k'}},s_{k'}),(x_{k'-1},S_{t'_{1}},\ldots ,S_{t'_{m'_{k'}}},x_{k'}))\\&\quad \le (CD(u))^{m'_{k'}}p_{s_{k'}-s_{k'-1}}(x_{k'-1},x_{k'}). \end{aligned}$$

According to the value of k, there are two sub-cases in Case B:

$$\begin{aligned} \begin{aligned}&\text{ Case } \text{ B1: }~k=0,\\&\text{ Case } \text{ B2: }~k>0. \end{aligned} \end{aligned}$$

Case B1:

If \(k=0\) in (2.54), then we have the term \(\prod _{i=1}^{q-r}\sum _{(x_{1},\ldots ,x_{q-r})\in ({\tilde{I}}_{0})^{q-r}}(p_{s_{i}-s_{i-1}}(x_{i-1},x_{i}))^{2}\) and for any \(x_{0}\), we can sum over \(s_{1},\ldots ,s_{q-r}\) and \(x_{1},\ldots ,x_{q-r}\) by (1.10) and (2.55) to obtain an upper bound \(C^{q}D(u)^{q-r}\). Then we can complete the estimate by

$$\begin{aligned} \sum \limits _{s_{0}-t_{m_{0}}=1}^{u}\sum \limits _{x_{0}\in {\tilde{I}}_{0}}p_{s_{0}-t_{m_{0}-1}}(S_{t_{m_{0}}},x_{0})\le u \end{aligned}$$

and

$$\begin{aligned} \sum \limits _{0<t_{1}<\cdots <t_{m_{0}}}{\mathbf {P}}((t_{1},\ldots ,t_{m_{0}}),(S_{t_{1}},\ldots ,S_{t_{m_{0}}}))\le (CD(u))^{m_{0}-1}. \end{aligned}$$

Case B2:

If \(k>0\) in (2.54), then we have \(\prod _{i=k-1}^{1}\sum _{(x_{k-2},\ldots ,x_{0})\in ({\tilde{I}}_{0})^{k-1}}(p_{s_{i}-s_{i-1}}(x_{i-1},x_{i}))^{2}\) and for any \(x_{k-1}\), we can sum over \(s_{0},\ldots ,s_{k-2}\) and \(x_{0},\ldots ,x_{k-2}\) by (1.10) and (2.55) (hold \((s_{k-1},x_{k-1})\) for the moment), which gives \(C^{q}D(u)^{k-1}\). For the same reason, then we can sum over \(s_{k+1},\ldots ,s_{q-r}\) and \(x_{k+1},\ldots ,x_{q-r}\) (hold \((s_{k},x_{k})\) for the moment), which gives \(C^{q}D(u)^{q-r-k}\). These summations and products together give \(C^{q}D(u)^{q-r-1}\), and then we can complete all the estimate by bounding

$$\begin{aligned} \begin{aligned} \sum \limits _{s_{k}-s_{k-1}=1}^{m_{k}u}&\sum \limits _{s_{k-1}<t_{1}<\cdots<t_{m_{k}}<s_{k}}\sum \limits _{x_{k-1},x_{k}\in {\tilde{I}}_{0}}\\&p_{s_{k}-s_{k-1}}(x_{k-1},x_{k}){\mathbf {P}}((s_{k-1},t_{1},\ldots ,t_{m_{k}},s_{k}),(x_{k-1},S_{t_{1}},\ldots ,S_{t_{m_{k}}},x_{k})). \end{aligned} \end{aligned}$$

via (2.56)–(2.58).

According to the upper bounds in Case A and Case B, we can obtain an upper bound \(C^{q}q^{2}(qu)^{1+\epsilon ^{4}}lD(u)^{q+r}\) for (2.54) by summing over \(m_{k}\) and \(m'_{k'}\). Recall that our analysis in Case A and Case B is based on \(1\le r\le q-1\). Hence, for \(1\le r\le q-1\), we can sum over k and \(k'\) to bound (2.42) from above by \(C^{q}u^{1+\epsilon ^{4}}lD(u)^{q+r}\), since \(q^{5+\epsilon ^{2}}\ll C^{q}\).

It still remains to bound the case \(r=q\) in (2.42), where \({\underline{s}}=\{s_{0}\}\). This is relatively simple. We use the expression in the first line of (2.42). Suppose that the t-index right beside \(s_{0}\) is \(t_{j}\). Without loss of generality, we may assume \(s_{0}<t_{j}\). Then we have

$$\begin{aligned} \sum \limits _{t_{j}-s_{0}=1}^{u}\sum \limits _{{\tilde{x}}_{0}\in {\tilde{I}}_{0}}p_{t_{j}-s_{0}}(x_{0},S_{t_{j}})\le u. \end{aligned}$$

For the other \(t,t'\)-indices, we just use the trivial bound

$$\begin{aligned} \sum \limits _{t=1}^{u}p_{t}(0,S_{t})\le D(u) \end{aligned}$$

and then we obtain an upper bound \(C^{q}ulD(u)^{q+r-1}\) for the case \(r=q\) in (2.42).

Finally, we substitute everything into (2.41) and by recalling \(\lambda '(\beta )\sim \beta \), \(\beta ^{2}D(u)<(1+2\epsilon )\), we have

$$\begin{aligned} \begin{aligned} {\mathbb {V}}\text{ ar }^{S}(X)&\le (1+\epsilon ^{3}/2)^{q+1}+\frac{C^{q}u^{1+\epsilon ^{4}} }{2Ra_{l}}\sum \limits _{r=1}^{q}(1+2\epsilon )^{r}\\&\le (1+\epsilon ^{3}/2)^{q+1}+\frac{q(2C)^{q}}{2R}l^{-\epsilon ^{3}}\\&\le (1+\epsilon ^{3}/2)^{q+1}+1\le (1+\epsilon ^{3})^{q} \end{aligned} \end{aligned}$$

and we conclude Lemma 2.5. \(\square \)

3 Proof of Theorem 1.4

In this proof, for any given \(\beta \) and \(\epsilon \), we will estimate the partition function at a special time N, defined by

$$\begin{aligned} N_{\beta ,\epsilon }:=\max \limits _{n}\{D(n)\le (1-\epsilon )/\beta ^{2}\}. \end{aligned}$$
(3.1)

By [11, Proposition 2.5], we have

$$\begin{aligned} p(\beta )=\sup \limits _{N}\frac{1}{N}{\mathbb {E}}[\log {\hat{Z}}_{N,\beta }^{\omega }]\ge \frac{1}{N_{\beta ,\epsilon }}{\mathbb {E}}[\log {\hat{Z}}_{N_{\beta ,\epsilon },\beta }^{\omega }]. \end{aligned}$$

To simplify the notation, we will use N as \(N_{\beta ,\epsilon }\) in the following without any ambiguity. We may emphasize several times that the choice of N satisfies (3.1).

To show (1.19), we need to bound \({\mathbb {E}}[\log {\hat{Z}}_{N,\beta }^{\omega }]\) appropriately. The key ingredient of the proof is the following result proved in [6]. Here we cite a version stated in [3].

Proposition 3.1

[3, Proposition 4.3] For any \(m\in {\mathbb {N}}\) and any random vector \(\eta = (\eta _{1},\ldots ,\eta _{m})\) which satisfies the property that there exists a constant \(K>0\) such that

$$\begin{aligned} {\mathbb {P}}(|\eta |\le K)=1. \end{aligned}$$
(3.2)

Then for any convex function f, we can find a constant \(C_{1}\) uniformly for m, \(\eta \) and f, such that for any a, M and any positive \(t>0\), the inequality

$$\begin{aligned} {\mathbb {P}}\left( f(\eta )\ge a, |\triangledown f(\eta )|\le M\right) {\mathbb {P}}\left( f(\eta )\le a-t\right) \le 2\exp \left( -\frac{t^{2}}{C_{1}K^{2}M^{2}}\right) \end{aligned}$$
(3.3)

holds, where \(|\triangledown f|:=\sqrt{\sum \limits _{i=1}^{m}\left( \frac{\partial f}{\partial x_{i}}\right) ^{2}}\) is the norm of the gradient of f.

We will apply Proposition 3.1 to \(\log {\hat{Z}}_{N,\beta }^{\omega }\) and the environment \(\omega \). However, this proposition is only valid for bounded and finite-dimension random vector. Since \(\log {\hat{Z}}_{N,\beta }^{\omega }\) is a function of countable-dimension random field and \(\omega \) may not be bounded, we need to restrict the range of the random walk S so that \(\log {\hat{Z}}_{N,\beta }^{\omega }\) is determined by finite many \(\omega _{i,x}\)’s and respectively, truncate \(\omega \) so that it is finite.

First, we define a subset of \({\mathbb {N}}\times {\mathbb {Z}}\) by

$$\begin{aligned} {\mathcal {T}}={\mathcal {T}}_{N}:=\{(n,x): 1\le n\le N, |x-b_{N}|\le Ra_{N}\}, \end{aligned}$$

where R is a constant that will be determined later and \(a_{N}, b_{N}\) has been introduced in (1.2). We will choose R large enough so that the trajectory of S up to time N entirely falls in \({\mathcal {T}}\) with probability close to 1 for any \(N=N_{\beta ,\epsilon }\). We can also assume that \(a_{N}\) is an integer without loss of generality.

Then we define

$$\begin{aligned} {\bar{Z}}_{N,\beta }^{\omega }:={\mathbf {E}}\left[ \exp \left( \beta \sum \limits _{n=1}^{N}\omega _{n,S_{n}}-N\lambda (\beta )\right) \mathbb {1}_{\{S\in {\mathcal {T}}\}}\right] , \end{aligned}$$
(3.4)

where \(\{S\in {\mathcal {T}}\}:=\{S:(n,S_{n})\in {\mathcal {T}},\quad \forall 1\le n\le N\}\). Note that \({\bar{Z}}_{N,\beta }^{\omega }\le {\hat{Z}}_{N\beta }^{\omega }\). Readers may check that \(\log {\bar{Z}}_{N,\beta }^{\omega }\) is indeed a finite-dimension convex function and hence, we can apply Proposition 3.1 to \(\log {\bar{Z}}_{N,\beta }\). Since our goal is to find a lower bound for \({\mathbb {E}}[\log {\hat{Z}}_{N,\beta }^{\omega }]\), we can first estimate the left tail of \(\log {\bar{Z}}_{N,\beta }^{\omega }\), which can be done by bounding the first probability on the left-hand side of (3.3) from below.

We show the following result.

Lemma 3.2

For arbitrarily small \(\epsilon >0\), there exist \(\beta _{\epsilon }\) and \(M=M_{\epsilon }\), such that for any \(\beta \in (0,\beta _{\epsilon })\), it follows that

$$\begin{aligned} {\mathbb {P}}\left( {\bar{Z}}_{N,\beta }^{\omega }\ge \frac{1}{2},\big |\triangledown \log {\bar{Z}}_{N_{\beta ,\epsilon },\beta }^{\omega }\big |\le M\right) \ge \frac{\epsilon }{100}. \end{aligned}$$
(3.5)

To prove Lemma 3.2, we need a result from [2], which we state as

Lemma 3.3

[2, Lemma 6.4] For any \(\epsilon >0\), if \(\beta \) is sufficiently small such that \(N=N_{\beta ,\epsilon }\) is large enough, then

$$\begin{aligned} {\mathbb {E}}[({\hat{Z}}_{N,\beta }^{\omega })^{2}]\le \frac{10}{\epsilon } \end{aligned}$$
(3.6)

Proof of Lemma 3.2

By Lemma 3.3 and the fact \({\bar{Z}}_{N,\beta }^{\omega }\le {\hat{Z}}_{N,\beta }^{\omega }\),

$$\begin{aligned} {\mathbb {E}}[({\bar{Z}}_{N,\beta }^{\omega })^{2}]\le \frac{10}{\epsilon }. \end{aligned}$$

Then by Paley–Zygmund inequality, we have

$$\begin{aligned} {\mathbb {P}}\left( {\bar{Z}}_{N,\beta }^{\omega }\ge \frac{1}{2}\right) \ge \frac{\left( {\mathbf {P}}(S\in {\mathcal {T}})-\frac{1}{2}\right) ^{2}}{{\mathbb {E}}[({\bar{Z}}_{N,\beta }^{\omega })^{2}]}\ge \frac{\epsilon }{50}, \end{aligned}$$

where the last inequality holds by choosing R large enough in \({\mathcal {T}}\).

By using notation

$$\begin{aligned} f(\omega ):=\log {\bar{Z}}_{N,\beta }^{\omega }, \end{aligned}$$

we have

$$\begin{aligned} {\mathbb {P}}\left( {\bar{Z}}_{N,\beta }^{\omega }\ge \frac{1}{2},|\triangledown f(\omega )|\le M\right)= & {} {\mathbb {P}}\left( {\bar{Z}}_{N,\beta }^{\omega }\ge \frac{1}{2}\right) -{\mathbb {P}}\left( {\bar{Z}}_{N,\beta }^{\omega }\ge \frac{1}{2},|\triangledown f(\omega )|>M\right) \nonumber \\\ge & {} \frac{\epsilon }{50}-\frac{1}{M^{2}}{\mathbb {E}}\left[ |\triangledown f(\omega )|^{2}\mathbb {1}_{\{{\bar{Z}}_{N,\beta }^{\omega }\ge \frac{1}{2}\}}\right] . \end{aligned}$$
(3.7)

To compute \(\triangledown f(\omega )\), we find that

$$\begin{aligned} \begin{aligned} \frac{\partial }{\partial \omega _{k,x}}\log {\bar{Z}}_{N,\beta }^{\omega }&=\frac{\beta }{{\bar{Z}}_{N,\beta }^{\omega }}{\mathbf {E}}\left[ \exp \left( \beta \sum \limits _{n=1}^{N}\omega _{n,S_{n}}-N\lambda (\beta )\right) \mathbb {1}_{\{S_{k}=x,S\in {\mathcal {T}}\}}\right] \\&\le \frac{\beta }{{\bar{Z}}_{N,\beta }}{\mathbf {E}}\left[ \exp \left( \beta \sum \limits _{n=1}^{N}\omega _{n,S_{n}}-N\lambda (\beta )\right) \mathbb {1}_{\{S_{k}=x\}}\right] . \end{aligned} \end{aligned}$$

Then

$$\begin{aligned} \begin{aligned} |\triangledown f(\omega )|^{2}&=\sum \limits _{(k,x)\in {\mathcal {T}}}\left| \frac{\partial }{\partial \omega _{k,x}}\log {\bar{Z}}_{N,\beta }^{\omega }\right| ^{2}\\&\le \frac{\beta ^{2}}{({\bar{Z}}_{N,\beta }^{\omega })^{2}}\sum \limits _{k=1}^{N}\sum \limits _{x\in {\mathbb {Z}}}\left( {\mathbf {E}}\left[ \exp \left( \beta \sum \limits _{n=1}^{N}\omega _{n,S_{n}}-N\lambda (\beta )\right) \mathbb {1}_{\{S_{k}=x\}}\right] \right) ^{2}. \end{aligned} \end{aligned}$$

Note that

$$\begin{aligned}&\left( {\mathbf {E}}\left[ \exp \left( \beta \sum \limits _{n=1}^{N}\omega _{n,S_{n}}-N\lambda (\beta )\right) \mathbb {1}_{\{S_{k}=x\}}\right] \right) ^{2}\\&\quad ={\mathbf {E}}^{\bigotimes 2}\left[ \exp \left( \beta \sum \limits _{n=1}^{N}(\omega _{n,S_{n}}+\omega _{n,{\tilde{S}}_{n}})-2N\lambda (\beta )\right) \mathbb {1}_{\{S_{k}={\tilde{S}}_{k}=x\}}\right] . \end{aligned}$$

Therefore,

$$\begin{aligned} |\triangledown f(\omega )|^{2}\le \frac{\beta ^{2}}{({\bar{Z}}_{N,\beta }^{\omega })^{2}}{\mathbf {E}}^{\bigotimes 2}\left[ \sum \limits _{k=1}^{N}\mathbb {1}_{\{S_{k}={\tilde{S}}_{k}\}}\exp \left( \beta \sum \limits _{n=1}^{N}(\omega _{n,S_{n}}+\omega _{n,{\tilde{S}}_{n}}) -2N\lambda (\beta )\right) \right] \end{aligned}$$

Then we have

$$\begin{aligned} {\mathbb {E}}\left[ |\triangledown f(\omega )|^{2}\mathbb {1}_{\{{\bar{Z}}_{N,\beta }^{\omega }\ge \frac{1}{2}\}}\right] \le 4{\mathbf {E}}^{\bigotimes 2}\left[ \beta ^{2}\sum \limits _{k=1}^{N}\mathbb {1}_{\{S_{k}={\tilde{S}}_{k}\}}\exp \left( \gamma (\beta )\sum \limits _{n=1}^{N}\mathbb {1}_{\{S_{n}={\tilde{S}}_{n}\}}\right) \right] , \end{aligned}$$
(3.8)

where

$$\begin{aligned} \gamma (\beta ):=\lambda (2\beta )-2\lambda (\beta ). \end{aligned}$$

We denote

$$\begin{aligned} Y:=\sum \limits _{n=1}^{N}\mathbb {1}_{\{S_{n}={\tilde{S}}_{n}\}} \end{aligned}$$

for short. It is not hard to check that

$$\begin{aligned} \lambda (2\beta )-2\lambda (\beta )\sim \beta ^{2},\quad \text{ as }~\beta \rightarrow 0. \end{aligned}$$

Hence, when \(\beta \) is sufficiently small, we have

$$\begin{aligned}&{\mathbf {E}}^{\bigotimes 2}\left[ \beta ^{2}\sum \limits _{k=1}^{N}\mathbb {1}_{\{S_{k}={\tilde{S}}_{k}\}}\exp \left( \gamma (\beta )\sum \limits _{n=1}^{N}\mathbb {1}_{\{S_{n}={\tilde{S}}_{n}\}}\right) \right] \nonumber \\&\quad \le {\mathbf {E}}^{\bigotimes 2}\left[ \beta ^{2}Y\exp ((1+\epsilon ^{3})\beta ^{2}Y)\right] \le {\mathbf {E}}^{\bigotimes 2}\left[ C_{\epsilon }\exp ((1+\epsilon ^{2})\beta ^{2}Y)\right] , \end{aligned}$$
(3.9)

where \(C_{\epsilon }\) is a constant such that

$$\begin{aligned} x\exp ((1+\epsilon ^{3})x)\le C_{\epsilon }\exp ((1+\epsilon ^{2})x),\quad \forall x\ge 0. \end{aligned}$$

Again by Lemma 3.3,

$$\begin{aligned} {\mathbf {E}}^{\bigotimes 2}\left[ \beta ^{2}\sum \limits _{k=1}^{N}\mathbb {1}_{\{S_{k}={\tilde{S}}_{k}\}}\exp \left( \gamma (\beta )\sum \limits _{n=1}^{N}\mathbb {1}_{\{S_{n}={\tilde{S}}_{n}\}}\right) \right] \le \frac{10C_{\epsilon }}{\epsilon }. \end{aligned}$$
(3.10)

We can choose \(M=M_{\epsilon }=20\sqrt{10C_{\epsilon }}/\epsilon ^{2}\) and then combine (3.7)–(3.10), we then conclude Lemma 3.2. \(\square \)

Finally, we can now prove Theorem 1.4. Readers should keep in mind that \(N=N_{\beta ,\epsilon }\).

Proof of Theorem 1.4

Because the environment \(\omega \) has a finite moment generating function, we can find some positive constants \(C_{2}\) and \(C_{3}\), such that

$$\begin{aligned} {\mathbb {P}}(|\omega _{1,0}|\ge t)\le C_{2}\exp (-C_{3}t). \end{aligned}$$

Note that we will focus on the environment with index in \({\mathcal {T}}\). We can estimate that

$$\begin{aligned} {\mathbb {P}}\left( \max \limits _{(n,x)\in {\mathcal {T}}}|\omega _{n,x}|\ge t\right) \le C_{4}Na_{N}\exp (-C_{3}t). \end{aligned}$$
(3.11)

Note that

$$\begin{aligned} \left\{ \max \limits _{(n,x)\in {\mathcal {T}}}|\omega _{n,x}|<t\right\} \subset \left\{ \omega _{n,x}>-t,\quad \forall (n,x)\in {\mathcal {T}}\right\} \end{aligned}$$

and recall the definition of \({\bar{Z}}_{N,\beta }^{\omega }\) from (3.4), then we obtain a rough bound

$$\begin{aligned} {\mathbb {P}}\left( \log {\bar{Z}}_{N,\beta }^{\omega }<-(\beta t+\lambda (\beta ))N\right) \le C_{4}Na_{N}\exp (-C_{3}t). \end{aligned}$$
(3.12)

We will use (3.12) later to bound the left tail of \(\log {\hat{Z}}_{N,\beta }^{\omega }\) for large t.

In order to apply Proposition 3.1, we need to truncate the environment appropriately. We set \({\tilde{\omega }}_{n,x}:=\omega _{n,x}\mathbb {1}_{\{|\omega _{n,x}|\le (\log N)^{2}\}}\) and define

$$\begin{aligned} f({\tilde{\omega }}):=\log {\mathbf {E}}\left[ \exp \left( \beta \sum \limits _{n=1}^{N}{\tilde{\omega }}_{n,S_{n}}-N\lambda (\beta )\right) \mathbb {1} _{\{S\in {\mathcal {T}}\}}\right] . \end{aligned}$$

Then

$$\begin{aligned}&{\mathbb {P}}\left( {\bar{Z}}_{N,\beta }^{\omega }\ge \frac{1}{2},|\triangledown \log {\bar{Z}}_{N,\beta }^{\omega }|\le M\right) \\&\quad ={\mathbb {P}}\left( {\bar{Z}}_{N,\beta }^{\omega }\ge \frac{1}{2},|\triangledown \log {\bar{Z}}_{N,\beta }^{\omega }|\le M,{\tilde{\omega }}=\omega \right) \\&\qquad +{\mathbb {P}}\left( {\bar{Z}}_{N,\beta }^{\omega }\ge \frac{1}{2},|\triangledown \log {\bar{Z}}_{N,\beta }^{\omega }|\le M,{\tilde{\omega }}\ne \omega \right) \\&\quad \le {\mathbb {P}}\left( f({\tilde{\omega }})\ge -\log 2,|\triangledown f({\tilde{\omega }})|\le M\right) +{\mathbb {P}}({\tilde{\omega }}\ne \omega ) \end{aligned}$$

By Lemma 3.2 and (3.11),

$$\begin{aligned}&{\mathbb {P}}\left( f({\tilde{\omega }})\ge -\log 2,|\triangledown f({\tilde{\omega }})|\le M\right) \\&\quad \ge {\mathbb {P}}\left( {\bar{Z}}_{N,\beta }^{\omega }\ge \frac{1}{2},|\triangledown \log {\bar{Z}}_{N,\beta }^{\omega }|\le M\right) -{\mathbb {P}}({\tilde{\omega }}\ne \omega )\\&\quad \ge \frac{\epsilon }{100}-C_{4}Na_{N}\exp (-C_{3}(\log N)^{2})\ge \frac{\epsilon }{200}, \end{aligned}$$

where the last inequality holds for large N, i.e., for small \(\beta \). Now we apply Proposition 3.1 to \(f({\tilde{\omega }})\) and we obtain

$$\begin{aligned} {\mathbb {P}}\left( f({\tilde{\omega }})\le -\log 2-t\right) \le \frac{400}{\epsilon }\exp \left( -\frac{t^{2}}{C_{1}(\log N)^{4}M^{2}}\right) . \end{aligned}$$

Finally,

$$\begin{aligned}&{\mathbb {P}}\left( \log {\bar{Z}}_{N,\beta }^{\omega }\le -\log 2-t\right) \nonumber \\&\quad ={\mathbb {P}}\left( \log {\bar{Z}}_{N,\beta }^{\omega }\le -\log 2-t,{\tilde{\omega }}=\omega \right) +{\mathbb {P}}\left( \log {\bar{Z}}_{N,\beta }^{\omega }\le -\log 2-t,{\tilde{\omega }}\ne \omega \right) \nonumber \\&\quad \le {\mathbb {P}}\left( f({\tilde{\omega }})\le -\log 2-t\right) +{\mathbb {P}}({\tilde{\omega }}\ne \omega )\nonumber \\&\quad \le \frac{400}{\epsilon }\exp \left( -\frac{t^{2}}{C_{1}(\log N)^{4}M^{2}}\right) +C_{4}Na_{N}\exp (-C_{3}(\log N)^{2}). \end{aligned}$$
(3.13)

We can now bounded the left tail of \(\log {\hat{Z}}_{N,\beta }^{\omega }\). Since it is larger than \(\log {\bar{Z}}_{N,\beta }^{\omega }\), we can rewrite (3.12) and (3.13) as

$$\begin{aligned} {\mathbb {P}}\left( \log {\hat{Z}}_{N,\beta }^{\omega }<-(\beta t+\lambda (\beta ))N\right) \le C_{4}Na_{N}\exp (-C_{3}t) \end{aligned}$$
(3.14)

and respectively,

$$\begin{aligned}&{\mathbb {P}}\left( \log {\hat{Z}}_{N,\beta }^{\omega }\le -\log 2-t\right) \nonumber \\&\quad \le \frac{400}{\epsilon }\exp \left( -\frac{t^{2}}{C_{1}(\log N)^{4}M^{2}}\right) +C_{4}Na_{N}\exp (-C_{3}(\log N)^{2}). \end{aligned}$$
(3.15)

For \(\log {\hat{Z}}_{N,\beta }^{\omega }\) with large negative value (for example, it is less than \(-N^{2}\)), we use the bound (3.14), which shows that the mass of \(\log {\hat{Z}}_{N,\beta }^{\omega }\) on \((-N^{2},-\infty )\) can be bounded below by some constant \(-C\). For \(\log Z_{N, \beta }^{\omega }\) with small negative value, we use the bound (3.15), which shows that the mass of \(\log {\hat{Z}}_{N,\beta }^{\omega }\) is bounded below by \(-{\tilde{C}}_{\epsilon }(\log N)^{2}\) with some constant \({\tilde{C}}_{\epsilon }\). Therefore, we obtain

$$\begin{aligned} p(\beta )\ge \frac{1}{N}{\mathbb {E}}\left[ \log {\hat{Z}}_{N,\beta }^{\omega }\right] \ge -\frac{C_{5,\epsilon }(\log N)^{2}}{N}\ge -\frac{1}{D^{-1}\left( (1-\epsilon )/\beta ^{2}\right) ^{1-\epsilon }} \end{aligned}$$

for \(\beta \) small enough, where the last inequality is due to the definition of \(N=N_{\beta ,\epsilon }\). \(\square \)