Keywords

1 Introduction and Main Results

In Sect. 1.1 we provide some background on metastability. In Sect. 1.2 we define our model: spin-flip dynamics on the Erdős-Rényi random graph ERn(p). In Sect. 1.3 we identify the metastable pair for the dynamics, corresponding to the ‘minus-phase’ and the ‘plus-phase’, respectively. In Sect. 1.4 we recall the definition of spin-flip dynamics on the complete graph K n, which serves as a comparison object, and recall what is known about the average metastable crossover time for spin-flip dynamics on K n (Theorem 1.3 below). In Sect. 1.5 we transfer the sharp asymptotics for K n to a somewhat rougher asymptotics for ERn(p) (Theorem 1.4 below). In Sect. 1.6 we close by placing our results in the proper context and giving an outline of the rest of the paper.

1.1 Background

Interacting particle systems, evolving according to a Metropolis dynamics associated with an energy functional called the Hamiltonian, may end up being trapped for a long time near a state that is a local minimum but not a global minimum. The deepest local minima are called metastable states, the global minimum is called the stable state. The transition from a metastable state to the stable state marks the relaxation of the system to equilibrium. To describe this relaxation, it is of interest to compute the crossover time and to identify the set of critical configurations the system has to visit in order to achieve the transition. The critical configurations represent the saddle points in the free energy landscape: the set of mini-max configurations that must be hit by any path that achieves the crossover.

Metastability for interacting particle systems on lattices has been studied intensively in the past three decades. Various different approaches have been proposed, which are summarised in the monographs by Olivieri and Vares [12], Bovier and den Hollander [4]. Recently, there has been interest in metastability for interacting particle systems on random graphs, which is much more challenging because the crossover time typically depends in a delicate manner on the realisation of the graph.

In the present paper we are interested in metastability for spin-flip dynamics on the Erdős-Rényi random graph. Our main result is an estimate of the average crossover time from the ‘minus-phase’ to the ‘plus-phase’ when the spins feel an external magnetic field at the vertices in the graph as well as a ferromagnetic interaction along the edges in the graph. Our paper is part of a larger enterprise in which the goal is to understand metastability on graphs. Jovanovski [11] analysed the case of the hypercube, Dommers [7] the case of the random regular graph, Dommers, den Hollander, Jovanovski and Nardi [10] the case of the configuration model, and den Hollander and Jovanovski [6] the case of the hierarchical lattice. Each case requires carrying out a detailed combinatorial analysis that is model-specific, even though the metastable behaviour is ultimately universal. For lattices like the hypercube and the hierarchical lattice a full identification of the relevant quantities is possible, while for random graphs like the random regular graph and the configuration model so far only the communication height is well understood, while the set of critical configurations and the prefactor remain somewhat elusive.

The equilibrium behaviour of the Ising model on random graphs is well understood. See e.g. Dommers et al. [8, 9].

1.2 Spin-Flip Dynamics on ERn(p)

Let \(\mathrm {ER}_n(p)=\left (V,E\right )\) be a realisation of the Erdős-Rényi random graph on \(\left |V\right |=n\in \mathbb {N}\) vertices with edge retention probability p ∈ (0, 1), i.e., each edge in the complete graph K n is present with probability p and absent with probability 1 − p, independently of other edges (see Fig. 1). We write \(\mathbb {P}_{\mathrm {ER}_n(p)}\) to denote the law of ERn(p). For typical properties of ERn(p), see van der Hofstad [13, Chapters 4–5].

Fig. 1
figure 1

A realization of the Erdős-Rényi random graph with n = 12 and \(p= \frac 13\)

Each vertex carries an Ising spin that can take the values − 1 or + 1. Let \(S_{n} =\left \{-1,+1\right \}^{V}\) denote the set of spin configurations on V , and let H n be the Hamiltonian on S n defined by

$$\displaystyle \begin{aligned} H_{n}\left(\sigma\right)=-\frac{1}{n}\sum_{\left(v,w\right)\in E} \sigma(v)\sigma(w) -h\sum_{v\in V}\sigma(v),\quad \sigma\in S_{n}. {} \end{aligned} $$
(1.1)

In other words, single spins interact with an external magnetic field h ∈ (0, ), while pairs of neighbouring spins interact with each other with a ferromagnetic coupling strength 1∕n.

Let \(\boxminus =\left \{-1\right \} ^{V}\) and \(\boxplus =\left \{+1\right \} ^{V}\) denote the configuration where all spins are − 1, respectively, + 1. Since

(1.2)

we have the geometric representation

(1.3)

where

$$\displaystyle \begin{aligned} \partial_{E}\sigma=\left\{ \left(v,w\right)\in E\colon \sigma (v) = -\sigma (w) = +1\right\} \end{aligned} $$
(1.4)

is the edge-boundary of σ and

$$\displaystyle \begin{aligned} \left|\sigma\right|=\left\{ v\in \mathrm{ER}_n(p)\colon\,\sigma(v)=+1\right\} \end{aligned} $$
(1.5)

is the vertex-volume of σ.

In the present paper we consider a spin-flip dynamics on S n commonly referred to as Glauber dynamics, defined as the continuous-time Markov process with transition rates

$$\displaystyle \begin{aligned} r\left(\sigma,\sigma'\right)=\left\{ \begin{array}{ll} \mathrm{e}^{-\beta [H_{n}(\sigma')-H_{n}(\sigma)]_{+}}, & \mbox{if }\left\Vert \sigma-\sigma'\right\Vert =2,\\ 0, & \mbox{if }\left\Vert \sigma-\sigma'\right\Vert >2, \end{array} \right. \quad \sigma,\sigma' \in S_{n}, {} \end{aligned} $$
(1.6)

where ∥⋅∥ is the 1-norm on S n. This dynamics has as reversible stationary distribution the Gibbs measure

$$\displaystyle \begin{aligned} \mu_n\left(\sigma\right)=\frac{1}{Z_n} \mathrm{e}^{-\beta H_{n}(\sigma)}, \quad \sigma \in S_{n}, \end{aligned} $$
(1.7)

where β ∈ (0, ) is the inverse temperature and Z n is the normalizing partition sum. We write

$$\displaystyle \begin{aligned} \left\{\xi_{t}\right\} _{t\geq0} \end{aligned} $$
(1.8)

to denote the path of the random dynamics and \(\mathbb {P}_\xi \) to denote its law given ξ 0 = ξ. For χ ⊂ S n, we write

$$\displaystyle \begin{aligned} \tau_\chi = \inf\{t \geq 0\colon\, \xi_t \in \chi,\,\xi_{t-} \notin \chi\}. \end{aligned} $$
(1.9)

to denote the first hitting/return time of χ.

We define the magnetization of σ by

$$\displaystyle \begin{aligned} m\left(\sigma\right)=\frac{1}{n}\sum_{v\in V}\sigma(v), {} \end{aligned} $$
(1.10)

and observe the relation

$$\displaystyle \begin{aligned} m\left(\sigma\right)=\frac{2\left|\sigma\right|}{n}-1, \quad \sigma \in S_{n}. {} \end{aligned} $$
(1.11)

We will frequently switch between working with volume and working with magnetization. Equation (1.11) ensures that these are in one-to-one correspondence. Accordingly, we will frequently look at the dynamics from the perspective of the induced volume process and magnetization process,

$$\displaystyle \begin{aligned} \left\{\left|\xi_{t}\right|\right\}_{t\geq0}, \quad \left\{m(\xi_{t})\right\}_{t\geq0}, \end{aligned} $$
(1.12)

which are not Markov.

1.3 Metastable Pair

For fixed n, the Hamiltonian in (1.1) achieves a global minimum at \(\boxplus \) and a local minimum at \(\boxminus \). In fact, \(\boxminus \) is the deepest local minimum not equal to \(\boxplus \) (at least for h small enough). However, in the limit as n →, these do not form a metastable pair of configurations because entropy comes into play.

Definition 1.1 (Metastable Regime)

The parameters β, h are said to be in the metastable regime when

$$\displaystyle \begin{aligned} \beta \in (1/p,\infty), \qquad h \in \big(0,p \chi(\beta p)\big), \end{aligned} $$
(1.13)

with (see Fig. 2)

$$\displaystyle \begin{aligned} \chi(\lambda) = \sqrt{1-\frac{1}{\lambda}} - \frac{1}{2\lambda} \log\left[\lambda\left(1+\sqrt{1-\frac{1}{\lambda}}\,\right)^{2}\right], \qquad \lambda \geq 1. \end{aligned} $$
(1.14)
Fig. 2
figure 2

Plot of λχ(λ)

We have limλ χ(λ) = 1 and limλ↓1 χ(λ) = 0 (with slope \(\frac 12\)). Hence, for β → any h ∈ (0, p) is metastable, while for β 1∕p or p 0 no h ∈ (0, ) is metastable. The latter explains why we do not consider the non-dense Erdős-Rényi random graph with p = p n 0 as n →. \(\blacksquare \)

The threshold β c = 1∕p is the critical temperature: the static model has a phase transition at h = 0 when β > β c and no phase transition when β ≤ β c (see e.g. Dommers et al. [9]).

To define the proper metastable pair of configurations, we need the following definitions. Let

$$\displaystyle \begin{aligned} \begin{array}{ll} &\varGamma_n = \{-1,-1+\frac{2}{n},\ldots,1-\frac{2}{n},1\},\\ &I_n(a) = - \frac{1}{n} \log {n \choose \frac{1+a}{2}n}, \quad J_n(a) = 2\beta(pa+h) - 2 I^{\prime}_n(a). \end{array}\end{aligned} $$
(1.15)

Define

$$\displaystyle \begin{aligned} \begin{array}{ll} {\mathbf{m}}_n &= \min\left\{a \in \varGamma_n\colon\, J_n(a) \leq 0\right\},\\ {\mathbf{t}}_n &= \min\left\{a\in \varGamma_n\colon\, a>{\mathbf{m}}_n,\, J_n(a) \geq 0\right\},\\ {\mathbf{s}}_n &= \min\left\{a\in \varGamma_n\colon\, a>{\mathbf{t}}_n,\, J_n(a) \leq 0\right\}. \end{array} {}\end{aligned} $$
(1.16)

The numbers in the left-hand side of (1.16) play the role of magnetizations. Further define

$$\displaystyle \begin{aligned} {\mathbf{M}}_n= \frac{n}{2}({\mathbf{m}}_n+1), \quad {\mathbf{T}}_n= \frac{n}{2}({\mathbf{t}}_n+1), \quad {\mathbf{S}}_n= \frac{n}{2}({\mathbf{s}}_n+1),\end{aligned} $$
(1.17)

which are the volumes corresponding to (1.16), and

$$\displaystyle \begin{aligned} A_k =\left\{ \sigma\in S_{n}\colon\,\left|\sigma\right|=k\right\}, \qquad k \in \{0,1,\ldots,n-1,n \}, {} \end{aligned} $$
(1.18)

the set of configurations with volume k. Define

$$\displaystyle \begin{aligned} R_n(a)=-\frac{1}{2}pa^{2}-ha+\frac{1}{\beta}I_n(a) \end{aligned} $$
(1.19)

and note that

$$\displaystyle \begin{aligned} \begin{array}{rcl} {} R^{\prime}_n(a) = -pa-h+\frac{1}{\beta}I^{\prime}_n(a) = - \frac{1}{2\beta} J_n(a). \end{array} \end{aligned} $$
(1.20)

The motivation behind the definitions in (1.15), (1.16) and (1.19) will become clear in Sect. 2. Via Stirling’s formula it follows that

$$\displaystyle \begin{aligned} J_n(a) = 2\beta(pa+h) +\log\left(\frac{1-a+\frac{1}{n}}{1+a+\frac{1}{n}}\right) + O(n^{-2}), \quad a \in \varGamma_n. \end{aligned} $$
(1.21)

We will see that, in the limit as n → when (β, h) is in the metastable regime defined by (1.13), the numbers in (1.16) are well-defined: \(A_{{\mathbf {M}}_n}\) is the metastable set, \(A_{{\mathbf {S}}_n}\) is the stable set, \(A_{{\mathbf {T}}_n}\) is the top set, i.e., the set of saddle points that lie in between \(A_{{\mathbf {M}}_n}\) and \(A_{{\mathbf {S}}_n}\). Our key object of interest will be the crossover time from \(A_{{\mathbf {M}}_n}\) to \(A_{{\mathbf {S}}_n}\) via \(A_{{\mathbf {T}}_n}\).

Note that

$$\displaystyle \begin{aligned} \varGamma_n \to [-1,1], \quad I_n(a) \to I(a), \quad J_n(a) \to J_{p,\beta,h}(a), \qquad n\to\infty, \end{aligned} $$
(1.22)

with

$$\displaystyle \begin{aligned} J_{p,\beta,h}(a) = 2\beta(pa+h) + \log\left(\frac{1-a}{1+a}\right) \end{aligned} $$
(1.23)

and

$$\displaystyle \begin{aligned} I(a) = \frac{1-a}{2} \log \left(\frac{1-a}{2}\right) + \frac{1+a}{2} \log \left(\frac{1+a}{2}\right). \end{aligned} $$
(1.24)

Accordingly,

$$\displaystyle \begin{aligned} {\mathbf{m}}_n \to \mathbf{m}, \quad {\mathbf{t}}_n \to \mathbf{t}, \quad {\mathbf{s}}_n \to \mathbf{s}, \qquad n\to\infty, \end{aligned} $$
(1.25)

with m, t, s the three successive zeroes of J (see Fig. 4 and recall (1.16)). Define

$$\displaystyle \begin{aligned} R_{p,\beta,h}(a)=-\frac{1}{2}pa^{2}-ha+\frac{1}{\beta}I(a). \end{aligned} $$
(1.26)

Note that R p,β,h(a) plays the role of free energy: \(-\frac {1}{2}pa^{2}-ha\) and \(\frac {1}{\beta } I(a)\) represent the energy, respectively, entropy at magnetisation a. Note that I(a) equals the relative entropy of the probability measure \(\frac 12(1+a)\delta _{+1} + \frac 12(1-a)\delta _{-1}\) with respect to the counting measure δ +1 + δ −1. Also note that

$$\displaystyle \begin{aligned} \begin{array}{rcl} {} R^{\prime}_{p,\beta,h}(a) = -pa-h+\frac{1}{\beta}I'(a) = - \frac{1}{2\beta} J_{p,\beta,h}(a). \end{array} \end{aligned} $$
(1.27)

Remark 1.2

As shown in Corollary 3.6 below, if h ∈ (p, ), then (1.6) leads to non-metastable behaviour where the dynamics ‘drifts’ through a sequence of configurations with volume growing from M to S within time O(1). \(\blacksquare \)

1.4 Metastability on K n

Let K n be the complete graph on n vertices (see Fig. 3). Spin-flip dynamics on K n, commonly referred to as Glauber dynamics for the Curie-Weiss model, is defined as in Sect. 1.2 but with the Curie-Weiss Hamiltonian

$$\displaystyle \begin{aligned} H_{n}(\sigma)=-\frac{1}{2n} \sum_{1\leq i,j\leq n}\sigma(i)\sigma(j) -h\sum_{1\leq i\leq n}\sigma(i),\quad \sigma\in S_{n}. \end{aligned} $$
(1.28)

This is the same as (1.1) with p = 1, except for the diagonal term \(-\frac {1}{2n} \sum _{1 \leq i \leq n}\sigma (i)\sigma (i)\) \(= -\frac 12\), which shifts H n by a constant and has no effect on the dynamics. The advantage of (1.28) is that we may write

$$\displaystyle \begin{aligned} H_{n}(\sigma) = n \left[-\frac 12 m(\sigma)^2-hm(\sigma)\right], \end{aligned} $$
(1.29)

which shows that the energy is a function of the magnetization only, i.e., the Curie-Weiss model is a mean-field model. Clearly, this property fails on ERn(p).

Fig. 3
figure 3

The complete graph with n = 9

Fig. 4
figure 4

Plot of R p,β,h(a) as a function of the magnetization a. The metastable set A M has magnetization m < 0, the stable set A S has magnetization s > 0, the top set has magnetization t < 0. Note that \(R_{p,\beta ,h}(-1) = - \frac 12p+h\), \(R_{p,\beta ,h}(0) = - \beta ^{-1}\log 2\), \(R_{p,\beta ,h}(+1) = - \frac 12 p-h\) and \(R_{p,\beta ,h}^{\prime }(-1) = -\infty \), \(R_{p,\beta ,h}^{\prime }(0) = - h\), \(R_{p,\beta , h}^{\prime }(+1) = \infty \)

For the Curie-Weiss model it is known that there is a critical inverse temperature β c = 1 such that, for β > β c, h small enough and in the limit as n →, the stationary distribution μ n given by (1.7) and (1.28) has two phases: the ‘minus-phase’, where the majority of the spins are − 1, and the ‘plus-phase’, where the majority of the spins are + 1. These two phases are the metastable state, respectively, the stable state for the dynamics. In the limit as n →, the dynamics of the magnetization introduced in (1.12) (which is Markov) converges to a Brownian motion on [−1, +1] in the double-well potential aR 1,β,h(a) (see Fig. 4).

The following theorem can be found in Bovier and den Hollander [4, Chapter 13]. For p = 1, the metastable regime in (1.13) becomes

$$\displaystyle \begin{aligned} \beta \in (1,\infty), \qquad h \in \big(0,\chi(\beta)\big). \end{aligned} $$
(1.30)

Theorem 1.3 (Average Crossover Time on K n)

Subject to (1.30), as n ∞, uniformly in \(\xi \in A_{{\mathbf {M}}_n}\),

$$\displaystyle \begin{aligned} \begin{array}{ll} &\mathbb{E}_{\xi}\left[\tau_{A_{{\mathbf{S}}_n}}\right]\\ &= [1+o_n(1)]\,\frac{\pi}{1+\mathbf{t}} \sqrt{\frac{1-{\mathbf{t}}^{2}}{1-{\mathbf{m}}^{2}}} \frac{1}{\beta\sqrt{R_{1,\beta,h}^{\prime\prime}(\mathbf{m})[-R_{1,\beta,h}^{\prime\prime}(\mathbf{t})]}} \,\mathrm{e}^{\beta n [R_{1,\beta,h}(\mathbf{t})-R_{1,\beta,h}(\mathbf{m})]}. \end{array} \end{aligned} $$
(1.31)

Figure 4 illustrates the setting: the average crossover time from \(A_{{\mathbf {M}}_n}\) to \(A_{{\mathbf {S}}_n}\) depends on the free energy barrier R 1,β,h(t) − R 1,β,h(m) and on the curvature of R 1,β,h at m and t. Note that m, s, t in Fig. 4 are the limits as n → of m n, s n, t n defined in (1.16) for p = 1.

1.5 Metastability on ERn(p)

Unlike for the spin-flip dynamics on K n, the induced processes defined in (1.12) are not Markovian. This is due to the random geometry of ERn(p). However, we will see that they are almost Markovian, a fact that we will exploit by comparing the dynamics on ERn(p) with that on K n, but with a ferromagnetic coupling strength pn rather than 1∕n and with an external magnetic field that is a small perturbation of h.

As shown in Lemma 2.2 below, in the metastable regime the function aR p(a) has a double-well structure just like in Fig. 4, so that the metastable state A M and the stable state A S are separated by a free energy barrier represented by A T.

We are finally in a position to state our main theorem.

Theorem 1.4 (Average Crossover Time on ERn(p))

Subject to (1.13), with \(\mathbb {P}_{\mathrm {ER}_n(p)}\) -probability tending to 1 as n ∞, and uniformly in \(\xi \in A_{{\mathbf {M}}_n}\),

$$\displaystyle \begin{aligned} \mathbb{E}_{\xi}\left[\tau_{A_{{\mathbf{S}}_n}}\right] = n^{E_n}\,\mathrm{e}^{\beta n [R_{p,\beta,h}(\mathbf{t})-R_{p,\beta,h}(\mathbf{m})]} \end{aligned} $$
(1.32)

where the random exponent E n satisfies

$$\displaystyle \begin{aligned} \lim_{n\to\infty} \mathbb{P}_{\mathrm{ER}_n(p)}\left(|E_n| \leq \beta (\mathbf{t}-\mathbf{m})\frac{11}{6}\right)=1. \end{aligned} $$
(1.33)

Thus, apart from a polynomial error term, the average crossover time is the same as on the complete graph with ferromagnetic interaction strength pn instead of 1∕n.

1.6 Discussion and Outline

We discuss the significance of our main theorem.

  1. 1.

    Theorem 1.4 provides an estimate on the average crossover time from \(A_{{\mathbf {M}}_n}\) to \(A_{{\mathbf {S}}_n}\) on ERn(p) (recall Fig. 4). The estimate is uniform in the starting configuration. The exponential term in the estimate is the same as on K n, but with a ferromagnetic interaction strength pn rather than 1∕n. The multiplicative error term is at most polynomial in n. Such an error term is not present on K n, for which the prefactor is known to be a constant up to a multiplicative factor 1 + o(1) (as shown in Theorem 1.3). The randomness of ERn(p) manifests itself through a more complicated prefactor, which we do not know how to identify. What is interesting is that, apparently, ERn(p) is so homogeneous for large n that the prefactor is at most polynomial. We expect the prefactor to be random under the law \(\mathbb {P}_{\mathrm {ER}_n(p)}\).

  2. 2.

    It is known that on K n the crossover time divided by it average has an exponential distribution in the limit as n →, as is typical for metastable behaviour. The same is true on ERn(p). A proof of this fact can be obtained in a straightforward manner from the comparison properties underlying the proof of Theorem 1.4. These comparison properties, which are based on coupling of trajectories, also allow us to identify the typical set of trajectories followed by the spin-flip dynamics prior to the crossover. We will not spell out the details.

  3. 3.

    The proof of Theorem 1.4 is based on estimates of transition probabilities and transition times between pairs of configurations with different volume, in combination with a coupling argument. Thus we are following the path-wise approach to metastability (see [4] for background). Careful estimates are needed because on ERn(p) the processes introduced in (1.12) are not Markovian, unlike on K n. The proof is based on a double coupling strategy: (1) a sandwich of the Erdős-Rényi dynamics between two small perturbations of the Curie-Weiss dynamics, with the goal to identify the leading order term of the average crossover time with the help of Theorem 1.3; (2) a two-level coupling (defined in Sect. 6), with the goal to prove asymptotic independence from the starting configuration (see also the beginning of Sects. 5.2 and 7).

  4. 4.

    Bovier et al. [5] use capacity estimates and concentration of measure estimates to show that the prefactors form a tight family of random variables under the law \(\mathbb {P}_{\mathrm {ER}_n(p)}\) as n →, which constitutes a considerable sharpening of (1.32). The result is valid for β > β c and h small enough. The starting configuration is not arbitrary, but is drawn according to the last-exit-biased distribution for the transition from \(A_{{\mathbf {M}}_n}\) to \(A_{{\mathbf {S}}_n}\), as is common in the potential-theoretic approach to metastability. The exponential limit law is therefore not immediate.

  5. 5.

    Another interesting model is where the randomness sits in the vertices rather than in the edges, namely, Glauber spin-flip dynamics with Hamiltonian

    $$\displaystyle \begin{aligned} H_n(\sigma) = -\frac{1}{n} \sum_{1 \leq i,j \leq n} \sigma(i) \sigma(j) - \sum_{1 \leq i \leq n} h_i\sigma(i),\end{aligned} $$
    (1.34)

    where h i, 1 ≤ i ≤ n, are i.i.d. random variables drawn from a common probability distribution ν on \(\mathbb {R}\). The metastable behaviour of this model was analysed in Bovier et al. [3] (discrete ν) and Bianchi et al. [1] (continuous ν). In particular, the prefactor was computed up to a multiplicative factor 1 + o(1), and turns out to be rather involved (see [4, Chapters 14–15]). Our model is even harder because the interaction between the spins runs along the edges of ERn(p), which have an intricate spatial structure. Consequently, the so-called lumping technique (employed in [3] and [1] to monitor the magnetization on the level sets of the magnetic field) can no longer be used. For the dynamics under (1.34) the exponential law was proved in Bianchi et al. [2].

Outline

The remainder of the paper is organized as follows. In Sect. 2 we define the perturbed spin-flip dynamics on K n (Definition 2.1 below) and explain why Definition 1.1 identifies the metastable regime (Lemma 2.2 below). In Sect. 3 we collect a few basic facts about the geometry of ERn(p) and the spin-flip dynamics on ERn(p). In Sect. 4 we derive rough capacity estimates for the spin-flip dynamics on ERn(p). In Sect. 5 we derive refined capacity estimates. In Sect. 6 we show that two copies of the spin-flip dynamics starting near the metastable state can be coupled in a short time. In Sect. 7 we prove Theorem 1.4. In Sect. 8, finally, we do a technical computation of hitting times that is needed in the proof.

2 Preparations

In Sect. 2.1 we define the perturbed spin-flip dynamics on K n that will be used as comparison object. In Sect. 2.2 we do a rough metastability analysis of the perturbed model. In Sect. 2.3 we show that R p,β,h has a double-well structure if and only if (β, h) is in the metastable regime, in the sense of Definition 1.1 (Lemma 2.2 below).

Define

$$\displaystyle \begin{aligned} J^*_n(a) = 2\beta\left(p\left(a+\frac{2}{n}\right)+h\right) +\log\left(\frac{1-a}{1+a+\frac{2}{n}}\right), \quad a \in \varGamma_n.\end{aligned} $$
(2.1)

We see from (1.21) that \(J_n(a) = J^*_n(a) + O(n^{-2})\) when \(\beta p = \frac {1}{1-a^2}\). This will be useful for the analysis of the ‘free energy landscape’.

2.1 Perturbed Curie-Weiss

We will compare the dynamics on ERn(p) with that on K n, but with a ferromagnetic coupling strength pn rather than 1∕n, and with an external magnetic field that is a small perturbation of h.

Definition 2.1 (Perturbed Curie-Weiss)

  1. (1)

    Let

    $$\displaystyle \begin{aligned} \begin{array}{rcl} {} H_{n}^{u}\left(\sigma\right) & =&\displaystyle -\frac{p}{2n}\sum_{1\leq i,j\leq n}\sigma(i)\sigma(j) -h_n^{u}\sum_{1\leq i\leq n}\sigma(i), \quad \sigma\in S_{n}, \end{array} \end{aligned} $$
    (2.2)
    $$\displaystyle \begin{aligned} \begin{array}{rcl} {} H_{n}^{l}\left(\sigma\right) & =&\displaystyle -\frac{p}{2n}\sum_{1\leq i,j\leq n}\sigma(i)\sigma(j) -h_n^{l}\sum_{1\leq i\leq n}\sigma(i), \quad \,\sigma\in S_{n}, \end{array} \end{aligned} $$
    (2.3)

    be the Hamiltonians on S n corresponding to the Curie-Weiss model on n vertices with ferromagnetic coupling strength pn, and with external magnetic fields \(h_n^{u}\) and \(h_n^{l}\) given by

    $$\displaystyle \begin{aligned} h_n^{u} = h+\frac{\left(1+\epsilon\right)\log(n^{11/6})}{n}, \qquad h_n^{l} = h-\frac{\left(1+\epsilon\right)\log(n^{11/6})}{n}, {}\end{aligned} $$
    (2.4)

    where 𝜖 > 0 is arbitrary. The indices u and l stand for upper and lower, and the choice of exponent \(\frac {11}{6}\) will become clear in Sect. 4.

  2. (2)

    The equilibrium measures on S n corresponding to (2.2) and (2.3) are denoted by \(\mu _n^{u}\) and \(\mu _n^{l}\), respectively (recall (1.7)).

  3. (3)

    The Glauber dynamics based on (2.2) and (2.3) are denoted by

    $$\displaystyle \begin{aligned} \{\xi_{t}^{u}\}_{t\geq 0}, \qquad \{\xi_{t}^{l}\}_{t\geq 0},\end{aligned} $$
    (2.5)

    respectively.

  4. (4)

    The analogues of (1.16) and (1.17) are written \({\mathbf {m}}^u_n,{\mathbf {t}}^u_n, {\mathbf {s}}^u_n\), \({\mathbf {M}}^u_n,{\mathbf {T}}^u_n,{\mathbf {S}}^u_n\) and \({\mathbf {m}}^l_n,{\mathbf {t}}^l_n, {\mathbf {s}}^l_n\), \({\mathbf {M}}^l_n,{\mathbf {T}}^l_n,{\mathbf {S}}^l_n\), respectively. \(\blacksquare \)

In what follows we will suppress the n-dependence from most of the notation. Almost all of the analysis in Sects. 27 pertains to the dynamics on ERn(p).

2.2 Metastability for Perturbed Curie-Weiss

Recall that \(\{\xi _{t}^{u}\}_{t\geq 0}\) and \(\{\xi _{t}^{l}\} _{t\geq 0}\) denote the Glauber dynamics for the Curie-Weiss model driven by (2.2) and (2.3), respectively. An important feature is that their magnetization processes

$$\displaystyle \begin{aligned} \begin{array}{ll} \left\{\theta_{t}^{u}\right\} _{t\geq0} & =\{m(\xi_{\tau_{s}^{l}}^{l})\} _{t\geq0},\\ \left\{\theta_{t}^{l}\right\} _{t\geq0} & =\{m(\xi_{\tau_{s}^{u}}^{u})\} _{t\geq0}, \end{array} {} \end{aligned} $$
(2.6)

are continuous-time Markov processes themselves (see e.g. Bovier and den Hollander [4, Chaper 13]) with state space \(\varGamma _n=\{ -1,-1+\frac {2}{n}, \ldots ,1-\frac {2}{n}\}\) and transition rates

$$\displaystyle \begin{aligned} \begin{array}{rcl} {} &&q^{u}\left(a,a'\right)=\left\{ \begin{array}{ll} \frac{n}{2}\left(1-a\right)\mathrm{e}^{-\beta [p(-2a-\frac{2}{n})-2h^{u}]_{+}}, & \mbox{if }a'=a+\frac{2}{n},\\ \frac{n}{2}\left(1+a\right)\mathrm{e}^{-\beta [p(2a+\frac{2}{n})+2h^{u}]_{+}}, & \mbox{if }a'=a-\frac{2}{n},\\ 0, & \mbox{otherwise}, \end{array} \right. \end{array} \end{aligned} $$
(2.7)
$$\displaystyle \begin{aligned} \begin{array}{rcl} {} &&q^{l}\left(a,a'\right)=\left\{ \begin{array}{ll} \frac{n}{2}\left(1-a\right) \mathrm{e}^{-\beta [p(-2a-\frac{2}{n})-2h^{l}]_{+}}, & \mbox{if }a'=a+\frac{2}{n},\\ \frac{n}{2}\left(1+a\right) \mathrm{e}^{-\beta [p(2a+\frac{2}{n})+2h^{l}]_{+}}, & \mbox{if }a'=a-\frac{2}{n},\\ 0, & \mbox{otherwise}, \end{array} \right. \end{array} \end{aligned} $$
(2.8)

respectively. The processes in (2.6) are reversible with respect to the Gibbs measures

$$\displaystyle \begin{aligned} \begin{array}{rcl} {} \nu^{u}\left(a\right) & =&\displaystyle \frac{1}{z^{u}} \mathrm{e}^{\beta n(\frac 12 pa^{2} +h^{u}a)}{n \choose \frac{1+a}{2}n},\quad a\in \varGamma_n, \end{array} \end{aligned} $$
(2.9)
$$\displaystyle \begin{aligned} \begin{array}{rcl} {} \nu^{l}\left(a\right) & =&\displaystyle \frac{1}{z^{l}} \mathrm{e}^{\beta n(\frac 12 pa^{2} +h^{l}a)}{n \choose \frac{1+a}{2}n},\quad a\in \varGamma_n, \end{array} \end{aligned} $$
(2.10)

respectively.

Define

$$\displaystyle \begin{aligned} \begin{array}{rcl} {} \varPsi^{u}\left(a\right) & =&\displaystyle -\frac 12 pa^{2}-h^{u}a,\quad a\in \varGamma_n, \end{array} \end{aligned} $$
(2.11)
$$\displaystyle \begin{aligned} \begin{array}{rcl} {} \varPsi^{l}(a) & =&\displaystyle -\frac 12 pa^{2}-h^{l}a,\quad a\in \varGamma_n. \end{array} \end{aligned} $$
(2.12)

Note that (2.7) and (2.9) can be written as

$$\displaystyle \begin{aligned} \begin{array}{rcl} q^{u}\left(a,a+\frac{2}{n}\right) & = &\displaystyle \frac{n}{2}\left(1-a\right) \mathrm{e}^{-\beta n [\varPsi^{u}(a+\frac{2}{n})-\varPsi^{u}(a)]_{+}},\\ \nu^{u}\left(a\right) & = &\displaystyle \frac{1}{z^{u}} \mathrm{e}^{-\beta n\varPsi^{u}(a)} {n \choose \frac{n}{2}\left(1+a\right)}, {} \end{array} \end{aligned} $$
(2.13)

and similar formulas hold for (2.8) and (2.10). The properties of the function \(\nu ^{u}\colon \varGamma _n\to \left [0,1\right ]\) can be analysed by looking at the ratio of adjacent values:

$$\displaystyle \begin{aligned} \frac{\nu^{u}\left(a+\frac{2}{n}\right)}{\nu^{u}\left(a\right)} =\exp\left(2\beta\left(p\left(a+\frac{2}{n}\right)+h^{u}\right) +\log\Big(\frac{1-a}{1+a+\frac{2}{n}}\Big)\right), {} \end{aligned} $$
(2.14)

which suggests that ‘local free energy wells’ in ν u can be found by looking at where the sign of

$$\displaystyle \begin{aligned} 2\beta\left(p\left(a+\frac{2}{n}\right)+h^{u}\right)+\log\left(\frac{1-a}{1+a+\frac{2}{n}}\right) \end{aligned} $$
(2.15)

changes from negative to positive. To that end note that, in the limit n →, the second term is positive for a < 0, tends to as a →−1, is negative for a ≥ 0, tends to − as a → 1, and tends to 0 as a → 0. The first term is linear in a, and for appropriate choices of p, β and h u (see Definition 1.1) is negative near a = −1 and becomes positive at some value a < 0. This implies that, for appropriate choices of p, β and h u, the sum of the two terms in (2.15) can change sign + →−→ + on the interval \(\left [-1,0\right ]\), and can change sign + →− on \(\left [0,1\right ]\). Assuming that our choice of p, β and h u corresponds to this change-of-signs sequence, we define m u, t u and s u as in (1.16) with h replaced by h u. This observation makes it clear that the sets in the right-hand side of (1.16) indeed are non-empty.

The interval \(\left [{\mathbf {m}}^{u},{\mathbf {t}}^{u}\right ]\) poses a barrier for the process \(\{\theta _{t}^{u}\}_{t \geq 0}\) due to a negative drift, which delays the initiation of the convergence to equilibrium while the process passes through the interval \(\left [{\mathbf {t}}^{u}, {\mathbf {s}}^{u}\right ]\). The same is true for the process \(\{\xi _{t}^{u}\}_{t\geq 0}\). Similar observations hold for \(\{\theta _{t}^{l}\}_{t \geq 0}\) and \(\{\xi _{t}^{l}\}_{t\geq 0}\). Recall Fig. 4.

2.3 Double-Well Structure

Lemma 2.2 (Metastable Regime)

The potential R p,β,h defined in (1.26) has a double-well structure if and only if βp > 1 and 0 < h < pχ(βp), with χ defined in (1.14).

Proof

In order for R p,β,h to have a double-well structure, the measure ν must have two distinct maxima on the interval \(\left (-1,1\right )\). From (1.22), (1.27) and (2.14) it follows that

$$\displaystyle \begin{aligned} J_{p,\beta,h}(a)=2\lambda\left(a+\frac{h}{p}\right)+\log\left(\frac{1-a}{1+a}\right), \qquad \lambda=\beta p, \end{aligned} $$
(2.16)

must have one local minimum and two zeroes in \(\left (-1,1\right )\). Since

$$\displaystyle \begin{aligned} J_{p,\beta,h}^{\prime}(a)=2\left(\lambda-\frac{1}{1-a^2}\right), \qquad a \in [-1,1], \end{aligned} $$
(2.17)

it must therefore be that λ > 1. The local minimum is attained when

$$\displaystyle \begin{aligned} \lambda=\frac{1}{1-a^{2}}, \end{aligned} $$
(2.18)

i.e., when \(a=a_\lambda =-\sqrt {1-\frac {1}{\lambda }}\) (a λ must be negative because it lies in (m, t); recall Fig. 4). Since

$$\displaystyle \begin{aligned} 0 > J_{p,\beta,h}(a_\lambda)=2\lambda\left(a_\lambda+\frac{h}{p}\right) +\log\left(\frac{1-a_\lambda}{1+a_\lambda}\right), \end{aligned} $$
(2.19)

it must therefore be that

$$\displaystyle \begin{aligned} \frac{h}{p} < \chi(\lambda) \end{aligned} $$
(2.20)

with χ(λ) given by (1.14). □

3 Basic Facts

In this section we collect a few facts that will be needed in Sect. 4 to derive capacity estimates for the dynamics on ERn(p). In Sect. 3.1 we derive a large deviation bound for the degree of typical vertices ERn(p) (Lemma 3.2 below). In Sect. 3.2 we do the same for the edge-boundary of typical configurations (Lemma 3.3 below). In Sect. 3.3 we derive upper and lower bounds for the jump rates of the volume process (Lemmas 3.43.5 and Corollary 3.6 below), and show that the return times to the metastable set conditional on not hitting the top set are small (Lemma 3.7 below). In Sect. 3.4 we use the various bounds to show that the probability for the volume process to grow by n 1∕3 is almost uniform in the starting configuration (Lemma 3.8 and Corollary 3.9 below).

Definition 3.1 (Notation)

For a vertex v ∈ V , we will write v ∈ σ to mean σ(v) = +1 and vσ to mean σ(v) = −1. Similarly, we will denote by \(\overline {\sigma }\) the configuration obtained from σ by flipping the spin at every vertex, i.e., σ(v) = +1 if and only if \(\overline {\sigma } (v)=-1\). For two configurations σ, σ′ we will say that σ ⊆ σ′ if and only if v ∈ σ ⇒ v ∈ σ′. By σ ∪ σ′ we denote the configuration satisfying v ∈ σ ∪ σ′ if and only if v ∈ σ or v ∈ σ′. A similar definition applies to σ ∩ σ′. We will also write σ ∼ σ′ when there is a v ∈ V  such that \(\sigma =\sigma '\cup \left \{v\right \} \) or \(\sigma '=\sigma \cup \left \{v\right \}\). We will say that σ and σ′ are neighbours. We write deg(v) for the degree of v ∈ V . \(\blacksquare \)

3.1 Concentration Bounds for ERn(p)

Recall that \(\mathbb {P}_{\mathrm {ER}_n(p)}\) denotes the law ERn(p).

Lemma 3.2 (Concentration of Degrees and Energies)

With \(\mathbb {P}_{\mathrm {ER}_n(p)}\) -probability tending to 1 as n ∞ the following is true. For any 𝜖 > 0 and any \(c>\sqrt {\frac {1}{8}\log 2}\),

$$\displaystyle \begin{aligned} \begin{array}{rcl} {} & &\displaystyle pn-(1+\epsilon) \sqrt{n\log n}<\deg(v) <pn+(1+\epsilon) \sqrt{n\log n} \qquad \forall\,v \in V,\qquad \end{array} \end{aligned} $$
(3.1)
(3.2)

Proof

These bounds are immediate from Hoeffding’s inequality and a union bound. □

3.2 Edge Boundaries of ERn(p)

We partition the configuration space as

$$\displaystyle \begin{aligned} S_{n}=\bigcup_{k=0}^{n}A_{k}, \end{aligned} $$
(3.3)

where A k is defined in (1.18). For 0 ≤ k ≤ n and \(-pk\left (n-k\right )\leq i\leq \left (1-p\right )k\left (n-k\right )\), define

$$\displaystyle \begin{aligned} \phi_{i}^{k}=\left|\left\{ \sigma\in A_{k}\colon\,\left|\partial_{E}\sigma\right|=pk\left(n-k\right)+i\right\}\right|, {} \end{aligned} $$
(3.4)

i.e., \(\phi _{i}^{k}\) counts the configurations σ with volume k whose edge-boundary size \(\left |\partial _{E}\sigma \right |\) deviates by i from its mean, which is equal to \(pk\left (n-k\right )\). For 0 ≤ k ≤ n, let \(\mathbb {P}_k\) denote the uniform distribution on A k.

Lemma 3.3 (Upper Bound on Edge-Boundary Sizes)

With \(\mathbb {P}_{\mathrm {ER}_n(p)}\) -probability tending to 1 as n ∞ the following are true. For pk(n  k) ≤ j ≤ (1 − p)k(n  k) and \(\varrho \colon \,\mathbb {N}\to \mathbb {R}_{+}\),

$$\displaystyle \begin{aligned} \mathbb{P}_k\left[\phi_{j}^{k}\geq\varrho\left(n\right){n \choose k}p^{pk(n-k)+j} (1-p)^{(1-p)k(n-k)-j}{k(n-k) \choose pk(n-k)+j}\right]\leq\frac{1}{\varrho(n)} {} \end{aligned} $$
(3.5)

and

$$\displaystyle \begin{aligned} \begin{array}{ll} \mathbb{P}_k\left[\sum_{j\geq i}\phi_{j}^{k}\geq\varrho\left(n\right){n \choose k} \mathrm{e}^{-\frac{2i^{2}}{k(n-k)}}\right] & \leq\frac{1}{\varrho\left(n\right)},\\[0.4cm] \mathbb{P}_k\left[\sum_{j\leq-i}\phi_{j}^{k}\geq\varrho\left(n\right){n \choose k} \mathrm{e}^{-\frac{2i^{2}}{k(n-k)}}\right] & \leq\frac{1}{\varrho\left(n\right)}. \end{array} {} \end{aligned} $$
(3.6)

Proof

Write ≃ to denote equality in distribution. Note that if \(\sigma \simeq \mathbb {P}_k\), then \(\left |\partial _{E}\sigma \right |\simeq \mathrm {Bin}\left (k\left (n-k\right ),p\right )\), and hence

$$\displaystyle \begin{aligned} \mathbb{P}_k\left[\left|\partial_{E}\sigma\right|=i\right]=p^{i} \left(1-p\right)^{k\left(n-k\right)-i}{k\left(n-k\right) \choose i}. \end{aligned} $$
(3.7)

In particular,

(3.8)

Hence, by Markov’s inequality, the claim in (3.5) follows. Moreover,

(3.9)

where we again use Hoeffding’s inequality. Hence, by Markov’s inequality, we get the first line in (3.6). The proof of the second line is similar. □

3.3 Jump Rates for the Volume Process

The following lemma establishes bounds on the rate at which configurations in A k jump forward to A k+1 and backward to A k−1. In Sect. 8 we will sharpen the error in the prefactors in (3.10)–(3.11) from 2n 2∕3 to O(1) and the error in the exponents in (3.10)–(3.11) from 3n −1∕3 to O(n −1∕2). The formulas in (3.13) and (3.14) show that for small and large magnetization the rate forward, respectively, backward are maximal.

Lemma 3.4 (Bounds on Forward Jump Rates)

With \(\mathbb {P}_{\mathrm {ER}_n(p)}\) -probability tending to 1 as n ∞ the following are true.

  1. (a)

    For 2n 1∕3 ≤ k  n − 2n 1∕3,

    $$\displaystyle \begin{aligned} \begin{array}{ll} &\big(n-k-2n^{2/3}\big) \mathrm{e}^{-2\beta [\vartheta_{k}+3n^{-1/3}]_{+}}\\ &\qquad \leq\sum_{\xi\in A_{k+1}}r\left(\sigma,\xi\right)\leq\big(n-k-2n^{2/3}\big) \mathrm{e}^{-2\beta [\vartheta_{k}-3n^{-1/3}]_{+}}+2n^{2/3}, \qquad \sigma\in A_{k}, \end{array} {} \end{aligned} $$
    (3.10)

    and

    $$\displaystyle \begin{aligned} \begin{array}{ll} &\big(k-2n^{2/3}\big) \mathrm{e}^{-2\beta[-\vartheta_{k}+3n^{-1/3}]_{+}}\\ &\qquad \leq\sum_{\xi\in A_{k-1}}r\left(\sigma,\xi\right)\leq\big(k-2n^{2/3}\big) \mathrm{e}^{-2\beta[-\vartheta_{k}-3n^{-1/3}]_{+}}+2n^{2/3}, \qquad \sigma\in A_{k}, \end{array} {} \end{aligned} $$
    (3.11)

    where

    $$\displaystyle \begin{aligned} \vartheta_{k}=p\left(1-\frac{2k}{n}\right)-h. \end{aligned} $$
    (3.12)
  2. (b)

    For \(n-\frac {n}{3}(p+h)\leq k< n\),

    $$\displaystyle \begin{aligned} \sum_{\xi\in A_{k+1}}r\left(\sigma,\xi\right)=n-k, \qquad \sigma\in A_{k}. {} \end{aligned} $$
    (3.13)
  3. (c)

    For \(0 < k \leq \frac {n}{3}(p-h)\),

    $$\displaystyle \begin{aligned} \sum_{\xi\in A_{k-1}}r\left(\sigma,\xi\right)=k, \qquad \sigma\in A_k. {} \end{aligned} $$
    (3.14)

Proof

The proof is via probabilistic counting.

  1. (a)

    Write \(\mathbb {P}\) for the law under which σ ∈ S n is a uniformly random configuration and \(v \in \overline {\sigma }\) is a uniformly random vertex. By Hoeffding’s inequality, the probability that v has more than \(p\left |\sigma \right |+n^{2/3}\) neighbours in σ (i.e., w ∈ V  such that \(\left (v,w\right )\in E\) and \(\sigma \left (w\right )=+1\)) is bounded by

    $$\displaystyle \begin{aligned} \mathbb{P}\left[\left|E(v,\sigma)\right| \geq p\left|\overline{\sigma}\right|+n^{2/3}\right] \leq \mathrm{e}^{-2n^{1/3}}, \end{aligned} $$
    (3.15)

    where

    $$\displaystyle \begin{aligned} E(v,\sigma) = \left\{w\in\sigma\colon\,\left(v,w\right)\in E\right\}. \end{aligned} $$
    (3.16)

    Define the event

    $$\displaystyle \begin{aligned} R^{+}\left(\sigma\right)=\left\{ \exists\,\zeta\subseteq\overline{\sigma},\, \zeta\in A_{2n^{2/3}}\colon\,\left|E(v,\sigma)\right|\geq p\left|\sigma\right|+n^{2/3}\,\,\forall\, v\in\zeta\right\}, \end{aligned} $$
    (3.17)

    i.e., the configuration \(\overline {\sigma }\) has at least 2n 2∕3 vertices like v, each with at least \(p\left |\sigma \right |+n^{2/3}\) neighbours in σ. Then, for 0 ≤ k ≤ n − 2n 2∕3,

    $$\displaystyle \begin{aligned} \mathbb{P}\left[R^{+}\left(\sigma\right)\right]\leq{\left|\overline{\sigma}\right| \choose 2n^{2/3}}\left(\mathrm{e}^{-2n^{1/3}}\right)^{2n^{2/3}}\leq2^{n} \mathrm{e}^{-4n}. {} \end{aligned} $$
    (3.18)

    Hence the probability that some configuration σ ∈ S n satisfies condition R +(σ) is bounded by

    $$\displaystyle \begin{aligned} \mathbb{P}\left[\bigcup_{\sigma\in S_{n}}R^{+}\left(\sigma\right)\right]\leq4^{n} \mathrm{e}^{-4n} \leq \mathrm{e}^{-2n}. {} \end{aligned} $$
    (3.19)

    Thus, with \(\mathbb {P}_{\mathrm {ER}_n(p)}\)-probability tending to 1 as n → there are no configurations σ ∈ S n satisfying condition R +(σ). The same holds for the event

    $$\displaystyle \begin{aligned} R^{-}\left(\sigma\right)=\left\{ \exists\,\zeta\subseteq\overline{\sigma},\,\zeta\in A_{2n^{2/3}} \colon\,\left|E(v,\sigma)\right| \leq p\left|\sigma\right|-n^{2/3}\,\,\forall\, v\in\zeta\right\}, \end{aligned} $$
    (3.20)

    for which

    $$\displaystyle \begin{aligned} \mathbb{P}\left[\,\bigcup_{\sigma\in S_{n}}R^{-}\left(\sigma\right)\right] \leq \mathrm{e}^{-2n}. {} \end{aligned} $$
    (3.21)

    Now let σ ∈ A k, and observe that σ has n − k neighbours in A k+1 and k neighbours in A k−1. But if \(\xi =\sigma \cup \left \{ v\right \} \in A_{k+1}\), then by (1.3),

    $$\displaystyle \begin{aligned} \begin{array}{rcl} H_{n}\left(\xi\right)-H_{n}\left(\sigma\right) & = &\displaystyle \frac{2}{n}\Big(\left|E(v,\overline{\sigma})\right|-\left|E(v,\sigma)\right|\Big)-2h\\ & = &\displaystyle \frac{2}{n}\big(\deg\left(v\right)-2\left|E(v,\sigma)\right|\big)-2h\\ & \leq &\displaystyle \frac{2}{n}\big(pn+n^{1/2}\log n-2\left|E(v,\sigma)\right|\big)-2h, {} \end{array} \end{aligned} $$
    (3.22)

    where the last inequality uses (3.1) with \(\varrho (n) = \log n\). Similarly,

    $$\displaystyle \begin{aligned} \begin{array}{rcl} H_{n}\left(\xi\right)-H_{n}\left(\sigma\right) & \geq &\displaystyle \frac{2}{n}\left(pn-n^{1/2}\log n -2\left|E(v,\sigma)\right|\right)-2h. {} \end{array} \end{aligned} $$
    (3.23)

    The events R +(σ) in (3.17) and R (σ) in (3.20) guarantee that for any configuration σ at most 2n 2∕3 vertices in the configuration \(\overline {\sigma }\) can have more than n 2∕3 neighbours in σ. In other words, the configuration σ has at most 2n 2∕3 neighbouring configurations in A k+1 that differ in energy by more than 6n −1∕3 − 2h. Since on the complement of R +(σ) with σ ∈ A k we have |{w ∈ σ:  (v, w) ∈ E}|≤ 2pk + 2n 1∕3 (because \(n^{1/2}\log n \leq n^{2/3}\) for n large enough), from (3.19) and (3.21) we get that, with \(\mathbb {P}_{\mathrm {ER}_n (p)}\)-probability at least 1 −e−2n,

    $$\displaystyle \begin{aligned} \begin{array}{ll} &\left|\left\{ \xi\in A_{k+1}\colon\,\xi\sim\sigma,\, H_{n}\left(\xi\right)-H_{n}\left(\sigma\right)\geq\frac{2}{n} \left(pn-2pk+3n^{2/3}\right)-2h\right\} \right| \leq 2n^{2/3},\\[0.2cm] &\left|\left\{ \xi\in A_{k+1}\colon\,\xi\sim\sigma,\, H_{n}\left(\xi\right)-H_{n}\left(\sigma\right)\leq\frac{2}{n} \left(pn-2pk-3n^{2/3}\right)-2h\right\} \right|\leq2n^{2/3}, \end{array} \end{aligned} $$
    (3.24)

    and hence, by (1.6), the rate at which the Markov chain starting at σ ∈ A k jumps to A k+1 satisfies

    $$\displaystyle \begin{aligned} \begin{array}{rcl} {} & &\displaystyle \sum_{\xi\in A_{k+1}}r\left(\sigma,\xi\right)\geq\big(n-k-2n^{2/3}\big) \mathrm{e}^{-2\beta[\vartheta_k+3n^{-1/3}]_{+}}, \end{array} \end{aligned} $$
    (3.25)
    $$\displaystyle \begin{aligned} \begin{array}{rcl} & &\displaystyle \sum_{\xi\in A_{k+1}}r\left(\sigma,\xi\right)\leq\big(n-k-2n^{2/3}\big) \mathrm{e}^{-2\beta[\vartheta_k-3n^{-1/3}]_{+}}+2n^{2/3}. {} \end{array} \end{aligned} $$
    (3.26)

    Here the term n − k − 2n 2∕3 comes from exclusion of the at most 2n 2∕3 neighbours in configurations that differ from σ in energy by more than 6n −1∕3 − 2h. Similarly, with \(\mathbb {P}_{\mathrm {ER}_n (p)}\)-probability at least 1 −e−2n,

    $$\displaystyle \begin{aligned} \begin{array}{ll} &\Big|\Big\{ \xi\in A_{k-1}\colon\,\xi\sim\sigma, \, H_{n}\left(\xi\right)-H_{n}\left(\sigma\right) \geq \frac{2}{n}\left(-pn+2pk+3n^{2/3}\right)+2h\Big\} \Big| \leq 2n^{2/3},\\ &\qquad \leq\frac{2}{n}\left(-pn+2pk-3n^{2/3}\right)+2h\Big\} \Big| \leq 2n^{2/3}, \end{array} \end{aligned} $$
    (3.27)

    and hence, by (1.6), the rate at which the Markov chain starting at σ ∈ A k jumps to A k−1 satisfies

    $$\displaystyle \begin{aligned} \begin{array}{rcl} {} & &\displaystyle \sum_{\xi\in A_{k-1}}r\left(\sigma,\xi\right)\leq\big(k-2n^{2/3}\big) \mathrm{e}^{-2\beta[-\vartheta_k-3n^{-1/3}]_{+}}+2n^{2/3}, \end{array} \end{aligned} $$
    (3.28)
    $$\displaystyle \begin{aligned} \begin{array}{rcl} & &\displaystyle \sum_{\xi\in A_{k-1}}r\left(\sigma,\xi\right)\geq\big(k-2n^{2/3}\big) \mathrm{e}^{-2\beta[-\vartheta_k+3n^{-1/3}]_{+}}. {} \end{array} \end{aligned} $$
    (3.29)

    This proves (3.10) and (3.11).

  2. (b)

    To get (3.13), note that for ξ = σ ∪{v} with vσ,

    $$\displaystyle \begin{aligned} \begin{array}{rcl} H_{n}\left(\xi\right)-H_{n}\left(\sigma\right) & = &\displaystyle \frac{2}{n}\Big(\left|E(v,\overline{\sigma})\right|-\left|E(v,\sigma)\right|\Big)-2h\\ & = &\displaystyle \frac{2}{n}\Big(2\left|E(v,\sigma)\right|-\deg\left(v\right)\Big)-2h\\ & \leq &\displaystyle 2\left(2(n-k)-p+n^{-1/2}\log n -h\right) \end{array} \end{aligned} $$
    (3.30)

    for n large enough, which is ≤ 0 when \(n-k \leq \frac {n}{3}(p+h)\), so that \(r\left (\sigma ,\xi \right )=1\) by (1.6).

  3. (c)

    To get (3.14), note that for ξ = σ ∖{v} with vσ,

    $$\displaystyle \begin{aligned} \begin{array}{rcl} H_{n}\left(\xi\right)-H_{n}\left(\sigma\right) & = &\displaystyle \frac{2}{n}\Big(\left|E(v,\sigma)\right|-\left|E(v,\overline{\sigma})\right|\Big)+2h\\ & = &\displaystyle \frac{2}{n}\Big(2\left|E(v,\sigma)\right|-\deg\left(v\right)\Big)+2h\\ & \leq &\displaystyle 2\left(2k-p+n^{-1/2}\log n+h\right) \end{array} \end{aligned} $$
    (3.31)

    for n large enough, which is ≤ 0 when \(k \leq \frac {n}{3}(p-h)\), so that \(r\left (\sigma ,\xi \right )=1\) by (1.6).

The following lemma is technical and merely serves to show that near A M transitions involving a flip from − 1 to + 1 typically occur at rate 1. Write ξ v to denote the configuration obtained from ξ by flipping the sign at vertex v ∈ V .

Lemma 3.5 (Attraction Towards the Metastable State)

Suppose that \(\left |\xi \right |=[1+o_{n}(1)]\,\mathbf {M}\) . Then \(r\left (\xi ,\xi ^{v}\right )=1\) for all but O(n 2∕3) many v  ξ.

Proof

We want to show that

$$\displaystyle \begin{aligned} H_{n}\left(\xi^{v}\right)<H_{n}\left(\xi\right){} \end{aligned} $$
(3.32)

for all but O(n 2∕3) many v ∈ ξ. Note that by (3.20) and (3.21) there are at most 2n 2∕3 many v ∈ ξ such that \(|E(v,\overline {\xi })|\leq p(n-|\xi |)-n^{2/3}\), and at most 2n 2∕3 many v ∈ ξ such that |E(v, ξ)|≥ p|ξ| + n 2∕3. Hence, by (1.3), for all but at most 4n 2∕3 many v ∈ ξ we have that

$$\displaystyle \begin{aligned} \begin{array}{rcl} H_{n}\left(\xi^{v}\right) & = &\displaystyle H_{n}\left(\xi\right) +\frac{2}{n}\left(\left|E\left(v,\xi\right)\right|-\left|E\left(v,\overline{\xi}\,\right)\right|\right)+2h\\ & = &\displaystyle H_{n}\left(\xi\right)+\frac{2p}{n}\left(2\left|\xi\right|-n\right)+2h+o_{n}(1)\\ & = &\displaystyle H_{n}\left(\xi\right)+\frac{2p}{n}\left(2\mathbf{M}-n\right)+2h+o_{n}(1)\\ & = &\displaystyle H_{n}\left(\xi\right)+2p\mathbf{m}+2h+o_{n}(1), \end{array} \end{aligned} $$
(3.33)

where we use (1.17). From the definition of m in (1.16) it follows that 2p m + 2h + o n(1) < 0, where we recall from the discussion near the end of Sect. 2.2 that m < 0 and hence \(\log (\frac {1-\mathbf {m}}{1+\mathbf {m}}) > 0\). Hence (3.32) follows. □

We can now prove the claim made in Remark 1.2, namely, there is no metastable behaviour outside the regime in (1.13). Recall the definition of S n in (1.17), which requires the function J in (1.23) to have two zeroes. If it has only one zero, then denote that zero by a′ and define \({\mathbf {S}}_n=\frac {n}{2}(a'+1)\). Let \(A_{{\mathbf {S}}_n+O(n^{2/3})}\) be the union of all A k with |k −S n| = O(n 2∕3).

Corollary 3.6 (Non-metastable Regime)

Suppose that β ∈ (1∕p, ) and h ∈ (p, ). Then {ξ t}t≥0 has a drift towards \(A_{{\mathbf {S}}_n+O(n^{2/3})}\) . Consequently, \(\mathbb {E}_{\xi _{0}}[\tau _{\mathbf {s}}] = O(1)\) for any initial configuration ξ 0 ∈ S n.

Proof

If β ∈ (1∕p, ) and h ∈ (p, ), then the function \(a \mapsto J_{p,\beta ,h}(a)=2\beta (pa+h) +\log (\frac {1-a}{1+a})\) has a unique root in the interval (0, 1). Indeed, J p,β,h(a) > 0 for a ∈ [−1, 0], \(J_{p,\beta ,h}^{\prime }(0) = 2(\beta p-1)>0\), while \(a \mapsto \log (\frac {1-a}{1+a})\) is concave and tends to − as a 1. We claim that the process {ξ t}t≥0 drifts towards that root, i.e., if we denote the root by a′, then the process drifts towards the set \(A_{\frac {n}{2}(a'+1)}\), which by convention we identify with \(A_{S_n}\). Note that if h ∈ (p, ), then \(\vartheta _{k}=p(1-\frac {2k}{n}) -h<0\) for all 0 ≤ k ≤ n (recall (3.12)) and so, by Lemma 3.4,

$$\displaystyle \begin{aligned} \begin{array}{rcl} \sum_{\xi\in A_{k+1}}r\left(\sigma,\xi\right) & \geq &\displaystyle n-k-2n^{2/3},\\ \sum_{\xi\in A_{k-1}}r\left(\sigma,\xi\right) & \leq &\displaystyle \big(k-2n^{2/3}\big) \mathrm{e}^{-2\beta[-\vartheta_k-3n^{-1/3}]}+2n^{2/3}. \end{array} \end{aligned} $$
(3.34)

Thus, for \(k\leq \frac {n}{2}-4n^{2/3}\), \(\sum _{\xi \in A_{k+1}}r(\sigma ,\xi ) > \sum _{\xi \in A_{k-1}} r(\sigma ,\xi )\). Similarly, for \(k \geq \frac {n}{2} + 4n^{2/3}\), the opposite inequality holds. Therefore there is a drift towards \(A_{{\mathbf {S}}_n+O(n^{2/3})}\). □

We close this section with a lemma stating that the average return time to \(A_{{\mathbf {M}}_n}\) conditional on not hitting \(A_{{\mathbf {T}}_n}\) is of order 1 and has an exponential tail. This will be needed to control the time between successive attempts to go from \(A_{{\mathbf {M}}_n}\) to \(A_{{\mathbf {T}}_n}\), until the dynamics crosses \(A_{{\mathbf {T}}_n}\) and moves to \(A_{{\mathbf {S}}_n}\) (recall Fig. 4).

Lemma 3.7 (Conditional Return Time to the Metastable Set)

There exists a C > 0 such that, with \(\mathbb {P}_{\mathrm {ER}_n(p)}\) -probability tending to 1 as n ∞, uniformly in \(\xi \in A_{{\mathbf {M}}_n}\),

$$\displaystyle \begin{aligned} \mathbb{P}_\xi\left[\tau_{A_{{\mathbf{M}}_n}} \geq k \mid \tau_{A_{{\mathbf{M}}_n}} < \tau_{A_{{\mathbf{T}}_n}} \right] \leq \mathrm{e}^{-Ck} \qquad \forall\, k. \end{aligned} $$
(3.35)

Proof

The proof is given in Sect. 8. □

3.4 Uniformity in the Starting Configuration

The following lemma shows that the probability of the event \(\{ \tau _{A_{k+o(n^{1/3})}}<\tau _{A_{k}}\}\) is almost uniform as a function of the starting configuration in A k.

Lemma 3.8 (Uniformity of Hitting Probability of Volume Level Sets)

With \(\mathbb {P}_{\mathrm {ER}_n(p)}\) -probability tending to 1 as n ∞, the following is true. For every 0 ≤ k < m  n,

$$\displaystyle \begin{aligned} \frac{\max_{\sigma\in A_{k}}\mathbb{P}_{\sigma}\left[\tau_{A_{m}}<\tau_{A_{k}}\right]} {\min_{\sigma\in A_{k}}\mathbb{P}_{\sigma}\left[\tau_{A_{m}}<\tau_{A_{k}}\right]} \leq\left[1+o_{n}(1)\right] \mathrm{e}^{K(m-k)n^{-1/3}} {} \end{aligned} $$
(3.36)

with K = K(β, h, p) ∈ (0, ).

Proof

The proof proceeds by estimating the probability of trajectories from A k to A m. Observe that

$$\displaystyle \begin{aligned} \begin{array}{rcl} {} \mathrm{e}^{-2\beta[\vartheta_{k}+3n^{-1/3}]_{+}} & \geq &\displaystyle \mathrm{e}^{-2\beta[\vartheta_{k}]_{+}}\left(1-6\beta n^{-1/3}\right) \quad \forall\,n,\\ \mathrm{e}^{-2\beta[\vartheta_{k}-3n^{-1/3}]_{+}} & \leq &\displaystyle \mathrm{e}^{-2\beta[\vartheta_{k}]_{+}}\left(1+7\beta n^{-1/3}\right) \quad n \mbox{ large enough}, \end{array} \end{aligned} $$
(3.37)

and that similar estimates hold for \(\mathrm{e} ^{-2\beta [-\vartheta _{k}+3n^{-1/3}]_{+}}\) and \(\mathrm{e} ^{-2\beta [-\vartheta _{k}-3n^{-1/3}]_{+}}\). We will bound the ratio in the left-hand side of (3.36) by looking at two random processes on \(\left \{ 0,\ldots ,n\right \} \), one of which bounds \(\max _{\sigma \in A_{k}}\mathbb {P}_{\sigma }\left [\tau _{A_{m}}<\tau _{A_{k}}\right ]\) from above and the other of which bounds \(\min _{\sigma \in A_{k}}\mathbb {P}_{\sigma }\left [\tau _{A_{m}} <\tau _{A_{k}}\right ]\) from below. The proof comes in three Steps.

  1. 1.

    We begin with the following observation. Suppose that \(\{X_{t}^{+}\}_{t \geq 0}\) and \(\{X_{t}^{-}\}_{t \geq 0}\) are two continuous-time Markov chains taking unit steps in \(\left \{0,\ldots ,n\right \}\) at rates r (k, k ± 1) and r +(k, k ± 1), respectively. Furthermore, suppose that for every 0 ≤ k ≤ n − 1,

    $$\displaystyle \begin{aligned} r^{-}\left(k,k+1\right)\leq\min_{\sigma\in A_{k}}\sum_{\xi\in A_{k+1}}r\left(\sigma,\xi\right) \leq\max_{\sigma\in A_{k}}\sum_{\xi\in A_{k+1}}r\left(\sigma,\xi\right)\leq r^{+}\left(k,k+1\right), \end{aligned} $$
    (3.38)

    and for every 1 ≤ k ≤ n,

    $$\displaystyle \begin{aligned} r^{-}\left(k,k-1\right)\geq\max_{\sigma\in A_{k}}\sum_{\xi\in A_{k-1}}r\left(\sigma,\xi\right) \geq\min_{\sigma\in A_{k}}\sum_{\xi\in A_{k-1}}r\left(\sigma,\xi\right)\geq r^{+}\left(k,k-1\right). {} \end{aligned} $$
    (3.39)

    Then

    $$\displaystyle \begin{aligned} \frac{\max_{\sigma\in A_{k}}\mathbb{P}_{\sigma}\left[\tau_{A_{m}}<\tau_{A_{k}}\right]} {\min_{\sigma\in A_{k}}\mathbb{P}_{\sigma}\left[\tau_{A_{m}}<\tau_{A_{k}}\right]} \leq\frac{\mathbb{P}_{k}^{X^{+}}\left[\tau_{m}<\tau_{k}\right]}{\mathbb{P}_{k}^{X^{-}} \left[\tau_{m}<\tau_{k}\right]}. {} \end{aligned} $$
    (3.40)

    Indeed, from (3.38) and (3.39) it follows that we can couple the three Markov chains \(\{X_{t}^{+}\}_{t\geq 0}\), \(\{X_{t}^{-}\}_{t\geq 0}\) and {ξ t}t≥0 in such a way that, for any 0 ≤ k ≤ n and any σ 0 ∈ A k, if \(X_{0}^{-}=X_{0}^{+}=|\sigma _{0}|=k\), then

    $$\displaystyle \begin{aligned} X_{t}^{-}\leq\left|\sigma_{t}\right|\leq X_{t}^{+}, \qquad t \geq 0. \end{aligned} $$
    (3.41)

    This immediately guarantees that, for any 0 ≤ k ≤ m ≤ n,

    $$\displaystyle \begin{aligned} \mathbb{P}_{k}^{X^{-}}\left[\tau_{m}<\tau_{k}\right] \leq\min_{\sigma\in A_{k}}\mathbb{P}_{\sigma}\left[\tau_{A_{m}}<\tau_{A_{k}}\right] \leq\max_{\sigma\in A_{k}}\mathbb{P}_{\sigma}\left[\tau_{A_{m}}<\tau_{A_{k}}\right] \leq\mathbb{P}_{k}^{X^{+}}\left[\tau_{m}<\tau_{k}\right], \end{aligned} $$
    (3.42)

    which proves the claim in (3.40). To get (3.38) and (3.39), we pick \(r^{-}\left (i,j\right )\) and \(r^{+}\left (i,j\right )\) such that

    (3.43)

    and

    $$\displaystyle \begin{aligned} r^{+}(i,j) = \left\{\begin{array}{ll} \min\left\{ n-i,\,\left(n-i+\left(-2+7\beta\right)n^{2/3}\right) \mathrm{e}^{-2\beta[\vartheta_{i}]_{+}} +2n^{2/3}\right\}, &j=i+1,\\ \left(i-\left(2+6\beta\right)n^{2/3}\right) \mathrm{e}^{-2\beta[-\vartheta_{i}]_{+}}, &j=i-1,\\ 0, &\mbox{otherwise}, \end{array} \right. \end{aligned} $$
    (3.44)

    and note that, by Lemma 3.4, (3.37)–(3.39) are indeed satisfied.

  2. 2.

    We continue from (3.40). Our task is to estimate the right-hand side of (3.40). Let \(\mathcal {G}\) be the set of all unit-step paths from k to m that only hit m after their final step:

    $$\displaystyle \begin{aligned} \begin{array}{ll} \mathcal{G}&=\bigcup_{M\in\mathbb{N}}\Big\{ \left\{ \gamma_{i}\right\} _{i=0}^{M-1}\colon\, \gamma_{0}=k,\,\gamma_{M}=m,\,\gamma_{i}\in\left\{ 0,\ldots,m-1\right\}\\ &\qquad \qquad \qquad \mbox{ and } \left|\gamma_{i+1}-\gamma_{i}\right|=1\mbox{ for } 0 \leq i<M\Big\}. \end{array} \end{aligned} $$
    (3.45)

    We will show that

    $$\displaystyle \begin{aligned} \begin{array}{ll} &\frac{\mathbb{P}_{k}^{X^{+}}\left[X_{t}^{+}\mbox{ follows trajectory }\gamma\right]} {\mathbb{P}_{k}^{X^{-}}\left[X_{t}^{-}\mbox{ follows trajectory }\gamma\right]}\\[0.3cm] &\leq\exp\left(\left[24\beta+4\mathrm{e}^{2\beta(p+h+1)}\right] \left(m-k\right)n^{-1/3}\right)\quad \forall\,\,\gamma\in\mathcal{G}, \end{array} \end{aligned} $$
    (3.46)

    which will settle the claim. (Note that the paths realising {τ m < τ k} form a subset of \(\mathcal {G}\).) To that end, let \(\gamma ^{\star }\in \mathcal {G}\) be the path \(\gamma ^{\star }=\left \{k,k+1,\ldots ,m\right \}\). We claim that

    $$\displaystyle \begin{aligned} \sup_{\gamma\in\mathcal{G}}\frac{\mathbb{P}_{k}^{X^{+}}\left[X_{t}^{+} \mbox{ follows trajectory }\gamma\right]} {\mathbb{P}_{k}^{X^{-}}\left[X_{t}^{-}\mbox{ follows trajectory }\gamma\right]} \leq\frac{\mathbb{P}_{k}^{X^{+}}\left[X_{t}^{+}\mbox{ follows trajectory }\gamma^{\star}\right]} {\mathbb{P}_{k}^{X^{-}}\left[X_{t}^{-}\mbox{ follows trajectory }\gamma^{\star}\right]}. {} \end{aligned} $$
    (3.47)

    Indeed, if \(\gamma =\left (\gamma _{1},\ldots ,\gamma _{M}\right )\in \mathcal {G}\), then by the Markov property we have that

    $$\displaystyle \begin{aligned} \mathbb{P}_{k}^{X^{+}}\left[X_{t}^{+}\mbox{ follows trajectory }\gamma\right] = \prod_{i=0}^{M-1}\mathbb{P}_{\gamma_{i}}^{X^{+}}\left[\tau_{\gamma_{i+1}}<\tau_{\gamma_{i}}\right], \end{aligned} $$
    (3.48)

    with a similar expression for \(\mathbb {P}_{k}^{X^{-}}[X_{t}^{-}\mbox{ follows trajectory }\gamma ]\). Therefore, noting that γ i − 1 = 2γ i − γ i+1 when γ i+1 = γ i + 1 and γ i + 1 = 2γ i − γ i+1 when γ i+1 = γ i − 1, we have

    $$\displaystyle \begin{aligned} \begin{array}{rcl} & &\displaystyle \frac{\mathbb{P}_{k}^{X^{+}}\left[X_{t}^{+}\mbox{ follows trajectory }\gamma\right]} {\mathbb{P}_{k}^{X^{-}}\left[X_{t}^{-}\mbox{ follows trajectory }\gamma\right]} = \prod_{i=0}^{M-1}\frac{\mathbb{P}_{\gamma_{i}}^{X^{+}}\left[\tau_{\gamma_{i+1}}<\tau_{\gamma_{i}}\right]} {\mathbb{P}_{\gamma_{i}}^{X^{-}}\left[\tau_{\gamma_{i+1}}<\tau_{\gamma_{i}}\right]}\\ & &\displaystyle = \prod_{i=1}^{M}\left(\frac{r^{+}\left(\gamma_{i},\gamma_{i+1}\right)}{r^{+} \left(\gamma_{i},\gamma_{i+1}\right)+r^{+}\left(\gamma_{i},2\gamma_{i}-\gamma_{i+1}\right)}\right) \left(\frac{r^{-}\left(\gamma_{i},\gamma_{i+1}\right)}{r^{-} \left(\gamma_{i},\gamma_{i+1}\right)+r^{-}\left(\gamma_{i},2\gamma_{i}-\gamma_{i+1}\right)}\right)^{-1}. \end{array} \end{aligned} $$
    (3.49)

    Since, whenever γ i+1 = γ i − 1,

    $$\displaystyle \begin{aligned} \begin{array}{ll} \frac{r^{-}\left(\gamma_{i},\gamma_{i+1}\right)}{r^{-}\left(\gamma_{i},\gamma_{i+1}\right) +r^{-}\left(\gamma_{i},2\gamma_{i}-\gamma_{i+1}\right)} & = \frac{r^{-}\left(\gamma_{i},\gamma_{i}-1\right)}{r^{-}\left(\gamma_{i},\gamma_{i}-1\right) +r^{-}\left(\gamma_{i},\gamma_{i}+1\right)}\\[0.2cm] & \geq \frac{r^{+}\left(\gamma_{i},\gamma_{i}-1\right)}{r^{+}\left(\gamma_{i},\gamma_{i}-1\right) +r^{+}\left(\gamma_{i},\gamma_{i}+1\right)}\\[0.2cm] & = \frac{r^{+}\left(\gamma_{i},\gamma_{i+1}\right)}{r^{+}\left(\gamma_{i},\gamma_{i+1}\right) +r^{+}\left(\gamma_{i},2\gamma_{i}-\gamma_{i+1}\right)}, \end{array} \end{aligned} $$
    (3.50)

    we get

    $$\displaystyle \begin{aligned} \begin{array}{ll} &\prod_{i=0}^{M-1}\left(\frac{r^{+}\left(\gamma_{i},\gamma_{i+1}\right)} {r^{+}\left(\gamma_{i},\gamma_{i+1}\right)+r^{+}\left(\gamma_{i},2\gamma_{i}-\gamma_{i+1}\right)}\right) \left(\frac{r^{-}\left(\gamma_{i},\gamma_{i+1}\right)}{r^{-}\left(\gamma_{i},\gamma_{i+1}\right) +r^{-}\left(\gamma_{i},2\gamma_{i}-\gamma_{i+1}\right)}\right)^{-1}\\[0.2cm] &\qquad \leq \prod_{i=k}^{m-1}\left(\frac{r^{+}\left(i,i+1\right)}{r^{+}\left(i,i+1\right)+r^{+}\left(i,i-1\right)}\right) \left(\frac{r^{-}\left(i,i+1\right)}{r^{-}\left(i,i+1\right)+r^{-}\left(i,i-1\right)}\right)^{-1}\\[0.2cm] &\qquad = \frac{\mathbb{P}_{k}^{X^{+}}\left[X_{t}^{+}\mbox{ follows trajectory }\gamma^{\star}\right]} {\mathbb{P}_{k}^{X^{-}}\left[X_{t}^{-}\mbox{ follows trajectory }\gamma^{\star}\right]}. \end{array} \end{aligned} $$
    (3.51)

    This proves the claim in (3.47).

  3. 3.

    Next, consider the ratio

    $$\displaystyle \begin{aligned} \frac{r^{-}\left(i,i+1\right)+r^{-}\left(i,i-1\right)}{r^{+}\left(i,i+1\right) +r^{+}\left(i,i-1\right)} = \frac{A}{B} \end{aligned} $$
    (3.52)

    with

    (3.53)

    and the ratio

    $$\displaystyle \begin{aligned} \frac{r^{+}\left(i,i+1\right)}{r^{-}\left(i,i+1\right)} = \frac{C}{D} \end{aligned} $$
    (3.54)

    with

    (3.55)

    Note from (3.52) that for 𝜗 i ≥ 0 (i.e., \(i\leq \frac {n}{2}(1-p^{-1}h)\), in which case also \(i<n(1-\frac {1}{3}\left (p+h\right )))\),

    $$\displaystyle \begin{aligned} \frac{r^{-}\left(i,i+1\right)+r^{-}\left(i,i-1\right)}{r^{+}\left(i,i+1\right)+r^{+}\left(i,i-1\right)} \leq1+\frac{13\beta \mathrm{e}^{2\beta(p-h)}}{n^{1/3}}, {} \end{aligned} $$
    (3.56)

    and from (3.54) it follows that in this case

    $$\displaystyle \begin{aligned} \begin{array}{rcl} \frac{r^{+}\left(i,i+1\right)}{r^{-}\left(i,i+1\right)} & \leq &\displaystyle 1+\frac{3\left(3+13\beta\right) \mathrm{e}^{2\beta(p-h)}}{n^{1/3}\left(p+h\right)}. {} \end{array} \end{aligned} $$
    (3.57)

    Similarly, for 𝜗 i < 0 we have that

    $$\displaystyle \begin{aligned} \begin{array}{rcl} \frac{r^{-}\left(i,i+1\right)+r^{-}\left(i,i-1\right)}{r^{+}\left(i,i+1\right)+r^{+}\left(i,i-1\right)} & \leq &\displaystyle 1+\frac{2 \mathrm{e}^{2\beta(p+h)}}{n^{1/3}} {} \end{array} \end{aligned} $$
    (3.58)

    and

    $$\displaystyle \begin{aligned} \begin{array}{rcl} \frac{r^{+}\left(i,i+1\right)}{r^{-}\left(i,i+1\right)} & \leq &\displaystyle 1+\frac{6\left(2+6\beta\right)}{n^{1/3}\left(p+h\right)}. {} \end{array} \end{aligned} $$
    (3.59)

    Combining (3.56)–(3.59), we get that, for all 1 ≤ i ≤ n − 1,

    $$\displaystyle \begin{aligned} \frac{r^{-}\left(i,i+1\right)+r^{-}\left(i,i-1\right)}{r^{+}\left(i,i+1\right) +r^{+}\left(i,i-1\right)}\times\frac{r^{+}\left(i,i+1\right)}{r^{-}\left(i,i+1\right)}\leq1+Kn^{-1/3}, \end{aligned} $$
    (3.60)

    where

    $$\displaystyle \begin{aligned} K=\max\left\{ \mathrm{e}^{2\beta(p-h)}\left(\frac{9+39\beta}{p+h}+13\beta\right),\, 2 \mathrm{e}^{2\beta(p+h)}+\frac{12+36\beta}{p+h}\right\}. \end{aligned} $$
    (3.61)

    Therefore

    $$\displaystyle \begin{aligned} \frac{\mathbb{P}_{k}^{X^{+}}\left[X_{t}^{+}\mbox{ follows trajectory }\gamma^{\star}\right]} {\mathbb{P}_{k}^{X^{-}}\left[X_{t}^{-}\mbox{ follows trajectory }\gamma^{\star}\right]} \leq \prod_{i=k}^{m-1}\left(1+\frac{K}{n^{1/3}}\right) \leq \mathrm{e}^{Kn^{-1/3}(m-k)}. \end{aligned} $$
    (3.62)

An application of the path-comparison methods used in Step 2 of the proof of Lemma 3.8 yields the following.

Corollary 3.9

With \(\mathbb {P}_{\mathrm {ER}_n(p)}\) -probability tending to 1 as n ∞ the following is true. For every 0 ≤ k < m  n,

$$\displaystyle \begin{aligned} \frac{\max_{\sigma\in A_{k}}\mathbb{E}_{\sigma}\left[\tau_{A_{m}}<\tau_{A_{k}}\right]} {\min_{\sigma\in A_{k}}\mathbb{E}_{\sigma}\left[\tau_{A_{m}}<\tau_{A_{k}}\right]} \leq\left[1+o_{n}(1)\right] \mathrm{e}^{K(m-k)n^{-1/3}} {} \end{aligned} $$
(3.63)

with K = K(β, h, p) ∈ (0, ).

4 Capacity Bounds

The goal of this section is to derive various capacity bounds that will be needed to prove Theorem 1.4 in Sects. 67. In Sect. 4.1 we derive capacity bounds for the processes \(\{\xi _{t}^{l}\}_{t\geq 0}\) and \(\{\xi _{t}^{u}\}_{t\geq 0}\) on K n introduced in (2.6) (Lemma 4.1 below). In Sect. 4.2 we do the same for the process {ξ t}t≥0 on ERn(p) (Lemma 4.2 below). In Sect. 4.3 we use the bounds to rank-order the mean return times to \(A_{\mathbf {M^{l}}}\), A M and \(A_{\mathbf {M^{u}}}\), respectively (Lemma 4.3 below). This ordering will be needed in the construction of a coupling in Sect. 6.

Define the Dirichlet form for {ξ t}t≥0 by

$$\displaystyle \begin{aligned} \mathcal{E}\left(f,f\right)=\frac{1}{2}\sum_{\sigma,\sigma'\in S_{n}}\mu\left(\sigma\right) r\left(\sigma,\sigma'\right)\left[f\left(\sigma\right)-f\left(\sigma'\right)\right]^{2}, \qquad f\colon\,S_{n}\to\left[0,1\right], {} \end{aligned} $$
(4.1)

and for \(\{\theta _{t}^{u}\}_{t\geq 0}\) and \(\{\theta _{t}^{l}\}_{t\geq 0}\) by

$$\displaystyle \begin{aligned} \begin{array}{rcl} \mathcal{E}^{u}\left(f,f\right) & = &\displaystyle \frac{1}{2}\sum_{a,a'\in \varGamma_n}\nu^{u}\left(a\right)q^{u} \left(a,a'\right)\left[f\left(a\right)-f\left(a'\right)\right]^{2},\\ \mathcal{E}^{l}\left(f,f\right) & = &\displaystyle \frac{1}{2} \sum_{a,a'\in \varGamma_n}\nu^{l}\left(a\right)q^{l}\left(a,a'\right) \left[f\left(a\right)-f\left(a'\right)\right]^{2}, \qquad f\colon\, \varGamma_n\to\left[0,1\right]. \end{array} \end{aligned} $$
(4.2)

For A, B ⊆ S n, define the capacity between A and B for {ξ t}t≥0 by

$$\displaystyle \begin{aligned} \mathrm{cap}\left(A,B\right)=\min_{f\in Q\left(A,B\right)}\mathcal{E}\left(f,f\right), {} \end{aligned} $$
(4.3)

where

$$\displaystyle \begin{aligned} Q\left(A,B\right)=\left\{f\colon\, S_{n}\to\left[0,1\right],\, f_{|A}\equiv1,\, f_{|B}\equiv0\right\}, {} \end{aligned} $$
(4.4)

and similarly for \(\mathrm {cap}^{u}\left (A,B\right )\) and \(\mathrm {cap}^{l}\left (A,B\right )\).

4.1 Capacity Bounds on K n

First we derive capacity bounds for \(\{\xi ^l_{t}\} _{t\geq 0}\) and \(\{\xi ^u_{t}\} _{t\geq 0}\) on K n. A useful reformulation of (4.3) is given by

$$\displaystyle \begin{aligned} \mathrm{cap}\left(A,B\right)=\sum_{\sigma\in A}\sum_{\sigma'\in S_{n}} \mu\left(\sigma\right)r\left(\sigma,\sigma'\right)\mathbb{P}_{\sigma}\left(\tau_{B}<\tau_{A}\right). {} \end{aligned} $$
(4.5)

Lemma 4.1 (Capacity Bounds for \(\{\xi ^u_{t}\} _{t\geq 0}\) and \(\{\xi ^l_{t}\} _{t\geq 0}\))

For \(a,b \in \left [{\mathbf {m}}^{u},{\mathbf {s}}^{u}\right ]\) with a < b,

$$\displaystyle \begin{aligned} \frac{\left(1-b + \frac{2}{n} \right)}{2n\left(b-a\right)^{2}}\leq\frac{\mathrm{cap}^{u}\left(a,b\right)} {C^{\star}\left(b\right)}\leq\frac{n\left(1-b\right)}{2} {} \end{aligned} $$
(4.6)

with

$$\displaystyle \begin{aligned} C^{\star}\left(b\right)=\frac{1}{z^{u}} \mathrm{e}^{-\beta n\varPsi^{u}(b)} {n \choose \frac{n}{2}\left(1+\min \left( b, {\mathbf{t}}^{u}\right)\right)}. \end{aligned} $$
(4.7)

For \(a,b \in \left [{\mathbf {m}}^{l},{\mathbf {s}}^{l}\right ]\) with a < b, analogous bounds hold for \(\mathrm {cap}^{l}\left (a,b\right )\).

Proof

We will prove the upper and lower bounds only for \(\mathrm {cap}^{u}\left (a,b\right )\), the proof for \(\mathrm {cap}^{l}\left (a,b\right )\) being identical. Note from the definition in (4.3) that

$$\displaystyle \begin{aligned} \begin{array}{rcl} & &\displaystyle \mathrm{cap}^{u}\left(a,b\right)\\ & &\displaystyle = \min_{f\in Q(a,b)}\sum_{i=0}^{\frac{n}{2}\left(b-a\right)-1}\nu^{u} \left(a+\frac{2i}{n}\right)q^{u}\left(a+\frac{2i}{n},a+\frac{2\left(i+1\right)}{n}\right)\\ & &\displaystyle \qquad \times \left[f\left(a+\frac{2i}{n}\right)-f\left(a+\frac{2\left(i+1\right)}{n}\right)\right]^{2}, \end{array} \end{aligned} $$
(4.8)

where it is easy to see that the set \(Q\left (a,b\right )\) in (4.4) may be reduced to

$$\displaystyle \begin{aligned} Q\left(a,b\right)=\Big\{ f\colon \varGamma_n\to\left[0,1\right],\, f\left(x\right)=1 \mbox{ for } x\leq a,\, f\left(x\right)=0\mbox{ for }x\geq b\Big\}. \end{aligned} $$
(4.9)

Note that for every f ∈ Q(a, b) there is some \(0\leq i\leq \frac {n}{2}\left (b-a\right )-1\) such that

$$\displaystyle \begin{aligned} \left|f\left(a+\frac{2i}{n}\right)-f\left(a+\frac{2\left(i+1\right)}{n}\right)\right| \geq\left(\frac{n}{2}\left(b-a\right)\right)^{-1}. {} \end{aligned} $$
(4.10)

Also note that, by (2.13),

$$\displaystyle \begin{aligned} \begin{array}{rcl} & &\displaystyle \nu^{u}\left(a+\frac{2i}{n}\right)q^{u}\left(a+\frac{2i}{n},a+\frac{2\left(i+1\right)}{n}\right)\\ & &\displaystyle = \frac{1}{z^{u}}\frac{n}{2}\left(1-a-\frac{2i}{n}\right) \mathrm{e}^{-\beta n\max\big\{ \varPsi^{u}\big(a+\frac{2i}{n}\big), \varPsi^{u}\big(a+\frac{2(i+1)}{n}\big)\big\}} {n \choose \frac{n}{2}\left(1+a\right)+i}, \end{array} \end{aligned} $$
(4.11)

and that, for any \(\delta \in \mathbb {R}\),

$$\displaystyle \begin{aligned} \varPsi^{u}\left(a+\delta\right)-\varPsi^{u}\left(a\right)=-\delta\left(pa+h^{u}+\frac{p}{2}\delta\right), {} \end{aligned} $$
(4.12)

so that

$$\displaystyle \begin{aligned} \max\left\{ \varPsi^{u}\left(a+\frac{2i}{n}\right), \varPsi^{u}\left(a+\frac{2\left(i+1\right)}{n}\right)\right\} \leq \frac{2}{n}\left(pa+h+\frac{p}{n}\right) + \varPsi^{u}\left(a+\frac{2i}{n}\right). \end{aligned} $$
(4.13)

Combining, (4.9)–(4.13) with \(\delta =\frac {2}{n}\), we get

$$\displaystyle \begin{aligned} \begin{array}{rcl} & &\displaystyle \mathrm{cap}^{u}\left(a,b\right)\\ & &\displaystyle \qquad \geq \min_{0\leq i\leq\frac{n}{2}\left(b-a\right)-1}\frac{2\left(1-b + \frac{2}{n}\right) \mathrm{e}^{-2\beta(p+h^{u})}}{nz^{u}\left(b-a\right)^{2}} \mathrm{e}^{-\beta n\varPsi^{u}\big(a+\frac{2i}{n}\big)}{n \choose \frac{n}{2}\left(1+a\right)+i}\\ & &\displaystyle \qquad = \frac{2\left(1-b + \frac{2}{n} \right) \mathrm{e}^{-2\beta(p+h^{u} + \frac{p}{n})}}{nz^{u}\left(b-a\right)^{2}} \mathrm{e}^{-\beta n\varPsi^{u}\left(\min(b,{\mathbf{t}}^{u})\right)}{n \choose \frac{n}{2} \left(1+\min(b,{\mathbf{t}}^{u})\right)}, \end{array} \end{aligned} $$
(4.14)

where we use (4.12) plus the fact that, by the definition of m u, t u, s u, for \(a,b\in \left [{\mathbf {m}}^{u},{\mathbf {s}}^{u}\right ]\) with a < b, the function \(i \mapsto \mathrm{e} ^{-\beta n\varPsi ^{u}\left (a+\frac {2i}{n}\right )}{n \choose \frac {n}{2}(1+a)+i}\) is decreasing on [m u, t u] and increasing on [t u, s u]. This settles the lower bound in (4.6).

Arguments similar to the ones above give

$$\displaystyle \begin{aligned} \begin{array}{rcl} \mathrm{cap}^{u}\left(a,b\right) & \leq &\displaystyle \nu^{u}\left(\min(b,{\mathbf{t}}^u)-\frac{2}{n}\right)q^{u} \left(\min(b,{\mathbf{t}}^u)-\frac{2}{n},\min(b,{\mathbf{t}}^u)\right)\\ & \leq &\displaystyle \frac{n\left(1-b\right) \mathrm{e}^{2\beta\left(p+h^{u}\right)}}{2z^{u}} \mathrm{e}^{-\beta n\varPsi^{u}\left(\min(b,{\mathbf{t}}^u)\right)} {n \choose \frac{n}{2}\left(1+\min(b,{\mathbf{t}}^u)\right)},\\ \end{array} \end{aligned} $$
(4.15)

where for the first equality we use the test function f ≡ 1 on \(\left [-1,\min (b,{\mathbf {t}}^u)-\frac {2}{n}\right ]\) and f ≡ 0 on \(\left [\min (b,{\mathbf {t}}^u),1\right ]\) in (4.9). □

4.2 Capacity Bounds on ERn(p)

Next we derive capacity bounds for {ξ t}t≥0 on ERn(p). The proof is analogous to what was done in Lemma 4.1 for \(\{\theta _{t}^{u}\} _{t\geq 0}\) and \(\{\theta _{t}^{l}\}_{t\geq 0}\) on K n.

Define the set of direct paths between A ⊆ S n and B ⊆ S n by

$$\displaystyle \begin{aligned} \mathcal{L}_{A,B}=\left\{ \gamma=(\gamma_0,\ldots,\gamma_{|\gamma|})\colon A\to B:\,\left|\gamma_{i+1}\right| =\left|\gamma_{i}\right|+1\mbox{ for all }\gamma_{i}\in\gamma\right\}, {} \end{aligned} $$
(4.16)

which may be empty. Recall from (3.12) that \(\vartheta _k = p(1-\frac {k}{n})-h\).

Lemma 4.2 (Capacity Bounds for {ξ t}t≥0)

With \(\mathbb {P}_{\mathrm {ER}_n(p)}\) -probability tending to 1 as n ∞ the following is true. For every 0 ≤ k < k′ n and every \(\varrho \colon \mathbb {N}\to \mathbb {R}_{+}\) satisfying \(\lim _{n\to \infty } \varrho \left (n\right )=\infty \),

(4.17)

where

$$\displaystyle \begin{aligned} \begin{array}{rcl} k_{m} = \mathrm{argmin}_{k\leq j \leq k'} {n \choose j}\mathrm{e}^{-\beta\,2j\theta_{j}}. \end{array} \end{aligned} $$
(4.18)

Proof

Recall from (4.1) and (4.3) that

$$\displaystyle \begin{aligned} \mathrm{cap}\left(A_{k},A_{k'}\right) = \min_{f\in Q}\mathcal{E}\left(f,f\right) = \min_{f\in Q}\,\,\frac{1}{2} \sum_{\sigma,\xi\in S_{n}}\mu\left(\sigma\right)r\left(\sigma,\xi\right) \left[f\left(\sigma\right)-f\left(\xi\right)\right]^{2}, \end{aligned} $$
(4.19)

where

$$\displaystyle \begin{aligned} Q(A_k,A_{k'})=\left\{ f\colon\, S_{n}\to\left[0,1\right]\colon\, f|{}_{A_{k}}\equiv1,\, f|{}_{A_{k'}}\equiv0\right\}. \end{aligned} $$
(4.20)

The proof comes in three Steps.

  1. 1.

    We first prove the upper bound in (4.17). Let \(B=\bigcup _{j=k}^{k_{m}-1}A_{j}\), and note that, by (1.7),

    (4.21)

    Recall from (3.4) that \(\phi _{i}^{k}\) denotes the cardinality of the set of all σ ∈ A k with \(\left |\partial _{E}\sigma \right |=pk\left (n-k\right )+i\). Note from (1.3) that for any \(\xi \in A_{k_{m}}\) such that \(\left |\partial _{E}\xi \right |=pk_{m}\left (n-k_{m}\right )+i\),

    (4.22)

    There are \({n \choose k_{m}}\) terms in the sum, and therefore we get

    (4.23)

    with \(Y=\sqrt {\log (\varrho (n)^{2}n^{5/6})k_{m}(n-k_{m})}\). The choice of Y  will become clear shortly. The summand in the right-hand side can be bounded as follows. By the sandwich in (3.2) in Lemma 3.2, the sum over i < −Y  can be restricted to − cn 3∕2 ≤ i < −Y , since with high probability no configuration has a boundary size that deviates by more than cn 3∕2 from the mean. But, using Lemma 3.3, we can also bound from above the number of configurations that deviate by at most Y  from the mean, i.e., we can bound \(\phi _{i}^{k_{m}}\) for − cn 3∕2 ≤ i < −Y . Taking a union bound over 0 ≤ k ≤ n and − cn 3∕2 ≤ i < −Y , we get

    $$\displaystyle \begin{aligned} \mathbb{P}\left[\bigcup_{k=0}^{n}\,\,\bigcup_{i=-cn^{3/2}}^{-Y}\left\{ \phi_{i}^{k_{m}} \geq\varrho\left(n\right)n^{5/2}{n \choose k_{m}} \mathrm{e}^{-\frac{2i^{2}}{k_{m}\left(n-k_{m}\right)}}\right\} \right]\leq\frac{1}{\varrho\left(n\right)}. {} \end{aligned} $$
    (4.24)

    Thus, with \(\mathbb {P}_{\mathrm {ER}_n(p)}\)-probability at least \(\geq 1-\frac {1}{\varrho \left (n\right )}\),

    $$\displaystyle \begin{aligned} \begin{array}{ll} \sum_{i<-Y}\phi_{i}^{k_{m}} \mathrm{e}^{-\beta\left(2k_{m}\,\theta_{k_{m}}+\frac{2i}{n}\right)} &\leq \sum_{i>Y}\varrho\left(n\right)n^{5/2}{n \choose k_{m}} \mathrm{e}^{-2i\left(\frac{i}{k_{m}\left(n-k_{m}\right)}-\frac{\beta}{n}\right)} \mathrm{e}^{-\beta\,2k_{m}\,\theta_{k_{m}}}\\[0.2cm] &\leq \varrho\left(n\right)n^{5/2}{n \choose k_{m}} \mathrm{e}^{-\beta\,2k_{m}\,\theta_{k_{m}}} \mathrm{e}^{-2\log(\varrho(n)n^{5/6})}\\[0.2cm] &\leq\frac{n^{5/6}}{\varrho\left(n\right)}{n \choose k_{m}} \mathrm{e}^{-\beta\,2k_{m}\,\theta_{k_{m}}}, \end{array} \end{aligned} $$
    (4.25)

    where we use that, for i > Y  and n sufficiently large,

    $$\displaystyle \begin{aligned} \frac{i}{k_{m}\left(n-k_{m}\right)}-\frac{\beta}{n} \geq\sqrt{\frac{\log(\varrho\left(n\right)^{2}n^{5/6})}{k_{m}\left(n-k_{m}\right)}}-\frac{\beta}{n} \geq\sqrt{\frac{\log(\varrho\left(n\right)n^{5/6})}{k_{m}\left(n-k_{m}\right)}}. \end{aligned} $$
    (4.26)

    The above inequality also clarifies our choice of Y . Substituting it into (4.23), we see that

    (4.27)

    A similar bound holds for \(\sum _{\xi \in A_{k_{m}-1}} \mathrm{e} ^{-\beta H_{n}\left (\xi \right )}\). A union bound over 1 ≤ k m ≤ n increases the exponent \(\frac {5}{6}\) to \(\frac {11}{6}\). Together with (4.21), this proves the upper bound in (4.17).

  2. 2.

    We next derive a combinatorial bound that will be used later for the proof of the lower bound in (4.17). Note that if \(f\in Q(A_k,A_{k'})\) and \(\gamma \in \mathcal {L}_{A_{k},A_{k'}}\) (recall (4.16)), then there must be some 1 ≤ i ≤ k′− k such that

    $$\displaystyle \begin{aligned} \left|f\left(\gamma_{i}\right)-f\left(\gamma_{i+1}\right)\right|\geq\left(k'-k\right)^{-1}. {} \end{aligned} $$
    (4.28)

    A simple counting argument shows that

    $$\displaystyle \begin{aligned} \left|\mathcal{L}_{A_{k},A_{k'}}\right|={n \choose k}\,\frac{\left(n-k\right)!}{\left(n-k'\right)!}, \end{aligned} $$
    (4.29)

    since for each σ ∈ A k there are \(\left (n-k\right )\times \left (n-k-1\right )\times \cdots \times \left (n-k'+1\right )\) paths in \(\mathcal {L}_{A_{k},A_{k'}}\) from σ to \(A_{k'}\). Let

    $$\displaystyle \begin{aligned} \begin{array}{ll} b_{i} & =\left|\left\{ \left(\sigma,\xi\right)\in A_{k+i-1}\times A_{k+i}\colon\, \left|f\left(\sigma\right)-f\left(\xi\right)\right| \geq\left(k'-k\right)^{-1},\,\sigma\sim\xi\right\} \right|,\\ &\qquad 1\leq i\leq k'-k. \end{array} \end{aligned} $$
    (4.30)

    We claim that

    $$\displaystyle \begin{aligned} \exists\,1\leq i_{\blacklozenge}\leq k'-k\colon \qquad b_{i_{\blacklozenge}}\geq\frac{k}{k'-k}{n \choose k+i_{\blacklozenge}}. \end{aligned} $$
    (4.31)

    Indeed, the number of paths in \(\mathcal {L}_{A_{k},A_{k'}}\) that pass through \(\sigma \in A_{k+i_{\blacklozenge }-1}\) followed by a move to \(\xi \in A_{k+i_{\blacklozenge }}\) equals

    $$\displaystyle \begin{aligned} z_{i_{\blacklozenge}}=\frac{\left(k+i_{\blacklozenge}-1\right)!}{k!}\times \frac{\left(n-k-i_{\blacklozenge}\right)!}{\left(n-k'\right)!}, {} \end{aligned} $$
    (4.32)

    where the first term in the product counts the number of paths from \(\sigma \in A_{k+i_{\blacklozenge }+1}\) to A k, while the second term counts the number of paths from \(\xi \in A_{k+i_{\blacklozenge }}\) to \(A_{k'}\). Thus, if (4.31) fails, then

    $$\displaystyle \begin{aligned} \sum_{i=1}^{k'-k}b_{i}z_{i} < \frac{k}{k'-k}\sum_{i=1}^{k-k'}\frac{1}{k+i}\frac{n!}{k!\left(n-k'\right)!} \leq \frac{n!}{k!\left(n-k'\right)!}=\left|\mathcal{L}_{A_{k},A_{k'}}\right|, \end{aligned} $$
    (4.33)

    which in turn implies that (4.28) does not hold for some \(\gamma \in \mathcal {L}_{A_{k},A_{k'}}\) (use that b i z i counts the paths that satisfy condition (4.28)), which is a contradiction. Hence the claim in (4.31) holds.

  3. 3.

    In this part we prove the lower bound in (4.17). By Lemma 3.3 we have that, with \(\mathbb {P}_{\mathrm {ER}_n(p)}\)-probability at least \(1-\frac {1}{\varrho (n)}\), for any Y ≥ 0,

    $$\displaystyle \begin{aligned} \sum_{j\geq\sqrt{Y\left(k+i_{\blacklozenge}\right)\left(n-k-i_{\blacklozenge}\right)}} \phi_{j}^{k+i_{\blacklozenge}}\leq\varrho\left(n\right){n \choose k+i_{\blacklozenge}} \mathrm{e}^{-2Y}. \end{aligned} $$
    (4.34)

    Picking \(Y=\log (\varrho \left (n\right )k^{-1}2n^{3/2})\), we get that

    $$\displaystyle \begin{aligned} \sum_{j\geq\sqrt{Y\left(k+i_{\blacklozenge}\right)\left(n-k-i_{\blacklozenge}\right)}} \phi_{j}^{k+i_{\blacklozenge}} \leq\frac{1}{4}\frac{k^{2}\varrho(n)}{\varrho(n)^{2}n^{3}} {n \choose k+i_{\blacklozenge}} \leq\frac{1}{2}\frac{k}{n\left(k'-k\right)}{n \choose k+i_{\blacklozenge}}, \end{aligned} $$
    (4.35)

    and so at least half of the configurations contributing to \(b_{i_{\blacklozenge }}\) have an edge-boundary of size at most

    $$\displaystyle \begin{aligned} p\left(k+i_{\blacklozenge}\right)\left(n-k-i_{\blacklozenge}\right) +\sqrt{Y\left(k+i_{\blacklozenge}\right)\left(n-k-i_{\blacklozenge}\right)}. \end{aligned} $$
    (4.36)

    If \(\xi \in A_{k+i_{\blacklozenge }}\) is such a configuration, then by Lemma 3.2 the same is true for any σ ∼ ξ (i.e., configurations differing at only one vertex), since

    (4.37)

    This implies

    (4.38)

    Therefore

    (4.39)

    Since (4.39) is true for any \(f\in Q(A_k,A_{k'})\), the lower bound in (4.17) follows, with k m defined in (4.18).

4.3 Hitting Probabilities on ERn(p)

Let \(\mu _{A_{\mathbf {M}}}\) be the equilibrium distribution μ conditioned to the set A M. Write \(\mathbb {P}^{l}\) and \(\mathbb {P}^{u}\) to denote the laws of the processes \(\{\xi _{t}^{l}\} _{t\geq 0}\) and \(\{ \xi _{t}^{u}\} _{t\geq 0}\), respectively. The following lemma is the crucial sandwich for comparing the crossover times of the dynamics on ERn(p) and the perturbed dynamics on K n.

Lemma 4.3 (Rank Ordering of Hitting Probabilities)

With \(\mathbb {P}_{\mathrm {ER}_n(p)}\) -probability tending to 1 as n ∞,

$$\displaystyle \begin{aligned} \begin{array}{ll} &\max_{\xi\in A_{{\mathbf{M}}^l}} \mathbb{P}_{\xi}^{l}\left[\tau_{{\mathbf{S}}^l}<\tau_{{\mathbf{M}}^l}\right] \leq\mathbb{P}_{\mu_{A_{\mathbf{M}}}}\left[\tau_{\mathbf{S}}<\tau_{\mathbf{M}}\right]\\[0.2cm] &\leq \min_{\sigma\in A_{{\mathbf{M}}^u}}\mathbb{P}^{u}_{\sigma}\left[\tau_{{\mathbf{S}}^u} <\tau_{{\mathbf{M}}^u}\right]. \end{array} {} \end{aligned} $$
(4.40)

Proof

The proof comes in three Steps.

  1. 1.

    Recall from (1.10) that the magnetization of σ ∈ A k is \(m\left (\sigma \right )=2\frac {k}{n}-1\). We first observe that the maximum and the minimum in (4.40) are redundant, because by symmetry

    $$\displaystyle \begin{aligned} \begin{array}{ll} \max_{\xi\in A_{k}}\mathbb{P}_{\xi}^{l}\left[\tau_{A_{k'}}<\tau_{A_{k}}\right] &= \min_{\xi\in A_{k}}\mathbb{P}_{\xi}^{l}\left[\tau_{A_{k'}}<\tau_{A_{k}}\right],\\[0.2cm] \min_{\xi\in A_{k}}\mathbb{P}_{\xi}^{u}\left[\tau_{A_{k'}}<\tau_{A_{k}}\right] &= \max_{\xi\in A_{k}}\mathbb{P}_{\xi}^{u}\left[\tau_{A_{k'}}<\tau_{A_{k}}\right]. \end{array} \end{aligned} $$
    (4.41)

    Recall that \(\{\xi _{t}^{l}\}_{t\geq 0}\) is the Markov process on S n governed by the Hamiltonian \(H_{n}^{l}\) in (2.3), and that the associated magnetization process \(\{\theta _{t}^{l}\} _{t\geq 0}=\{m(\xi _{t}^{l})\} _{t\geq 0}\) is a Markov process on the set Γ n in (1.15) with transition rates given by q l in (2.8). Denoting by \(\hat {\mathbb {P}}^{l}\) the law of \(\{\theta _{t}^{l}\} _{t\geq 0}\), we get from (4.5) that for any 0 ≤ k ≤ k′ < n, and with \(a=\frac {2k}{n}-1\) and \(b=\frac {2k'}{n}-1\),

    $$\displaystyle \begin{aligned} \begin{array}{rcl} \mathrm{cap}^{l}\left(a,b\right) & =&\displaystyle \sum_{u \in \varGamma_n} \nu ^{l}\left(a\right) q^{l}\left(a,u\right) \hat{\mathbb{P}}_{a}^{l}\left[\tau_{b}<\tau_{a}\right]\\ & =&\displaystyle \nu ^{l}\left(a\right) \left[ q^{l}\left(a,a+\frac{2}{n}\right) + q^{l}\left(a,a-\frac{2}{n}\right) \right] \hat{\mathbb{P}}_{a}^{l}\left[\tau_{b}<\tau_{a}\right], \end{array} \end{aligned} $$
    (4.42)

    and therefore

    $$\displaystyle \begin{aligned} \begin{array}{ll} \max_{\xi\in A_{k}}\mathbb{P}_{\xi}^{l}\left[\tau_{A_{k'}}<\tau_{A_{k}}\right] &= \hat{\mathbb{P}}_{a}^{l}\left[\tau_{b}<\tau_{a}\right]\\ &= \left[\nu^{l}\left(a\right)\left(q^{l}\left(a,a+\frac{2}{n}\right) +q^{l}\left(a,a-\frac{2}{n}\right)\right)\right]^{-1}\mathrm{cap}^{l}\left(a,b\right). {} \end{array} \end{aligned} $$
    (4.43)

    By (2.13), using the abbreviations

    $$\displaystyle \begin{aligned} \varPsi_{1}=\max\left\{ \varPsi^{l}\left(a\right),\varPsi^{l}\left(a+\frac{2}{n}\right)\right\}, \qquad \varPsi_{2}=\max\left\{ \varPsi^{l}\left(a\right),\varPsi^{l}\left(a-\frac{2}{n}\right)\right\}, \end{aligned} $$
    (4.44)

    we have, with the help of (4.12),

    $$\displaystyle \begin{aligned} \begin{array}{rcl} & &\displaystyle \nu^{l}\left(a\right)\left(q^{l}\left(a,a+\frac{2}{n}\right)+q^{l}\left(a,a-\frac{2}{n}\right)\right) \\ & &\displaystyle \quad = \frac{1}{z^{l}}\frac{n}{2}{n \choose \frac{n}{2}\left(1+a\right)} \Big(\left(1-a\right) \mathrm{e}^{-\beta n\varPsi_{1}}+\left(1+a\right)\mathrm{e}^{-\beta n\varPsi_{2}}\Big) {}\\ & &\displaystyle \quad \geq \frac{1}{z^{l}}n \mathrm{e}^{-2\beta\left(p\left|a\right|+h^{l}+\frac{p}{n}\right)} \mathrm{e}^{-\beta n\varPsi^{l}\left(a\right)} {n \choose \frac{n}{2}\left(1+a\right)}. \end{array} \end{aligned} $$
    (4.45)

    From Lemma 4.1 we have that

    $$\displaystyle \begin{aligned} \mathrm{cap}^{l}\left(a,b\right)\leq\frac{n\left(1-a\right)}{2z^{l}} \mathrm{e}^{-\beta n\varPsi^{l}\left(b\right)}{n \choose \frac{n}{2}\left(1+b\right)}. {} \end{aligned} $$
    (4.46)

    Putting (4.43), (4.45) and (4.46) together, we get

    $$\displaystyle \begin{aligned} \begin{array}{ll} \max_{\xi\in A_{k}}\mathbb{P}_{\xi}^{l}\left[\tau_{A_{k'}}<\tau_{A_{k}}\right] &\leq \frac{\left(1-a\right)}{2} \mathrm{e}^{2\beta\left(p\left|a\right|+h^{l} +\frac{p}{n}\right)} \mathrm{e}^{-\beta n\left[\varPsi^{l}\left(b\right)-\varPsi^{l}\left(a\right)\right]}\\ &\qquad \times {n \choose \frac{n}{2}\left(1+b\right)}{n \choose \frac{n}{2}\left(1+a\right)}^{-1}. \end{array} \end{aligned} $$
    (4.47)

    Similarly, denoting by \(\hat {\mathbb {P}}^{u}\) the law of \(\{\theta _{t}^{u}\} _{t\geq 0}\), we have

    $$\displaystyle \begin{aligned} \begin{array}{rcl} {} \min_{\xi\in A_{k}}\mathbb{P}_{\xi}^{u}\left[\tau_{A_{k'}}<\tau_{A_{k}}\right] & = &\displaystyle \hat{\mathbb{P}}_{a}^{u}\left[\tau_{b}<\tau_{a}\right]\\ & = &\displaystyle \left[\nu^{u}\left(a\right)\left(q^{u}\left(a,a+\frac{2}{n}\right) +q^{u}\left(a,a-\frac{2}{n}\right)\right)\right]^{-1}\mathrm{cap}^{u}\left(a,b\right), \end{array} \end{aligned} $$
    (4.48)

    where

    $$\displaystyle \begin{aligned} \begin{array}{ll} &\left[\nu^{u}\left(a\right)\left(q^{u}\left(a,a+\frac{2}{n}\right) +q^{u}\left(a,a-\frac{2}{n}\right)\right)\right]^{-1}\\[0.2cm] &\geq\left[\frac{n}{z^{u}} \mathrm{e}^{2\beta\left(p\left|a\right|+h^{l}+\frac{p}{n}\right)} \mathrm{e}^{-\beta n\varPsi^{u}\left(a\right)}{n \choose \frac{n}{2}\left(1+a\right)}\right]^{-1}, \end{array} \end{aligned} $$
    (4.49)

    and, by Lemma 4.1,

    $$\displaystyle \begin{aligned} \mathrm{cap}^{u}\left(a,b\right) \geq \frac{1}{2nz^{u}} \mathrm{e}^{-\beta n\varPsi^{u}\left(b\right)}{n \choose \frac{n}{2}\left(1+b\right)}. \end{aligned} $$
    (4.50)

    Putting (4.48)–(4.50) together, we get

    $$\displaystyle \begin{aligned} \min_{\xi\in A_{k}}\mathbb{P}_{\xi}^{u}\left[\tau_{A_{k'}}<\tau_{A_{k}}\right] \geq \frac{1}{n} \mathrm{e}^{-\beta n\left[\varPsi^{l}\left(b\right)-\varPsi^{l} \left(a\right)\right]}{n \choose \frac{n}{2}\left(1+b\right)} {n \choose \frac{n}{2}\left(1+a\right)}^{-1}. {} \end{aligned} $$
    (4.51)
  2. 2.

    Recall from (4.5) that

    $$\displaystyle \begin{aligned} \mathrm{cap}\left(A_{k},A_{k'}\right) =\sum_{\sigma\in A_{k}}\sum_{\xi \in S_{n}}\mu(\sigma)r(\sigma,\xi) \mathbb{P}_{\sigma}\left[\tau_{A_{k'}}<\tau_{A_{k}}\right]. \end{aligned} $$
    (4.52)

    Split

    $$\displaystyle \begin{aligned} \begin{array}{rcl} & &\displaystyle \sum_{\sigma \in A_{k}}\sum_{\xi \in S_{n}} \mu(\sigma)r(\sigma,\xi)\\ & &\displaystyle = \sum_{\sigma \in A_{k}} \sum_{\xi \in A_{k+1}} \mu(\sigma)r(\sigma,\xi) + \sum_{\sigma \in A_{k}} \sum_{\xi \in A_{k-1}} \mu(\sigma)r(\sigma,\xi)\\ & &\displaystyle = \frac{1}{Z} \sum_{\sigma \in A_{k}} \sum_{\xi \in A_{k+1}} \mathrm{e}^{-\beta\max\left\{ H_{n}\left(\sigma\right),H_{n}\left(\xi\right)\right\}} +\frac{1}{Z}\sum_{\xi\in A_{k}}\sum_{\xi'A_{k-1}} \mathrm{e}^{-\beta\max\left\{ H_{n}\left(\sigma\right),H_{n}\left(\xi\right)\right\}}. \end{array} \end{aligned} $$
    (4.53)

    By Lemma 3.3 and a reasoning similar to that leading to (4.38),

    (4.54)

    with \(Y=\log (\sqrt {2\varrho (n)})\). Indeed, by (3.6) fewer than \(\frac {1}{2}{n \choose k}\) configurations in A k have an edge-boundary of size \(\geq pk\left (n-k\right )+\sqrt {k\left (n-k\right )Y}\). Moreover, if ξ ∼ ξ′, then, by Lemma 3.2,

    $$\displaystyle \begin{aligned} \mathrm{e}^{\beta[ H_{n}(\xi')-H_{n}(\xi)]} \leq [1+o(1)]\,\mathrm{e}^{\beta\left(p+h\right)}, \end{aligned} $$
    (4.55)

    and since we may absorb this constant inside the error term \(\varrho \left (n\right )\), we get that

    (4.56)
    (4.57)

    and hence

    (4.58)
  3. 3.

    Similar bounds can be derived for \(\mathbb {P}_{\mu _{A_{k}}}\left [\tau _{A_{k'}} < \tau _{A_{k}}\right ]\). Indeed, by Lemma 3.5, \(r\left (\sigma ,\xi \right ) = 1\) for all σ ∈ A k and all but O(n 2∕3) many configurations ξ ∈ S n. Therefore

    $$\displaystyle \begin{aligned} \begin{array}{rcl} \mathrm{cap}(A_{k},A_{k'}) & =&\displaystyle n\,[1+o(1)] \sum_{\sigma \in A_{k}} \mu(\sigma) \mathbb{P}_{\sigma}\left[\tau_{A_{k'}} < \tau_{A_{k}}\right]\\ & =&\displaystyle n\,[1+o(1)]\,\mu(A_{k})\,\mathbb{P}_{\mu_{A_{k}}}\left[\tau_{A_{k'}} < \tau_{A_{k}}\right] \end{array} \end{aligned} $$
    (4.59)

    and hence

    $$\displaystyle \begin{aligned} \mathbb{P}_{\mu_{A_{k}}}\left[\tau_{A_{k'}} < \tau_{A_{k}}\right] = [1+o(1)]\, \frac{\mathrm{cap}(A_{k}, A_{k'})}{n\mu(A_{k})}. \end{aligned} $$
    (4.60)

    Note that

    $$\displaystyle \begin{aligned} \mu(A_{k}) = \frac{1}{Z}\sum_{\sigma \in A_{k}}\,\mathrm{e}^{-\beta H_{n}(\sigma)}, \end{aligned} $$
    (4.61)

    and we have already produced bounds for a sum like (4.61) in Lemma 4.2. Referring to (4.28), we see that

    $$\displaystyle \begin{aligned} \begin{array}{rcl} \mathbb{P}_{\mu_{A_{k}}}\left[\tau_{A_{k'}} < \tau_{A_{k}}\right] & \leq&\displaystyle [1+o(1)]\, \frac{\mathrm{cap}( A_{k}, A_{k'})}{\frac{1}{Z}\,\mathrm{e}^{-(\beta + \frac{1}{\sqrt{3}})\sqrt{\log n}} {n \choose k} \mathrm{e}^{-\beta2k\vartheta_{k}}}, \end{array} \end{aligned} $$
    (4.62)
    $$\displaystyle \begin{aligned} \begin{array}{rcl} \mathbb{P}_{\mu_{A_{k}}}\left[\tau_{A_{k'}} < \tau_{A_{k}}\right] & \geq&\displaystyle [1+o(1)]\, \frac{\mathrm{cap}(A_{k}, A_{k'})}{\frac{1}{Z}\,n^{17/6} \mathrm{e}^{-(\beta + \frac{1}{\sqrt{3}})\sqrt{\log n}}{n \choose k}\mathrm{e}^{-\beta 2k\vartheta_{k}}}.\quad \end{array} \end{aligned} $$
    (4.63)

    Finally, we note that if we let Δ h = h − h u, then

    $$\displaystyle \begin{aligned} \frac{\mathrm{e}^{-\beta n[\varPsi^{u}({\mathbf{s}}^{u})-\varPsi^{u}({\mathbf{m}}^{u})]}} {\mathrm{e}^{-\beta n[\varPsi(\mathbf{s})-\varPsi(\mathbf{m})]}} = \mathrm{e}^{\beta nC_{\beta,h,p}\varDelta_{h}}, \end{aligned} $$
    (4.64)

    where C β,h,p is a constant that depends on the parameters β, p and h. A similar expression follows for the ratio

    $$\displaystyle \begin{aligned} {n \choose \frac{n}{2}(1+{\mathbf{s}}^u)}{n \choose \frac{n}{2}(1+{\mathbf{m}}^u)}^{-1} \left[{n \choose \frac{n}{2}(1+\mathbf{s})}{n \choose \frac{n}{2}(1+\mathbf{m}}^{-1}\right]^{-1}. \end{aligned} $$
    (4.65)

    From this the statement of the lemma follows.

5 Invariance Under Initial States and Refined Capacity Estimates

In this section we use Lemma 4.3 to control the time it takes {m(ξ t)}t≥0 to cross the interval [t u, s u] ∩ [t l, s l], which will be a good indicator of the time it takes {ξ t}t≥0 to reach the basin of the stable state s. In particular, our aim is to control this time by comparing it with the time it takes \(\{\theta _{t}^{u}\} _{t\geq 0}\) and \(\{\theta _{t}^{l}\} _{t\geq 0}\) defined in (2.6) to do the same for s u and s l. In Sect. 5.1 we derive bounds on the probability of certain rare events for the dynamics on ERn(p) (Lemmas 5.15.4 below). In Sect. 5.2 we use these bounds to prove that hitting times are close to being uniform in the starting configuration.

5.1 Estimates for Rare Events

In this section we prove four lemmas that serve as a preparation for the coupling in Sect. 6. Lemma 5.1 shows that the dynamics starting anywhere in A M is unlikely to stray away from A M by much during a time window that is comparatively small. Lemma 5.2 bounds the total variation distance between two exponential random variable whose means are close. Lemma 5.3 bounds the tail of the distribution of the first time when all the vertices have been updated. Lemma 5.4 bounds the number of returns to A M before A S is hit.

We begin by deriving upper and lower bounds on the number of jumps \(N_{\xi }\left (t\right )\) taken by the process {ξ t}t≥0 up to time t. By Lemma 3.4, the jump rate from any σ ∈ S n is bounded by

$$\displaystyle \begin{aligned} n\,\mathrm{e}^{-2\beta\left(p+h\right)}\leq\sum_{\sigma'\in S_{n}}r\left(\sigma,\sigma'\right)\leq n. \end{aligned} $$
(5.1)

Hence \(N_{\xi }\left (t\right )\) can be stochastically bounded from above by a Poisson random variable with parameter tn, and from below by a Poisson random variable with parameter tne−2β(p+h). It therefore follows that, for any M ≥ 0,

$$\displaystyle \begin{aligned} \begin{array}{ll} \mathbb{P}\left[N_{\xi}\left(t\right) \geq M\right] &\leq \chi_M(nt),\\[0.2cm] \mathbb{P}\left[N_{\xi}\left(t\right) < M\right] &\leq 1- \chi_M\big(nt\,\mathrm{e}^{-2\beta(p+h)}\big), \end{array} {} \end{aligned} $$
(5.2)

where we abbreviate χ M(u) = eukM u kk!, \(u \in \mathbb {R}\), \(M \in \mathbb {N}\).

5.1.1 Localisation

The purpose of the next lemma is to show that the probability of {ξ t}t≥0 straying too far from A M during its first \(n^{2}\log n\) jumps is very small. The seemingly arbitrary choice of \(n^{2}\log n\) is in fact related to the Coupon Collector’s problem.

Lemma 5.1 (Localisation)

Let ξ 0 ∈ A M, \(T=\inf \left \{ t\geq 0\colon N_{\xi }\left (t\right )\geq n^{2}\log n\right \}\) , and let \(C_{1}\in \mathbb {R}\) be a sufficiently large constant, possibly dependent on p and h (but not on n). Then

$$\displaystyle \begin{aligned} \mathbb{P}_{\xi_{0}}\left[\xi_{t}\in A_{\mathbf{M}+C_{1}n^{5/6}} \mathit{\mbox{ for some }} 0\leq t\leq T\right] \leq \mathrm{e}^{-n^{2/3}}. \end{aligned} $$
(5.3)

Proof

The idea of the proof is to show that {ξ t}t≥0 returns many times to A M before reaching \(A_{\mathbf {M}+C_{1}n^{5/6}}\). The proof comes in three Steps.

  1. 1.

    We begin by showing that \(T\leq n^{2}\log n\) with probability \(\geq 1- \mathrm{e} ^{-n^{3}}\) (in other words, it takes less than \(n^{2}\log n\) time to make \(n^{2}\log n\) steps). Indeed, by the second line of (5.2),

    $$\displaystyle \begin{aligned} \begin{array}{rcl} {} & &\displaystyle \mathbb{P}\left[T>n^{2}\log n\right]\\[0.2cm] & &\displaystyle \qquad = \mathbb{P}\left[N_{\xi}\left(n^{2}\log n\right)<n^{2}\log n\right] \\ & &\displaystyle \qquad \leq 1-\chi_{n^{2} \log n}\Big((n^{3} \log n)\,\mathrm{e}^{-2\beta\left(p+h\right)}\Big) \\ & &\displaystyle \qquad \leq \sum_{k=0}^{n^{2}\log n}\exp\left(-(n^{3}\log n)\, \mathrm{e}^{-2\beta\left(p+h\right)}+k\log\left(\frac{e\,n^{3}\log n}{k}\right)\right) \\ & &\displaystyle \qquad \leq (n^{2}\log n)\exp\left(-(n^{3}\log n)\, \mathrm{e}^{-2\beta\left(p+h\right)}+n^{5/2}\right) \\ & &\displaystyle \qquad \leq \mathrm{e}^{-n^{3}}, \end{array} \end{aligned} $$
    (5.4)

    where for the second inequality we use that \(k! \geq (\frac {k}{e})^{k}\), \(k \in \mathbb {N}\), and for the third inequality that, for n sufficiently large,

    $$\displaystyle \begin{aligned} k\log\left(\frac{en^{3}\log n}{k}\right) \leq (n^{2}\log n) \log\left(e\,n^{3}\log n\right)\leq n^{5/2}. \end{aligned} $$
    (5.5)

    Next, observe that

    $$\displaystyle \begin{aligned} \begin{array}{rcl} & &\displaystyle \mathbb{P}_{\xi_{0}}\left[\xi_{t}\in A_{\mathbf{M}+C_{1}n^{5/6}} \mbox{ for some }0\leq t\leq T\right]\\ & &\displaystyle = \mathbb{P}_{\xi_{0}}\left[\xi_{t}\in A_{\mathbf{M}+C_{1}n^{5/6}} \mbox{ for some }0\leq t\leq T,\, T\leq n^{2}\log n \right]\\ & &\displaystyle \quad +\, \mathbb{P}_{\xi_{0}}\left[\xi_{t}\in A_{\mathbf{M}+C_{1}n^{5/6}} \mbox{ for some }0\leq t\leq T,\, T>n^{2}\log n \right]\\ & &\displaystyle \leq (n^{2}\log n) \max_{\sigma\in A_{\mathbf{M}}} \mathbb{P}_{\sigma}\left[\tau_{A_{\mathbf{M}+C_{1}n^{5/6}}}<\tau_{A_{\mathbf{M}}}\right] + \mathrm{e}^{-n^{3}}. \end{array} \end{aligned} $$
    (5.6)

    Here, the inequality follows from (5.4) and the observation that the event \(\xi _{t}\in A_{\mathbf {M}+C_{1}n^{5/6}} \mbox{ for some } 0\leq t\leq T\) with \(T \leq n^{2}\log n\) is contained in the event that \(A_{\mathbf {M}+C_{1}n^{5/6}}\) is visited before the \((n^{2}\log n)\)-th return to A M. From Lemma 4.3 and (4.47) it follows that

    $$\displaystyle \begin{aligned} \begin{array}{ll} &\max_{\sigma\in A_{\mathbf{M}}}\mathbb{P}_{\sigma} \left[\tau_{A_{\mathbf{m}+C_{1}n^{5/6}}}<\tau_{A_{\mathbf{M}}}\right]\\ {} &\leq\frac{\left(1-a\right)}{2} \mathrm{e}^{2\beta\left(p\left|a\right|+h^{l}+\frac{p}{n}\right)} \mathrm{e}^{-\beta n[\varPsi\left(b\right)-\varPsi\left(a\right)]} {n \choose \frac{n}{2}\left(1+b\right)}{n \choose \frac{n}{2}\left(1+a\right)}^{-1} \end{array} \end{aligned} $$
    (5.7)

    with a = mn and \(b=\left (\mathbf {m}+C_{1}n^{5/6}\right )/n\).

  2. 2.

    Our assumption on the parameters β, p and h is that \(2\beta (p(a+\frac {2}{n})+h) +\log (\frac {1-a}{1+a+\frac {2}{n}})\) is negative in two disjoint regions. Recall that the first region lies between \(a_{1}=\frac {2\mathbf {M}}{n}-1\) and \(a_{2}=\frac {2\mathbf {T}}{n}-1\). This, in particular, implies that the derivative of \(2\beta (p(a+\frac {2}{n})+h) +\log (\frac {1-a}{1+a+\frac {2}{n}})\) at a = a 1 is

    $$\displaystyle \begin{aligned} 2\beta p-\frac{1}{1-a_{1}}-\frac{1}{1+a_{1}}=-\delta_{1}<0 {} \end{aligned} $$
    (5.8)

    for some δ 1 > 0. Recall that \(\varPsi (a)=-\frac {p}{2}a^{2}-ha\), so that \(\varPsi (b)-\varPsi (a) = (a-b)(\frac {p}{2}(a+b)+h)\), which gives

    $$\displaystyle \begin{aligned} \begin{array}{rcl} {} & &\displaystyle \mathrm{e}^{-\beta n\left[\varPsi\left(b\right)-\varPsi\left(a\right)\right]} {n \choose \frac{n}{2}\left(1+b\right)}{n \choose \frac{n}{2}\left(1+a\right)}^{-1} \end{array} \end{aligned} $$
    (5.9)
    $$\displaystyle \begin{aligned} \begin{array}{rcl} & &\displaystyle =\exp\Bigg(\beta n\left(b-a\right)\left(pa+h\right)+\beta n\left(b-a\right)^{2}\frac{p}{2}\\ & &\displaystyle \qquad +\frac{n}{2}\log\left(\frac{\left(1+a\right)^{\left(1+a\right)}\left(1-a\right)^{\left(1-a\right)}} {\left(1+b\right)^{\left(1+b\right)}\left(1-b\right)^{\left(1-b\right)}}\right) + O(\log n)\Bigg), \end{array} \end{aligned} $$
    (5.10)

    where we use Stirling’s approximation in the last line. Since b = a + C 1 n −1∕6, we have

    $$\displaystyle \begin{aligned} \begin{array}{rcl} {} \mbox{r.h.s. (5.9)} = \exp\left(\beta C_{1}n^{5/6}\left(pa+h\right)+\frac{p}{2}\beta C_{1}^{2}n^{2/3} +\frac{n}{2}\log F\right) \end{array} \end{aligned} $$
    (5.11)

    with

    $$\displaystyle \begin{aligned} F=\left(1-U_n(a)\right)^{1+a} \left(1+V_n(a)\right)^{1-a}\left(W_n(a)\right)^{C_{1}n^{-1/6}}, \end{aligned} $$
    (5.12)

    where

    $$\displaystyle \begin{aligned} U_n(a) = \frac{C_{1}n^{-1/6}}{1+a+C_{1}n^{-1/6}}, \qquad V_n(a) = \frac{C_{1}n^{-1/6}}{1-a-C_{1}n^{-1/6}}. \end{aligned} $$
    (5.13)

    From the Taylor series expansion of \(\log \left (1+x\right )\) for \(0\leq \left |x\right |<1\), we obtain

    $$\displaystyle \begin{aligned} \begin{array}{ll} &\frac{n}{2}\left(1+a\right)\log\left(1-U_n(a)\right) \leq \frac{n}{2}\left(1+a\right)\left(-U_n(a)-\frac{1}{2}\left(U_n(a)\right)^{2}\right),\\ &\frac{n}{2}\left(1-a\right)\log\left(1+V_n(a)\right) \leq \frac{n}{2}\left(1-a\right)\left(V_n(a) -\frac{1}{2}\left(V_n(a)\right)^{2}+O(n^{-1/2})\right), \end{array} \end{aligned} $$
    (5.14)

    and

    $$\displaystyle \begin{aligned} \begin{array}{rcl} & &\displaystyle \frac 12 C_{1}n^{5/6}\log\left(\frac{U_n(a)}{V_n(a)}\right)\\ & &\displaystyle \quad = \frac 12 C_{1}n^{5/6}\log\left(\frac{1-a}{1+a} \frac{1-a-C_{1}n^{-1/6}}{1-a}\frac{1+a}{1+a+C_{1}n^{-1/6}}\right)\\ & &\displaystyle \quad \leq \frac 12 C_{1}n^{5/6}\left(\log\left(\frac{1-a}{1+a}\right) -\frac{C_{1}n^{-1/6}}{1-a}-U_n(a)-O(n^{-2/3})\right). \end{array} \end{aligned} $$
    (5.15)

    By the definition of m, we have

    $$\displaystyle \begin{aligned} C_{1}n^{5/6}\left(\beta\left(pa+h\right)+\log\left(\frac{1-a}{1+a}\right)\right)\leq0. \end{aligned} $$
    (5.16)

    Hence we get

    $$\displaystyle \begin{aligned} \begin{array}{ll} &\beta C_{1}n^{5/6}\left(pa+h\right)+\frac{p}{2}\beta n^{2/3}C_{1}^{2}+\frac{n}{2}\log F\\[0.2cm] &\leq \frac{p}{2}\beta n^{2/3}C_{1}^{2}-\frac{\frac 12 C_{1}\left(1+a\right)n^{5/6}} {1+a+C_{1}n^{-1/6}}+\frac{\frac 12 C_{1}\left(1-a\right)n^{5/6}}{1-a-C_{1}n^{-1/6}} -\frac 12 C_{1}^{2}n^{2/3}\,G \end{array} \end{aligned} $$
    (5.17)

    with

    $$\displaystyle \begin{aligned} \begin{array}{ll} G&=\frac{1}{1-a}+\frac{1}{1+a+C_{1}n^{-1/6}}\\ &\quad +\left(\frac{1-a}{2}\right) \left(\frac{1}{1-a-C_{1}n^{-1/6}}\right)^{2}+\left(\frac{1+a}{2}\right)\left(\frac{1}{1+a+C_{1}n^{-1/6}}\right)^{2}. \end{array} \end{aligned} $$
    (5.18)

    Hence

    $$\displaystyle \begin{aligned} \begin{array}{ll} \mbox{r.h.s. (5.17)} &\leq \frac{p}{2}\beta n^{2/3}C_{1}^{2}\\ &\quad +\frac 12 C_{1}^{2} n^{2/3}\left(\frac{1}{1-a-cn^{-1/6}} +\frac{1}{1+a+C_{1}n^{-1/6}}\right)-\frac 12 C_{1}^{2}n^{1/6}\,G\\[0.2cm] &\leq n^{2/3} \frac 12 C_{1}^{2}\left(p\beta-\frac{1}{2}\frac{1}{1-a-C_{1}n^{-1/6}} -\frac{1}{2}\frac{1}{1+a+C_{1}n^{-1/6}}+O(n^{-1/6})\right)\\[0.2cm] &= n^{2/3} \frac 12 C_{1}^{2}\left(p\beta-\frac{1}{2}\frac{1}{1-a}-\frac{1}{2}\frac{1}{1+a} +O(n^{-1/3})\right) \leq -\frac 14 C_{1}^{2}\delta_{1}n^{2/3}. \end{array} \end{aligned} $$
    (5.19)
  3. 3.

    Combine (5.7), (5.9) and (5.17), and pick C 1 large enough, to get the claim in (5.3).

5.1.2 Update Times

The following two lemmas give useful bounds for the coupling scheme. The symbol ≃ stands for equality in distribution.

Lemma 5.2 (Total Variation Between Exponential Distributions)

Let \(X\simeq \mathrm {Exp}\left (\lambda \right )\) and \(Y\simeq \mathrm {Exp}\left (\lambda +\delta \right )\) . Then the total variation distance between the distributions of X and Y  is bounded by

$$\displaystyle \begin{aligned} d_{TV}\left(X,Y\right)\leq\frac{2\delta}{\lambda+\delta}. \end{aligned} $$
(5.20)

Proof

Elementary. □

Lemma 5.3 (Update Times)

Let \(T_{update}^{\xi }\) be the first time {ξ t}t≥0 has experienced an update at every vertex:

$$\displaystyle \begin{aligned} T_{update}^{\xi}=\inf\left\{ t\geq0\colon\,\forall\,v\in V\,\exists\,0 \leq s\leq t\colon\, \xi_{s}\left(v\right)=-\xi_{0}\left(v\right)\right\} . \end{aligned} $$
(5.21)

Then, for any y > 0,

$$\displaystyle \begin{aligned} \mathbb{P}\left[T_{update}^{\xi}\geq y\right] \leq\frac{\exp\left(-\lambda y + \log n\right)}{1-\exp\left(-\lambda y\right)}, \qquad \lambda=\mathrm{e}^{-\beta\left(2p+h\right)}. \end{aligned} $$
(5.22)

Proof

Recall that for σ ∈ S n and v ∈ V , σ v denotes the configuration satisfying \(\sigma ^{v}\left (w\right )=\sigma \left (w\right )\) for w ≠ v, and \(\sigma ^{v}\left (v\right ) =-\sigma \left (v\right )\). From (1.3) and (1.6) it follows that

$$\displaystyle \begin{aligned} r\left(\sigma,\sigma^{v}\right)\geq\lambda, \end{aligned} $$
(5.23)

and so \(T_{update}^{\xi }\) is dominated by the maximum of n i.i.d. \(\mathrm {Exp}\left (\lambda \right )\) random variables. Therefore

$$\displaystyle \begin{aligned} \begin{array}{rcl} \mathbb{P}\left[T_{update}^{\xi}\leq y\right] & \geq &\displaystyle \left(1-\mathrm{e}^{-\lambda y}\right)^{n} = \exp\big(n\log (1-\mathrm{e}^{-\lambda y})\big)\\ & \geq &\displaystyle \exp\left(-\frac{n\mathrm{e}^{-\lambda y}}{1-\mathrm{e}^{-\lambda y}}\right) \geq 1-\frac{n\mathrm{e}^{-\lambda y}}{1-\mathrm{e}^{-\lambda y}}, \end{array} \end{aligned} $$
(5.24)

which proves the claim. □

5.1.3 Returns

The next lemma establishes a lower bound on the number of returns to A M before A S is reached. Let \(g_{\xi _{0}}\left (A_{\mathbf {M}},A_{\mathbf {S}}\right )\) denote the number of jumps that \(\left \{ \xi _{t}\right \} _{t\geq 0}\) makes into the set A M before reaching A S. More precisely, let \(\left \{ s_{i}\right \} _{i\in \mathbb {N}_{0}}\) denote the jump times of the process \(\left \{ \xi _{t}\right \} _{t\geq 0}\), i.e., s 0 = 0 and

$$\displaystyle \begin{aligned} s_{i}=\inf\left\{ s>s_{i-1}\colon\,\xi_{s}\neq\xi_{s_{i-1}}\right\}, \end{aligned} $$
(5.25)

and define for the process (ξ t)t≥0 starting at ξ 0,

$$\displaystyle \begin{aligned} g_{\xi_{0}}\left(A_{\mathbf{M}},A_{\mathbf{S}}\right) =\left|\left\{ i\in\mathbb{N}_0\colon\,\xi_{s_{i}}\in A_{\mathbf{M}}, \xi_{s} \notin A_{\mathbf{S}}\,\,\forall\, s \leq s_{i}\right\} \right|. \end{aligned} $$
(5.26)

Lemma 5.4 (Bound on Number of Returns)

For any ξ 0 ∈ A M and any δ > 0,

$$\displaystyle \begin{aligned} \mathbb{P}_{\xi_{0}}\big[g_{\xi_{0}}\left(A_{\mathbf{M}},A_{\mathbf{S}}\right) < \mathrm{e}^{[R_{p}(t) - R_{p}(m)]n}\big] \leq \mathrm{e}^{-\delta n + Cn^{2/3}} {} \end{aligned} $$
(5.27)

for some constant C that does not depend on n.

Proof

Let Y  be a geometric random variable with probability of success given by \(\mathrm{e} ^{-[R_{p}(t) - R_{p}(m)]n +Cn^{2/3}}\). Then, by Lemma 4.3, every time the process {ξ t}t≥0 starts all over from A M, it has a probability less than \(\mathbb {P}^{u}_{\xi }[\tau _{S^{u}} < \tau _{M^{u}}]\) of making it to A S. Using the bounds from that lemma, it follows that Y  is stochastically dominated by \(g_{\xi _{0}}\left (A_{\mathbf {M}},A_{\mathbf {T}}\right )\). Hence

$$\displaystyle \begin{aligned} \begin{array}{ll} \mathbb{P}\left[Y\leq \mathrm{e}^{([R_{p}(t) - R_{p}(m)]-\delta)n}\right] &\leq \mathrm{e}^{([R_{p}(t) - R_{p}(m)]-\delta)n}\,\mathrm{e}^{-[R_{p}(t) - R_{p}(m)]n+Cn^{2/3}}\\ &\leq \mathrm{e}^{-\delta n+Cn^{2/3}}. \end{array} \end{aligned} $$
(5.28)

5.2 Uniform Hitting Time

In this section we show that if Theorem 1.4 holds for some initial configuration in A M, then it holds for all initial configurations in A M. The proof of this claim, which will be needed in Sect. 7, relies on a coupling construction in which the two processes starting anywhere in A M meet with a sufficiently high probability long before either one reaches A S. Details of the coupling construction are given in Sect. 6.

The idea of the proof is that for {ξ t}t≥0 starting in A M the starting configuration is irrelevant for the metastable crossover time because the latter is very large. We will verify this by showing that “local mixing” takes place long before the crossover to A S occurs. More precisely, we will show that if \(\xi _{0},\tilde {\xi }_{0}\) are any two initial configurations in A M, then there is a coupling such that the trajectory tξ t intersects the trajectory \(t \mapsto \tilde {\xi }_{t}\) well before either strays too far from A M. The coupling is such that there is a small but sufficiently large probability that ξ t and \(\tilde {\xi }_{t}\) are identical once every spin at every vertex has had a chance to update, which occurs after a time t that is not too large. It follows that after a large number of trials with high probability the two trajectories intersect.

Proof

Consider two copies of the process, {ξ t}t≥0 and \(\{\tilde {\xi }_{t}\} _{t\geq 0}\). Let δ > 0 and \(T_{0}=\mathrm{e} ^{([R_{p,\beta ,h}(\mathbf {t})-R_{p,\beta ,h}(\mathbf {m})]-\delta )n}\). In order to simplify the notation and differentiate between the two processes, we abbreviate the crossover time \(\tau _{A_{\mathbf {S}}}\) by

$$\displaystyle \begin{aligned} \tau^{\xi}=\inf\left\{ t\geq s\colon\xi_{t}\in A_{\mathbf{S}}\right\}, \end{aligned} $$
(5.29)

with a similar definition for \(\tau ^{\tilde {\xi }}\). We will show that \(\mathbb {E}_{\xi _{0}}[\tau ^{\xi }] \leq [1+o_{n}(1)]\mathbb {E}_{\tilde {\xi }_{0}} [\tau ^{\tilde {\xi }} ]\), with the proof for the inequality in the other direction being identical. The proof comes in two Steps.

  1. 1.

    We start with the following observation. From Corollary 3.9, we immediately get that

    $$\displaystyle \begin{aligned} \mathbb{E}_{\xi_{0}}\big[\tau^{\xi}\big] / \mathbb{E}_{\xi_{0}}\big[\tau^{\tilde{\xi}}\big] = \mathrm{e}^{O(n^{2/3})}. {} \end{aligned} $$
    (5.30)

    Furthermore, the relation in (5.30) together with the initial steps in the proof of Theorem 1.4 implies that, for any initial configuration ξ 0,

    $$\displaystyle \begin{aligned} \mathbb{E}_{\xi_{0}}\left[\tau^{\xi}\right]=\mathrm{e}^{n[R_{p,\beta,h}(\mathbf{t})-R_{p,\beta,h}(\mathbf{m})] + O(n^{2/3})}. {} \end{aligned} $$
    (5.31)

    Note: Step 2 in Sect. 7 shows that if ξ 0 is distributed according to the law \(\mu _{A_{\mathbf {M}}}\), then

    $$\displaystyle \begin{aligned} \mathbb{E}_{\xi_{0}}[\tau^{\xi}] = \mathrm{e}^{n[R_{p,\beta,h}(\mathbf{t}) -R_{p,\beta,h}(\mathbf{m})] + O(\log n)}. \end{aligned} $$
    (5.32)

    (Recall from Sect. 4.3 that \(\mu _{A_{\mathbf {M}}}\) is the equilibrium distribution μ conditioned on the set A M.) Let \(\{\xi _{t},\tilde {\xi }_{t}\} _{t\geq 0}\) be the coupling of the two processes described in Sect. 6, and note that

    (5.33)

    where \(\hat {\mathbb {E}}\) denotes expectation with respect to the law of the joint process. The above inequality splits the expectation based on whether the coupling has succeeded (in merging the two processes) by time T 0 or not. Note that

    (5.34)

    and

    (5.35)

    Also note from the definition of the coupling that, for any σ ∈ S n and any A ⊆ S n, \(\hat {\mathbb {E}}_{(\sigma ,\sigma )}[\tau _{A}^{\xi }]=\mathbb {E}_{\sigma }[\tau _{A}]\) because the two trajectories merge when they start from the same vertex. Hence

    (5.36)

    where we use the Markov property. Similarly, observe that

    (5.37)

    and

    (5.38)

    Thus, (5.33) becomes

    $$\displaystyle \begin{aligned} \begin{array}{ll} \mathbb{E}_{\xi_{0}}\left[\tau^{\xi}\right] &\leq 2T_{0}+\mathbb{E}_{\tilde{\xi}_{0}}\left[\tau_{A_{\mathbf{S}}}\right] +\left(\mathbb{P}_{\tilde{\xi}_{0}}\left[\tau_{A_{\mathbf{S}}}<T_{0}\right] +\hat{\mathbb{P}}_{\left(\xi_{0},\tilde{\xi}_{0}\right)}\big[\xi_{T_{0}} \neq \tilde{\xi}_{T_{0}}\big]\right)\\ &\qquad \qquad \qquad \qquad \qquad \times \Big(T_{0} +\max_{\sigma\in\bigcup_{i<\mathbf{S}}A_{i}} \mathbb{E}_{\sigma}\left[\tau_{A_{\mathbf{S}}}\right]\Big). \end{array} \end{aligned} $$
    (5.39)

    We will show that the leading term in the right-hand side is \(\mathbb {E}_{\tilde {\xi }_{0}} [\tau _{A_{\mathbf {S}}}]\), and all other terms are of smaller order. From (5.31) we know that T 0 is of smaller order, and that

    $$\displaystyle \begin{aligned} \max_{\sigma\in\bigcup_{i<\mathbf{S}}A_{i}} \mathbb{E}_{\sigma}\left[\tau_{A_{\mathbf{S}}}\right] = \mathrm{e}^{O(n^{2/3})}\mathbb{E}_{\tilde{\xi}_{0}}\left[\tau_{A_{\mathbf{S}}}\right]. \end{aligned} $$
    (5.40)

    Hence it suffices to show that the sum \(\mathbb {P}_{\tilde {\xi }_{0}}[\tau _{A_{\mathbf {S}}}<T_{0}] +\hat {\mathbb {P}}_{(\xi _{0},\tilde {\xi }_{0})}[\xi _{T_{0}} \neq \tilde {\xi }_{T_{0}}]\) is exponentially small. We will show that it is bounded from above by eδn.

  2. 2.

    By Corollary 6.3, the probability \(\hat {\mathbb {P}}_{(\xi _{0},\tilde {\xi }_{0})} [\xi _{T_{0}}\neq \tilde {\xi }_{T_{0}}]\) is bounded from above by \(\mathrm{e} ^{-\delta n + O(n^{2/3})}\). To bound \(\mathbb {P}_{\tilde {\xi }_{0}}[\tau _{A_{\mathbf {S}}}<T_{0}]\), we first need to limit the number of steps that \(\{\tilde {\xi _t}\}_{t\geq 0}\) can take until time T 0. From (5.2) and Stirling’s approximation we have that

    $$\displaystyle \begin{aligned} \mathbb{P}\left[N_{\tilde{\xi}}\left(T_{0}\right)\ge 3nT_{0}\right] \leq & \sum_{k=0}^{\infty} \mathrm{e}^{nT_{0}+k}\left(\frac{nT_{0}}{3nT_{0}+k}\right)^{3nT_{0}+k}\\ \leq & \mathrm{e}^{nT_{0}}\left(\frac{1}{3}\right)^{3nT_{0}} \sum_{k=0}^{\infty} \mathrm{e}^k \left(\frac{1}{3}\right)^{k} \leq 11\left(0.91\right)^{3n \mathrm{e}^{\frac 12 [R_{p,\beta,h}(\mathbf{t})-R_{p,\beta,h}(\mathbf{m})]}}. \end{aligned} $$
    (5.41)

    It therefore follows that with high probability \(\{\tilde {\xi }_t\}_{t \geq 0}\) does not make more than 3nT 0 steps until time T 0. Hence

    $$\displaystyle \begin{aligned} \mathbb{P}_{\tilde{\xi}_{0}}\big[\tau_{A_{\mathbf{S}}}<T_{0}\big] \leq \mathbb{P}_{\tilde{\xi}_{0}}\left[\tau_{A_{\mathbf{S}}}<T_{0}, N_{\tilde{\xi}}\left(T_{0}\right)<3nT_{0}\right] {+}11\left(0.91\right)^{3n \mathrm{e}^{\frac 12\ [R_{p,\beta,h}(\mathbf{t})-R_{p,\beta,h}(\mathbf{m})]}}. \end{aligned} $$
    (5.42)

    Finally, note that the event \(\{\tau _{A_{\mathbf {S}}}^{\tilde {\xi }}<T_{0},N_{\tilde {\xi }}(T_{0})<3nT_{0}\}\) implies that \(\{\tilde {\xi }_{t}\}_{t\geq 0}\) makes fewer than 3nT 0 returns to the set A M before reaching A S. By Lemma 5.4, the probability of this event is bounded from above by \(3nT_{0} \mathrm{e} ^{([R_{p,\beta ,h}(\mathbf {t})-R_{p,\beta ,h}(\mathbf {m})]-\delta )n+Cn^{2/3}} = 3n\mathrm{e} ^{-\frac 12 \delta n+Cn^{2/3}}\), and hence

    $$\displaystyle \begin{aligned} \mathbb{P}_{\tilde{\xi}_{0}}\big[\tau_{A_{\mathbf{S}}}^{\tilde{\xi}}<T_{0}\big] \leq 4n\, \mathrm{e}^{-\frac 12\delta n+Cn^{2/3}}. {} \end{aligned} $$
    (5.43)

    Finally, from (5.43) we obtain

    $$\displaystyle \begin{aligned} \mathbb{E}_{\xi_{0}}\big[\tau_{A_{\mathbf{S}}}^{\xi}\big] =\mathbb{E}\big[\tau_{A_{\mathbf{S}}}^{\tilde{\xi}}\big]\left[1+o\left(1\right)\right], \end{aligned} $$
    (5.44)

    which settles the claim.

6 A Coupling Scheme

In this section we define a coupling of (ξ t)t≥0 and \(\{\tilde {\xi }_{t}\}_{t \geq 0}\) with arbitrary starting configurations in A M. The coupling is divided into a short-term scheme, defined in Sect. 6.1 and analysed in Lemma 6.1 below, followed by a long-term scheme, defined in Sect. 6.2 and analysed Corollary 6.3 below. The goal of the coupling is to keep the process {m(ξ t)}t≥0 bounded by \(\{\theta _{t}^{u}\}_{t\geq 0}\) from above and bounded by \(\{\theta _{t}^{l}\}_{t\geq 0}\) from below (the precise meaning will become clear in the sequel).

6.1 Short-Term Scheme

Lemma 6.1 (Short-Term Coupling)

With \(\mathbb {P}_{\mathrm {ER}_n(p)}\) -probability tending to 1 as n ∞, there is a coupling \(\{\xi _{t},\tilde {\xi }_{t}\}_{t \geq 0}\) of {ξ t}t≥0 and \(\{\tilde {\xi }_{t}\}_{t \geq 0}\) such that

$$\displaystyle \begin{aligned} \mathbb{P}\big[\xi_{2n}\neq\tilde{\xi}_{2n}] \leq O\left(\mathrm{e}^{-n^{-2/3}}\right) \end{aligned} $$
(6.1)

for any initial states ξ 0 ∈ A M and \(\tilde {\xi }_{0} \in A_{\mathbf {M}}\).

Proof

The main idea behind the proof is as follows. Define

$$\displaystyle \begin{aligned} W_{1}^{t} =\lbrace{ v \in V\colon\, \xi_{t}(v)=-\tilde{\xi}_{t}(v)\rbrace} = \xi_{t} \varDelta \tilde{\xi}_{t}, \end{aligned} $$
(6.2)

i.e., the symmetric difference between the two configurations ξ t and \(\tilde {\xi }_{t}\), and

$$\displaystyle \begin{aligned} W_{2}^{t} = \lbrace{ v \in V\colon\, \xi_{t}(v)=\tilde{\xi}_{t}(v)\rbrace} = V \setminus W_{1}^{t}. \end{aligned} $$
(6.3)

The coupling we are about to define will result in the set \(W_{1}^{t}\) shrinking at a higher rate than the set \(W_{2}^{t}\), which will imply that \(W_{1}^{t}\) contracts to the empty set. The proof comes in eight Steps.

  1. 1.

    We begin with bounds on the relevant transition rates that will be required in the proof. Recall from Lemma 3.4 (in particular, (3.19) and (3.21)) that with \(\mathbb {P}_{\mathrm {ER}_{n}(p)}\)-probability at least 1 −e−2n there are at most 2n 2∕3 vertices \(v\in \overline {\xi _{t}}\) (i.e., \(\xi _{t}\left (v\right )=-1\)) such that \(|E(v,\xi _t)|=\left |\left \{ w\in \xi _{t} \colon \, \left (v,w\right )\in E\right \} \right |\geq p\left |\xi _{t}\right |+n^{2/3}\), and similarly at most 2n 2∕3 vertices \(v\in \overline {\xi _{t}}\) such that \(\left |E(v,\xi _t)\right |\leq p\left |\xi _{t}\right |-n^{2/3}\). Analogous bounds are true for \(\tilde {\xi }_{t}\), t ≥ 0. Denote the set of bad vertices for ξ t by

    $$\displaystyle \begin{aligned} B_{t}=\big\{ v\in \overline{\xi_{t}}\colon\,\big| |E(v,\xi_t)|-p|\xi_{t}|\big|\geq n^{2/3}\}, \end{aligned} $$
    (6.4)

    and the set of bad vertices for \(\tilde {\xi }_{t}\) by \(\tilde {B}_{t}\). Let \(\hat {B}_{t}=B_{t}\cup \tilde {B}_{t}\). Recall that \(\xi _{t}^{v}\) denotes the configuration obtained from ξ t by flipping the sign at vertex v ∈ V . If \(v\notin \hat {B}_{t}\), then from (1.3) and Lemma 3.2 it follows that, for vξ t,

    $$\displaystyle \begin{aligned} \begin{array}{rcl} H_{n}\left(\xi_{t}^{v}\right)-H_{n}\left(\xi_{t}\right) & = &\displaystyle \frac{2}{n}\left(\left|\partial_{E}\xi_{t}^{v}\right|-\left|\partial_{E}\xi_{t}\right|\right)-2h\\ & = &\displaystyle \frac{2}{n}\left(\mathrm{deg}\left(v\right)-2\left|E(v,\xi_t)\right|\right)-2h\\ & \leq &\displaystyle \frac{2}{n}\left(pn+n^{1/2}\log n-2p\left|\xi_{t}\right|+2n^{2/3}\right)-2h, \end{array} \end{aligned} $$
    (6.5)

    and similarly, for v ∈ ξ t,

    $$\displaystyle \begin{aligned} H_{n}\left(\xi_{t}^{v}\right)-H_{n}\left(\xi_{t}\right)\leq\frac{2}{n}\left(pn+n^{1/2}\log n -2p\left(n-\left|\xi_{t}\right|\right)+2n^{2/3}\right)+2h. \end{aligned} $$
    (6.6)

    Again, by (1.3) and Lemma 3.2, we have similar lower bounds, namely, if \(v\notin \hat {B}_{t}\), then, for vξ t,

    $$\displaystyle \begin{aligned} H_{n}(\xi_{t}^{v})-H_{n}(\xi_{t}) \geq\frac{2}{n}\left(pn-n^{1/2}\log n-2p\left|\xi_{t}\right|-2n^{2/3}\right)-2h, \end{aligned} $$
    (6.7)

    and, for v ∈ ξ t,

    $$\displaystyle \begin{aligned} H_{n}(\xi_{t}^{v})-H_{n}(\xi_{t}) \geq\frac{2}{n}\left(pn-n^{1/2}\log n-2p\left(n-\left|\xi_{t}\right|\right) -2n^{2/3}\right)+2h. \end{aligned} $$
    (6.8)

    Identical bounds hold for \(H_{n}(\tilde {\xi }_{t}^{v})-H_{n}(\tilde {\xi }_{t})\). Therefore, if \(v\notin \hat {B}_{t}\), and if either \(v\in \xi _{t}\cap \tilde {\xi }_{t}\) or \(v\notin \xi _{t}\cup \tilde {\xi }_{t}\), then

    $$\displaystyle \begin{aligned} \begin{array}{ll} &\left|r(\xi_{t},\xi_{t}^{v})-r(\tilde{\xi}_{t},\tilde{\xi}_{t}^{v})\right|\\ &=\left| \mathrm{e}^{-\beta[H_{n}(\xi_{t}^{v})-H_{n}(\xi_{t})]_{+}} -\mathrm{e}^{-\beta[H_{n}(\tilde{\xi}_{t}^{v})-H_{n}(\tilde{\xi}_{t})]_{+}}\right|\\ &=\mathrm{e}^{-\beta[H_{n}(\xi_{t}^{v})-H_{n}(\xi_{t})]_{+}} \left|1-\mathrm{e}^{\beta\big([H_{n}(\xi_{t}^{v}) -H_{n}(\xi_{t})]_{+}-[H_{n}(\tilde{\xi}_{t}^{v}) -H_{n}(\tilde{\xi}_{t})]_{+}\big)}\right|\\ &\leq\left[1+o_{n}(1)\right] \mathrm{e}^{-\beta[H_{n}(\xi_{t}^{v}) -H_{n}(\xi_{t})]_{+}} \left(\mathrm{e}^{8\beta n^{-1/3} + \frac{4p}{n}\big(|\xi_{t}|-|\tilde{\xi}_{t}| \big)}-1\right)\\ & \leq\left[1+o_{n}(1)\right]\left(\mathrm{e}^{8\beta n^{-1/3} + \frac{4p}{n}\big(|\xi_{t}|-|\tilde{\xi}_{t}|\big)}-1\right)\\ & \leq\left[1+o_{n}(1)\right]\left(8\beta n^{-1/3} + \frac{4p}{n}\big(|\xi_{t}|-|\tilde{\xi}_{t}|\big)\right). \end{array} \end{aligned} $$
    (6.9)
  2. 2.

    Having established the above bounds on the transition rates, we give an explicit construction of the coupling \(\{\xi _{t},\tilde {\xi }_{t}\} _{t\geq 0}\).

Definition 6.2

  1. (I)

    We first define the coupling for time t = 0. For t > 0 this coupling will be renewed after each renewal of \(\{\xi _{t},\tilde {\xi }_{t}\}_{t \geq 0}\), i.e., whenever either of the two processes jumps to a new state. To that end, for every \(v\in W_{2}^{0}\) (i.e., \(\xi _{0}(v)=\tilde {\xi }_{0}(v)\)), couple the exponential random variables \(e_{0}^{v}\simeq \mathrm {Exp}(r(\xi _{0},\xi _{0}^{v}))\) and \(\tilde {e}_{0}^{v} \simeq \mathrm {Exp} (r(\tilde {\xi }_{0},\tilde {\xi }_{0}^{v}))\) associated with the transitions \(\xi _{0}\to \xi _{0}^{v}\) and \(\tilde {\xi }_{0} \to \tilde {\xi }_{0}^{v}\) according to the following scheme:

    1. 1.

      Choose a point

      $$\displaystyle \begin{aligned} \left(x,y\right)\in\{\left(x',y'\right)\colon\,0\leq x'<\infty,\,0\leq y'\leq r\left(\xi_{0}, \xi_{0}^{v}\right) \mathrm{e}^{-r\left(\xi_{0},\xi_{0}^{v}\right)x'}\} \end{aligned}$$

      uniformly and set \(e_{0}^{v}=x\). Note that, indeed, this gives \(e_{0}^{v} \simeq \mathrm {Exp}(r(\xi _{0},\xi _{0}^{v}))\).

    2. 2.

      If the value y from step 1 satisfies \(y\leq r(\tilde {\xi }_{0},\tilde {\xi }_{0}^{v}) \exp (-r(\tilde {\xi }_{0},\tilde {\xi }_{0}^{v})x)\), then set \(\tilde {e}_{0}^{v} =e_{0}^{v}=x\). Else, choose

      $$\displaystyle \begin{aligned} \begin{array}{ll} &\left(x*,y*\right)\in\\ &\left\{ \left(x',y'\right)\colon0\leq x'<\infty,\, r\left(\xi_{0},\xi_{0}^{v}\right) \mathrm{e}^{-r\left(\xi_{0},\xi_{0}^{v}\right)x'} <y'\leq r(\tilde{\xi}_{0},\tilde{\xi}_{0}^{v}) \mathrm{e}^{-r(\tilde{\xi}_{0},\tilde{\xi}_{0}^{v})x'}\right\} \end{array} \end{aligned}$$

      uniformly and independently from the sampling in step 1, and set \(\tilde {e}_{0}^{v}=x*\). Note that this too gives \(e_{0}^{v}\simeq \mathrm {Exp}(r(\tilde {\xi }_{0},\tilde {\xi }_{0}^{v}))\).

  2. (II)

    For every \(v\in W_{1}^{0}\), sample the random variables \(e_{0}^{v}\simeq \mathrm {Exp}(r(\xi _{0},\xi _{0}^{v}))\) and \(\tilde {e}_{0}^{v} \simeq \mathrm {Exp}(r(\tilde {\xi }_{0},\tilde {\xi }_{0}^{v}))\) associated with the transitions \(\xi _{0}\to \xi _{0}^{v}\) and \(\tilde {\xi }_{0} \to \tilde {\xi }_{0}^{v}\) independently. At time t = 0, we use the above rules to define the jump times associated with any vertex v ∈ V . Recall that \(W_{2}^{0}\) is the set of vertices where the two configurations agree in sign. The aim of the coupling defined above is to preserve that agreement. Following every renewal, we re-sample all transition times anew (i.e., we choose new copies of the exponential variables as was done above). We proceed in this way until the first of the following two events happens: either \(\xi _{t} = \tilde {\xi }_{t}\), or \(n\log n\) transitions have been made by either one of the two processes.

  1. 3.

    Note that the purpose of limiting the number of jumps to \(n\log n\) is to permit us to employ Lemma 5.1, which in turn we use to maintain control on the two processes being similar in volume. Further down we will also show that, with high probability, in time 2n no more than \(n\log n\) transitions occur. By (6.9) and Lemma 5.2, if \(v\notin \hat {B}_{t}\), then

    $$\displaystyle \begin{aligned} \mathbb{P}\left[e_{t}^{v}\neq e_{t}^{v}\right] \leq\frac{2(8\beta n^{-1/3}+ \frac{4p}{n} (|\xi_{t}|-|\tilde{\xi}_{t}|))}{\mathrm{e}^{-2\beta(p+h)}}. {} \end{aligned} $$
    (6.10)

    On the other hand, if \(v\in \hat {B}_{t}\) and we let \(z=\frac {2\beta \left (p+h\right )}{1-\mathrm{e} ^{-2\beta (p+h)}}\), then

    $$\displaystyle \begin{aligned} \begin{array}{rcl} \mathbb{P}\left[e_{t}^{v}\neq e_{t}^{v}\right] & =&\displaystyle d_{TV}\left(e_{t}^{v},e_{t}^{v}\right) \leq \mathrm{e}^{-2\beta\left(p+h\right)}\int_{0}^{z} \mathrm{d}x\, \exp\left(-x\mathrm{e}^{-2\beta\left(p+h\right)}\right) \\ & =&\displaystyle 1-\exp\left(-\frac{2\beta\left(p+h\right)\mathrm{e}^{-2\beta\left(p+h\right)}} {1-\mathrm{e}^{-2\beta\left(p+h\right)}}\right). {} \end{array} \end{aligned} $$
    (6.11)

    Observe that, for \(v\in W_{1}^{t}\), with \(\mathbb {P}_{\mathrm {ER}_n(p)}\)-high probability

    $$\displaystyle \begin{aligned} \sum_{v\in W_{1}^{t}} \left[r(\xi_{t},\xi_{t}^{v}) + r(\tilde{\xi}_{t},\tilde{\xi}_{t}^{v})\right] \geq \left[1+o_{n}(1)\right]\left|W_{1}^{t}\right|. \end{aligned} $$
    (6.12)

    Indeed, by the concentration inequalities of Lemma 3.2 and the bound in Lemma 5.1, it follows that |ξ t| and \(|\tilde {\xi }_{t}|\) are of similar magnitude:

    $$\displaystyle \begin{aligned} \mathbb{P}\big[||\xi_{t}| - |\tilde{\xi}_{t}|| \geq n^{5/6}\big] \leq \mathrm{e}^{-n^{2/3}}. \end{aligned} $$
    (6.13)

    Therefore, with \(\mathbb {P}_{\mathrm {ER}_{n}(p)}\)-high probability, for all but O(n 2∕3) such v,

    $$\displaystyle \begin{aligned} H(\xi_{t}) - H(\xi_{t}^{v}) = \left[1+o_{n}(1)\right] \big[H\big(\tilde{\xi}_{t}^{v}\big)-H\big(\tilde{\xi}_{t})\big], \end{aligned} $$
    (6.14)

    from which (6.12) follows. The rate at which the set \(W_{2}^{t}\) shrinks is equal to the rate at which it loses \(v\in W_{2}^{t}\) such that \(v \notin \hat {B}_{t}\), plus the rate at which it loses \(v\in W_{2}^{t}\) such that \(v \in \hat {B}_{t}\). From (6.9) it follows that the former is bounded from above by \(|W_{2}^{t}|(8\beta n^{-1/3} + \frac {4p}{n}(|\xi _{t}|-|\tilde {\xi }_{t}|))\), while by (3.19) the latter is bounded by 4n 2∕3. Therefore, defining the stopping time

    $$\displaystyle \begin{aligned} \upsilon_{i} = \inf \left\{t\colon\, |W_1^{t}|=i\right\}, \end{aligned} $$
    (6.15)

    we have that

    $$\displaystyle \begin{aligned} \mathbb{P}_{(\xi_{t},\tilde{\xi}_{t})} \big[ \upsilon_{|W_1^{t}|-1} < \upsilon_{|W_1^{t}|+1} \big] \geq \frac{|W_{1}^{t}|}{|W_{2}^{t}| [8\beta n^{-1/3} + \frac{4p}{n}(|\xi_{t}| -|\tilde{\xi}_{t}|)] + 4n^{2/3}+|W_{1}^{t}|}. \end{aligned} $$
    (6.16)

    From Lemma 5.1 we know that (with probability \(\geq 1-\mathrm{e} ^{-n^{2/3}}\)) neither |ξ t| nor \(|\tilde {\xi }_{t}|\) will stray beyond M + Cn 5∕6 and M − Cn 5∕6 within \(n^{2}\log n\) steps. Thus,

    $$\displaystyle \begin{aligned} \left||\xi_{t}|-|\tilde{\xi}_{t}|\right| \leq Cn^{5/6}. \end{aligned} $$
    (6.17)

    Hence, for \(\left |W_{1}^{t}\right | \geq n^{6/7}\) we have that (6.16) is equal to 1 − o n(1).

  2. 4.

    Next suppose that \(\left |W_{1}^{t}\right | < n^{6/7}\). To bound the rate at which the set \(W_{2}^{t}\) shrinks, we argue as follows. The rate at which a matching vertex v becomes non-matching equals

    $$\displaystyle \begin{aligned} |r( \xi_{t},\xi_{t}^v) - r(\tilde{\xi}_{t},\tilde{\xi}_{t}^{v})|. \end{aligned} $$
    (6.18)

    Let

    $$\displaystyle \begin{aligned} \begin{array}{ll} B_{1} &= -2h + \frac{2}{n}\big(\mathrm{deg}(v) - 2|E(v,\xi_{t})|\big),\\[0.2cm] B_{2} &= -2h + \frac{2}{n}\big(\mathrm{deg}(v) - 2|E(v,\tilde{\xi}_{t})|\big),\\[0.2cm] B_{3} &= -h + \frac{1}{n}\big(\mathrm{deg}(v) - 2|E(v,\xi_{t}\cap \tilde{\xi}_{t})|\big). \end{array} \end{aligned} $$
    (6.19)

    For \(v\notin \xi _{t} \cup \tilde {\xi }_{t}\), we can estimate

    $$\displaystyle \begin{aligned} \begin{array}{ll} |r( \xi_{t},\xi_{t}^v) - r(\tilde{\xi}_{t},\tilde{\xi}_{t}^{v})| &= \left| \mathrm{e}^{-\beta\left[B_{1}\right]_{+}} - \mathrm{e}^{-\beta\left[B_{2}\right]_{+}}\right|\\ &\leq \mathrm{e}^{-2\beta [B_{3}]_+} \left| \mathrm{e}^{-\frac{4\beta}{n} |E(v,\xi_{t} \backslash \tilde{\xi}_{t})|} -\mathrm{e}^{-\frac{4\beta}{n} |E(v,\tilde{\xi}_{t} \backslash \xi_{t})|} \right| \\[0.2cm] &\leq \mathrm{e}^{-2\beta [B_{3}]_+} \frac{4\beta}{n}\left| |E(v,\xi_{t} \backslash \tilde{\xi}_{t})| -|E(v,\tilde{\xi}_{t} \backslash \xi_{t})| \right| \\[0.2cm] & \leq \left[1+o_{n}(1)\right] \mathrm{e}^{-2\beta[-p\mathbf{m}-h]_+} \frac{4\beta}{n} \left|E\left(v, W_{1}^{t}\right)\right|, \end{array} \end{aligned} $$
    (6.20)

    where we note that \(W_1^t = \xi _{t} \backslash \tilde {\xi }_{t} \cup \tilde {\xi }_{t} \backslash \xi _{t}\) and use that, by Lemma 3.2 and the bound \(|\xi _{t} \backslash \tilde {\xi }_{t}| \leq |W_{1}^{t}| \leq n^{6/7}\),

    $$\displaystyle \begin{aligned} \begin{array}{ll} &\frac{1}{n}\left(\mathrm{deg}(v) - 2|E(v,\xi_{t}\cap \tilde{\xi}_{t})|\right) = [1+o_{n}(1)]\,p\left(1 - \frac{2|\xi_{t}\cap \tilde{\xi}_{t}|}{n}\right)\\[0.2cm] &= [1+o_{n}(1)]\,p\left(1 - \frac{2|\xi_{t}|}{n}\right) = [1+o_{n}(1)]\,p\left(1 - \frac{2\mathbf{M}}{n}\right) = -[1+o_{n}(1)]\,p\mathbf{m}. \end{array} \end{aligned} $$
    (6.21)

    Note that since ξ t and \(\tilde {\xi }_{t}\) disagree at most at n 6∕7 vertices, and since \(|\xi _{t}| =[1+o_{n}(1)]\mathbf {M} = [1+o_{n}(1)]\frac {n}{2}(1+\mathbf {m})\), we have that \(| v \in \xi _{t} \cup \tilde {\xi }_{t} | = [1{+}o_{n}(1)]\frac {n}{2}(1-\mathbf {m})\). Furthermore, since \(|E(v, W_{1}^{t})| \leq [1{+}o_n(1)] p|W_{1}^{t}|\) and |V | = n, we have that

    $$\displaystyle \begin{aligned} \sum_{v\notin \xi_{t}\cup\tilde{\xi}_{t}} | r( \xi_{t},\xi_{t}^v) - r(\tilde{\xi}_{t},\tilde{\xi}_{t}^{v})| \leq \left[1+o_{n}(1)\right] \mathrm{e}^{-2\beta[-p\mathbf{m}-h]_+}\,2\beta p (1-\mathbf{m})\,|W_{1}^{t}|. \end{aligned} $$
    (6.22)

    For \(v\in \xi _{t}\cap \tilde {\xi }_{t}\), on the other hand, Lemma 3.5 gives that

    $$\displaystyle \begin{aligned} r(\xi_{t},\xi_{t}^{v})=r(\tilde{\xi}_{t},\tilde{\xi}_{t}^{v})=1 \end{aligned} $$
    (6.23)

    for all but O(n 2∕3) many such v. If v is such that \(r(\xi _{t},\xi _{t}^{v}) \neq r(\tilde {\xi }_{t},\tilde {\xi }_{t}^{v})\), then a computation identical to the one leading to (6.22) gives that

    $$\displaystyle \begin{aligned} \sum_{v\in\xi_{t}\cap\tilde{\xi}_{t}} |r\left(\xi_{t},\xi_{t}^{v}\right) -r(\tilde{\xi}_{t},\tilde{\xi}_{t}^{v})| = O(n^{-1/6}) \big|W_{1}^{t}\big|. {} \end{aligned} $$
    (6.24)

    Combining (6.22) and (6.24), we obtain

    $$\displaystyle \begin{aligned} \sum_{v \in W_{t}^2} |r\left(\xi_{t},\xi_{t}^{v}\right) -r(\tilde{\xi}_{t},\tilde{\xi}_{t}^{v})| \leq \left[1+o_{n}(1)\right] \mathrm{e}^{-2\beta[-p\mathbf{m}-h]_+}\,2\beta p (1-\mathbf{m})\,\big|W_{1}^{t}\big|, \end{aligned} $$
    (6.25)

    which bounds the rate at which \(W_2^{t}\) shrinks.

  3. 5.

    To bound the rate at which \(W_{1}^{t}\) shrinks, we argue as follows. The rate at which a non-matching vertex v becomes matching equals

    $$\displaystyle \begin{aligned} r(\xi_{t},\xi_{t}^{v})+r(\tilde{\xi}_{t},\tilde{\xi}_{t}^{v}). \end{aligned} $$
    (6.26)

    Note that, for every \(v\in W_{1}^{t}\),

    $$\displaystyle \begin{aligned} H\left(\xi_{t}^{v}\right)-H\left(\xi_{t}\right) =-\left[1+o_{n}(1)\right]\big[H(\tilde{\xi}_{t}^{v})-H(\tilde{\xi}_{t})\big], \end{aligned} $$
    (6.27)

    since, up to an arithmetic correction of magnitude \(|W_{1}^{t}|=O(n^{6/7})\), v has the same number of neighbours in ξ t as in \(\tilde {\xi }_{t}\). Hence it follows that

    $$\displaystyle \begin{aligned} \sum_{v \in W_{1}^{t}} \left[r(\xi_{t},\xi_{t}^{v})+r(\tilde{\xi}_{t},\tilde{\xi}_{t}^{v})\right] =\left[1+o_{n}(1)\right]\big(\mathrm{e}^{-2\beta [-p\mathbf{m}-h]_+}+\mathrm{e}^{-2\beta [p\mathbf{m}+h]_+}\big)|W_{1}^{t}|, {} \end{aligned} $$
    (6.28)

    which bounds the rate at which \(W_{1}^{t}\) shrinks.

  4. 6.

    Combining (6.25) and (6.28), and noting that p m + h < 0, we see that \(\left |W_{1}^{t}\right |\) is contracting when

    $$\displaystyle \begin{aligned}{}[1+o_n(1)] \left(\mathrm{e}^{2\beta(p\mathbf{m}+h)} + 1\right)\,\left|W_{1}^{t}\right| >\left[1+o_{n}(1)\right]\left(\mathrm{e}^{2\beta(p\mathbf{m}+h)}\,4\beta p\right)\,\left|W_{1}^{t}\right|. \end{aligned} $$
    (6.29)

    For this in turn it suffices that

    $$\displaystyle \begin{aligned} \mathrm{e}^{2\beta(p\mathbf{m}+h)} + 1 > \mathrm{e}^{2\beta(p\mathbf{m}+h)} 2\beta p(1-\mathbf{m}). \end{aligned} $$
    (6.30)
  5. 7.

    Note from the definition of m in (1.16) that, up to a correction factor of 1 + o n(1), m solves the equation J(m) = 0 with

    $$\displaystyle \begin{aligned} J_{p,\beta,h}(a) = 2\lambda\left(a+\frac{h}{p}\right)+\log\left(\frac{1-a}{1+a}\right), \qquad \lambda = \beta p, \end{aligned} $$
    (6.31)

    i.e.,

    $$\displaystyle \begin{aligned} \frac{1+\mathbf{m}}{1-\mathbf{m}} = \mathrm{e}^{2\lambda\left(\mathbf{m}+\frac{h}{p}\right)}. \end{aligned} $$
    (6.32)

    Hence from (6.30) it follows that \(|W_{1}^{t}|\) is contracting whenever we are in the metastable regime and the inequality

    $$\displaystyle \begin{aligned} \lambda < \frac{1}{1-{\mathbf{m}}^2} {} \end{aligned} $$
    (6.33)

    is satisfied. From (2.17) it follows that the equality

    $$\displaystyle \begin{aligned} \lambda = \frac{1}{1-a^2} \end{aligned} $$
    (6.34)

    holds for \(a=a_{\lambda }=-\sqrt {1-1/\lambda }\), which in turn is bounded between the values m < a λ < t < 0, and therefore

    $$\displaystyle \begin{aligned} \frac{1}{1-{\mathbf{m}}^2} > \frac{1}{1-a_{\lambda}^2} = \lambda. \end{aligned} $$
    (6.35)

    This shows that \(|W_{1}^{t}|\) is contracting whenever we are in the metastable regime.

  6. 8.

    To conclude, we summarise the implication of the contraction of the process \(|W_{1}^{t}|\). The probability in (6.16) is equal to \(1-O_{n}(n^{\frac {5}{6}-\frac {6}{7}})\) for \(|W_{1}^{t}| > n^{6/7}\), and is strictly larger than \(\frac {1}{2}\) for \(|W_{1}^{t}| \leq n^{6/7}\). Furthermore, from (6.12) we know that the rate at which \(W_1^{t}\) shrinks is ≥ 1. This allows us to ensure that sufficiently many steps are made by time 2n to allow \(W_{1}^{t}\) to contract to the empty set. In particular, the number steps taken by \(W_{1}^{t}\) up to time 2n is bounded from below by a Poisson point process N(t) with unit rate, for which we have

    $$\displaystyle \begin{aligned} \begin{array}{rcl} \mathbb{P}\left[N\left(2n\right) \leq \frac{3n}{2}\right] \leq 2n\frac{\left(2n\right)^{\left(\frac{3n}{2}\right)}\mathrm{e}^{-2n}}{\left(\frac{3n}{2}\right)!} \leq 2n\left(\frac{4n}{3}\right)^{\left(\frac{3n}{2}\right)}\mathrm{e}^{-\frac{n}{2}} \leq 1.07^{\left(-\frac{n}{2}\right).}\\ \end{array} \end{aligned} $$
    (6.36)

    In other words, with probability exponentially close to 1, we have that at least 3n∕2 jumps are made in time 2n. To bound the probability that \(W_{1}^{t}\) has not converged to the empty set, note that this probability decreases in the number of transitions made by \(W_{1}^{t}\). Therefore, without loss of generality, we may assume that \(\frac {3n}{2}\) transitions were made, and that we start with \(|W_{1}^{0}|=n\). We claim that, with high probability, in time 2n, \(W_{1}^{t}\) takes at most \(\frac {100n}{\log n}\) increasing steps (i.e., i → i + 1) in the interval [n 5∕6, n]. Indeed, note that the probability of the latter occurring is less than

    $$\displaystyle \begin{aligned} 2^{M}O\big(n^{-1/42}\big)^{\frac{100n}{\log n}} = O\big(\mathrm{e}^{-n}\big). \end{aligned} $$
    (6.37)

    It follows that at least \(\frac {n}{2}[1+o_{n}(1)]\) steps are taken in the interval [0, n 5∕6]. But then, using (6.16), we have that the probability of an increasing step is at most \(\frac {1}{2}-\epsilon \) for some 𝜖 > 0, and therefore the probability of that event is at most

    $$\displaystyle \begin{aligned} \begin{array}{ll} &2^{\frac{n}{2}[1+o_{n}(1)]}\left(\frac{1}{2}+\epsilon\right)^{\frac{n}{4}[1+o_{n}(1)]} \left(\frac{1}{2}-\epsilon\right)^{\frac{n}{4}[1+o_{n}(1)]}\\[0.2cm] &= 4^{\frac{n}{4}[1+o_{n}(1)]}\left(\frac{1}{4}-\epsilon^{2}\right)^{\frac{n}{4}[1+o_{n}(1)]} = \left(1-4\epsilon^{2}\right)^{\frac{n}{4}[1+o_{n}(1)]}. \end{array} \end{aligned} $$
    (6.38)

    Finally, observing that in the entire proof so far the largest probability for any of the bounds not to hold is \(O(\mathrm{e} ^{-n^{-2/3}})\) (see (6.13) and the paragraph following (6.16)), we get

    $$\displaystyle \begin{aligned} \mathbb{P}\left[|W_{1}^{2n}|>0\right] \leq O\big(\mathrm{e}^{-n^{2/3}}\big) \end{aligned} $$
    (6.39)

    and so the claim of the lemma follows. □

6.2 Long-Term Scheme

Corollary 6.3 (Long-Term Coupling)

Let δ > 0. With \(\mathbb {P}_{\mathrm {ER}_n(p)}\) -probability tending to 1 as n ∞, there is a coupling of {ξ t}t≥0 and \(\{\tilde {\xi }_{\tilde {t}}\}_{t \geq 0}\) , and there are times t and \(\tilde {t}\) with \(\max (t,\tilde {t})<\mathrm{e} ^{n\varGamma ^{\star }-\delta n}\) , such that

$$\displaystyle \begin{aligned} \mathbb{P}\big[\xi_{t} \neq \tilde{\xi}_{\tilde{t}}\big] \leq \mathrm{e}^{-n\delta+ O(n^{2/3})}. \end{aligned} $$
(6.40)

Proof

Let s i be the ith return-time of {ξ t}t≥0 to A M. Define \(\{\tilde {\mathbf {s}}_{i}\}_{i\geq 0}\) in an analogous manner for \(\{\tilde {\xi }_t\}_{t \geq 0}\). Then we can define a coupling of {ξ t}t≥0 and \(\{\tilde {\xi }_{t}\}_{t \geq 0}\) as follows. For i ≥ 0 and 0 ≤ s ≤ 2n, couple \(\xi _{\mathbf {s_{i}}+s}\) and \(\tilde {\xi }_{\tilde {\mathbf {s}}_{\mathbf {i}}+s}\) as described in Lemma 6.1. For times t ∈ (s i + 2n, s i+1) and \(\tilde {t} \in (\tilde {\mathbf {s}}_{\mathbf {i}}+2n,\tilde {\mathbf {s}}_{\mathbf {i+1}})\), let {ξ t}t≥0 and \(\{\tilde {\xi }_{\tilde {t}}\}_{t \geq 0}\) run independently of each other. Terminate this coupling at the first time t such that t = s i + s for some s ≤ 2n and \(\xi _{t}=\tilde {\xi }_{\tilde {t}}\) with \(\tilde {t} =\tilde {\mathbf {s}}_{\mathbf {i}}+s\), from which point onward we simply let \(\xi _{t}=\tilde {\xi }_{\tilde {t}}\). It is easy to see that the coupling above is an attempt at repeating the coupling scheme of Lemma 6.1 until the paths of the two processes have crossed. To avoid having to wait until both processes are in A M at the same time, the coupling defines a joint distribution of ξ t and \(\tilde {\xi }_{\tilde {t}}\).

Note that, by Lemma 5.4, with probability of at least \(1-\mathrm{e} ^{-\delta n + O(n^{2/3})}\), {ξ t}t≥0 will visit A M at least \(\mathrm{e} ^{(\varGamma ^{\star } - \delta )n}\) times before reaching A S, for any δ > 0. The same statement is true for \(\{\tilde {\xi }_{\tilde {t}}\}_{\tilde {t} \geq 0}\). Assuming that the aforementioned event holds for both ξ t and \(\tilde {\xi }_{\tilde {t}}\), we see that the probability that the coupling does not succeed (i.e., the two trajectories do not intersect as described earlier) is at most

$$\displaystyle \begin{aligned} \left[O\big(\mathrm{e}^{-n^{-2/3}}\big)\right]^{\mathrm{e}^{(\varGamma^{\star}-\delta) n}}. \end{aligned} $$
(6.41)

Therefore, the probability that the coupling does not succeed before either of {ξ t}t≥0 or \(\{\tilde {\xi }_{\tilde {t}}\}_{\tilde {t} \geq 0}\) reaches A S is at most \(\mathrm{e} ^{-\delta n + O(n^{2/3})}\). □

7 Proof of the Main Metastability Theorem

In this section we prove Theorem 1.4.

Proof

The key is to show that with \(\mathbb {P}_{\mathrm {ER}_n(p)}\)-probability tending to 1 as n →, for any \(\xi _{0}^{u} \in A_{{\mathbf {M}}^{u}}\), ξ 0 ∈ A M and \(\xi _{0}^{l}\in A_{{\mathbf {M}}^{l}}\),

$$\displaystyle \begin{aligned} \mathbb{E}_{\xi_{0}^{u}}\left[\tau_{{\mathbf{S}}^{u}}\right] \leq \mathbb{E}_{\xi_{0}}\left[\tau_{\mathbf{S}}\right] \leq \mathbb{E}_{\xi_{0}^{l}}\left[\tau_{{\mathbf{S}}^{l}}\right]. \end{aligned} $$
(7.1)

Note that \(\mathbb {E}_{\xi _{0}}\left [\tau _{\mathbf {S}}\right ]\) is the same for all ξ 0 ∈ A M up to a multiplicative factor of 1 + o n(1), as was shown in Sects. 5.2 and 6. Therefore it is suffices to find some convenient ξ ∈ A M for which we can prove the aforementioned theorem.

  1. 1.

    Our proof follows four steps:

    1. (1)

      Recall that for A ⊂ S n, μ A is the probability distribution μ conditioned to the set A. Starting from the initial distribution \(\mu _{A_{\mathbf {M}}}\) on the set A M, the trajectory segment taken by ξ t from ξ 0 to ξ τ, with \(\tau =\min \left \{\tau _{\mathbf {M}}, \tau _{\mathbf {S}}\right \}\), can be coupled to the analogous trajectory segments taken by \(\xi _{t}^{l}\) and \(\xi _{t}^{u}\), starting in \(A_{{\mathbf {M}}^{l}}\) and \(A_{{\mathbf {M}}^{u}}\), respectively, and this coupling can be done in such a way that the following two conditions hold:

      1. (a)

        If ξ t reaches A S before returning to A M (i.e., τ S < τ M), then \(\xi _{t}^{u}\) reaches \(A_{{\mathbf {S}}^{u}}\) before returning to \(A_{{\mathbf {M}}^{u}}\).

      2. (b)

        If ξ t returns to A M before reaching A S (i.e., τ M < τ S), then \(\xi _{t}^{l}\) returns to \(A_{{\mathbf {M}}^{l}}\) before reaching \(A_{{\mathbf {S}}^{l}}\).

    2. (2)

      We show that if ξ t has initial distribution \(\mu _{A_{\mathbf {M}}}\) and τ M < τ S, then upon returning to A M the distribution of ξ t is once again given by \(\mu _{A_{\mathbf {M}}}\). This implies that the argument in Step (1) can be applied repeatedly, and that the number of returns ξ t makes to A M before reaching A S is bounded from below by the number of returns \(\xi _{t}^{u}\) makes to \(A_{{\mathbf {M}}^{u}}\) before reaching \(A_{{\mathbf {S}}^{u}}\), and is bounded from above by the number of returns \(\xi _{t}^{l}\) makes to \(A_{{\mathbf {M}}^{l}}\) before reaching \(A_{{\mathbf {S}}^{l}}\).

    3. (3)

      Using Lemma 3.7, we bound the time between unsuccessful excursions, i.e., the expected time it takes for ξ t, when starting from \(\mu _{A_{\mathbf {M}}}\), to return to A M, given that τ M < τ S. This bound is used together with the outcome of Step (2) to obtain the bound

      $$\displaystyle \begin{aligned} \mathbb{E}_{\mu_{A_{{\mathbf{M}}^{u}}}}^{u}\left[\tau_{\mathbf{S}}^{u}\right] \leq \mathbb{E}_{\mu_{A_{\mathbf{M}}}}\left[\tau_{\mathbf{S}}\right] \leq \mathbb{E}_{\mu_{A_{{\mathbf{M}}^{l}}}}^{l}\bigl[\tau_{\mathbf{S}}^{l}\bigr]. \end{aligned} $$
      (7.2)

      Here, the fact that the conditional average return time is bounded by some large constant rather than 1 does not affect the sandwich in (7.2), because the errors coming from the perturbation of the magnetic field in the Curie-Weiss model are polynomial in n (see below).

    4. (4)

      We complete the proof by showing that, for any distribution μ 0 restricted to A M,

      $$\displaystyle \begin{aligned} \mathbb{E}_{\mu_{0}}\left[\tau_{\mathbf{S}}\right] = \left[1+o_{n}(1)\right]\,\mathbb{E}_{\mu_{A_{\mathbf{M}}}}\left[\tau_{\mathbf{S}}\right]. \end{aligned} $$
      (7.3)
  2. 2.

    Before we turn to the proof of these steps, we explain how the bound on the exponent in the prefactor of Theorem 1.4 comes about. Return to (2.4). The magnetic field h is perturbed to \(h \pm \, (1+\epsilon ) \log (n^{11/6})/n\). We need to show how this affects the formulas for the average crossover time in the Curie-Weiss model. For this we use the computations carried out in [4, Chapter 13]. According to [4, Eq. (13.2.4)] we have, for any \(\xi \in A_{{\mathbf {M}}_n}\) and any 𝜖 > 0,

    $$\displaystyle \begin{aligned} \mathbb{E}_{\xi}\left[\tau_{A_{{\mathbf{S}}_n}}\right] = [1+o_n(1)]\,\frac{2}{1-\mathbf{t}}\, \mathrm{e}^{\beta n [R_n(\mathbf{t})-R_n(\mathbf{m})]}\,\frac{1}{n}\,{\mathbf{S}}_n \end{aligned} $$
    (7.4)

    with

    (7.5)

    where R n is the free energy defined by \(R^{\prime }_n= -J_n/2\beta \) (recall (1.20)). (Here we suppress the dependence on β, h and note that (7.4) carries an extra factor \(\frac {1}{n}\) because [4, Chapter 13] considers a discrete-time dynamics where at every unit of time a single spin is drawn uniformly at random and is flipped with a probability that is given by the right-hand side of (1.6).) According to [4, Eq. (13.2.5)–(13.2.6)] we have

    $$\displaystyle \begin{aligned} I_n(a) - I(a) = [1+o_n(1)] \frac{1}{2n} \log \left(\frac 12 \pi n(1-a^2)\right), \qquad a \in [-1,1], \end{aligned} $$
    (7.6)

    so that

    $$\displaystyle \begin{aligned} \mathrm{e}^{\beta n[R_n(a)-R(a)]} = [1+o_n(1)]\,\sqrt{\frac 12\pi n(1-a^2)}, \qquad a \in [-1,1], \end{aligned} $$
    (7.7)

    where R is the limiting free energy defined by R′ = −J∕2β (recall (1.27)). Inserting (7.7) into (7.4), we get

    $$\displaystyle \begin{aligned} \mathbb{E}_{\xi}\left[\tau_{A_{{\mathbf{S}}_n}}\right] = [1+o_n(1)]\,\frac{2}{1+\mathbf{t}}\, \sqrt{\frac{1-{\mathbf{t}}^2}{1-{\mathbf{m}}^2}}\, \mathrm{e}^{\beta n [R(\mathbf{t})-R(\mathbf{m})]}\,\frac{1}{n}\,{\mathbf{S}}^*_n \end{aligned} $$
    (7.8)

    with

    (7.9)

    Finally, according to [4, Eq. (13.2.9)–(13.2.11)] we have, with the help of a Gaussian approximation,

    $$\displaystyle \begin{aligned} \lim_{n\to\infty} \frac{1}{n}\,{\mathbf{S}}^*_n = \frac{\pi}{2\beta \sqrt{[R^{\prime\prime}(\mathbf{m})[-R^{\prime\prime}(\mathbf{t})]}}. \end{aligned} $$
    (7.10)

    Putting together (7.8) and (7.10), we see how Theorem 1.3 arises as the correct formula for the Curie-Weiss model.

  3. 3.

    The above computations are for β, h fixed and p = 1. We need to investigate what changes when p ∈ (0, 1), β is fixed, but h depends on n:

    $$\displaystyle \begin{aligned} h_n = h \pm (1+\epsilon)\,\frac{\log (n^{11/6})}{n}. \end{aligned} $$
    (7.11)

    We write \(R_n^n\) to denote R n when h is replaced by h n. In the argument in [4, Chapter 13] leading up to (7.4), the approximation only enters through the prefactor. But since h n → h as n →, the perturbation affects the prefactor only by a factor 1 + o n(1). Since h plays no role in (7.6) and \(R^n_n(a)-R^n(a) = \frac {1}{\beta }[I_n(a)-I(a)]\) (recall (1.19) and (1.26)), we get (7.8) with exponent βn[R n(t) − R n(m)] and (7.9) with exponent βn[R n(a) − R n(t)] − βn[R n(a′) − R n(m)]. The latter affects the Gaussian approximation behind (7.10) only by a factor 1 + o n(1). However, the former leads to an error term in the exponent, compared to the Curie Weiss model, that equals

    $$\displaystyle \begin{aligned} \begin{array}{ll} &\beta n[R^n(\mathbf{t})-R^n(\mathbf{m})]-\beta n[R(\mathbf{t})-R(\mathbf{m})] = \beta n \int_{\mathbf{m}}^{\mathbf{t}} \mathrm{d} a\, [(R^n)'(a)-R'(a)]\\[0.2cm] &\qquad = \beta n \int_{\mathbf{m}}^{\mathbf{t}} \mathrm{d} a\, [-(h_n-h)] = \beta (\mathbf{t}-\mathbf{m})\,n (h-h_n)\\[0.2cm] &\qquad = \mp \beta (\mathbf{t}-\mathbf{m})\,(1+\epsilon) \log(n^{11/6}). \end{array} \end{aligned} $$
    (7.12)

    The exponential of this equals n β(tm) (1+𝜖)(11∕6), which proves Theorem 1.4 with the bound in (1.33) because 𝜖 is arbitrary.

Proof of Step (1)

This step is a direct application of Lemma 4.3.

Proof of Step (2)

Write =d to denote equality in distribution. Let \(\xi _{0}=^d\mu _{A_{\mathbf {M}}}\), and recall that τ M is the first return time of ξ t to A M once the initial state ξ 0 has been left. We want to show that \(\xi _{\tau _{\mathbf {M}}}=^d\mu _{A_{\mathbf {M}}}\) or, in other words, that \(\mathbb {P}_{\mu _{A_{\mathbf {M}}}}[\xi _{\tau _{\mathbf {M}}}=\sigma ] = \mu _{A_{\mathbf {M}}} (\sigma )\) for any σ ∈ A M. To facilitate the argument, we begin by defining the set of all finite permissible trajectories \(\mathcal {T}\), i.e.,

$$\displaystyle \begin{aligned} \mathcal{T}=\bigcup_{N\in\mathbb{N}}\left\{ \gamma=\left\{ \gamma_{i}\right\} _{i=0}^{N} \in S_{n}^{N}\colon\left|\left|\gamma_{i}\right|-\left|\gamma_{i+1}\right|\right|=1\,\, \forall\,0 \leq i \leq N-1\right\}. \end{aligned} $$
(7.13)

Let \(\gamma \in \mathcal {T}\) be any finite trajectory beginning at γ 0 ∈ A M, ending at γ |γ|−1 = σ ∈ A M, and satisfying γ iA M for 0 < i < |γ|− 1. Then the probability that ξ t follows the trajectory γ is given by

$$\displaystyle \begin{aligned} \begin{array}{ll} \mathbb{P}\left[\xi_{t}\mbox{{ follows }}\gamma\right] &= \mu_{A_{\mathbf{M}}}\left(\gamma_{0}\right)P\left(\gamma_{0},\gamma_{1}\right) \times \cdots \times P\left(\gamma_{\left|\gamma\right|-2},\sigma\right)\\[0.2cm] &= \frac{1}{\mu\left(A_{\mathbf{M}}\right)}\mu\left(\gamma_{0}\right)P\left(\gamma_{0},\gamma_{1}\right) \times \cdots \times P\left(\gamma_{\left|\gamma\right|-2},\sigma\right)\\[0.2cm] &= \frac{1}{\mu\left(A_{\mathbf{M}}\right)}\mu\left(\sigma\right)P\left(\sigma,\gamma_{\left|\gamma\right|-2}\right) \times \cdots \times P\left(\gamma_{1},\gamma_{0}\right)\\[0.2cm] &= \mu_{A_{\mathbf{M}}}\left(\sigma\right)P\left(\sigma,\gamma_{\left|\gamma\right|-2}\right) \times \cdots \times P\left(\gamma_{1},\gamma_{0}\right), \end{array} \end{aligned} $$
(7.14)

where the third line follows from reversibility. Thus, if we let \(\mathcal {T}(\sigma )\) be the set of all trajectories in \(\mathcal {T}\) that begin in A M, end at σ, and do not visit A M in between, then we get

$$\displaystyle \begin{aligned} \begin{array}{ll} \mathbb{P}_{\mu_{A_{\mathbf{M}}}}\left[\xi_{\tau_{\mathbf{M}}}=\sigma\right] &= \sum_{\gamma\in\mathcal{T}\left(\sigma\right)}\mu_{A_{\mathbf{M}}}\left(\sigma\right) P\left(\sigma,\gamma_{\left|\gamma\right|-2}\right) \times \cdots \times P\left(\gamma_{1},\gamma_{0}\right)\\[0.2cm] &= \mu_{A_{\mathbf{M}}}\left(\sigma\right)\mathbb{P}_{\sigma}\left[\tau_{\mathbf{M}}<\infty\right]\\[0.2cm] &= \mu_{A_{\mathbf{M}}}\left(\sigma\right), \end{array} \end{aligned} $$
(7.15)

where we use recurrence and the law of total probability, since the trajectories in \(\mathcal {T}(\sigma )\), when reversed, give all possible trajectories that start at σ ∈ A M and return to A M in a finite number of steps. This shows that if ξ t has initial distribution \(\mu _{A_{\mathbf {M}}}\), then it also has this distribution upon every return to A M.

We can now define a segment-wise coupling of the trajectory taken by ξ t with the trajectories taken by \(\xi _{t}^{u}\) and \(\xi _{t}^{l}\). First, we define the subsets of trajectories that start and end in particular regions of the state space: (1) \(\mathcal {T}_{\sigma ,L,K}\) is the set of trajectories that start at a particular configuration σ and end in A K without ever visiting A K or A L in between, for some K < L; (2) \(\mathcal {T}_{\sigma ,L,L}\) is the set of trajectories that start at some σ and end in A L without ever visiting A K or A L in between; (3) \(\mathcal {T}_{\sigma ,L}\) is the union of the two aforementioned sets. In explicit form,

$$\displaystyle \begin{aligned} \begin{array}{ll} \mathcal{T}_{\sigma,L,K} &= \left\{ \gamma\in\mathcal{T}\colon\gamma_{0}=\sigma,\gamma_{\left|\gamma\right|-1} \in A_{K},K<\left|\gamma_{j}\right|<L\,\,\forall\,0<j<\left|\gamma\right|-1\right\},\\[0.2cm] \mathcal{T}_{\sigma,L,L} &= \left\{ \gamma\in\mathcal{T}\colon\gamma_{0}=\sigma,\gamma_{\left|\gamma\right|-1} \in A_{L},K<\left|\gamma_{j}\right|<L\,\,\forall\,0<j<\left|\gamma\right|-1\right\},\\[0.2cm] \mathcal{T}_{\sigma,L} & = \mathcal{T}_{\sigma,L,K}\cup\mathcal{T}_{\sigma,L,L}. \end{array} \end{aligned} $$
(7.16)

By Step (1), for any \(\xi _{0}^{l}\in A_{{\mathbf {M}}^{l}}\) and \(\xi _{0}^{u}\in A_{{\mathbf {M}}^{u}}\),

$$\displaystyle \begin{aligned} \mathbb{P}_{\xi_{0}^{l}}^{l}\bigl[\mathcal{T}_{\xi_{0}^{l},{\mathbf{S}}^{l},{\mathbf{S}}^{l}}\bigr] \leq\mathbb{P}_{\xi_{0}}\left[\mathcal{T}_{\xi_{0},\mathbf{S},\mathbf{S}}\right] \leq \mathbb{P}_{\xi_{0}^{u}}^{u}\left[\mathcal{T}_{\xi_{0}^{u},{\mathbf{S}}^{u},{\mathbf{S}}^{u}}\right]. {} \end{aligned} $$
(7.17)

It is clear that the two probabilities at either end of (7.17) are independent of the starting points \(\xi _{0}^{l}\) and \(\xi _{0}^{u}\). By the argument given above, if for the probability in the middle \(\xi _{0}=^d\mu _{A_{\mathbf {M}}}\), then each subsequent return to A M also has this distribution. For this reason, we may define a coupling of the trajectories as follows.

Sample a trajectory segment γ l from \(\mathcal {T}_{\xi _{0}^{l},{\mathbf {S}}^{l}}\) for the process \(\xi _{t}^{l}\). If γ l happens to be in \(\mathcal {T}_{\xi _{0}^{l},{\mathbf {S}}^{l}, {\mathbf {S}}^{l}}\), then by (7.17) we may sample a trajectory segment γ from \(\mathcal {T}_{\xi _{0},\mathbf {S},\mathbf {S}}\) for the process ξ t, and a trajectory segment γ u from \(\mathcal {T}_{\xi _{0}^{u},{\mathbf {S}}^{u},{\mathbf {S}}^{u}}\) for the process ξ u. Otherwise, \(\gamma ^{l}\in \mathcal {T}_{\xi _{0}^{l},{\mathbf {S}}^{l},{\mathbf {M}}^{l}}\), and we independently take \(\gamma \in \mathcal {T}_{\xi _{0},\mathbf {S},\mathbf {S}}\) with probability \(\mathbb {P}_{\xi _{0}} [\mathcal {T}_{\xi _{0},\mathbf {S},\mathbf {S}}]-\mathbb {P}_{\xi _{0}^{l}}^{l}[\mathcal {T}_{\xi _{0}^{l}, {\mathbf {S}}^{l},{\mathbf {S}}^{l}}]\), and \(\gamma \in \mathcal {T}_{\xi _{0},\mathbf {S},\mathbf {M}}\) otherwise. If \(\gamma \in \mathcal {T}_{\xi _{0},\mathbf {S},\mathbf {S}}\), then sample γ u from \(\mathcal {T}_{\xi _{0}^{u},{\mathbf {S}}^{u},{\mathbf {S}}^{u}}\). Otherwise \(\gamma \in \mathcal {T}_{\xi _{0}, \mathbf {S},\mathbf {M}}\), and so take independently \(\gamma ^{u}\in \mathcal {T}_{\xi _{0}^{u}, {\mathbf {S}}^{u},{\mathbf {S}}^{u}}\) with probability \(\mathbb {P}_{\xi _{0}^{u}}^{u}[\mathcal {T}_{\xi _{0}^{u}, {\mathbf {S}}^{u},{\mathbf {S}}^{u}}]-\mathbb {P}_{\xi _{0}}[\mathcal {T}_{\xi _{0},\mathbf {S},\mathbf {S}}]\), and \(\gamma ^{u}\in \mathcal {T}_{\xi _{0}^{u},{\mathbf {S}}^{u},{\mathbf {M}}^{u}}\) with the remaining probability. We glue together the sampling of segments leaving and returning to \(A_{{\mathbf {M}}^{l}}/A_{\mathbf {M}} /A_{{\mathbf {M}}^{u}}\) with the next sampling of such segments. This results in trajectories for ξ u, ξ, and ξ l that reach \(A_{{\mathbf {S}}^{u}}/A_{\mathbf {S}}/A_{{\mathbf {S}}^{l}}\), in that particular order.

Proof of Step (3) and Step (4)

These two steps are immediate from Lemma 3.7.

8 Conditional Average Return Time for Inhomogeneous Random Walk

In this section we prove Lemma 3.7. In Sects. 8.18.2 we compute the harmonic function and the conditional average return time for an arbitrary nearest-neighbour random walk on a finite interval. In Sect. 8.3 we use these computations to prove the lemma.

8.1 Harmonic Function

Consider a nearest-neighbour random walk on the set {0, …, N} with strictly positive transition probabilities p(x, x + 1) and p(x, x − 1), 0 < x < N, and with 0 and N acting as absorbing boundaries. Let τ 0 and τ N denote the first hitting times of 0 and N. The harmonic function is defined as

$$\displaystyle \begin{aligned} h_N(x) = \mathbb{P}_x(\tau_N<\tau_0), \quad 0 \leq x \leq N, \end{aligned} $$
(8.1)

where \(\mathbb {P}_x\) is the law of the random walk starting from x. This is the unique solution of the recursion relation

$$\displaystyle \begin{aligned} h_N(x) = p(x,x+1) h_N(x+1) + p(x,x-1) h_N(x-1), \quad 0<x<N, \end{aligned} $$
(8.2)

with boundary conditions

$$\displaystyle \begin{aligned} h_N(0)= 0, \qquad h_N(N) = 1. \end{aligned} $$
(8.3)

Since p(x, x + 1) + p(x, x − 1) = 1, the recursion can be written as

$$\displaystyle \begin{aligned} p(x,x+1)[h_N(x+1)-h_N(x)] = p(x,x-1)[h_N(x)-h_N(x-1)]. \end{aligned} $$
(8.4)

Define Δh N(x) = h N(x + 1) − h N(x), 0 ≤ x < N. Iteration gives

$$\displaystyle \begin{aligned} \varDelta h_N(x) = \pi[1,x]\,\varDelta h_N(0), \quad 0 \leq x < N, \end{aligned} $$
(8.5)

where we define

$$\displaystyle \begin{aligned} \pi(I) = \prod_{z \in I} \frac{p(z,z-1)}{p(z,z+1)}, \quad I \subseteq \{1,\ldots,N-1\}, \end{aligned} $$
(8.6)

with the convention that the empty product equals 1. Since h N(0) = 0, we have

$$\displaystyle \begin{aligned} h_N(x) = \sum_{z=0}^{x-1} \varDelta h_N(z) = \left(\sum_{z=0}^{x-1} \pi[1,z]\right) \varDelta h_N(0), \qquad 0 < x \leq N. \end{aligned} $$
(8.7)

Put C = Δh N(0), and abbreviate

$$\displaystyle \begin{aligned} R(x) = \sum_{z=0}^{x-1} \pi[1,z], \quad 0 \leq x \leq N. \end{aligned} $$
(8.8)

Since h N(N) = 1, we have C = 1∕R(N). Therefore we arrive at

$$\displaystyle \begin{aligned} h_N(x) = \frac{R(x)}{R(N)}, \quad 0 \leq x \leq N. \end{aligned} $$
(8.9)

Remark 8.1

For simple random walk we have \(p(x,x \pm 1) = \frac 12\), hence π[1, x] = 1 and R(x) = x, and so

$$\displaystyle \begin{aligned} h_N(x) = \frac{x}{N}, \quad 0 \leq x \leq N, \end{aligned} $$
(8.10)

which is the standard gambler’s ruin formula.

8.2 Conditional Average Hitting Time

We are interested in the quantity

$$\displaystyle \begin{aligned} e_N(x) = \mathbb{E}_x(\tau_N \mid \tau_N<\tau_0), \qquad 0 < x \leq N. \end{aligned} $$
(8.11)

The conditioning amounts to taking the Doob transformed random walk, which has transition probabilities

$$\displaystyle \begin{aligned} q(x,x \pm 1) = p(x,x \pm1 )\,\frac{h_N(x \pm 1)}{h_N(x)}. \end{aligned} $$
(8.12)

We have the recursion relation

$$\displaystyle \begin{aligned} e_N(x) = 1 + q(x,x+1) e_N(x+1) + q(x,x-1) e_N(x-1), \quad 0 < x < N, \end{aligned} $$
(8.13)

in this section we prove with boundary conditions

$$\displaystyle \begin{aligned} e_N(N) = 0, \quad e_N(1) = 1 + e_N(2). \end{aligned} $$
(8.14)

Putting f N(x) = h N(x)e N(x), we get the recursion

$$\displaystyle \begin{aligned} f_N(x) = h_N(x) + p(x,x+1) f_N(x+1) + p(x,x-1) f_N(x-1), \quad 0 < x < N, \end{aligned} $$
(8.15)

which can be rewritten as

$$\displaystyle \begin{aligned} p(x,x+1)[f_N(x+1)-f_N(x)] = p(x,x-1)[f_N(x)-f_N(x-1)] - h_N(x). \end{aligned} $$
(8.16)

Define Δf N(x) = f N(x + 1) − f N(x), 0 < x < N. Iteration gives

$$\displaystyle \begin{aligned} \varDelta f_N(x) = \pi(1,x]\,\varDelta f_N(1) - \chi(1,x], \quad 0 < x < N, \end{aligned} $$
(8.17)

with

$$\displaystyle \begin{aligned} \chi(1,x] = \sum_{y=2}^x \pi(y,x]\,\frac{h_N(y)}{p(y,y+1)}, \quad 0 < x < N. \end{aligned} $$
(8.18)

Since f N(N) = 0, we have

$$\displaystyle \begin{aligned} f_N(x) = - \sum_{z=x}^{N-1} \varDelta f_N(z) = \sum_{z=x}^{N-1} \chi(1,z] - \left(\sum_{z=x}^{N-1} \pi(1,z]\right) \varDelta f_N(1), \quad 0<x<N, \end{aligned} $$
(8.19)

or

$$\displaystyle \begin{aligned} e_N(x) = \frac{1}{h_N(x)} \sum_{z=x}^{N-1} \chi(1,z] - \frac{1}{h_N(x)} \left(\sum_{z=x}^{N-1} \pi(1,z]\right) \varDelta f_N(1), \quad 0 < x < N. \end{aligned} $$
(8.20)

Put C = Δf N(1), and abbreviate

$$\displaystyle \begin{aligned} A(x) = \sum_{z=x}^{N-1} \pi(1,z], \quad B(x) = \sum_{z=x}^{N-1} \chi(1,z], \quad 0 < x \leq N. \end{aligned} $$
(8.21)

Then

$$\displaystyle \begin{aligned} e_N(x) = \frac{1}{h_N(x)} \bigl[B(x)-C A(x)\bigr]. \end{aligned} $$
(8.22)

Since e N(1) = 1 + e N(2), we have

$$\displaystyle \begin{aligned} C = \frac{[h_N(2)B(1)-h_N(1)B(2)]-h_N(1)h_N(2)}{h_N(2)A(1)-h_N(1)A(2)}. \end{aligned} $$
(8.23)

Abbreviate

$$\displaystyle \begin{aligned} \bar{R}(x) = \sum_{z=0}^{x-1} \pi(1,z], \quad \bar{S}(x) = \sum_{z=0}^{x-1} \chi(1,z], \quad 0 < x \leq N. \end{aligned} $$
(8.24)

Then

$$\displaystyle \begin{aligned} A(x) = \bar{R}(N)-\bar{R}(x), \quad B(x) = \bar{S}(N)-\bar{S}(x), \quad 0< x < N. \end{aligned} $$
(8.25)

Note that \(h_N(x) = R(x)/R(N) = \bar {R}(x)/\bar {R}(N)\), because π[1, z] = π(1)π(1, z] and a common factor π(1) drops out. Note further that \(\bar {R}(1) = 1\), \(\bar {R}(2) = 2\), while \(\bar {S}(1)=\bar {S}(2)=0\). Therefore

$$\displaystyle \begin{aligned} C = \frac{\bar{S}(N)}{\bar{R}(N)} - \frac{2}{\bar{R}(N)^2}. \end{aligned} $$
(8.26)

Therefore we arrive at

$$\displaystyle \begin{aligned} e_N(x) = \bar{S}(N) - \frac{\bar{R}(N)}{\bar{R}(x)}\,\bar{S}(x) + \frac{2}{\bar{R}(x)} - \frac{2}{\bar{R}(N)}, \quad 0 < x \leq N. \end{aligned} $$
(8.27)

Abbreviating

$$\displaystyle \begin{aligned} \bar{T}(x) = \bar{S}(x)\bar{R}(N) = \sum_{z=0}^{x-1} \sum_{y=2}^z \frac{\pi(y,z]}{p(y,y+1)}\,\bar{R}(y), \quad \bar{U}(x) = \frac{\bar{T}(x)-2}{\bar{R}(x)}, \end{aligned} $$
(8.28)

we can write

$$\displaystyle \begin{aligned} e_N(x) = \bar{U}(N)- \bar{U}(x), \quad 0 < x \leq N. \end{aligned} $$
(8.29)

Remark 8.2

For simple random walk we have \(p(x,x\pm 1) = \frac 12\), π(y, z] = 1, \(\bar {R}(x)=x\), \(\bar {S}(x) = \frac {1}{3N}(x^3-7x+6)\) and \(\bar {U}(x) = \frac 13(x^2-7)\), and so

$$\displaystyle \begin{aligned} e_N(x) = \frac 13\,(N^2-x^2), \qquad 0 < x \leq N. \end{aligned} $$
(8.30)

This is to be compared with the unconditional average hitting time \(\mathbb {E}_x(\tau ) = x(N-x)\), 0 ≤ x ≤ N, where τ = τ 0 ∧ τ N is the first hitting time of {0, N}.

8.3 Application to Spin-Flip Dynamics

We will use the formulas in (8.6), (8.24) and (8.28)–(8.29) to obtain an upper bound on the conditional return time to the metastable state. This bound will be sharp enough to prove Lemma 3.7. We first do the computation for the complete graph (Curie-Weiss model). Afterwards we turn to the Erdős-Rényi Random Graph (our spin-flip dynamics).

8.3.1 Complete Graph

We monitor the magnetization of the continuous-time Curie-Weiss model by looking at the magnetization at the times of the spin-flips. This gives a discrete-time random walk on the set Γ n defined in (1.15). This set consists of n + 1 sites. We first consider the excursions to the left of m n (recall (1.16)). After that we consider the excursions to the right.

  1. 1.

    For the Curie-Weiss model we have (use the formulas in Lemma 3.4 without the error terms)

    $$\displaystyle \begin{aligned} \sigma \in A_k\colon \quad \sum_{\xi \in A_{k+1}} r(\sigma,\xi) = (n-k)\,\mathrm{e}^{-2\beta [\vartheta_k]_+}, \quad \sum_{\xi \in A_{k-1}} r(\sigma,\xi) = k\, \mathrm{e}^{-2\beta [-\vartheta_k]_+}, \end{aligned} $$
    (8.31)

    where \(\vartheta _k = p(1-\frac {2k}{n})-h\). Hence, the quotient of the rate to move downwards, respectively, upwards in magnetization equals

    $$\displaystyle \begin{aligned} Q(k) = \frac{\sum_{\xi \in A_{k-1}} r(\sigma,\xi) }{\sum_{\xi \in A_{k+1}} r(\sigma,\xi)} = \frac{k}{n-k}\,\mathrm{e}^{2\beta ([\vartheta_k]_+ - [-\vartheta_k]_+)}. \end{aligned} $$
    (8.32)

    It is convenient to change variables by writing \(k = \frac {n}{2}(a_k+1)\), so that 𝜗 k = −pa k − h. The metastable state corresponds to \(k = {\mathbf {M}}_n = \frac {n}{2}({\mathbf {m}}_n+1)\), i.e., a k = m n. We know from (1.16)–(1.15) that m n is the smallest solution of the equation J n(m n) = 0 (rounded off by 1∕n to fall in Γ n). Hence m n = m + O(1∕n) with m the smallest solution of the equation J p,β,h(m) = 0, satisfying \(\frac {1-\mathbf {m}}{1+\mathbf {m}} = \mathrm{e} ^{-2\beta (p\mathbf {m}+h)}\) (recall (1.23)). Hence we can write (for ease of notation we henceforth ignore the error O(1∕n))

    $$\displaystyle \begin{aligned} Q(k) = \frac{F({\mathbf{m}}_n)}{F(a_k)}, \qquad F(a) = \frac{1-a}{1+a}\,\mathrm{e}^{2\beta p a}. \end{aligned} $$
    (8.33)

    Here, we use that [𝜗 k]+ − [−𝜗 k]+ = 𝜗 k, which holds because \(0 = R^{\prime }_{p,\beta ,h} (\mathbf {m}) = -p\mathbf {m}-h+\beta ^{-1}I'(\mathbf {m})\) with I′(m) < 0 because m < 0 (recall (1.27)), so that − p m n − h > 0 for n large enough, which implies that also − pa − h > 0 for all a < m n for n large enough. We next note that (recall (1.27) and (2.17))

    $$\displaystyle \begin{aligned} \begin{array}{ll} \frac{d}{da} \log \left[\frac{F({\mathbf{m}}_n)}{F(a)}\right] &= -2\left(\beta p-\frac{1}{1-a^2}\right) = - J_{p,\beta,h}^{\prime}(a) = 2\beta R^{\prime\prime}_{p,\beta,h}(a) \geq \delta\\ &\quad \mbox{ for some } \delta>0, \end{array} \end{aligned} $$
    (8.34)

    where the inequality comes from the fact that aR p,β,h(a) has a positive curvature that is bounded away from zero on [−1, m] (recall Fig. 4).

  2. 2.

    We view the excursions to the left of m n as starting from site N in the set {0, …, N} with \(N = {\mathbf {M}}_n = \frac {n}{2}({\mathbf {m}}_n+1)\). From (8.28)–(8.29), we get

    $$\displaystyle \begin{aligned} \begin{array}{ll} e_N(x) &= \sum_{z=0}^{N-1} \sum_{y=2}^z \frac{\pi(y,z]}{p(y,y+1)}\,\frac{\bar{R}(y)}{\bar{R}(N)} - \sum_{z=0}^{x-1} \sum_{y=2}^z \frac{\pi(y,z]}{p(y,y+1)}\,\frac{\bar{R}(y)}{\bar{R}(x)}\\ &\quad + \frac{2}{\bar{R}(N)\bar{R}(x)} [\bar{R}(N)-\bar{R}(x)]\\[0.2cm] &\leq \sum_{z=x}^{N-1} \sum_{y=2}^z \frac{\pi(y,z]}{p(y,y+1)}\,\frac{\bar{R}(y)}{\bar{R}(N)} + \frac{2}{\bar{R}(x)}\\[0.2cm] &\leq 2 \sum_{z=1}^{N-1} \sum_{y=2}^z \pi(y,z] + 2. \end{array} \end{aligned} $$
    (8.35)

    Here, we use that \(p(y,y+1) \geq \frac 12\) and \(1 = \bar {R}(0) \leq \bar {R}(y) \leq \bar {R}(N)\) for all 0 < y < N (recall (8.24) and note that \(x \mapsto \bar {R}(x)\) is non-decreasing). The bound is independent of x. Using the estimate

    $$\displaystyle \begin{aligned} Q(x) = \frac{p(x,x-1)}{p(x,x+1)} \leq \mathrm{e}^{-\epsilon(N-x)/N}, \quad 0 < x < N, \quad \mbox{ for some } \epsilon = \epsilon(\delta)>0, \end{aligned} $$
    (8.36)

    which comes from (8.34), we can estimate

    $$\displaystyle \begin{aligned} \pi(y,z] {\leq} \prod_{x=y+1}^z \mathrm{e}^{-\epsilon (N-x)/N} = \exp\left[ -\epsilon \sum_{x=y+1}^z (N-x)/N \right], \quad \! 0 \leq y \leq z <N, \end{aligned} $$
    (8.37)

    from which it follows that

    $$\displaystyle \begin{aligned} \sum_{z=1}^{N-1} \sum_{y=2}^z \pi(y,z] = O(N/\epsilon), \qquad N\to\infty. \end{aligned} $$
    (8.38)

    Thus we arrive at

    $$\displaystyle \begin{aligned} e_N(x) = O(N), \qquad N \to \infty, \quad \mbox{ uniformly in } 0<x<N. \end{aligned} $$
    (8.39)

    To turn (8.39) into a tail estimate, we use the Chebyshev inequality: (8.39) implies that every N time units there is a probability at least c to hit N, for some c > 0 and uniformly in 0 < x < N. Hence

    $$\displaystyle \begin{aligned} \mathbb{P}_x(\tau_N \geq kN \mid \tau_N < \tau_0) \leq (1-c)^k \qquad \forall\, k \in \mathbb{N}_0. \end{aligned} $$
    (8.40)
  3. 3.

    For excursions to the right of m n the argument is similar. Now \(N = {\mathbf {T}}_n - {\mathbf {M}}_n = \frac {n}{2}({\mathbf {t}}_n-{\mathbf {m}}_n)\) (recall (1.17)), and the role of 0 and N is interchanged. Both near 0 and near N the drift towards M n vanishes linearly (because of the non-zero curvature). If we condition the random walk not to hit N, then the average hitting time of 0 starting from x is again O(N), uniformly in x.

  4. 4.

    Returning from the discrete-time random walk to the continuous-time Curie-Weiss model, we note that order n spin-flips occur per unit of time. Since N ≍ n as n →, (8.40) and its analogue for excursions to the right give that, uniformly in \(\xi \in A_{{\mathbf {M}}_n}\),

    $$\displaystyle \begin{aligned} \mathbb{P}_\xi\left[\tau_{A_{{\mathbf{M}}_n}} \geq k \mid \tau_{A_{{\mathbf{M}}_n}} < \tau_{A_{{\mathbf{T}}_n}} \right] \leq \mathrm{e}^{-Ck} \qquad \forall\, k \in \mathbb{N}_0. \end{aligned} $$
    (8.41)

    for some C > 0, which is the bound in (3.35).

8.3.2 Erdős-Rényi Random Graph

We next argue that the above argument can be extended to our spin-flip dynamics after taking into account that the rates to move downwards and upwards in magnetization are perturbed by small errors. In what follows we will write p CW(x, x ± 1) for the transition probabilities in the Curie-Weiss model and p ER(x, x ± 1) for the transition probabilities that serve as uniform upper and lower bounds for the transition probabilities in our spin-flip model. Recall that the latter actually depend on the configuration and not just on the magnetization, but Lemma 3.4 provides us with uniform bounds that allow us to sandwich the magnetization between the magnetizations of two perturbed Curie-Weiss models.

  1. 1.

    Suppose that

    $$\displaystyle \begin{aligned} \frac{p^{\mathrm{ER}}(x,x-1)}{p^{\mathrm{ER}}(x,x+1)} = \frac{p^{\mathrm{CW}}(x,x-1)}{p^{\mathrm{CW}}(x,x+1)} \big[1+O(N^{-1/2})]. \end{aligned} $$
    (8.42)

    Then there exists a C > 0 large enough such that

    $$\displaystyle \begin{aligned} \pi^{\mathrm{ER}}(y,z] \leq C \pi^{\mathrm{CW}}(y,z], \qquad 0 \leq y \leq z <N. \end{aligned} $$
    (8.43)

    Indeed, as long as z − y ≤ C 1 N 1∕2 we have the bound in (8.43) (with C depending on C 1). On the other hand, if z − y > C 1 N 1∕2 with C 1 large enough, then the drift of the Curie-Weiss model sets in and overrules the error: recall from (8.36) that the drift at distance N 1∕2 from N is of order N 1∕2N = N −1∕2. It follows from (8.43) that (8.38)–(8.40) carry over, with suitably adapted constants, and hence so does (8.41).

  2. 2.

    To prove (8.42), we must show that (8.32) holds up to a multiplicative error 1 + O(n −1∕2). In the argument that follows we assume that k is such that θ k ≥ δ for some fixed δ > 0. We comment later on how to extend the argument to other k values. Recall that θ k = −pa k − h and that θ k ≥ δ > 0 for all a k ∈ [−1, m] for n large enough.

  3. 3.

    Let σ ∈ A k and σ v ∈ A k−1, where σ v is obtained from σ by flipping the sign at vertex v ∈ σ from + 1 to − 1. Write the transition rate from σ to σ v as

    $$\displaystyle \begin{aligned} \begin{array}{ll} r(\sigma,\sigma^v) &= \exp\left( -\beta \left[ 2p(\frac{2k}{n}-1)+2h + \frac{2}{n}\big(\epsilon(\sigma,v)-\epsilon(\overline{\sigma},v)\big)\right]_{+}\right)\\ &= \exp\left( -2\beta \left[ -\vartheta_k + \frac{1}{n}\big(\epsilon(\sigma,v)-\epsilon(\overline{\sigma},v)\big)\right]_{+}\right). \end{array} \end{aligned} $$
    (8.44)

    Here, \(2p(\frac {2k}{n}-1)=\frac {2}{n}p[k-(n-k)]\) equals \(\frac {2}{n}\) times the average under \(\mathbb {P}_{\mathrm {ER}_n(p)}\) of \(E(\sigma ,v) - E(\overline {\sigma },v)\), with E(σ, v) the number of edges between the support of σ and v and \(E(\overline {\sigma },v)\) the number of edges between the support of \(\overline {\sigma }\) and v (recall the notation in Definition 3.1), and \(\epsilon (\sigma ,v)-\epsilon (\overline {\sigma },v)\) is an error term that arises from deviations of this average. Since − 𝜗 k ≤−δ, the error terms are not seen except when they represent a large deviation of size at least δn. A union bound over all the vertices and all the configurations, in combination with Hoeffding’s inequality, guarantees that, with \(\mathbb {P}_{\mathrm {ER}_n(p)}\)-probability tending to 1 as n →, for any σ there are at most \((\log 2)/2\delta ^2 = O(1)\) many vertices that can lead to a large deviation of size at least δn. Since r(σ, σ v) ≤ 1, we obtain

    $$\displaystyle \begin{aligned} \sum_{v\in\sigma} r(\sigma,\sigma^v) = O(1) + [n-k-O(1)]\,\mathrm{e}^{-2\beta [-\vartheta_k]_+}. \end{aligned} $$
    (8.45)

    This is a refinement of (3.10).

  4. 4.

    Similarly, let σ ∈ A k and σ v ∈ A k+1, where σ v is obtained from σ by flipping the sign at vertex vσ from − 1 to + 1. Write the transition rate from σ to σ v as

    $$\displaystyle \begin{aligned} \begin{array}{ll} r(\sigma,\sigma^v) &= \exp\left( -\beta \left[2p(1-\frac{2k}{n}) - 2h + \frac{2}{n}\big(\epsilon(\overline{\sigma},v)-\epsilon(\sigma,v)\big)\right]_{+}\right)\\ &= \exp\left( -2\beta \left[\vartheta_k + \frac{1}{n}\big(\epsilon(\overline{\sigma},v)-\epsilon(\sigma,v)\big)\right]_{+}\right). \end{array} \end{aligned} $$
    (8.46)

    We cannot remove [⋅]+ when the error terms represent a large deviation of order δn. By the same argument as above, this happens for all but \((\log 2)/2\delta ^2 = O(1)\) many vertices v. For all other vertices, we can remove [⋅]+ and write

    $$\displaystyle \begin{aligned} r(\sigma,\sigma^v) = \mathrm{e}^{-2\beta \vartheta_k} \,\exp\left( \frac{1}{n} \big(\epsilon(\overline{\sigma},v)-\epsilon(\sigma,v)\big) \right). \end{aligned} $$
    (8.47)

    Next, we sum over v and use the inequality, valid for δ small enough,

    $$\displaystyle \begin{aligned} \mathrm{e}^{-(1+\delta) \frac{1}{M} |\sum_{i=1}^M a_i|} \leq \frac 1M \sum_{i=1}^M \mathrm{e}^{a_i} \leq \mathrm{e}^{(1+\delta) \frac{1}{M} |\sum_{i=1}^M a_i|} \qquad \forall\,0 \leq |a_i| \leq \delta,\,\, 1 \leq i \leq M. \end{aligned} $$
    (8.48)

    This gives

    $$\displaystyle \begin{aligned} \begin{array}{ll} &\sum_{v \notin \sigma} r(\sigma,\sigma^v) = O(1) + [k-O(1)]\,\mathrm{e}^{-2\beta \vartheta_k}\,\mathrm{e}^{O(|S_n|)},\\[0.2cm] &S_n = \frac{1}{[k-O(1)]} \frac{1}{n} \sum_{v \notin \sigma} \big(\epsilon(\overline{\sigma},v)-\epsilon(\sigma,v)\big). \end{array} \end{aligned} $$
    (8.49)

    We know from Lemma 3.2 that, with \(\mathbb {P}_{\mathrm {ER}_n(p)}\)-probability tending to 1 as n →,

    $$\displaystyle \begin{aligned} |S_n| \leq \frac{c n^{3/2}}{[k-O(1)]n} \qquad \forall\,c > \sqrt{\frac{1}{8}\log 2}. \end{aligned} $$
    (8.50)

    Since we may take \(k \geq \frac {n}{3}(p-h)\) (recall (3.14)), we obtain

    $$\displaystyle \begin{aligned} \sum_{v \notin \sigma} r(\sigma,\sigma^v) = O(1) + [k-O(1)]\,\mathrm{e}^{-2\beta \vartheta_k}\,\mathrm{e}^{O(n^{-1/2})}. \end{aligned} $$
    (8.51)

    This is a refinement of (3.11).

  5. 5.

    The same argument works when we assume that k is such that 𝜗 k ≤−δ for some fixed δ > 0: simply reverse the arguments in Steps 3 and 4. It therefore remains to explain what happens when 𝜗 k ≈ 0, i.e., \(a_k \approx -\frac {h}{p}\). We then see from (1.27) that \(R^{\prime }_{p,\beta ,h}(a_k) \approx \beta ^{-1} I'(a_k) < 0\), so that a k lies in the interval [t, 0], which is beyond the top state (recall Fig. 4).