1 Introduction

We consider a Galton–Watson tree \({\mathbb {T}}\) with offspring distribution \(\nu \). The measure \(\mathrm{GW}\) denotes the Galton–Watson measure on the space of trees, and \(E_{\mathrm{GW}}\) is the expectation with respect to \(\mathrm{GW}\). The root is denoted by \( e \). We suppose that the mean number of children \(m:=E_{\mathrm{GW}}[\nu ]\) is strictly greater than 1 so that the tree is super-critical. We write \(\mathrm{GW}^*\) for the Galton–Watson measure conditioned on \({\mathbb {T}}\) being infinite.

We call \(\nu (x)\) the number of children of the vertex x in \({\mathbb {T}}\). For \(x\in {\mathbb {T}}\backslash \{e\}\), we denote by \( {x_*}\) the parent of x, that is the neighbour of x which lies on the path from x to the root \( e \), and by \(xi,1\le i\le \nu (x)\) the children of x. We let |x| be the height of the vertex x, that is the graph distance between the root and x. Fix a tree \({\mathbb {T}}\). For \(\lambda \ge 0\), the \(\lambda \)-biased random walk \((X_n)_{n\ge 0}\) is the Markov chain on the graph \({\mathbb {T}}\) which starts at \( e \) and such that

$$\begin{aligned} \mathrm{P}_{{\mathbb {T}}}(X_{n+1}= {x_*}\,|\, X_n=x)= & {} {\lambda \over \lambda +\nu (x)}, \end{aligned}$$
(1.1)
$$\begin{aligned} \mathrm{P}_{{\mathbb {T}}}(X_{n+1}=xi\,|\, X_n=x)= & {} {1\over \lambda + \nu (x)} \; \; \text{ for } \text{ any } \;\; 1\le i\le \nu (x) . \end{aligned}$$
(1.2)

To define the transition probabilities from the root, we artificially add a parent \({e_*}\) to the root, and we suppose that the Markov chain is reflected at \({e_*}\). We denote by \(\mathrm{P}_{{\mathbb {T}}}\) the quenched probability associated to the Markov chain \((X_n)_n\) on the tree \({\mathbb {T}}\) and by \(\mathbb {P}\), resp. \(\mathbb {P}^*\), the annealed probability obtained by averaging \(\mathrm{P}_{\mathbb {T}}\) over \(\mathrm{GW}\), resp. \(\mathrm{GW}^*\). They are associated to the expectations \(\mathrm{E}_{{\mathbb {T}}}, \mathbb {E}\) and \(\mathbb {E}^*\).

When \(\lambda <m\), the Markov chain is transient for \(\mathrm{GW}^*\)-almost every tree, see Lyons [21]. We refer to the works of Lyons, Pemantle and Peres [2426] for the study of the transient biased random walk, and open questions. We consider here the null recurrent case \(\lambda =m \in (1,\infty )\). Peres and Zeitouni [31] showed a central limit theorem for the height of the walk.

Theorem

(Peres, Zeitouni [31]) Assume \(m\in (1,\infty ), \lambda =m\) and some exponential moments for \(\nu \). Let \(\sigma ^2:= {m(m-1) \over E_{\mathrm{GW}}[\nu (\nu -1)]}\). For \(\mathrm{GW}^*\)-almost every tree, the process \(\left\{ |X_{\lfloor nt \rfloor }| / \sqrt{ \sigma ^2 n}\right\} _{t\ge 0} \) converges in law towards \((|B_{t}|)_{t\ge 0}\), where \((B_{t})_{t\ge 0}\) is a standard Brownian motion.

This theorem was proved by finding an explicit invariant measure on the space of trees, and showing an invariance principle for a martingale which approximates the process \((|X_{n}|)_{n\ge 0}\). Dembo and Sun [10] extended the theorem to the case where \({\mathbb {T}}\) is a multi-type Galton–Watson tree, assuming only a moment of order \(4+\varepsilon \), for some \(\varepsilon >0\). A natural question is now to understand the trace of the walk \((X_{n})_{n}\) in the tree \({\mathbb {T}}\). Let \(\mathcal {R}_{n}:=\{X_{k},\, k\le n\}\) be the set of vertices visited by the walk until time n, and \(R_{n}\) its cardinal, also called the range. Notice that \(\mathcal {R}_{n}\) is a tree. We will consider it as a metric space, where each edge has length 1. From this point of view, \(\mathcal {R}_{n}\) is an unlabeled tree. Our main theorem says that \(\mathcal {R}_{n}\) suitably normalized converges in distribution to the real tree coded by \((|B_{t}|)_{t\in [0,1]}\). We briefly recall its construction taken from [12]. We refer to [20] for a review on random real trees.

Let g be a continuous function from [0, 1] to \({\mathbb {R}}_{+}\) such that \(g(0)=0\) (it is usually assumed that \(g(1)=0\) but it is not the case here). For any \(s,t \in [0,1]\), define \(d_{g}(s,t):= g(s)+g(t)-2 \min \{g(r); r\in [\min (s,t),\max (s,t)]\}\). Using the equivalence relation \(s\sim t \Leftrightarrow d_{g}(s,t)=0\), we see that \(d_{g}\) defines a metric on the quotient space \({\mathcal {T}}_{g}:= [0,1] / \sim \). The metric space \(({\mathcal {T}}_{g},d_{g})\) is the real tree encoded by g. The space of all real trees is equipped with the Gromov-Hausdorff metric \(d_{G}\) (see Sect. 1 of [20]). Taking for g a normalized Brownian excursion, \({\mathcal {T}}_{g}\) is the continuum random tree, also called Brownian tree, introduced by Aldous [1, 2]. Here we will take for g the reflected Brownian motion \(|\mathbf{B}|:=(|B_{t}|)_{t\in [0,1]}\). In this case, \({\mathcal {T}}_{g}\) can be seen as a Brownian forest explored up to time 1. For \(r>0\), the notation \(r\mathcal {R}_{n}\) denotes the tree \(\mathcal {R}_{n}\) with edge length r.

Theorem 1.1

Assume that \(m\in (1,\infty ), \lambda =m\) and let \(\sigma ^2:= {m(m-1) \over E_{\mathrm{GW}}[\nu (\nu -1)]}\in (0;\infty )\). Under \(\mathbb {P}^*\) (annealed case) and under \(\mathrm{P}_{{\mathbb {T}}}\) for \(\mathrm{GW}^*\)-a.e. tree \({\mathbb {T}}\) (quenched case), the following joint convergence in law holds as \(n\rightarrow \infty \):

$$\begin{aligned} {1\over \sqrt{ \sigma ^2 n}} \left( \mathcal {R}_{n}, \left\{ |X_{\lfloor nt \rfloor }| \right\} _{t\in [0,1]}\right) \Rightarrow \left( {\mathcal {T}}_{|\mathbf{B}| },|\mathbf{B} | \right) \end{aligned}$$

for the Gromov-Hausdorff topology on the space of real trees and the Skorokhod topology on the space of càdlàg functions.

Therefore, asymptotically, the random walk \((X_{n})_{n}\) looks like the contour function of its trace, see Section 3 of [20] (this statement is not very precise because we deal with unlabeled trees here). The theorem also extends the result of Peres and Zeitouni [31] on the convergence of the height of the random walk under a second moment assumption. The idea of the proof is to look at the local times of the random walk. This strategy was used in the papers of Kesten, Kozlov, Spitzer [19] in the case of random walks in random environment on \(\mathbb {Z}\), and Basdevant and Singh [46] in the case of multi-excited random walks on \(\mathbb {Z}\) and on the regular tree. Call an excursion of \((X_{n})_{n}\) the trajectory of the walk before hitting the parent of its starting point. During one excursion, the visited tree marked with the local times of the edges \(( {x_*},x)\) (i.e. the number of times the directed edge has been crossed) forms under \(\mathbb {P}\) a multi-type Galton–Watson tree, with initial type 1. We will show that the successive excursions from the root \( e \) are close to be independent (they are identically distributed but not independent under \(\mathbb {P}\)). Therefore, \(\mathcal {R}_{n}\) is close to a concatenation of i.i.d. multi-type Galton–Watson trees, and we use a result of [8] on scaling limit of multi-type Galton–Watson trees to complete the proof in the annealed case. In the quenched case, we show that there is an averaging phenomenon, which is reminiscent of (but much easier to prove than) what happens for the large deviations of transient biased random walks on Galton–Watson trees [9].

The same strategy can be applied to the case of random walks on random environment on Galton–Watson trees, see Faraud [14]. We give more details on this account in Sect. 6.

The paper is organized as follows. In Sect. 2, we recall the result of [8] on the scaling limit of certain two-type Galton–Watson trees with edge lengths. In Sect. 3, we describe the process of the local times of an excursion of the biased random walk. In Sect. 4, we construct different reduced trees associated to the trace of an excursion. These reduced trees are simpler to deal with since they fall into the scope of [8], but still contain all the information needed. We prove Theorem 1.1 in Sect. 5. Finally, Sect. 6 deals with the case of random walks in random environment on a Galton–Watson tree.

2 Preliminaries

Let T be a finite rooted ordered tree. We refer to Neveu [30] for the formal construction of a tree. By the representation of [30], we can label the vertices of a tree through the set of words \(\bigcup _{n\ge 0} {\mathbb {N}}^n\). The generation of a vertex is the length of its label, the root being of generation 0. Since the set of words is equipped with the lexicographical order, we can rank the vertices of T from the smallest (the root) to the biggest. This gives a way to explore the tree, called the depth-first search, and we call index of a given vertex the rank of that vertex in the depth-first search (the index of the root being set to 0).

We put on each edge of the tree a non-negative mark, which stands for its length. When not specified, the length of an edge is set to 1. Therefore the tree T is endowed with a natural metric (or pseudo-metric in the case where some edges have length 0). The height of a vertex is by definition the distance of the vertex to the root. The height function of the tree T is the function that maps any integer \(k\in [\![0,\#T-1 ]\!]\) to the height of the vertex of index k in the depth-first search. A forest is a sequence of finite trees \((T_{i})_{i}\), and the height function of a forest is the concatenation of the height functions of the trees \((T_{i})_{i}\).

Let us introduce the result of [8] that will be used in our proof. Let T be a leafed Galton–Watson tree with edge lengths (as defined in Section 1.1 of [8]); that is T is a 2-type Galton–Watson tree with edge lengths with types denoted by s and f such that vertices of type s give no offspring (they are sterile). The offspring distribution of a vertex of type f can be represented by a random point process \(\theta =(\delta _{t(i),\ell (i)})_{i\le N}\) on \(\{s,f\}\times {\mathbb {R}}_{+}\) where N is the number of children, t(i) is the type (s or f) of the i-th child for the lexicographical order, and \(\ell (i)\) is the length of its edge. The type of the root is f. It gives birth according to \(\theta \). Children at generation 1 of type s have no offspring, while the ones of type f give birth independently according to i.i.d. copies of \(\theta \) and so on. We suppose that \(\sum _{i=1}^{N} \mathbf{1}_{\{ t(i)=f \}}\) has mean 1 and some finite variance \(\Sigma _{f}^2\in (0,\infty )\) (we take \(\Sigma _{f}\ge 0\)). It means that the Galton–Watson tree composed of vertices of type f is critical. Moreover, we suppose that

  1. (i)

    \(E[N]=: C_{1}^{-1}<\infty \),

  2. (ii)

    \(y^{2}E\left[ \sum _{i=1}^N \mathbf{1}_{\{ t(i)=f, \ell (i)>y\}} \right] \) goes to 0 as \(y\rightarrow \infty \),

  3. (iii)

    \(y^2P\left( \max _{i\le N,t(i)=s}\ell (i)>y\right) \) goes to 0 as \(y\rightarrow \infty \).

We denote by \(C_2\) the quantity \(E\left[ \sum _{i=1}^N \ell (i)\mathbf{1}_{\{ t(i)=f \}}\right] \), which is finite thanks to (ii). Finally we take i.i.d. trees \((T_{i})_{i}\) distributed as T and we call H the height function associated to the forest, and \(H_{f}\) the height function of the forest restricted to vertices of type f. The following result comes from Theorem 1 of [8] (beware that \(H_f\) is different from the process \(H^1\) introduced in [8] since in our case the lengths are not reset to 1). It states that the height function H is asymptotically given by a deterministic rescaling in time of \(H_{f}\). Loosely speaking the forest composed of the vertices of type f captures all the randomness.

Theorem A

[8] The following joint convergence in law holds as \(n\rightarrow \infty \) for the Skorokhod topology of càdlàg functions:

$$\begin{aligned} {1\over \sqrt{n}}\left( H(\lfloor nt\rfloor ),H_{f}(\lfloor nt\rfloor )\right) _{t\ge 0} \Rightarrow {2C_{2}\over \Sigma _{f}} \left( |B_{C_{1} t }|,|B_{t}|\right) _{t\ge 0}, \end{aligned}$$

where B is a standard Brownian motion.

To be precise, you obtain this result by combining Theorem 1 of [8] applied on one hand to the leafed Galton–Watson tree with edge lengths T and on the other hand to the single-type Galton–Watson tree with edge lengths composed only of vertices of type f.

Let \(\chi (k)\) be the index (for the depth-first search) of the k-th vertex of type f visited by the depth-first search in the forest. Let \( \bar{\ell }(n)\) be the maximal length of the edges explored by the depth-first search until n vertices of type f have been visited. The following lemma can be found in [8] ((i) is the equation which lies right below equation (2.10) in the proof of Proposition 5, as \(\chi \) is denoted by \(\psi \) in [8], and (ii) comes from equation (2.4) in the proof of Proposition 5 together with the use of our condition (iii) to control the length of edges of type s).

Lemma 2.1

The following convergences hold in probability:

$$\begin{aligned} (i)\; \lim _{k\rightarrow \infty } {\chi (k) \over k} \mathop {=}\limits ^{(\mathrm{P})} {1\over C_{1}}; \quad (ii)\; \lim _{n\rightarrow \infty } {{ \bar{\ell }}(n) \over \sqrt{n} } \mathop {=}\limits ^{(\mathrm{P})} 0. \end{aligned}$$

3 Description of the process of local times

Recall that \({\mathbb {T}}\) is a Galton–Watson tree, in which we artificially added a parent \({e_*}\) to the root \( e \). Let \(\tau _{{e_*}}^{(1)}:=\min \{n\ge 1\,:\, X_{n}={e_*}\}\) be the hitting time of \({e_*}\) in \({\mathbb {T}}\). Setting \(X_{-1}={e_*}\) (which is coherent with the setting \(X_0= e \)), we let for each vertex \(x\in {\mathbb {T}}{\setminus }\{{e_*}\}\),

$$\begin{aligned} N_{x}^{(1)}:=\sum _{n= 0}^{\tau _{{e_*}}^{(1)}} \mathbf{1}_{\{X_{n-1}= {x_*},X_{n}=x\}} \end{aligned}$$

which stands for the number of crosses of the directed edge \(( {x_*},x)\) during one excursion. Notice that in this case, \(N_ e ^{(1)}=1\). More generally, if we define for \(k\ge 2, \)

$$\begin{aligned} \tau _{{e_*}}^{(k)} := \min \{n>\tau _{{e_*}}^{(k-1)}\,:\, X_{n} ={e_*}\}, \end{aligned}$$

then we can set for each vertex \(x\in {\mathbb {T}}{\setminus }\{{e_*}\}, \)

$$\begin{aligned} N_{x}^{(k)} := \sum _{n= 0}^{\tau _{{e_*}}^{(k)}} \mathbf{1}_{\{X_{n-1}= {x_*},X_{n}=x\}}. \end{aligned}$$

The random variable \(\tau _{{e_*}}^{(k)}\) stands for the k-th visit time at \({e_*}\), and \(N_{x}^{(k)}\) is the local time on the directed edge \(( {x_*},x)\) up to that time. Notice that \(N_{ e }^{(k)}=k\).

Lemma 3.1

Let \(k\ge 1\). Under \(\mathbb {P}\), the marked tree \(\big (\{x\in {\mathbb {T}}\backslash \{{e_*}\}\,:\, N_{x}^{(k)}\ge 1\}, N^{(k)} \big )\) is a multi-type Galton-Watson tree with initial type k.

Proof

Let us construct the tree \({\mathbb {T}}\) and the Markov chain \((X_{n})_{n\ge 0}\). Recall from the setting of Neveu [30], that we can see \({\mathbb {T}}\backslash \{{e_*}\}\) as a subset of the set of words \(U:=\bigcup _{n\ge 0} {\mathbb {N}}^n\). On each word \(x\in U\) independently, we let \(\nu (x)\) be distributed as \(\nu \). In the case where x is a vertex of \({\mathbb {T}}, \nu (x)\) is the number of children of x in \({\mathbb {T}}\). Furthermore, on each word \(x\in U\) independently, we attach a sequence \(\mathcal P_x\) of i.i.d. random variables equal to a child xi (with \(i\le \nu (x)\)) with probability \({1\over m+\nu (x)}\) and to the parent \( {x_*}\) with probability \({m\over m+\nu (x)}\). Then the Markov chain \((X_n)_{n\ge 0}\) is a function of all processes \((\mathcal P_x,x\in U)\). For a child xi of x, we observe that \(N_{xi}^{(k)}\) is the number of appearances of xi in the sequence \(\mathcal P_x\) until \( {x_*}\) has appeared \(N_{x}^{(k)}\) times. In particular, the law of \((N_{xi}^{(k)},i\le \nu (x))\) given \((N_{y}^{(k)},|y|\le |x|)\) only depends on \(N_{x}^{(k)}\). It implies the lemma. \(\square \)

Let us consider the setting of Section 1.2 of [8]. Let

$$\begin{aligned} \mathbf{T}:= \{x\in {\mathbb {T}}\backslash \{{e_*}\} \,:\, N_{x}^{(1)}\ge 1\} \end{aligned}$$

be the subtree of \({\mathbb {T}}\) visited by the walk during one excursion. The marked tree \((\mathbf{T},N^{(1)})\) is a multi-type Galton–Watson tree and we want to check that the assumptions \(({\mathbf {H}}_M), \mathbf ({\mathbf {H}}_R^{1})\) and \(({\mathbf {H}}_Q)\) of [8] hold. Indeed, in the next section we will build from \((\mathbf{T},N^{(1)})\) a leafed Galton–Watson tree with edge lengths, which according to a result of [8] will satisfy hypotheses (i), (ii) and (iii) of Theorem A if \(\mathbf{T}\) satisfies these hypotheses. The mean matrix is, for \(i,j\ge 1, \)

$$\begin{aligned} m_{i,j}:=\mathbb {E}\left[ \sum _{|x|=1} \mathbf{1}_{\{N_{x}^{(i)}=j\}} \right] = \left( {\begin{array}{c}i+j-1\\ j\end{array}}\right) {m^{i+1}\over (m+1)^{i+j}}. \end{aligned}$$

We notice that the vectors \((a_{i})_{i\ge 1}\) and \((b_{i})_{i\ge 1}\) given by \(a_{i}:=(m-1)m^{-i}\) and \(b_{i}:=(1-m^{-1})\, i\) are respectively left and right eigenvectors associated to the eigenvalue 1, normalized such that \(\sum _{i\ge 1} a_{i}=1\) and \(\sum _{i\ge 1} a_{i}b_{i}=1\). This ensures that \(\mathbf{T}\) satisfies hypothesis \(({\mathbf {H}}_M)\). In this context, a version of the many-to-one lemma reads as follows (its proof goes by induction on n). For any bounded function \(f:{\mathbb {N}}^n\rightarrow {\mathbb {R}}\), we have, denoting by \(x_{i}\) the ancestor of x at generation i,

$$\begin{aligned}&\mathbb {E}\left[ \sum _{|x|=n,N_{x}^{(k)}\ge 1} f(N_{x_{1}}^{(k)},N_{x_{2}}^{(k)},\ldots ,N_{x_{n-1}}^{(k)},N_{x}^{(k)}) \right] \nonumber \\&\quad = k\mathbb {E}\left[ {1\over \hat{N}_{n}}f(\hat{N}_{1},\hat{N}_2,\ldots ,\hat{N}_{n-1},\hat{N}_{n}) \Big | \hat{N}_{0} = k\right] \end{aligned}$$
(3.3)

where \((\hat{N}_{i})_{i\ge 0}\) is a Markov chain on \({\mathbb {N}}\backslash \{0\}\) with transition probabilities from i to j given by

$$\begin{aligned} m_{i,j}{b_{j}\over b_{i}} = \left( {\begin{array}{c}i+j-1\\ i\end{array}}\right) {m^{i+1}\over (m+1)^{i+j}}. \end{aligned}$$

We can check that the probability distribution \(\pi \) on \({\mathbb {N}}\backslash \{0\}\) given by \(\pi _{i}:=a_{i}b_{i}\) is then a reversible measure for \((\hat{N}_{i})_{i\ge 0}\). The return time at 1 of this Markov chain is easily controlled by the following lemma.

Lemma 3.2

Let \(\hat{\gamma }_{1}:=\min \{i\ge 1\,:\, \hat{N}_{i} =1\}\). There exists \(r>0\) such that \(\mathbb {E}\left[ \mathrm{e}^{r\hat{\gamma }_{1}}\, |\, \hat{N}_{0}=1\right] <\infty \).

Proof

A computation leads to

$$\begin{aligned} \sum _{j\ge 1}\hat{p}_{i,j} j =1+\frac{1}{m}(i+1). \end{aligned}$$

Now for all \(i>i_0\) large enough, \(1+\frac{1}{m}(i+1)<d\times i\) for some \(d<1\). It implies that, starting in the set \(\{i\le i_{0}\}\), the return time to this set admits exponential moments (see e.g. Theorem 15.2.5 in [27]). The probability to go from \(i\le i_{0}\) to 1 in one step is uniformly bounded from below by some positive constant. It implies that the number of hits of the set \(\{i\le i_{0}\}\) before time \(\hat{\gamma }_{1}\) is stochastically dominated by a geometric random variable.Therefore \(\hat{\gamma }_{1}\) is stochastically dominated by a sum of a geometric number of i.i.d random variables which have exponential moments. It implies the lemma. \(\square \)

This ensures that equation (A.1) of Appendix of [8] is satisfied with \(V(i)=i\), and therefore \(\mathbf{T}\) satisfies condition \(({\mathbf {H}}_R^{alt})\) of [8] and thus condition \(({\mathbf {H}}_R^{1})\) according to Proposition 9 of [8]. Finally, a computation yields

$$\begin{aligned} Q_{j,k}^{i}:= & {} \mathbb {E}\left[ \left( \sum _{|x|=1} \mathbf{1}_{\{N_{x}^{(i)}=j\}}\right) \left( \sum _{|x|=1} \mathbf{1}_{\{N_{x}^{(i)}=k\}}\right) \right] -\delta _{j,k}m_{j,k} \\= & {} \left( {\begin{array}{c}i+j+k-1\\ i-1,j,k\end{array}}\right) E_{\mathrm{GW}}[\nu (\nu -1)]\frac{m^i}{(m+2)^{i+j+k}} \end{aligned}$$

and

$$\begin{aligned} \eta ^2:= \sum _{i,j,k\ge 1} a_i b_j Q_{j,k}^{i}b_k =2\frac{E_{\mathrm{GW}}[\nu (\nu -1)]}{m^2}<\infty , \end{aligned}$$

which ensures that \(\mathbf{T}\) satisfies hypothesis \(({\mathbf {H}}_Q)\).

4 Reduction of trees

Following an idea of Miermont [28] further developed in [8], we will see that the important vertices in the marked tree \((\mathbf{T},N^{(1)})\) are the vertices of type 1. Therefore, we choose to work with some simpler trees constructed as follows.

The tree \(\mathbf{T}^{(r)}\)

Fig. 1
figure 1

The tree \(\mathbf{T}\) (top) and the tree \(\mathbf{T}^{(r)}\) (bottom) associated. The notation (2, 5) means that the vertex has type \(N_{x}^{(1)}=2\) and is the 5-th distinct vertex visited by the walk

Draw the tree \(\mathbf{T}\) in the plane and erase all the edges (but keep the vertices, remember the genealogy and the trajectory \((X_{n})_{n\le \tau _{{e_*}}^{(1)}}\)). Draw an edge between x and any descendant y such that x is the youngest ancestor of y in \(\mathbf{T}\) with type 1 (excluding y itself). The length of the edge between x and y is set to be \(|y|-|x|\), where |z| is the generation of z in the tree \({\mathbb {T}}\) or equivalenty in \(\mathbf{T}\). Re-order the resulting tree so that the order in which the walk \((X_{n})_{n}\) first hits the vertices is given by the depth-first search order (notice that it is possible indeed using the fact that an edge \(( {x_*},x)\) is crossed only once upwards for a vertex \(x\in {\mathbb {T}}\) with type 1). Call \(\mathbf{T}^{(r)}\) the tree that you obtain, see Fig. 1. The tree \(\mathbf{T}^{(r)}\) will be used to encode the trace \(\mathcal {R}_{n}\). If we set the type of a vertex z as f if \(N_{z}^{(1)}=1\) and as s otherwise then we see from Proposition 1 of [8] that \(\mathbf{T}^{(r)}\) is a leafed Galton–Watson tree with edge lengths as introduced in Sect. 2. This reduced tree is studied in Sections 1.3 and 3 of [8]. Proposition 8 of [8] ensures that \(\sum _{i=1}^{N} \mathbf{1}_{\{ t(i)=f \}}\) has mean 1 and variance \(\Sigma _{f}^2 = {\eta ^2 \over b_{1}^2 a_{1}}\). Proposition 7 of [8] ensures that (i) is satisfied with \(C_{1} =a_{1} \). Conditions (ii) and (iii) are exactly assumption \(({\mathbf {H}}_R^{1})\) of [8] which we proved to hold in the last subsection. Section 3.4 of [8] gives \(C_{2}={1\over a_{1}b_{1}}\).

The tree \(\mathbf{T}^{(w)}\)

Consider the tree \(\mathbf{T}^{(r)}\), and let x be a vertex of type f. For any child y of x in \(\mathbf{T}^{(r)}\) with type s, denoting by \(k_{y}\) the number of times the vertex y has been visited by the walk \((X_{n})_{n\le \tau _{{e_*}}^{(1)}}\) in \({\mathbb {T}}\), root \(k_{y}-1\) copies of y at x (which makes \(k_y\) copies of y in total). Root also \(k_{x}-1\) new edges of length 0 at x. Do this for any vertex x of type f and re-order the new edges so that the height function of the tree is exactly given by \((|X_{n}|)_{n\le \tau _{{e_*}}^{(1)}}\). Call \(\mathbf{T}^{(w)}\) this tree. Again \(\mathbf{T}^{(w)}\) is a leafed Galton–Watson tree with edge lengths as introduced in Sect. 2. The old vertices inherit their types s or f from \(\mathbf{T}^{(r)}\) whereas the type of the newly created vertices are all set to s. The values of \(\Sigma _{f}\) and \(C_{2}\) remain unchanged but we need to compute \(C_{1}\). We observe that in that case \(C_{1}^{-1}\) is by construction

$$\begin{aligned} \mathbb {E}\left[ k_{ e } -1 + \sum _{x\in {\mathbb {T}}\backslash \{{e_*}, e \}} k_x\mathbf{1}_{\{ N_{x_{i}}^{(1)}\ne 1,\forall i \in [\![ 1, |x|]\!] \}} + \sum _{x\in {\mathbb {T}}\backslash \{{e_*}, e \}} \mathbf{1}_{\{ N_{x_{i}}^{(1)}\ne 1,\forall i \in [\![ 1, |x|-1]\!] \}}{} \mathbf{1}_{\{ N_{x}^{(1)}=1\}} \right] \end{aligned}$$

where we recall that \(x_{i}\) is the ancestor of x at generation i in \({\mathbb {T}}\). We notice that the term inside the expectation can be rewritten as

$$\begin{aligned} 2 \sum _{ x\in {\mathbb {T}}\backslash \{{e_*}, e \}} N_{x}^{(1)}{} \mathbf{1}_{\{ N_{x_{i}}^{(1)}\ne 1,\forall i \in [\![ 1, |x|-1]\!] \}}. \end{aligned}$$

Therefore

$$\begin{aligned} C_{1}^{-1} = 2 \sum _{\ell \ge 1} \mathbb {E}\left[ \sum _{|x|=\ell } N_{x}^{(1)}{} \mathbf{1}_{\{ N_{x_{i}}^{(1)}\ne 1,\forall i \in [\![ 1, \ell -1]\!] \}} \right] . \end{aligned}$$

By Eq. (3.3), we get

$$\begin{aligned}&2\mathbb {E}\left[ \sum _{x\in {\mathbb {T}}\backslash \{{e_*}, e \}} N_{x}^{(1)}\mathbf{1}_{\{ N_{x_{i}}^{(1)}\ne 1,\forall i \in [\![ 1, |x|-1]\!] \}} \right] \\&\quad = 2\sum _{\ell \ge 1} \mathbb {P}\left( \hat{N}_{i}^{(1)}\ne 1,\forall i \in [\![ 1, \ell -1]\!] \,|\, \hat{N}_{0} =1 \right) \\&\quad = 2 \mathbb {E}\left[ \hat{\gamma }_{1} \,|\, \hat{N}_{0} =1 \right] = {2\over \pi (1)}=\frac{2}{a_1b_1} \end{aligned}$$

where we recall that \(\pi \) is the invariant probability measure of the Markov chain \((\hat{N}_{k})_{k}\). Therefore the value of \(C_{1}\) is now \(\frac{a_1b_1}{2}\).

Remark

During the procedure, and by an abuse of notation, a vertex \(x\in {\mathbb {T}}\) was referred to by the same name in the trees \(\mathbf{T}, \mathbf{T}^{(r)}\) and \(\mathbf{T}^{(w)}\). We will always do so.

Let \((\mathbf{T}_i,\mathbf{T}_{i}^{(r)},\mathbf{T}_{i}^{(w)})_{i\ge 1}\) be i.i.d. copies of the trees \((\mathbf{T},\mathbf{T}^{(r)},\mathbf{T}^{(w)})\). We call \(\mathbf{F}, \mathbf{F}^{(r)}, \mathbf{F}^{(w)}\) the forests associated. We let \(H^{(r)}\) and \(H^{(w)}\) be the height functions of \(\mathbf{F}^{(r)}\) and \(\mathbf{F}^{(w)}\). Theorem A yields the following proposition.

Proposition 4.1

The following joint convergence in law holds as \(n\rightarrow \infty \) for the Skorokhod topology on the space of càdlàg functions:

$$\begin{aligned} {1\over \sqrt{\sigma ^2 n}}\left( H^{(r)}( {\lfloor nt \rfloor }), H^{(w)}({\lfloor nt \rfloor })\right) _{t\ge 0} \Rightarrow \left( |B_{\frac{2}{b_1}t}|, |B_{ t}|\right) _{t\ge 0} \end{aligned}$$

where \((B_{t})_{t\ge 0}\) is a standard Brownian motion.

Proof

Theorem A applied to \(\mathbf{F}^{(r)}\) implies that we have the following convergence in law:

$$\begin{aligned} \frac{1}{\sqrt{n}}\left( H^{(r)}( {\lfloor nt \rfloor }),H_f^{(r)}( {\lfloor nt \rfloor })\right) _{t\ge 0} \Rightarrow {2b_1\sqrt{a_1}\over \eta }(a_1b_1)^{-1}\left( |B_{a_1 t}|, |B_t| \right) _{t\ge 0} \end{aligned}$$

where \(H_f^{(r)}\) denotes the height function of \(\mathbf{F}^{(r)}\) restricted to vertices of type f. Similarly, Theorem A implies that

$$\begin{aligned} \frac{1}{\sqrt{n}}\left( H^{(w)}( {\lfloor nt \rfloor }),H_f^{(w)}( {\lfloor nt \rfloor })\right) _{t\ge 0} \Rightarrow {2b_1\sqrt{a_1}\over \eta }(a_1b_1)^{-1}\left( |B_{{a_{1}b_{1}\over 2} t}|, |B_t| \right) _{t\ge 0} \end{aligned}$$

where \(H_f^{(w)}\) denotes the height function of \(\mathbf{F}^{(w)}\) restricted to vertices of type f. Finally notice that \(H_f^{(w)}=H_f^{(r)}\). Using the Brownian scaling with \(\frac{a_1b_1}{2}\) yields the theorem with \(\sigma ^2=\frac{2b_{1}}{\eta ^2}=\frac{m(m-1)}{\mathbb {E}[\nu (\nu -1)]}\). \(\square \)

5 Proof of Theorem 1.1

Recall that \(\mathcal {R}_{n}\) denotes the set of vertices visited by the walk before time n. Let \(R_{n}\) be the cardinal of \(\mathcal {R}_{n}\).

Lemma 5.1

Let \({\varepsilon }\in (0,1)\). Let \(\mathcal S_{{\varepsilon }}\) be the event

$$\begin{aligned} \mathcal S_{{\varepsilon }} := \left\{ \forall \, n\ge 1,\, \sum _{|x|=n} 1 \ge {\varepsilon }m^n \right\} . \end{aligned}$$

There exists a constant \(c>0\) such that for any \(a\ge 1\) and \(k\ge 1\),

$$\begin{aligned} \mathbb {P}\left( \mathcal S_{{\varepsilon }},R_{\tau _{{e_*}}^{(ka)}} < k^2 \right) \le 3\mathrm{e}^{-c{\varepsilon }\sqrt{a}}. \end{aligned}$$
(5.4)

Proof

The quenched probability \(\mathrm{P}_{{\mathbb {T}}}\left( R_{\tau _{{e_*}}^{(ka)}} < k^2 \right) \) is smaller than \(\mathrm{P}_{{\mathbb {T}}}(R_{ \tau _{{e_*}}^{(1)} }<k^2)^{ka}\). On the event \(\left\{ \mathrm{P}_{{\mathbb {T}}}(R_{\tau _{{e_*}}^{(1)}}\ge k^2) \ge {1\over k\sqrt{a} }\right\} \), we get that \( \mathrm{P}_{{\mathbb {T}}}\left( R_{\tau _{{e_*}}^{(ka)}} < k^2 \right) \le \mathrm{e}^{-\sqrt{a}}\). Therefore, in order to prove (5.4), we only need to bound the probability

$$\begin{aligned} \mathrm{GW}\left( \mathcal S_{{\varepsilon }}, \mathrm{P}_{{\mathbb {T}}}(R_{\tau _{{e_*}}^{(1)}} \ge k^2) < {1\over k\sqrt{a}} \right) . \end{aligned}$$

For any \(x\in {\mathbb {T}}\), let \((X_{n}^{(x)})_{n\ge 0}\) be the Markov chain starting at x. We can couple all \((X_{n}^{(x)},n\ge 0, x\in {\mathbb {T}})\) so that \(X_{n}^{(x)}\) is the trajectory of \((X_{n})_{n\ge 0}\) after the first visit time at x. Let \(\mathcal E_{x,k}\) be the event that the walk \((X_{n}^{(x)})_{n\ge 0}\) visits more than \(k^2\) distinct vertices before hitting \( {x_*}\). Notice that under \(\mathrm{P}_{{\mathbb {T}}}\), the events \((\mathcal E_{x,k},|x|=\ell )\) are mutually independent and independent of \(F_{\ell }:=\sigma \{N_{x}^{(1)},|x|\le \ell \}\). The probability \(\mathrm{P}_{{\mathbb {T}}}(R_{\tau _{{e_*}}^{(1)}} \ge k^2)\) is greater than the probability that there exists x such that \(N_{x}^{(1)}\ge 1\) and \(\mathcal E_{x,k}\) holds. By comparison with a one-dimensional random walk, \(\mathrm{P}_{{\mathbb {T}}}(N_{x}^{(1)}\ge 1) = {m - 1\over m^{|x|+1}-1}\). Using independence, we have for any \(\ell \ge 1, \)

$$\begin{aligned} \mathrm{P}_{{\mathbb {T}}}(R_{\tau _{{e_*}}^{(1)}} \ge k^2) \ge {m - 1\over m^{\ell +1}-1} \mathrm{P}_{{\mathbb {T}}}\left( \bigcup _{|x|=\ell } \mathcal E_{x,k}\right) . \end{aligned}$$

Choosing the largest integer \(\ell \ge 1\) such that \({m - 1\over m^{\ell +1}-1} \ge {2\over k\sqrt{a} }\), we get \(\mathrm{P}_{{\mathbb {T}}}(R_{\tau _{{e_*}}^{(1)}} \ge k^2) \ge {2\over k\sqrt{a} } \mathrm{P}_{{\mathbb {T}}}\left( \bigcup _{|x|=\ell } \mathcal E_{x,k}\right) \) hence, for such an \(\ell \),

$$\begin{aligned} \mathrm{GW}\left( \mathcal S_{{\varepsilon }}, \mathrm{P}_{{\mathbb {T}}}(R_{\tau _{{e_*}}^{(1)}} \ge k^2)< {1\over k\sqrt{a}} \right)\le & {} \mathrm{GW}\left( \mathcal S_{{\varepsilon }}, \mathrm{P}_{{\mathbb {T}}}\left( \bigcup _{|x|=\ell } \mathcal E_{x,k}\right)<1/2 \right) \\\le & {} \mathrm{GW}\left( \sum _{|x|=\ell } 1 \ge {\varepsilon }m^\ell , \mathrm{P}_{{\mathbb {T}}}\left( \bigcup _{|x|=\ell } \mathcal E_{x,k}\right) <1/2 \right) . \end{aligned}$$

By independence, we have

$$\begin{aligned} \mathbb {P}\left( \bigcap _{|x|=\ell } \mathcal E_{x,k}^c \; \Big |\; \sum _{|x|=\ell } 1 \ge {\varepsilon }m^\ell \right) \le \mathbb {P}\left( R_{\tau _{{e_*}}^{(1)}}< k^2 \right) ^{{\varepsilon }m^\ell }. \end{aligned}$$

Notice that the subtree of \({\mathbb {T}}\) composed of vertices x of type \(N_{x}^{(1)}=1\) is a critical Galton–Watson tree with finite variance. It implies that with probability greater than \({c\over k}\) (for some constant \(c>0\)), the number of vertices of type 1 is greater than \(k^2\). In particular, \(\mathbb {P}\left( R_{\tau _{{e_*}}^{(1)}} < k^2\right) \le 1- {c\over k}\). It implies that

$$\begin{aligned} \mathbb {P}\left( \bigcap _{|x|=\ell } \mathcal E_{x,k}^c \; \Big |\; \sum _{|x|=\ell } 1 \ge {\varepsilon }m^\ell \right) \le \mathrm{e}^{-c'{\varepsilon }\sqrt{a}}. \end{aligned}$$

By Markov inequality, it yields that

$$\begin{aligned} \mathrm{GW}\left( \mathrm{P}_{{\mathbb {T}}}\left( \bigcap _{|x|=\ell } \mathcal E_{x,k}^c \right) > 1/2 \; \Big | \; \sum _{|x|=\ell } 1 \ge {\varepsilon }m^\ell \right) \le 2\mathrm{e}^{-c'{\varepsilon }\sqrt{a}}. \end{aligned}$$

The proof is complete. \(\square \)

The next lemma shows that with high probability we cannot find any long path in the tree with only vertices y of local times greater than 2. We call \(]\!] e ,x [\![\) the set of vertices that lie on the path from the root \( e \) and the vertex x, excluding \( e \) and x.

Lemma 5.2

There exists a constant \(c>0\) such that for any \(\ell ,k\ge 1\),

$$\begin{aligned} \mathbb {P}\left( \exists |x|\ge \ell \,:\, N_{y}^{(k)} \ge 2,\, \forall \, y\in ]\!] e ,x [\![ \; \mathrm{s.t.} \;\, |y|\in [\ell /2,\ell ] \right) \le 2 k^2\mathrm{e}^{-c \ell } . \end{aligned}$$
(5.5)

Proof

We set by convention \(N^{(0)}_{x}=0\) for any \(x\in {\mathbb {T}}\). The event in (5.5) is included in the union of the two following events

$$\begin{aligned} \mathcal E_{1}:= & {} \bigcup _{i=1}^k \bigcup _{|x|=\ell } \left\{ \forall \, y\in ]\!] e ,x [\![ \; \mathrm{s.t.} \;\, |y|\in [\ell /2,\ell ], N^{(i)}_{y}-N^{(i-1)}_{y}\ge 2 \right\} , \\ \mathcal E_{2}:= & {} \bigcup _{1\le i< j\le k} \bigcup _{|y|=\ell /2} \left\{ N^{(i)}_{y}-N^{(i-1)}_{y}\ge 1, N^{(j)}_{y}-N^{(j-1)}_{y}\ge 1 \right\} . \end{aligned}$$

In words, \(\mathcal E_{1}\) is the event that there exists an excursion from the root during which we can find a path from generation \(\ell /2\) to generation \(\ell \) on which the walk crossed at least twice every (directed) edge. If this is not the case, it means that there has been necessarily two excursions from the root which crossed the same vertex at generation \(\ell /2\). This is our event \(\mathcal E_{2}\). Let us bound both probabilities. By the union bound, we have

$$\begin{aligned} \mathbb {P}(\mathcal E_{1})\le & {} k\mathbb {P}\left( \bigcup _{|x|=\ell }\left\{ \forall \, y\in ]\!] e ,x [\![ \; \mathrm{s.t.} \;|y|\in [\ell /2,\ell ], N_{y}^{(1)}\ge 2 \right\} \right) \\\le & {} k \mathbb {E}\left[ \sum _{|x|=\ell } \mathbf{1}_{\{ \forall \, y\in ]\!] e ,x [\![ \;\mathrm{s.t.} \; |y|\in [\ell /2,\ell ], N_{y}\ge 2 \}} \right] . \end{aligned}$$

By Eq. (3.3), the last expectation is

$$\begin{aligned} \mathbb {E}\left[ {1\over \hat{N}_{\ell }} \mathbf{1}_{\{ \forall \, i \in [\ell /2,\ell ], \hat{N}_{i}\ge 2 \}} \mid \hat{N}_{0}=1 \right] \le {1\over 2}\mathbb {P}\left( \forall \, i \in [\ell /2,\ell ], \hat{N}_{i}\ge 2 \mid \hat{N}_{0}=1\right) . \end{aligned}$$

Using a union bound on the last time before time \(\ell /2\) when \(\hat{N}_{k}=1\), then simple Markov property, we find that

$$\begin{aligned} \mathbb {P}\left( \forall \, i \in [\ell /2,\ell ], \hat{N}_{i}\ge 2 \mid \hat{N}_{0}=1 \right) \le (\ell /2) \mathbb {P}\left( \hat{\gamma }_{1}>\ell /2\right) \end{aligned}$$

which is exponentially small by Lemma 3.2. This gives the correct upper bound for \(\mathbb {P}(\mathcal E_{1})\). Let us bound now the probability of the event \(\mathcal E_{2}\). We have (we suppose \(\ell \) even for simplicity),

$$\begin{aligned} \mathbb {P}(\mathcal E_{2}) \le {k(k-1)\over 2} \mathbb {P}\left( \exists \, |y|=\ell /2\,:\, N_{y}^{(1)}\ge 1, N^{(2)}_{y}-N_{y}^{(1)}\ge 1\right) . \end{aligned}$$

The probability in the right-hand side is less than

$$\begin{aligned} \mathbb {E}\left[ \sum _{|y|=\ell /2} \mathbf{1}_{\{ N_{y}^{(1)}\ge 1, N^{(2)}_{y}-N_{y}^{(1)}\ge 1 \}} \right] = \mathbb {E}\left[ \sum _{|y|=\ell /2} \mathrm{P}_{{\mathbb {T}}}(N_{y}^{(1)}\ge 1)^2 \right] . \end{aligned}$$
(5.6)

We have already seen (by comparison with the biased one-dimensional random walk) that \(\mathrm{P}_{{\mathbb {T}}}(N_{y}\ge 1) = {m-1\over m^{|y|+1} -1}\). Hence, the last expectation is less than \( m^{-\ell /2}\). Therefore \(\mathbb {P}(\mathcal E_{2})\le k^2 m^{-\ell /2}\) and the proof is complete. \(\square \)

Proof of Theorem 1.1

Let \(n\ge 1\) and \(j_{n}:=\lfloor \sqrt{n}\ln ^3(n)\rfloor \). Consider the set of vertices in \({\mathbb {T}}\) of type \(N_{x}^{(j_n)}=1\) and generation greater than \({1\over 2}\ln (n)^2\) and which do not present on their ancestral line any other such vertex (of height greater than \({1\over 2}\ln (n)^2\) and type 1). In the case in which this set contains more than n vertices, restrict to the n first vertices of the set visited by the walk. Call \(\mathcal L_{n}\) the set of vertices obtained and \(L_{n}\le n\) the cardinal of this set. We call \(({\tilde{{\mathbb {T}}}}_{i})_{i\le L_{n}}\) the subtrees rooted at \(\mathcal L_{n}\), ordered by the hitting times of their roots by the walk \((X_{k})_{k}\), and for convenience for \(i>L_{n}\) we set \({\tilde{{\mathbb {T}}}}_{i}:= {\mathbb {T}}_{i}\) (where the \({\mathbb {T}}_i\) are i.i.d. versions of \({\mathbb {T}}\), and independent of all the random variables introduced so far). Notice that the trees \(({\tilde{{\mathbb {T}}}}_{i})_{i}\) are i.i.d. under \(\mathbb {P}\). We denote by \({\tilde{ e }}_i\) the root of \({\tilde{{\mathbb {T}}}}_i\) for \(i\ge 1\). We call \({\mathcal { Z}}_{n}\) the set of vertices in \({\mathbb {T}}\) which were visited by \((X_k)_{k\le \tau _{{e_*}}^{(j_n)}}\) but which do not belong to any of the \(({\tilde{{\mathbb {T}}}}_{i})_{i\le L_{n}}\), see Fig. 2.

Fig. 2
figure 2

The tree \({\mathbb {T}}\) (represented up to generation 9), the set \(\mathcal {Z}_n\) and the trees \(\tilde{{\mathbb {T}}}_i\)

By the Kesten-Stigum theorem, we have that \(m^{-n}\sum _{|x|=n} 1\) converges almost surely towards a random variable which is positive on the event of survival of the Galton–Watson tree. By Lemma 5.1 together with the Borel–Cantelli lemma, it yields that \(R_{\tau _{{e_*}}^{(j_n) }}\ge n\) for n large enough, \(\mathbb {P}^*\)-almost surely. It yields that for \(\mathrm{GW}^*\)-a.e. tree \({\mathbb {T}}, \lim _{n\rightarrow \infty } \mathrm{P}_{{\mathbb {T}}}(R_{\tau _{{e_*}}^{(j_n)}}< n)=0\). Similarly, Lemma 5.2 implies \(\lim _{n\rightarrow \infty } \mathrm{P}_{{\mathbb {T}}}( \max _{x\in {\mathcal { Z}}_{n}} |x| > \ln ^2(n) )=0\) for \(\mathrm{GW}\)-a.e. tree \({\mathbb {T}}\). Finally, observe that, by comparison with a one-dimensional random walk, for any vertex \(x\in {\mathbb {T}}\) and any integer \(k, \mathrm{E}_{{\mathbb {T}}}\left[ N_{x}^{(k)}\right] = km^{-|x|}\). By the Markov inequality, we get that \(\mathrm{P}_{{\mathbb {T}}}(\sum _{|x|\le \ln ^2(n)} N_{x}^{(j_n)} > j_n\ln ^3(n)) \le {1\over \ln ^3(n)}\sum _{k\le \ln ^2(n)} m^{-k}\sum _{|x|=k}1\) which goes to 0 \(\mathrm{GW}\)-a.s. as \(n\rightarrow \infty \). As a result, we can restrict to the event (both for the annealed and quenched convergences in law),

$$\begin{aligned} \mathcal E_{n} :=\left\{ R_{\tau _{{e_*}}^{(j_n)}} \ge n, \sum _{x\in {\mathcal { Z}}_{n}} N_{x}^{(j_n)} \le j_{n}\ln (n)^3,\, \max _{x\in {\mathcal { Z}}_{n}} |x|\le \ln ^2(n)\right\} . \end{aligned}$$
(5.7)

We will repeatedly use the following simple consequence of Lemma 2.3 of [13], to which we will refer by (F). If (Td) and \((T',d')\) are two real trees and \(\varphi :T\rightarrow T'\) is a surjective map that sends the root of T to the root of \(T'\), then the Gromov-Hausdorff distance \(d_{G}(T,T')\) between T and \(T'\) is smaller than \({1\over 2} \sup \{|d(x,y) -d'(\varphi (x),\varphi (y))|,(x,y) \in T^2 \}\).

On the event \(\mathcal E_{n}\), we have \(\mathcal {R}_{n} \subset \mathcal {R}_{\tau _{{e_*}}^{(j_n)}}\). Let \(\tilde{{\mathbb {F}}}\) be the forest associated to the trees \(({\tilde{{\mathbb {T}}}}_{i})_{i}\). Notice that each tree \({\tilde{{\mathbb {T}}}}_{i}\) is associated with an excursion of the walk from its root \({\tilde{ e }}_i\). We call \(({\tilde{X}}_{k})_{k\ge 0}\) the concatenation of these excursions, so that \(({\tilde{X}}_{ k})_{ k\ge 0}\) restricted to \(({\tilde{{\mathbb {T}}}}_{i})_{i\le L_n}\) is equal to the trajectory of the walk \((X_{k})_{k\ge 0}\) in the trees \(({\tilde{{\mathbb {T}}}}_{i})_{i\le L_{n}}\). Finally, for \(k\ge 1, {\tilde{\mathcal {R}}}_{k}\) denotes the set of vertices \(\{{\tilde{X}}_{\ell },\, \ell \le k\}\). We let \(\tilde{h}({\tilde{X}}_{k})\) be the generation of \({\tilde{X}}_{k}\) inside the subtree \({\tilde{{\mathbb {T}}}}_{i}\) to which it belongs (so for any \(k\ge 1, h({\tilde{X}}_k)=|\tilde{X}_k|-|{\tilde{ e }}_i|\) if \(\tilde{X}_k\in {\tilde{{\mathbb {T}}}}_i\)). Then, it is sufficient to prove the convergence in law

$$\begin{aligned} {1\over \sqrt{\sigma ^2 n}} \left( \tilde{\mathcal {R}}_{n}, \left\{ \tilde{h}\left( {\tilde{X}}_{\lfloor nt\rfloor }\right) \right\} _{t\in [0,1]}\right) \Rightarrow \left( {\mathcal {T}}_{|\mathbf{B}|} , |\mathbf{B}| \right) \end{aligned}$$
(5.8)

both under the annealed probability \(\mathbb {P}^*\) and the quenched probability \(\mathrm{P}_{{\mathbb {T}}}\) to prove the theorem. Indeed, let \(\tilde{n}:=\sum _{k\le n} \mathbf{1}_{\{X_{k} \notin {\mathcal { Z}}_{n}\}}\) be the time spent in the forest \(\tilde{{\mathbb {F}}}\) until time n. On the event \(\mathcal E_{n}\), we have \(0\le n-\tilde{n} \le 2j_{n}\ln ^3(n)=o(n)\), and \(||X_{k}|-\tilde{h}({\tilde{X}}_{k})| |\le \ln ^2(n) + \max _{i\le n-\tilde{n}} |\tilde{h}({\tilde{X}}_{k}) - \tilde{h}({\tilde{X}}_{k-i})|\) which will be \(o(n^{1/2})\) in probability uniformly in k if (5.8) is proved. On the other hand, \(d_{G}(\mathcal {R}_{n},{\tilde{\mathcal {R}}}_{\tilde{n}})\le \ln ^2(n)\) (use (F) with \(\varphi :\mathcal {R}_{n}\rightarrow {\tilde{\mathcal {R}}}_{\tilde{n}}\) being the identity on \(\tilde{{\mathbb {F}}}\), and mapping any vertex of \({\mathcal { Z}}_{n}\) to the root of \({\tilde{\mathcal {R}}}_{\tilde{n}}\)), and \(d_{G}({\tilde{\mathcal {R}}}_{\tilde{n}},{\tilde{\mathcal {R}}}_{n})\le \max _{i\le n-\tilde{n}} |\tilde{h}({\tilde{X}}_{\tilde{n}}) -\tilde{h}({\tilde{X}}_{\tilde{n} +i})|\) (using (F) with \(\varphi :{\tilde{\mathcal {R}}}_{\tilde{n}} \rightarrow {\tilde{\mathcal {R}}}_{n}\) being the identity on \({\tilde{\mathcal {R}}}_{\tilde{n}}\) and mapping the other vertices to \({\tilde{X}}_{\tilde{n}}\)) which will be \(o(n^{1/2})\) in probability after (5.8) is proved.

The annealed case

It is actually enough to prove (5.8) under \(\mathbb {P}^*_{n_{0}}:=\mathbb {P}(\cdot \mid \sum _{|x|=n_{0} } 1 >0)\) for any integer \(n_{0}\). Notice that the trees \(({\tilde{{\mathbb {T}}}}_{i})_{i}\) are still i.i.d. under \(\mathbb {P}^*_{n_{0}}\) as soon as n is such that \({1\over 2} \ln (n)^2 \ge n_{0}\) (this was not true under \(\mathbb {P}^*\)). To a tree \({\tilde{{\mathbb {T}}}}_{i}\), we associate the trees \((\tilde{\mathbf{T}}_{i}, \tilde{\mathbf{T}}_{i}^{(r)}, \tilde{\mathbf{T}}_{i}^{(w)})\) as in Sect. 4. The concatenation of these trees yields the forests \((\tilde{\mathbf{F}}, {\tilde{\mathbf{F}}}^{(r)},{\tilde{\mathbf{F}}}^{(w)})\). Let \(\tilde{H}^{(r)}, \tilde{H}^{(w)}\) be the height functions associated to the forests \({\tilde{\mathbf{F}}}^{(r)}\) and \({\tilde{\mathbf{F}}}^{(w)}\). Recall that by construction we have \(\tilde{H}^{(w)}(k)=\tilde{h}({\tilde{X}}_{k})\) for any \(k\ge 0\). By an abuse of notation, we denote by \({\tilde{\mathbf{F}}}^{(r)}\cap {\tilde{\mathcal {R}}}_{n}\) the forest \({\tilde{\mathbf{F}}}^{(r)}\) restricted to vertices which were in \({\tilde{\mathcal {R}}}_{n}\) before the reduction of Sect. 4. We see that \(d_{G}({\tilde{\mathcal {R}}}_{n}, {\tilde{\mathbf{F}}}^{(r)}\cap {\tilde{\mathcal {R}}}_{n}) \le \tilde{\ell }(n)\), where \(\tilde{\ell }(n)\) is the maximal length of an edge explored by the depth-first search in the forest \({\tilde{\mathbf{F}}}^{(r)}\) before visiting n vertices of type 1. By Lemma 2.1 (ii), we see that \(d_{G}({\tilde{\mathcal {R}}}_{n}, {\tilde{\mathbf{F}}}^{(r)}\cap {\tilde{\mathcal {R}}}_{n})\) is \(o(\sqrt{n})\) in probability. We want to show the convergence of \({\tilde{\mathbf{F}}}^{(r)}\cap {\tilde{\mathcal {R}}}_{n}\) for the Gromov-Hausdorff topology. This is equivalent with showing the convergence of its contour function, see Lemma 2.3 of [12]. In our case, the convergence of the contour function would be implied by that of the height function of the tree. Indeed, the argument used in the proof of Theorem 2.4.1 of [11] can be adjusted to height/contour functions of trees with edge lengths, as inequality (2.34) of [11] can be adjusted to such functions and inequality (2.35) remains valid. As a result, we need to show that under \(\mathbb {P}^*_{n_{0}}\), (with \(\tilde{R}_{n}\) being the cardinal of \({\tilde{\mathcal {R}}}_{n}\)),

$$\begin{aligned} {1\over \sqrt{\sigma ^2 n}} \left( \left\{ \tilde{H}^{(r)}\left( \lfloor \tilde{R}_{n} t\rfloor \right) \right\} _{t\in [0,1]} , \left\{ \tilde{H}^{(w)}\left( {\lfloor nt \rfloor }\right) \right\} _{t\in [0,1]}\right) \Rightarrow \left( |\mathbf{B}| , {|\mathbf{B}|} \right) . \end{aligned}$$

This follows from Proposition 4.1 once we prove that \({\tilde{R}_{n} \over n}\) converges towards \(\frac{b_1}{2}\) in probability. Let \({\tilde{\chi }}^{(w)}(i)\) (resp. \({\tilde{\chi }}^{(r)}(i)\)) be the index in the forest \({\tilde{\mathbf{F}}}^{(w)}\) (resp. \({\tilde{\mathbf{F}}}^{(r)}\)) of the i-th vertex of type 1 for the lexicographical order. Formally, denoting by \(u_{{\tilde{\mathbf{F}}}^{(r)}}(k)\) (resp. \(u_{{\tilde{\mathbf{F}}}^{(w)}}(k)\)) the \(k \mathrm{th}\) vertex of \({\tilde{\mathbf{F}}}^{(r)}\) (resp. \({\tilde{\mathbf{F}}}^{(w)}\)) for the lexicographical order, we have

$$\begin{aligned}&{\tilde{\chi }}^{(w)}(1)=1\quad \text {and}\quad {\tilde{\chi }}^{(w)}(i+1)=\min \{ k>\tilde{\chi }^{(w)}(i) \, :\, u_{{\tilde{\mathbf{F}}}^{(w)}}(k) \text { is of type } 1 \}, \\&{\tilde{\chi }}^{(r)}(1)=1\quad \text {and}\quad {\tilde{\chi }}^{(r)}(i+1)=\min \{ k>\tilde{\chi }^{(r)}(i) \, :\, u_{{\tilde{\mathbf{F}}}^{(r)}}(k) \text { is of type } 1 \}. \end{aligned}$$

Let \(i_{n}\) be the number of vertices of type 1 which were visited by \(({\tilde{X}}_{k})_{k}\) before time n. By construction of \(\tilde{\mathbf{F}}^{(w)}\) the process \((u_{{\tilde{\mathbf{F}}}}(k))_k\) is equal to \(({\tilde{X}}_k)_k\), therefore \({\tilde{\chi }}^{(w)}(i_{n})\) and \({\tilde{\chi }}^{(w)}(i_{n}+1)\) are lower and upper bounds of n. By Lemma 2.1 (i), it implies that \(i_{n}/n\) converges to \({a_{1}b_{1}\over 2}\) in probability. On the other hand, \(\tilde{R}_{n}\) is between \({\tilde{\chi }}^{(r)}(i_{n})\) and \({\tilde{\chi }}^{(r)}(i_{n}+1)\). Lemma 2.1 (i) yields that \({\tilde{R}_{n} \over i_{n}}\) converges to \({1\over a_{1}}\) in probability. The proof is complete in the annealed case. Observe that the same proof works under \(\mathbb {P}\) which means that (5.8) also holds under \(\mathbb {P}\).

The quenched case

Let \(\Xi _{n}:= {1\over \sqrt{\sigma ^2 n}} ( {\tilde{\mathcal {R}}}_{n}, \{ \tilde{h}({\tilde{X}}_{\lfloor nt\rfloor })\}_{t\in [0,1]})\). Let F be some bounded nonnegative continuous function on the space of real trees times càdlàg functions on [0, 1]. We know by (5.8) (used in the annealed setting under \(\mathbb {P}\)) that \(\mathbb {E}\left[ F(\Xi _{n})\right] \) converges. We want to show that \(\mathrm{E}_{{\mathbb {T}}}\left[ F(\Xi _{n}) \right] \) converges to the same limit for \(\mathrm{GW}\)-a.e. every tree \({\mathbb {T}}\). Hence we are going to show that \(\mathrm{E}_{{\mathbb {T}}}\left[ F(\Xi _{n}) \right] \) concentrates around its mean value. To do this, we will show that the variance decreases fast enough. Let \(\mathcal V\) be the space of all finite sets of words in \(\bigcup _{k\ge 0} {\mathbb {N}}^k\). Let \(\mathcal A\) be the set of couples \((V_{1},V_{2})\in \mathcal V^2\) such that no vertex of \(V_{1}\) is an ancestor of a vertex in \(V_{2}\) and vice-versa (it includes all couples which contain \(\emptyset \)). Recall that by construction if \(L_{n}=0\), then the forest \(\tilde{{\mathbb {F}}}\) on which the random \((\tilde{X}_n)_n\) walk takes place is made up of i.i.d. trees \((\tilde{{\mathbb {T}}}_i)_i\) independent of \({\mathbb {T}}\), and then \(\mathrm{E}_{{\mathbb {T}}}\left[ F(\Xi _{n}) \right] = \mathbb {E}\left[ F(\Xi _{n})\right] \) in that case. We compute that

$$\begin{aligned} E_{\mathrm{GW}}\left[ \mathrm{E}_{{\mathbb {T}}}\left[ F(\Xi _{n})\right] ^2 \right] = \sum _{(V_{1},V_{2}) \in \mathcal V^2 } E_{\mathrm{GW}}\left[ \mathrm{E}_{{\mathbb {T}}}\left[ F(\Xi _{n})\mathbf{1}_{\{ \mathcal L_{n}= V_{1} \}} \right] \mathrm{E}_{{\mathbb {T}}}\left[ F(\Xi _{n})\mathbf{1}_{\{ \mathcal L_{n}= V_{2} \}} \right] \right] . \end{aligned}$$

We divide the sum in two, depending on whether \((V_{1},V_{2})\) belongs to \(\mathcal A\) or not. We observe that,

$$\begin{aligned}&\sum _{(V_{1},V_{2})\in \mathcal A } E_{\mathrm{GW}}\left[ \mathrm{E}_{{\mathbb {T}}}\left[ F(\Xi _{n})\mathbf{1}_{\{ \mathcal L_{n}= V_{1} \}} \right] \mathrm{E}_{{\mathbb {T}}}\left[ F(\Xi _{n})\mathbf{1}_{\{ \mathcal L_{n}= V_{2} \}} \right] \right] \\&\quad =\sum _{(V_{1},V_{2})\in \mathcal A } E_{\mathrm{GW}}\left[ \mathrm{P}_{{\mathbb {T}}}(\mathcal L_{n}=V_{1})\mathrm{P}_{{\mathbb {T}}}(\mathcal L_{n}=V_{2}) \right] \mathbb {E}\left[ F(\Xi _{n}) \right] ^2 \\&\quad \le \mathbb {E}\left[ F(\Xi _{n})\right] ^2 \end{aligned}$$

by the branching property. For the rest of the sum, we just write since F is bounded by, say, some M,

$$\begin{aligned}&\sum _{(V_{1},V_{2}) \notin \mathcal A } E_{\mathrm{GW}}\left[ \mathrm{E}_{{\mathbb {T}}}\left[ F(\Xi _{n})\mathbf{1}_{\{ \mathcal L_{n}= V_{1} \}} \right] \mathrm{E}_{{\mathbb {T}}}\left[ F(\Xi _{n})\mathbf{1}_{\{ \mathcal L_{n}= V_{2} \}} \right] \right] \\&\qquad \le M^2 \sum _{(V_{1},V_{2}) \notin \mathcal A } E_{\mathrm{GW}}\left[ \mathrm{P}_{{\mathbb {T}}}(\mathcal L_{n}= V_{1})\mathrm{P}_{{\mathbb {T}}}(\mathcal L_{n}= V_{2}) \right] . \end{aligned}$$

We end up with

$$\begin{aligned}&E_{\mathrm{GW}}\left[ \mathrm{E}_{{\mathbb {T}}}\left[ F(\Xi _{n}) \right] ^2 \right] - \mathbb {E}\left[ F(\Xi _{n})\right] ^2 \\&\quad \le M^2 \sum _{(V_{1},V_{2}) \notin \mathcal A } E_{\mathrm{GW}}\left[ \mathrm{P}_{{\mathbb {T}}}\left( \mathcal L_{n}= V_{1}\right) \mathrm{P}_{{\mathbb {T}}}\left( \mathcal L_{n}= V_{2}\right) \right] . \end{aligned}$$

It is enough to show that the right-hand side is summable in n. We have

$$\begin{aligned}&\sum _{(V_{1},V_{2}) \notin \mathcal A } E_{\mathrm{GW}}\left[ \mathrm{P}_{{\mathbb {T}}}(\mathcal L_{n}= V_{1})\mathrm{P}_{{\mathbb {T}}}\left( \mathcal L_{n}= V_{2}\right) \right] \\&\quad = \sum _{V_{1} \in \mathcal V } E_{\mathrm{GW}}\left[ \mathrm{P}_{{\mathbb {T}}}\left( \mathcal L_{n}= V_{1} \right) \mathrm{P}_{{\mathbb {T}}}\left( (\mathcal L_{n}, V_{1} )\notin \mathcal A\right) \right] . \end{aligned}$$

Notice that we can restrict to \(V_{1}\) such that \(\min _{x\in V_{1}} |x| \ge {1\over 2}\ln ^2(n)\) and with cardinal smaller than n. Let us bound the probability \(\mathrm{P}_{{\mathbb {T}}}( (\mathcal L_{n}, V_{1} )\notin \mathcal A)\), for such a set \(V_{1}\). If \((\mathcal L_{n}, V_{1} )\notin \mathcal A\), it means that the random walk \((X_{k})_{k}\) has visited before time \(\tau _{{e_*}}^{(j_{n})}\) the ancestor at generation \({1\over 2} \ln ^2(n)\) of a vertex in \(V_{1}\). This probability is smaller than \(j_{n} m^{-{1\over 2} \ln ^2(n)}\), and there is at most n such ancestors. Therefore, \(\mathrm{P}_{{\mathbb {T}}}( (\mathcal L_{n}, V_{1} )\notin \mathcal A)\le nj_{n}m^{-{1\over 2} \ln ^2(n)} \) and we deduce that

$$\begin{aligned} \sum _{(V_{1},V_{2}) \notin \mathcal A } E_{\mathrm{GW}}\left[ \mathrm{P}_{{\mathbb {T}}}\left( \mathcal L_{n}= V_{1}\right) \mathrm{P}_{{\mathbb {T}}}\left( \mathcal L_{n}= V_{2}\right) \right] \le nj_{n} m^{-{1\over 2} \ln ^2(n)} \end{aligned}$$
(5.9)

which is summable indeed. \(\square \)

6 Random walks in random environment on a Galton–Watson tree

Let \((V(x))_{x\in {\mathbb {T}}}\) be a branching random walk on \({\mathbb {R}}\) and \({\mathbb {V}}:=({\mathbb {T}}, (V(x))_{x\in {\mathbb {T}}})\). More specifically, we let \(V({e_*})=V( e )=0\). At generation n, and conditionally on \(\sigma \{V(y),\, |y|\le n\}\), the random variables \((V(xi)-V(x),i\le \nu (x))_{|x|=n}\) are supposed to be independent and identically distributed. The common law does not depend on n. Conditionally on \({\mathbb {V}}\), we consider the Markov chain \((X_{n})_{n\ge 0}\) on \({\mathbb {T}}\) such that for any \(x\ne {e_*}\),

$$\begin{aligned} \mathrm{P}_{{\mathbb {V}}}(X_{n+1}= {x_*}\,|\, X_n=x)= & {} {\mathrm{e}^{-V(x)} \over \mathrm{e}^{-V(x)} + \sum _{i=1}^{\nu (x)} \mathrm{e}^{-V(xi)} }, \end{aligned}$$
(6.10)
$$\begin{aligned} \mathrm{P}_{{\mathbb {V}}}(X_{n+1}=xi\,|\, X_n=x)= & {} {\mathrm{e}^{-V(xi)} \over \mathrm{e}^{-V(x)} + \sum _{i=1}^{\nu (x)} \mathrm{e}^{-V(xi)} } \; \; \text{ for } \text{ any } \;\; 1\le i\le \nu (x),\nonumber \\ \end{aligned}$$
(6.11)

and which is reflected on \({e_*}\). The biased random walk is the particular case \(V(x) = |x| \ln (\lambda )\). We let \(\mathbb {P}\) be the measure \(\mathrm{P}_{{\mathbb {V}}}\) integrated over the law of \({\mathbb {V}}\). They are associated to the expectations \(\mathbb {E}\) and \(\mathrm{E}_{{\mathbb {V}}}\). The measure of \({\mathbb {V}}\) will be denoted by \(\mathrm{BW}\), and by \(\mathrm{BW}^*\) when conditioned upon the event that \({\mathbb {T}}\) is infinite. They are associated to the expectations \(E_{\mathrm{BW}}\) and \(E_{\mathrm{BW}^*}\). We introduce

$$\begin{aligned} \psi (t):= E_{\mathrm{BW}} \left[ \sum _{|x|=1} \mathrm{e}^{-tV(x)} \right] . \end{aligned}$$

Lyons and Pemantle [23] showed that the Markov chain is recurrent or transient depending on whether \(\min _{t\in [0,1]} \psi (t)\) is respectively \({\le }1\) or \({>}1\) (moreover it is positive recurrent in the case \(<1\)). We consider the critical case \(\min _{t\in [0,1]} \psi (t)=1\). In the papers [1618], Hu and Shi proved that when \(\psi '(1)\ge 0\) the random walk is of order \(\log (n)^3\) whereas when \(\psi '(1)<0\), it is of order \(n^\nu \) where \(\nu := 1-{1\over \min (\kappa ,2)}\) and \(\kappa :=\inf \{ t>1\,:\, \psi (t)=1\}\in (1,+\infty ]\). In the latter case, Andreoletti and Debs [3] showed that the local time of the root at time n was of order \(n^{1/\min (\kappa ,2)}\), and that the largest generation entirely visited at time n was of order \(\ln (n)\). Then, in [15], Hu showed that the local time was actually converging in law after a suitable rescaling.

In the case \(\kappa >2\), the walk is therefore of order \(\sqrt{n}\) and we expect a central limit theorem. Indeed Faraud [14] generalized the result of Peres and Zeitouni [31] and showed a central limit theorem, at least when \(\kappa >5\) in the annealed case, and \(\kappa >8\) in the quenched case. We extend this result to the convergence of the trace of the random walk to the Brownian forest, with the condition \(\kappa >2\). At the present time, no central limit theorem has been shown for \(\kappa \le 2\). We keep the notation \(\mathcal {R}_{n}, {\mathcal {T}}_{|\mathbf{B}| }\) of the introduction.

Theorem 6.1

Assume that \(\psi (1)=1, \psi (2)<1\) and \(\mathbb {E}\Big [ \sum _{|x|=|y|=1,x\ne y} \mathrm{e}^{-V(x)}\mathrm{e}^{-V(y)} \Big ]<\infty \). Let

$$\begin{aligned} \sigma ^2:= (1-\psi (2)) / \mathbb {E}\left[ \sum _{\begin{array}{c} |x|=|y|=1 \\ x\ne y \end{array}} \mathrm{e}^{-V(x)}\mathrm{e}^{-V(y)}\right] . \end{aligned}$$

Under \(\mathbb {P}^*\) (annealed case) and under \(\mathrm{P}_{{\mathbb {V}}}\) for \(\mathrm{BW}^*\)-a.e. \({\mathbb {V}}\) (quenched case), the following joint convergence in law holds as \(n\rightarrow \infty \):

$$\begin{aligned} { 1\over \sqrt{ \sigma ^2 n}} \left( \mathcal {R}_{n},\left\{ |X_{\lfloor nt \rfloor }| \right\} _{t\in [0,1]}\right) \Rightarrow \left( {\mathcal {T}}_{|\mathbf{B}|},|\mathbf{B} | \right) \end{aligned}$$

for the Gromov-Hausdorff topology on the space of real trees and the Skorokhod topology on the space of càdlàg functions.

The proof follows the same lines as for Theorem 1.1. We proceed by adapting the steps of the proof to our case.

6.1 Description of the process of local times

We adapt Sect. 3. Lemma 3.1 still holds. The mean matrix is now given by

$$\begin{aligned} m_{i,j}:=\mathbb {E}\left[ \sum _{|x|=1} \mathbf{1}_{\{N_{x}=j\}} \mid N_{ e }=i\right] = \left( {\begin{array}{c}i+j-1\\ j\end{array}}\right) \mathbb {E}\left[ \sum _{|x|=1} {\mathrm{e }^{-jV(x)} \over (1+\mathrm{e}^{-V(x)})^{i+j}}\right] . \end{aligned}$$

We introduce a random variable \(\hat{S}_{1}\) with law given by \(\mathbb {E}[f(\hat{S}_{1})] = \mathbb {E}\Big [\sum _{|x|=1} f(V(x))\mathrm{e}^{-V(x)} \Big ]\). Notice that under the assumptions of Theorem 6.1 we have \(\mathbb {E}\left[ \hat{S}_{1}\right] >0\) (it may be infinite if the mean number of children in \({\mathbb {T}}\) is infinite). We define \((\hat{S}_{k})_{k\ge 1}\) as the random walk with step distribution given by the law of \(\hat{S}_{1}\). Let

$$\begin{aligned} a_{i}:= & {} \mathbb {E}\left[ {\left( \sum _{\ell \ge 1} \mathrm{e}^{-\hat{S}_{\ell }}\right) ^{i-1} \over \left( 1+\sum _{\ell \ge 1} \mathrm{e}^{-\hat{S}_{\ell }}\right) ^{i+1}} \right] / \mathbb {E}\left[ {1\over 1 + \sum _{\ell \ge 1} \mathrm{e}^{-\hat{S}_{\ell }}}\right] ,\\ b_{i}:= & {} i\mathbb {E}\left[ {1\over 1 + \sum _{\ell \ge 1} \mathrm{e}^{-\hat{S}_{\ell }}}\right] . \end{aligned}$$

Lemma 6.2

The vectors \((a_i)_{i\in {\mathbb {N}}^*}\) and \((b_i)_{i\in {\mathbb {N}}^*}\) are the left and right eigenvectors associated to the eigenvalue 1 of the mean matrix \((m_{i,j})_{i,j\ge 1}\). Moreover, they are normalized so that \(\sum _{i} a_{i}=1\) and \(\sum _{i} \pi _{i}=1\) (where \(\pi _{i}:=a_{i}b_{i}\)).

Proof

Simple computations ensure that both normalizations are right, and that \((b_i)_{i\in {\mathbb {N}}^*}\) is a right eigenvector of the mean matrix. Let \(C_a:=\mathbb {E}\Big [(1+\sum _{\ell \ge 1} \mathrm{e}^{-\hat{S}_\ell })^{-1}\Big ]\) and suppose that the random walk \((\hat{S}_{k})_{k\ge 1}\) is taken independent of \({\mathbb {V}}\). We have for all \(j\in {\mathbb {N}}^*\),

$$\begin{aligned} C_a\sum _{i\ge 1} a_i m_{i,j}&= \mathbb {E}\left[ \frac{\Bigg (\sum _{\ell \ge 1} \mathrm{e}^{-\hat{S}_{\ell }}\Bigg )^{-1}}{1+\sum _{\ell \ge 1} \mathrm{e}^{-\hat{S}_{\ell }}} \sum _{|x|=1}\sum _{i\ge 1} \left( {\begin{array}{c}i+j-1\\ j\end{array}}\right) \right. \\&\qquad \qquad \left. \Bigg (\frac{\sum _{\ell \ge 1} \mathrm{e}^{-\hat{S}_{\ell }}}{1+\sum _{\ell \ge 1} \mathrm{e}^{-\hat{S}_{\ell }}}\Bigg )^i {\mathrm{e }^{-jV(x)} \over (1+\mathrm{e}^{-V(x)})^{i+j}}\right] \\&=\mathbb {E}\left[ \sum _{|x|=1} \frac{\Bigg (\mathrm{e}^{-V(x)}(1+\sum _{\ell \ge 1}\mathrm{e}^{-\hat{S}_{\ell }})\Bigg )^{j-1}}{\Bigg (1+\mathrm{e}^{-V(x)}(1+\sum _{\ell \ge 1}\mathrm{e}^{-\hat{S}_{\ell }})\Bigg )^{j+1}}\mathrm{e}^{-V(x)}\right] \\&=\mathbb {E}\left[ \frac{\Big (\mathrm{e}^{-\hat{S}_1'}(1+\sum _{\ell \ge 1}\mathrm{e}^{-\hat{S}_{\ell }})\Big )^{j-1}}{\Big (1+\mathrm{e}^{-\hat{S}_1'}(1+\sum _{\ell \ge 1}\mathrm{e}^{-\hat{S}_{\ell }})\Big )^{j+1}}\right] \end{aligned}$$

where \(\hat{S}_1'\) has the same law as \(\hat{S}_1\), and is independent of \((\hat{S}_\ell )_{\ell \ge 1}\). This yields

$$\begin{aligned} C_a\sum _{i\ge 1} a_i m_{i,j}=\mathbb {E}\left[ \frac{\Big (\sum _{\ell \ge 1}\mathrm{e}^{-\hat{S}_{\ell }'}\Big )^{j-1}}{\Big (1+\sum _{\ell \ge 1}\mathrm{e}^{-\hat{S}_{\ell }'}\Big )^{j+1}}\right] =C_a a_j, \end{aligned}$$

where we set for all \(\ell >1\) \(\hat{S}_\ell ':= \hat{S}_{1}' + \hat{S}_{\ell -1}\), and used the fact that with this setting \((\hat{S}_\ell )_{\ell \ge 1}\) has the same law as \((\hat{S}_\ell ')_{\ell \ge 1}\). \(\square \)

Now the many-to-one lemma states that for any bounded function \(f:{\mathbb {N}}^n\rightarrow {\mathbb {R}}\), we have

$$\begin{aligned} \mathbb {E}\left[ \sum _{|x|=n,N_{x}\ge 1} f(N_{x_{1}},N_{x_{2}},\ldots ,N_{x_{n-1}},N_{x}) \right] = \mathbb {E}\left[ {1\over \hat{N}_{n}}f(\hat{N}_{1},\hat{N}_2,\ldots ,\hat{N}_{n-1},\hat{N}_{n})\right] \end{aligned}$$

where \((\hat{N}_{i})_{i\ge 0}\) is a Markov chain on \({\mathbb {N}}^*\) starting at 1 and with transition probabilities from i to j given by \(m_{i,j}{b_{j}\over b_{i}}\). We adapt Lemma 3.2.

Lemma 6.3

Let \(\hat{\gamma }_{1}:=\min \{i\ge 1\,:\, \hat{N}_{i} =1\}\). There exists \(r>0\) such that \(\mathbb {E}\left[ \mathrm{e}^{r\hat{\gamma }_{1}}\right] <\infty \).

Proof

We set for all \(i\ge 1, F(i):=i\). A computation leads to

$$\begin{aligned} \sum _{j\ge 1}\hat{p}_{i,j}F(j)=\mathbb {E}\left[ \sum _{|x|=1}\mathrm{e}^{-V(x)} \right] +\mathbb {E}\left[ \sum _{|x|=1}\mathrm{e}^{-2V(x)} \right] (i+1). \end{aligned}$$

Now since \(\psi (2)<1\), there exists \(d<1\) such that for all \(i>i_0\) large enough, \(\sum _{j\ge 1}\hat{p}_{i,j}F(j)<dF(i)\). We conclude as in Lemma 3.2. \(\square \)

6.2 Reduction of trees

We adapt Sect. 4. We define in the same way the trees \(\mathbf{T}, \mathbf{T}^{(r)}\) and \(\mathbf{T}^{(w)}\) there. Lemma 6.3 ensures that \(\mathbf{T}^{(r)}\) satisfies conditions (ii) and (iii) of Sect. 2. The constants \(\Sigma _{f}, C_{1}\) and \(C_{2}\) in Sect. 2 associated to \(\mathbf{T}^{(r)}\) are given by \(\Sigma _{f}={\eta \over b_{1}\sqrt{a_{1}}}, C_{1}=a_{1}, C_{2} ={1\over a_{1}b_{1}}\), and \(\eta >0\) is given by

$$\begin{aligned} \eta ^2={2\over 1-\psi (2)} \mathbb {E}\left[ \left( 1+\sum _{\ell \ge 1} \mathrm{e}^{-\hat{S}_{\ell }} \right) ^{-1}\right] \mathbb {E}\left[ \sum _{\begin{array}{c} |x|,|y|=1 \\ x\ne y \end{array}} \mathrm{e}^{-V(x)}\mathrm{e}^{-V(y)}\right] \end{aligned}$$

(in the setting and notation of [8], we have for all \(i,j,k\in {\mathbb {N}}^*,\)

$$\begin{aligned} Q_{i,j}^{k}=\left( {\begin{array}{c}i+j+k-1\\ i,j,k-1\end{array}}\right) \mathbb {E}\left[ \sum _{|x|=1,|y|=1,x\ne y} \frac{\mathrm{e}^{-iV(x)}\mathrm{e}^{-jV(y)}}{(1+\mathrm{e}^{-V(x)}+\mathrm{e}^{-V(y)})^{i+j+k}} \right] ). \end{aligned}$$

Similarly, the tree \(\mathbf{T}^{(w)}\) is associated to the constants \(\Sigma _{f}={\eta \over b_{1}\sqrt{a_{1}}}, C_{1}={2\over a_{1}b_{1}}\) and \(C_{2} ={1\over a_{1}b_{1}}\). Proposition 4.1 still holds (with our choice of \(b_{1}\) and \(\sigma \)). The proof remains unchanged.

7 Proof of Theorem 6.1

We adapt Sect. 5. We give here an analogue of Lemma 5.1. This analogue is less precise but is still sufficient for our purpose.

Lemma 7.1

For \(k\ge 1\) large enough, \(R_{\tau _{{e_*}}^{(k{\lfloor \ln ^{10}(k) \rfloor })}} \ge k^2\) \(\mathrm{BW}^*\)-a.s.

Proof

Define the set of vertices \(G_{k}\) which contains all vertices \(x\in {\mathbb {T}}\) such that \( \mathrm{e}^{V(x)}>k\ln ^2(k)\) and \(\mathrm{e}^{V(y)}<k\ln ^2(k)\) for any strict ancestor y of x. In other words, \(G_k\) is the set of vertices which are the first of their ancestry line to be such that \(\mathrm{e}^{V(x)}>k\ln ^2(k)\). First let us collect some few facts about \(G_{k}\). Notice that the \(G_k\) are simple optional lines increasing in k, as defined in [7]. Applying Theorem 6.1 of [7] to \(\sum _{x\in G_k} \mathrm{e}^{-V(x)}\) and using the fact that \(\sum _{|x|=n} \mathrm{e}^{-V(x)}\) converges \(\mathrm{BW}^*\)-a.s. to a positive random variable (see [22]), we get that \(\mathrm{BW}^*\)-a.s, there exists \({\varepsilon }>0\) small enough so that the event

$$\begin{aligned} \mathcal {S}_{\varepsilon }=\left\{ \forall k\ge 1,\, \sum _{x \in G_{k}} \mathrm{e}^{-V(x)}>{\varepsilon }\right\} \end{aligned}$$

holds. Furthermore, for any real number \(b>0\), we have

$$\begin{aligned} \mathbb {P}\left( \exists x\in G_k\, :\, |x|>{\lfloor \ln ^2(k) \rfloor } \right)\le & {} \mathbb {E}\left[ \sum _{|x|={\lfloor \ln ^2(k) \rfloor }} {\mathbf{1}_{\{ {\mathrm{e}^{V(x)}<k{\lfloor \ln ^2(k) \rfloor }} \}}} \right] \\\le & {} (k{\lfloor \ln ^2(k) \rfloor })^{b} \mathbb {E}\left[ \sum _{|x|={\lfloor \ln ^2(k) \rfloor }} \mathrm{e}^{-b V(x)} \right] \\= & {} (k{\lfloor \ln ^2(k) \rfloor })^{b}\psi (b)^{{\lfloor \ln ^2(k) \rfloor }}. \end{aligned}$$

Taking b such that \(\psi (b)<1\), this last quantity is summable in k and the Borel-Cantelli lemma gives us that BW-a.s., for k large enough, for all \(x\in G_k, |x|<{\lfloor \ln ^2(k) \rfloor }\). Finally, considering \({\mathbb {T}}^+\) the set of vertices \(x\in {\mathbb {T}}\) such that \(V(y)<V(x)\) for any strict ancestor y of x, notice that the tree induced by \(({\mathbb {T}}^+,(V(x))_{x\in {\mathbb {T}}^+})\) is a branching random walk with positive increments along paths in \({\mathbb {T}}^+\). More precisely, setting for any \(x\in {\mathbb {T}}^+, \sigma _x:=V(x)\) and \(\lambda _x=\infty \), it is a C-M-J process as defined in [29] with Malthusian parameter \(\alpha =1\). Applying Theorem 6.3 of [29] to characteristics \(\phi _x(t):={\mathbf{1}_{\{ {t>0} \}}}\sum _{{{y}_*}=x}{\mathbf{1}_{\{ {\sigma _{y}-\sigma _x >t} \}}}\) and \(\psi _x(t):={\mathbf{1}_{\{ {t>0} \}}}\sum _{{{y}_*}=x}\mathrm{e}^{t-(\sigma _y-\sigma _x)}{\mathbf{1}_{\{ {\sigma _{y}-\sigma _x>t} \}}}\) (conditions 6.1 and 6.2 there being satisfied with \(\beta =0\)), we get that \(\mathrm{BW}^*\)-a.s.

$$\begin{aligned} \frac{ k\ln ^2(k)\sum _{x\in G_k} \mathrm{e}^{-V(x)}}{\sum _{x\in G_k} 1}=\frac{\sum _{x\in {\mathbb {T}}^+} \psi _x( \ln (k{\lfloor \ln ^2(k) \rfloor })-\sigma _x)}{\sum _{x\in {\mathbb {T}}^+} \phi _x(\ln (k{\lfloor \ln ^2(k) \rfloor })-\sigma _x)} {\begin{array}{c} \\ {\longrightarrow } \\ {k\rightarrow \infty } \end{array}} C\in (0;1], \end{aligned}$$

where C is a positive deterministic constant. Let \(G_k':=\{x\in G_k\, :\, e^{V(x)}< {2\over C} k \ln ^2(k) \}\). We observe that \(k\ln ^2(k)\sum _{x\in G_k} \mathrm{e}^{-V(x)} \le \#G'_{k} + {C\over 2} (\#G_{k}-\#G'_{k})\) which implies that \(\# G'_{k} \ge c {\varepsilon }k \ln ^2(k)\) for k large enough \(\mathrm{BW}^*\)-a.s. on the event \(\mathcal S_{{\varepsilon }}\), where c is any positive constant smaller than \({1\over 2-C}\).

Let us now proceed to the proof. We follow the lines of the proof of Lemma 5.1 with the setting \(a={\lfloor \ln ^{10}(k) \rfloor }\). We only need to show that \(\mathrm{BW}\)-as on the event \(\mathcal S_{{\varepsilon }}\), we have

$$\begin{aligned} \mathrm{P}_{{\mathbb {V}}}(R_{\tau _{{e_*}}^{(1)}} \ge k^2) \ge {1\over k{\lfloor \ln ^5(k) \rfloor }} \end{aligned}$$

for k large enough. We define again \((X_{n}^{(x)})_{n\ge 0}\) as the Markov chain starting at x and \(\mathcal E_{x,k}\) the event that the walk \((X_{n}^{(x)})_{n\ge 0}\) visits more than \(k^2\) distinct vertices before hitting \( {x_*}\). By comparison with a one-dimensional random walk, \(\mathrm{P}_{{\mathbb {V}}}(N_{x}^{(1)}\ge 1)\) is now equal to \(\left( 1 +\mathrm{e}^{V(x_{1})} + \cdots + \mathrm{e}^{V(x)}\right) ^{-1}\) which is greater than \(\frac{1}{(2/C)k{\lfloor \ln ^2(k) \rfloor }\times (|x|+1)}\ge \frac{2}{{\lfloor \ln (k) \rfloor }k{\lfloor \ln ^2(k) \rfloor }\times {\lfloor \ln ^2(k) \rfloor } }\) if \(x\in G_k'\) for k large enough. We deduce that BW-a.s. on the event \(\mathcal S_{{\varepsilon }}\), for \(k\ge 1\) large enough,

$$\begin{aligned} \mathrm{P}_{{\mathbb {V}}}(R_{\tau _{{e_*}}^{(1)}} \ge k^2) \ge \frac{2}{k{\lfloor \ln ^5(k) \rfloor }}\mathrm{P}_{{\mathbb {V}}}\left( \bigcup _{ x \in G_k'} \mathcal E_{x,k}\right) . \end{aligned}$$

We see that we simply need that \(\mathrm{P}_{{\mathbb {V}}}\left( \bigcup _{ x \in G_k'} \mathcal E_{x,k}\right) \ge {1\over 2}\) for k large enough, \(\mathrm{BW}\)-as on the event \(\mathcal S_{{\varepsilon }}\). We showed that \(\#G'_{k} \ge c{\varepsilon }k \ln ^2(k)\) for k large enough \(\mathrm{BW}\)-a.s. on the event \(\mathcal S_{{\varepsilon }}\). Hence, the proof will be complete if we prove that the probabilities

$$\begin{aligned} \mathrm{BW}\left( \# G_k'> c {\varepsilon }k\ln ^2(k)\,\text {and}\, \mathrm{P}_{{\mathbb {V}}}\left( \bigcup _{x \in G'_{k}} \mathcal E_{x,k}\right) <1/2 \right) . \end{aligned}$$

are summable in k. This is done by following the final lines of the proof of Lemma 5.1. \(\square \)

We give now the analogue of Lemma 5.2.

Lemma 7.2

There exists a constant \(c>0\) such that for any \(\ell ,k\ge 1\),

$$\begin{aligned} \mathbb {P}\left( \exists |x|\ge \ell \,:\, N_{y}^{(k)} \ge 2,\, \forall \, y\in ]\!] e ,x [\![ \right) \le 2 k^2\mathrm{e}^{-c \ell } . \end{aligned}$$

Proof

The proof is the same as before. The only difference lies in equation (5.6) where we use the upper bound \(\mathrm{P}(N_{y}^{(1)}\ge 1) \le \mathrm{e}^{-V(y)}\). Then we observe that \(\mathbb {E}[\sum _{|y|=\ell /2} \mathrm{e}^{-2V(x)}]\) is exponentially small in \(\ell \) by our assumptions. \(\square \)

Proof of Theorem 6.1

We adapt the proof of Theorem 1.1. First we prove that in the annealed and quenched cases, we can restrict to the event \(\mathcal E_{n}\) defined in Eq. (5.7) where we take now \(j_{n}:= {\lfloor \sqrt{n}\ln ^5(n) \rfloor }\). Here is why we can restrict to the event \(\mathcal E_{n}\): Lemma 7.1 and Lemma 7.2 show that we can restrict to \(\{R_{\tau _{{e_*}}^{(j_n)}} \ge n,\, \max _{x\in {\mathcal { Z}}_{n}} |x|\le \ln ^2(n) \}\). Then for any vertex \(x\in {\mathbb {T}}\) and any integer \(k, \mathrm{E}_{{\mathbb {V}}}\left[ N_{x}^{(k)}\right] = k\mathrm{e}^{-V(x)}\). The Markov inequality implies that \(\mathrm{P}_{{\mathbb {V}}}(\sum _{|x|\le \ln ^2(n)} N_{x}^{(j_n)} > j_n\ln ^3(n)) \le {1\over \ln ^3(n)}\sum _{k\le \ln ^2(n)}\sum _{|x|=k}\mathrm{e}^{-V(x)}\) which goes to 0 \(\mathrm{BW}\)-a.s. as \(n\rightarrow \infty \) since \(\sum _{|x|=k} \mathrm{e}^{-V(x)}\) converges a.s. when \(k\rightarrow \infty \). Then the proof in the annealed case follows the same lines, replacing the measures \(\mathrm{GW}\) and \(\mathrm{P}_{{\mathbb {T}}}\) respectively by \(\mathrm{BW}\) and \(\mathrm{P}_{{\mathbb {V}}}\). The proof of the quenched case is also similar. The only difference lies in Eq. (5.9). The probability to touch a vertex at generation \({1\over 2} \ln ^2(n)\) is smaller than \(\max _{|x|={1\over 2}\ln ^2(n)} \mathrm{e}^{-V(x)}\) hence the upper bound in (5.9) becomes \(nj_n E_{\mathrm{BW}}\big [\max _{|x|=\frac{1}{2}\ln ^2(n)}\mathrm{e}^{-V(x)}\big ]\). Now this quantity is smaller than

$$\begin{aligned}&nj_n E_{\mathrm{BW}}\left[ \left( \sum _{|x|=\frac{1}{2}\ln ^2(n)}\mathrm{e}^{-2V(x)}\right) ^{\frac{1}{2}}\right] \\&\quad \le nj_n E_{\mathrm{BW}}\left[ \sum _{|x|=\frac{1}{2}\ln ^2(n)}\mathrm{e}^{-2V(x)}\right] ^{\frac{1}{2}} =nj_n\psi (2)^{\frac{1}{4}\ln ^2(n)}, \end{aligned}$$

which is summable as \(\psi (2)\in (0;1)\). \(\square \)