1 Introduction

In the Euclidean path integral formulation, quantizing gravity translates into randomizing geometry weighted by the Einstein-Hilbert action or some generalization thereof. Since a direct continuum formulation is plagued with many delicate issues, from non-renormalizable ultraviolet divergences to the huge gauge group of diffeomorphism invariance and the impossibility to fully classify geometries in dimension higher than two through complete lists of invariants, the safest road seems to search for generic, sufficiently universal large distance/semi-classical limits of discretized random geometries, using both analytic and numerical tools. In this approach to quantum gravity, space-time is no longer fixed, but sampled from a statistical collection of large discrete objects such as triangulations or their dual graphs [1].

To understand the physical (potentially observable) consequences of such a bold point of view, it is important to study how particles propagate and interact on such statistical collections of random graphs.

Random trees (often known in physics under the name of branched polymers) are the first and most natural examples of random graphs. In the large size/continuum limit, they have good universal properties. Mathematicians have been studying them in detail with combinatorial, analytic and probabilistic tools such as the basic map from trees to brownian excursions, Fuss-Catalan numbers [2, 3], Galton–Watson processes [4] and so on. There are now fairly detailed rigorous results on their continuum limit [5, 6] and universal critical indices such as their Hausdorff dimension (\(d_H = 2\)), and their spectral dimension (\(d_S = 4/3\)) [7,8,9]. An essential characteristic of one of the simplest classes of infinite random trees considered in the literature [5,6,7,8,9] is the existence of a single infinite one-dimensional spine,Footnote 1 decorated by independent random finite critical Galton–Watson branches (see Fig. 1). Figure 2 tries to give an intuition of zooming towards the large size/continuum limit of random trees.

The more complicated case of two dimensional random geometries and quantum gravity [15] is also relatively well understood. The typical random space here is the now famous Brownian sphere [16,17,18,19] (\(d_H = 4\), \(d_S= 2\)). The main result to remember is that this Brownian sphere, which is the continuum limit of planar q-angulations, themselves dual to the dominant graphs of matrix models with \({\mathrm{Tr}}\,M^q\) interaction [20, 21], is the same [22,23,24] as the Liouville quantum field theory formulation of pure two dimensional quantum gravity, where the Liouville field describes the conformal factor of the random metric [25,26,27]. Planar graphs (more technically planar combinatorial maps) can be thought of as a natural evolution of random trees through the addition of some random labels as in the Cori–Vauquelin–Schaeffer [13] and Bouttier–di-Francesco–Guitter [14] bijections, or through equivalent mating processes [28, 29].

Quantum field theory on random spaces has been developed mostly not on random treesFootnote 2 but on the more complicated two dimensional random geometries for many reasons. Physicists are very interested in conformal field theories (CFT) since they enjoy universal properties as fixed points of the renormalization group. In flat two dimensional space, there exists a rich family of non-trivial CFTs for which exact analytic results can be obtained. When such CFTs are coupled to Liouville gravity, the critical indices of matter are modified in a computable way through the celebrated Knizhnik–Polyakov–Zamolodchikov [32, 33] and David–Distler–Kawai [34, 35] relations. This led during the last 40 years to a flurry of marvelous results, both in theoretical physics and mathematics, that we cannot even roughly sketch here [36]. The link with mainstream string theory is a powerful motivation [25]. Also in such studies the sphere or the \({\mathbb {R}}^2\) plane still provides a fixed background topology. Randomness of space-time is reduced to the familiar Liouville scalar field which represents the fluctuations of the conformal factor of the metric. Clearly this is conceptually less disturbing than a completely random geometric point of view, for which even observables may not be obvious to define, as they have to be attached to features common to almost all objects of the statistical sum. Finally and perhaps most importantly, random trees and branched polymers were considered until rather recently quite trivial and unpromising for quantum gravity. This has changed following two discoveries of the last decade.

Fig. 1
figure 1

An infinite binary tree with horizontal spine and Galton–Watson branches

Fig. 2
figure 2

Zooming towards the continuum random tree

Firstly the large N expansion of random tensor models [37,38,39,40] was found [41,42,43]. Melonic graphs dominate at leading order [44]. The generality and robustness of this result has been confirmed more and more with time [45,46,47,48,49]. Since the dual graphs of random tensors of rank r perform a statistical sum over a huge geometric category including all piecewise linear quasi-manifolds of dimension r, the melonic graphs form a natural entrance door to higher dimensional quantum gravity, a point of view advocated in [50,51,52]. However melons are a strict subset of planar graphs. Equipped with the graph distance, they have as scaling limit the branched polymer phase [53]. This seems a puzzling step backward compared to the two dimensional world of matrices, planar graphs and strings. Ordinary double scaling in tensor models [54,55,56] does not lead out of the branched polymer phase. Of course more interesting geometric phases may hide in more sophisticated multiple-scaling limits of tensor models or as non-trivial fixed points of the renormalization group for tensor field theories [57,58,59,60,61], right now an active research field [62,63,64].

A second independent discovery unexpectedly boosted the excitement about melons, namely the uncovering of the holographic and maximally chaotic properties of the SYK model [65,66,67,68]. These properties point to an interesting gravitational dual, and launched a new avenue of research. However the relationship between quantum gravity and such a simple one dimensional model of condensed matter remains somewhat mysterious. Part of the veil was lifted when the tensor and SYK research lines were related by tensor models à la Gurau-Witten [69,70,71,72] which have the same chaotic properties than SYK but are bona fide quantum theories. They are now seriously considered as providing the first computable toy models of truly quantum black-holes [73,74,75]. The study of tensor models on higher dimensional ordinary spaces and of their possible gravitational dual is also active and promising (e.g. [76,77,78,79]).

Nevertheless it remains an open problem to connect these developments to the initial random geometric motivation of tensor models [81]. This connection could happen through the development of a new “random holography” chapter of the gauge-gravity and AdS/CFT ongoing saga [82, 83].

The present paper is a modest step in this direction. We propose a systematic study of QFT on random trees, since we feel that random trees form the most natural way to randomize the fixed time of SYK-type quantum models. The spine common to all the infinite trees in the random sum allows to define convenient one dimensional observables for which the translation invariance and Fourier analysis of the SYK-type models remains available. This reminds of the two dimensional case, the Galton–Watson trees along the spine being a one-dimensional analog of the bumps of the Liouville field. We would like to summarize this analogy in the bold statement that pure quantum gravity in dimension 1 or “gravitational time” is simply ordinary time dressed by random lateral trees.

We shall limit ourselves in this paper to perturbative results on Feynman amplitudes for a self-interacting scalar theory. We take as propagator a fractional rescaled Laplacian as in [95,96,97] to put ourselves in the interesting just renormalizable case. Our basic tool is the multiscale analysis of Feynman amplitudes [84, 85], which remains available on random trees since it simply slices the proper timeFootnote 3 of the random path representation of the inverse of the Laplacian.

Combining this slicing with the probabilistic estimates of Barlow and Kumagai [8, 9] we establish basic theorems on power counting, convergence and renormalization of Feynman amplitudes. Our main results, Theorems 3.6 and 4.3 below, use the Barlow–Kumagai technique of “\(\lambda \)-good balls” to prove that the averaged amplitude of any graph without superficially divergent subgraphs is finite and that logarithmically divergent graphs and subgraphs can be renormalized via local counterterms. We think that these results validate the intuition that from the physics perspective random trees indeed behave as an effective space of dimension 4/3. We postpone the more complete analysis of specific models and of their renormalization group flows and non-perturbative or constructive properties to the future.

Finally, let us remark that Barlow and Kumagai obtained heat kernel bounds [8] for the Incipient Infinite Cluster (IIC) on Cayley trees (regular and rooted). This graph contains as subgraphs all clusters connected to the root of given size n, for all \(n\in \mathbb {N}\), when considering the critical percolation on Cayley trees [10]. In the continuum limit it is the Aldous Continuum Random Tree (CRT) [5, 6]. [11] showed that the scaling limit of random walks on Galton–Watson trees is the Brownian motion on the CRT and [12] obtained quenched bounds on heat kernel on the CRT compatible with the ones of [8], with techniques that generalize more easily to the random graphs known as Random Conductance Models (see for instance Ch. 8 of [9] for an overview of results and their proofs).

The paper is structured as follows. In Sect. 2, we introduce the ensemble of random trees that will be of concern as well as the random walk approach to the propagator of the theory. We also recall the multiscale point of view for renormalization towards an infrared fixed point and motivate the rescaling of the Laplacian appropriate for just renormalizable models. After presenting briefly in Sect. 3 the needed results of [8], we prove upper and lower bounds on completely convergent graphs. In Sect. 4, we obtain upper bounds on differences on amplitudes when transporting external legs, important in order to assure local counterterms. Finally, we discuss in Sect. 5 the setting that we think would stand for an analog of finite temperature field theory in this framework and the description of a model that would naturally serve as a concrete playground for the methods exposed below.

2 Quantum Field Theory on a Graph

2.1 \(\phi ^{q}\) QFT on a graph

For this introductory section we follow [87] (in particular its section 3.3.2). Let us consider a space-time which is a proper connected graph \(\Gamma \), with vertex set \(V_\Gamma \) and edge set \(E_\Gamma \). It can be taken finite or infinite. The word “proper” means that the graph has neither multiedges nor self-loops (often called tadpoles in physics). In the finite case we often omit to write cardinal symbols such as \(\vert V_\Gamma \vert \), \(\vert E_\Gamma \vert \) when there is no ambiguity. In practice in this paper we shall consider mostly trees, more precisely either finite trees \(\Gamma \) for which \(V_\Gamma = E_\Gamma +1\), or infinite trees in the sense of [7] which can be also interpreted as conditioned percolation clusters or Galton–Watson trees conditioned on non-extinction in the sense of [8]. The main characteristic of such infinite trees is to have a single infinite spine \({\mathcal {S}}(\Gamma ) \subset V(\Gamma )\). This spine is decorated all along by lateral independent Galton–Watson finite critical trees, which we call the branches, see Fig. 1.

On any such graph \(\Gamma \), there is a natural notion of the Laplace operator \({\mathcal {L}}_\Gamma \). We recall that on a directed graph \(\Gamma \) the incidence matrix is the rectangular V by E matrix with indices running over vertices and edges respectively, such that

  • \({\epsilon _\Gamma (v,e)}\) is +1 if e ends at v,

  • \({\epsilon _\Gamma (v,e)}\) is -1 if e starts at v,

  • \({\epsilon _\Gamma (v,e)}\) is 0 otherwise.

The V by V square matrix with entries \(d_v\) on the diagonal is called the degree or coordination matrix \(D_\Gamma \). The adjacency matrix is the symmetric \(V \times V\) matrix \(A_\Gamma \) made of zeroes on the diagonal: \(A_\Gamma (v,v) = 0 \;\; \forall v\in V\), and such that if \(v \ne w\) then \(A_\Gamma (v,w)\) is the number of edges of G which have vertices v and w as their ends. Finally the Laplacian matrix of \(\Gamma \) is defined to be \({\mathcal {L}}_\Gamma = D_\Gamma - A_\Gamma \). Its positivity properties stem from the important fact that it is a kind of square of the incidence matrix, namely

$$\begin{aligned} L_\Gamma = \epsilon _\Gamma \cdot \epsilon _\Gamma ^\star . \end{aligned}$$
(1)

Remark that this Laplacian is a positive rather than a negative operator (the sign convention being opposite to the one of differential geometry). Its kernel (the constant functions) has dimension 1 since \(\Gamma \) is connected.

The kernel \(C_\Gamma (x, y) \) of the inverse of this operator is formally given by the sum over random paths \(\omega \) from x to y

$$\begin{aligned} {\mathcal {L}}_\Gamma ^{-1} = C_\Gamma (x, y) =\left[ \sum _n \left( \frac{1}{D_\Gamma }A_\Gamma \right) ^n \frac{1}{D_\Gamma } \right] (x,y)= \sum _{\omega : x \rightarrow y} \; \prod _{v\in \Gamma } \;\biggl [\frac{1}{d_v}\biggr ]^{n_v (\omega )} \end{aligned}$$
(2)

where \(d_v= D_\Gamma (v,v) = {\mathcal {L}}_\Gamma (v,v)\) is the coordination at v and \(n_v (\omega )\) is the number of visits of \(\omega \) at v. We sometimes omit the index \(\Gamma \) when there is no ambiguity.

As we know this series is not convergent without an infrared regulator (this is related to the Laplacian having a constant zero mode). For a finite \(\Gamma \) we can take out this zero mode by fixing a root vertex in the graph and deleting the corresponding line and column in \({\mathcal {L}}_\Gamma \). But it is more symmetric to use the mass regularization. It adds \(m^2 {\mathbb {1}}\) to the Laplacian, where \({\mathbb {1}}\) is the identity operator on \(\Gamma \), with kernel \(\delta (x,y)\). Defining \(C_\Gamma ^m (x, y)\) as the kernel of \(({\mathcal {L}}_\Gamma + m^2 {\mathbb {1}})^{-1}\) we have the convergent path representation

$$\begin{aligned} C_\Gamma ^m (x, y) = \sum _{\omega : x \rightarrow y} \; \prod _{v\in \Gamma } \;\biggl [\frac{1}{d_v + m^2}\biggr ]^{n_v (\omega )} \end{aligned}$$
(3)

and the infrared limit corresponds to \(m \rightarrow 0\).

A scalar Bosonic free field theory \(\phi \) on \(\Gamma \) is a function \(\phi : V_\Gamma \rightarrow {\mathbb {R}}\) defined on the vertices of the graph and measured with the Gaussian measure

$$\begin{aligned} d\mu _{C_\Gamma }(\phi ) = \frac{1}{Z_0} e^{-\frac{1}{2} \phi ({\mathcal {L}}_\Gamma + m^2 {\mathbb {1}}) \phi } \prod _{x\in V_\Gamma } d\phi (x) , \end{aligned}$$
(4)

where \(Z_0\) a normalization constant. It is obviously well-defined as a finite dimensional probability measure for \(m >0\) and \(\Gamma \) finite. We meet associated infrared divergences in the limit of \(m=0\) and they are governing the large distance behavior of the QFT in the limit of infinite graphs \(\Gamma \). The systematic way to study QFT divergences is through a multiscale expansion in the spirit of [84,85,86, 88,89,90,91,92]. No matter whether an ultraviolet or an infrared limit is considered, the renormalization group always flows from ultraviolet to infrared and the same techniques apply in both cases.

The \(\phi ^{q}\) interacting theory is then defined by the formal functional integral [87]:

$$\begin{aligned} d\nu _\Gamma (\phi ) = \frac{1}{Z(\Gamma , \lambda )} e^{- \lambda \sum _{x \in V_\Gamma } \phi ^{q} (x) } d \mu _{C_\Gamma } (\phi ) , \end{aligned}$$
(5)

where the new normalization is

$$\begin{aligned} Z(\Gamma , \lambda )= \int e^{- \lambda \sum _{x \in V_\Gamma } \phi ^{q} (x) } d \mu _{C_\Gamma } (\phi ) = \int d\nu _\Gamma (\phi ) . \end{aligned}$$
(6)

The correlations (Schwinger functions) of the \(\phi ^{q}\) model on \(\Gamma \) are the normalized moments of this measure:

$$\begin{aligned} S_{N} (z_1,\ldots ,z_N ) = \int \phi (z_1)\ldots \phi (z_N) \; d\nu _\Gamma (\phi ) , \end{aligned}$$
(7)

where the \(z_i\) are external positions hence fixed vertices of \(\Gamma \). The case of fixed flat d-dimensional lattice corresponds to \(\Gamma ={{\mathbb {Z}}}^d\). As well known the Schwinger functions expand in the formal series of Feynman graphs

$$\begin{aligned} S_{N} (z_1,\ldots ,z_N ) = \sum _{V=0}^\infty \frac{(- \lambda )^V}{V!} \sum _G A_G (z_1,\ldots ,z_N ), \end{aligned}$$
(8)

where the sum over G runs over Feynman graphs with n internal vertices of valence q and N external leaves of valence 1. Beware not to confuse these Feynman graphs with the “space-time” graph \(\Gamma \) on which the QFT lives. More precisely \(E_G\) is the disjoint union of a set \(I_G\) of internal edges and of a set \(N_G\) of external edges, and for the interaction \(\phi ^{q}\) these Feynman graphs have \(V_G =V\) internal vertices which are regular with total degree q and \(N_G=N\) external leaves of degree 1. Hence \(qV_G= 2E_G+N_G\). If q is even this as usual implies parity rules, namely \(N_G\) has also to be even. We often write simply V, E, N instead of \(V_G\), \(E_G\) and \(N_G\) when there is no ambiguity. In this paper, we do not care about exact combinatoric factors nor about convergence of this series although these are of course important issues treated elsewhere [86]. We also shall consider only connected Feynman graphs G, which occur in the expansion of the connected Schwinger functions.

As usual the treatment of external edges is attached to a choice for the external arguments of the graph. Our typical choice here is to use external edges which all link a q-regular internal vertex to a 1-regular leaf with fixed external positions \(z_1\), ...\(z_N\) in \(\Gamma \). The (unamputated) graph amplitude is then a function of the external arguments obtained by integrating all positions \(x_v\) of internal vertices v of G over our space time, which is \(V(\Gamma )\). Hence

$$\begin{aligned} A_{G}(z_1, \ldots , z_N) = \sum _{\begin{array}{c} x_v \in V_\Gamma \\ v\in V_G \end{array}} \prod _{\ell \in E_G } C_\Gamma ^m (x_{\ell },y_{\ell }) \end{aligned}$$
(9)

where \(x_\ell \) and \(y_\ell \) is our (sloppy, but compact!) notation for the vertex-positions at the two ends of edge \(\ell \).

We consider now perturbative QFT on random trees, which instead of \(\Gamma \) we note from now as T. The universality class of random trees [5, 6] is the Gromov–Hausdorff limit of any critical Galton–Watson tree process with fixed branching rate [4], and conditioned on non-extinction. It has a unique infinite spine, decorated with a product of independent Galton–Watson measures for the branches along the spine [7]. We briefly recall the corresponding probability measure, following closely [7], but instead of half-infinite rooted trees with spine labeled by \({{\mathbb {N}}}\) we consider trees with a spine infinite in both directions, hence labeled by \({{\mathbb {Z}}}\).

The order \(\vert T \vert \) of a rooted tree is defined as its number of edges. To a set of non-negative branching weights \(w_i,\,i\in \mathbb N^\star \) is associated the weights generating function \(g(z):= \sum _{i \ge 1} w_i z^{i-1} \) and the finite volume partition function \(Z_n\) on the set \({\mathcal {T}}_n\) of all rooted trees T with root r of order \(\vert T \vert =n\)

$$\begin{aligned} Z_n = \sum _{T \in {\mathcal {T}}_n}\prod _{u\in T {\setminus } r}w_{d_u}, \end{aligned}$$
(10)

where \(d_u\) denotes the degree of the vertex u. The generating function for all \(Z_n\)’s is

$$\begin{aligned} Z(\zeta ) = \sum _{n=1}^\infty Z_n \zeta ^n . \end{aligned}$$
(11)

It satisfies the equation [7]

$$\begin{aligned} Z(\zeta ) =\zeta g(Z (\zeta )). \end{aligned}$$
(12)

Assuming a finite radius of convergence \(\zeta _0\) for Z one defines

$$\begin{aligned} Z_0 = \lim _{\zeta \uparrow \zeta _0}Z(\zeta ). \end{aligned}$$
(13)

The critical Galton–Watson probabilities \(p_i := \zeta _0 w_{i+1}Z_0^{i-1}\) for \(i\in {\mathbb {N}}\) are then normalized: \(\sum _{i=0}^\infty p_i = 1\). We then consider the class of infinite random trees defined by an infinite spine of vertices \(s_k\), \(k\in {\mathbb {Z}}\), plus a collection of \(d_k -2\) finite branches \(T^{(1)}_k, \dots , T^{(d_k -2)}_k\), at each vertex \(s_k\) of the spine (recall the degree of k is indeed \(d_k\)). The set of such infinite trees is called \({\mathcal {T}}_\infty \). It is equipped with a probability measure \(\nu \) that we now describe. This measure is obtained as a limit of measures \(\nu _n\) on finite trees of order n. These measures \(\nu _n\) are defined by identically and independently distributing branches around a spine with measures

$$\begin{aligned} \mu ( T) = Z_0^{-1}\zeta _0^{|T|}\prod _{u \in T {\setminus } r}w_{d_u} = \prod _{u\in T{\setminus } r}p_{d_u -1}. \end{aligned}$$
(14)

Theorem

[7] Viewing \(\nu _n(T) = Z_n^{-1}\prod _{u\in T {\setminus } r}w_{d_u},\quad T\in {\mathcal {T}}_n,\) as a probability measure on \({\mathcal {T}}\) we have

$$\begin{aligned} \nu _n \rightarrow \nu \quad \text {as}\quad n\rightarrow \infty , \end{aligned}$$
(15)

where \(\nu \) is the probability measure on \({\mathcal {T}}\) concentrated on the subset of infinite trees \({\mathcal {T}}_\infty \). Moreover the spectral dimension of generic infinite tree ensembles is \( d_{spec} = 4/3\,\).

From now on we write \({\mathbb {E}}(f)\) for the average according to the measure \(d \nu \) of a function f depending on the tree T, and \({\mathbb {P}}\) for the probability of an event A according to \(d\nu \). Hence \({\mathbb {P}}(A) = {\mathbb {E}}( \chi _A)\) where \(\chi _A\) is the characteristic function for the event A to occur. For simplicity and in order not to loose the reader’s attention into unessential details we shall also restrict ourselves from now on to the case of critical binary Galton–Watson trees. It corresponds to weights \(w_1 = w_3 = 1\), and \(w_i = 0\) for all other values of i. In this case the above formulas simplify. The critical Galton–Watson process corresponds to offspring probabilities \(p_0 = p_2 = \frac{1}{2} \), \(p_i = 0\) for \(i \ne 0, 2\). The generating function for the branching weights is simply \(g(z) = 1 + z^2\) and the generating function for the finite volume trees \(Z(\zeta ) = \sum _{n=1}^\infty Z_n \zeta ^n \) obeys the simple equation \( Z(\zeta ) =\zeta (1 + Z^2 (\zeta ))\), which solves to the Catalan function \(Z= \frac{1 - \sqrt{1 - 4 \zeta ^2}}{2\zeta }\). In the above notations the radius of convergence of this function is \(\zeta _0 = \frac{1}{2}\). Moreover \(Z_0 = \lim _{\zeta \uparrow \zeta _0}Z(\zeta ) =1\) and the independent measure on each branch of our random trees is simply

$$\begin{aligned} \mu (T) = 2^{- \vert T \vert }. \end{aligned}$$
(16)

2.2 Fractional Laplacians

Since the most interesting QFTs (including, in dimension 1, the tensorial theories à la Gurau–Witten) are the ones with just renormalizable power counting, we want to state our result in that case. A time-honored method for that is to raise the ordinary Laplacian to a suitable fractional power \(\alpha \) in the QFT propagator [95,96,97]. We assume from now on that this fractional power obeys \(0< \alpha <1\) and call \(C^\alpha \) the corresponding propagator, i.e. the kernel of \({\mathcal {L}}^{-\alpha }\). It is most conveniently computed using the identity

$$\begin{aligned} {\mathcal {L}}^{-\alpha } = \frac{\sin \pi \alpha }{\pi } \int _0^\infty \frac{2m^{1-2\alpha }}{{\mathcal {L}}+ m^2} dm \end{aligned}$$
(17)

since this “Källen–Lehmann” representation respects the positivity properties of the random path representation of the ordinary Laplacian inverse.

In the continuum \({\mathbb {R}}^d\) case, we have the ordinary heat kernel integral representation

$$\begin{aligned} C^\alpha _{{{\mathbb {R}}}^d}(x,y)= & {} \frac{\sin \pi \alpha }{\pi } \int _0^\infty 2m^{1-2\alpha }dm \int _0^\infty e^{-m^{2}t - \frac{\vert x-y \vert ^{2} }{4t }} \frac{d t}{t^{d/2}} . \end{aligned}$$
(18)

On \({{\mathbb {Z}}}^d\) the rescaled kernel of the Laplacian between points x and y is similarly obtained from eq. (17), using the random walk representation:

$$\begin{aligned} C_{{{\mathbb {Z}}}^d}^{\alpha }(x,y) = \frac{\sin \pi \alpha }{\pi } \int _0^\infty 2m^{1-2\alpha } dm\sum _{\omega : x \rightarrow y}\prod _v \;\biggl [\frac{1}{2d + m^2}\biggr ]^{n_v (\omega )} \end{aligned}$$
(19)

where \(n_v (\omega )\) is the number of visits of \(\omega \) at v. Notice that each vertex on \({{\mathbb {Z}}}^d\) has degree 2d.

As remarked above in the case of a general graph \(\Gamma \) we no longer have translation invariance of Fourier integrals but still the random path expansion, so that

$$\begin{aligned} C_{\Gamma }^{\alpha }(x,y) = \frac{\sin \pi \alpha }{\pi } \int _0^\infty 2m^{1-2\alpha }dm\sum _{\omega : x \rightarrow y}\prod _v \;\biggl [\frac{1}{d_v + m^2}\biggr ]^{n_v (\omega )} \end{aligned}$$
(20)

where the walks \(\omega \) now live on \(\Gamma \) and \(d_v\) is the degree at vertex v.

2.3 The random tree critical power \(\alpha = \frac{2}{3} - \frac{4}{3q}\)

In integer dimension d, standard QFT power counting with propagator \(C^\alpha \) relies on the standard notion of degree of divergence. For a regular Feynman graph of degree q with N external legs, this degree is defined as

$$\begin{aligned} \omega (G) = (d-2 \alpha )E - d (V-1) = (d-2 \alpha )(qV-N)/2 - d (V-1) . \end{aligned}$$
(21)

This power counting is neutral (hence does not depend on V) in the critical or just-renormalizable case

$$\begin{aligned} \alpha =\frac{ (q-2)d }{2q } \end{aligned}$$
(22)

in which case we have

$$\begin{aligned} \omega (G) =d \left( 1- \frac{N}{q}\right) . \end{aligned}$$
(23)

For instance if \(q=d=4\) we recover that the \(\phi ^4_4\) theory with propagator \(p^{-2}\) is critical, and if \(d=1\) we recover the critical index \(\alpha = \frac{1}{2} - \frac{1}{q}\) of the infrared SYK theory with q interacting fermions [65,66,67,68].

As we will show in Sect. 4, a just-renormalizable \(\phi ^q\) theory is obtained by substituting in the above formulas the spectral dimension \(d=4/3\) of random trees, namely

$$\begin{aligned} \alpha = \frac{2}{3} - \frac{4}{3q}, \quad \omega (G) =\frac{4-N}{3} . \end{aligned}$$
(24)

This is not surprising since this spectral dimension is precisely related to the short-distance, long-time behavior of the inverse Laplacian averaged on the random tree. We shall fix from now the fractional power \(\alpha \) to its critical value and write \(C_T\) instead of \(C_T^\alpha \). Nevertheless this simple rule requires justification, which is precisely provided by the next sections.

2.4 Slicing into scales

The multiscale decomposition of Feynman amplitudes is a systematic tool to establish power counting and study perturbative and constructive renormalization in quantum field theory [84,85,86,87]. It relies on a sharp slicing into a geometrically growing sequence of scales of the Feynman parameter for the propagator of the theory. This parameter is nothing but the time in the random path representation of the Laplacian. The short time behavior of the propagator is unimportant since the graph \(\Gamma \) is an ultraviolet regulator in itself. We are therefore interested in infrared problems, namely the long distance behavior of the theory (in terms of the graph distance). In the usual discrete random walk expansion of the inverse Laplacian, the total time is the length of the path hence an integer. This integer when non trivial cannot be smaller than 1. However the results of [8] are formulated in terms of a continuous-time random walk which should have equivalent infrared properties. In what follows we shall use both points of view.

Definition 2.1

(Time-of-the-Path Slicing). We introduce the infrared parametric slicing of the propagator \(1/({\mathcal {L}}+m^2)\):

$$\begin{aligned} C= & {} \sum _{j=0}^{\infty } C^j ; \quad C^0= {\mathbb {1}}, \nonumber \\ C^j= & {} \sum _{\omega : x \rightarrow y \atop M^{2(j-1)} \le n(\omega ) < M^{2j} } \; \prod _{v\in \Gamma } \;\biggl [\frac{1}{d_v + m^2}\biggr ]^{n_v (\omega )} \quad \forall j \ge 1. \end{aligned}$$
(25)

M is a fixed constant which parametrizes the thickness of a renormalization group slice (the craftsman trademark of [86]). Each propagator \(C^{j}\) indeed corresponds to a theory with both an ultraviolet and an infrared cutoff, which differ by the fixed multiplicative constant \(M^2\). An infrared cutoff on the theory is then obtained by setting a maximal value \(\rho = j_{max}\) for the index j. The covariance with this cutoff is therefore

$$\begin{aligned} C_{\rho } = \sum _{j=0}^{\rho } C^{j} . \end{aligned}$$
(26)

In the continuum \({\mathbb {R}}^d\) case we have the ordinary heat kernel representation hence the explicit integral representation

$$\begin{aligned} C^{\alpha ,j}_{{{\mathbb {R}}}^d}(x,y)= & {} \frac{\sin \pi \alpha }{\pi } \int _0^\infty 2m^{1-2\alpha }dm \int _{M^{2j}}^{M^{2(j+1)}} e^{-m^{2}t - \frac{\vert x-y \vert ^{2} }{4t }} \frac{dt}{t^{d/2}} \end{aligned}$$
(27)

from which it is standard to deduce scaling bounds such as

$$\begin{aligned} C_{{{\mathbb {R}}}^d}^{\alpha ,j}(x,y)\le & {} K M^{(2 \alpha -d)j} e^{- c M^{-2j}\vert x -y \vert ^2} \end{aligned}$$
(28)

for some constants K and c. From now on in this paper we use most of the time c as a generic name for any inessential constant (c is therefore the same as the O(1) notation in the constructive field theory literature). We shall also omit from now on to keep inessential constant factors such as \( \frac{\sin \pi \alpha }{\pi } \). In \({{\mathbb {Z}}}^d\) the sliced propagator then writes

$$\begin{aligned} C_{{{\mathbb {Z}}}^d}^{\alpha ,j}(x,y) = \int _0^\infty 2m^{1-2\alpha } dm \sum _{\omega : x \rightarrow y \atop M^{2(j-1)} \le n(\omega ) < M^{2j} } \prod _v \;\biggl [\frac{1}{2d + m^2}\biggr ]^{n_v (\omega )} . \end{aligned}$$
(29)

It still can be shown easily to obey the same bound (28). For a general tree T the sliced decomposition of the propagator then writes

$$\begin{aligned} C_{T}^{\alpha }(x,y)= & {} \sum _{j=0}^{\infty } C_{T}^{\alpha ,j}(x,y) ; \quad C_{T}^{\alpha ,0}= {\mathbb {1}}, \quad \mathrm{and\;\; for }\;\; j \ge 1, \end{aligned}$$
(30)
$$\begin{aligned} C_{T}^{\alpha ,j}(x,y)= & {} \int _0^\infty 2m^{1-2\alpha } dm \sum _{\omega : x \rightarrow y \atop M^{2(j-1)} \le n(\omega ) < M^{2j} } \; \prod _{v\in T} \;\biggl [\frac{1}{d_v + m^2}\biggr ]^{n_v (\omega )}. \end{aligned}$$
(31)

Remark that after n steps a path cannot reach farther than distance n (for the discrete time random walk). In particular we can safely include the function \(\chi _j (x,y)\) in any estimate on \(C_{T}^{\alpha ,j}\), where \(\chi _j (x,y)\) is the characteristic function for \(d(x,y) \le M^{2j}\).Footnote 4 A generic tree T in \({\mathcal {T}}\) has spectral dimension 4/3 so that we should expect for such a tree

$$\begin{aligned} C_{T}^{\alpha ,j}(x,y)\le & {} K M^{\left( 2 \alpha -\frac{4}{3}\right) j} \chi _j (x,y) . \end{aligned}$$
(32)

A fixed tree can nevertheless be non-generic, hence has no a priori well defined dimension d. At the same time, since it always contains an infinite spine which has dimension 1, the propagator on any tree T in \({\mathcal {T}}\) should obey the following bound:

$$\begin{aligned} C_{T}^{\alpha ,j}(x,y)\le & {} K M^{(2 \alpha -1)j}\chi _j (x,y) . \end{aligned}$$
(33)

However we do not need a very precise bound for exceptional trees since as we will see in the next section, they will be wiped by small probabilistic factor. In fact a very rough “dimension zero” bound can be obtained for all points x, y on T:

$$\begin{aligned} C_{T}^{\alpha ,j}(x,y)\le & {} K M^{2 \alpha j} \chi _j (x,y). \end{aligned}$$
(34)

Indeed, overcounting the number of paths from x to y in time t as the total number of paths from x in time t leads to this inequality. In the binary tree case each vertex degree is bounded by 3. At a visited vertex v we have \(d_v\) choices for the next random path step so that

$$\begin{aligned} \sum _{\omega : x \rightarrow y \atop M^{2(j-1)} \le n(\omega )< M^{2j} } \; \prod _{v\in T} \;\biggl [\frac{1}{d_v + m^2}\biggr ]^{n_v (\omega )}\le & {} \sum _{M^{2(j-1)} \le n< M^{2j} } \; \left[ \frac{3}{3+m^2}\right] ^{n}\end{aligned}$$
(35)
$$\begin{aligned}\le & {} K \int _{M^{2(j-1)}}^{M^{2j}} dt e^{-ctm^2} \end{aligned}$$
(36)

where K and c are some inessential constants.Footnote 5 Then the naive inequality

$$\begin{aligned} K \int _0^\infty m^{1-2 \alpha }dm \int _{M^{2(j-1)}}^{M^{2j}} dt e^{-ctm^2} \le K' M^{2j\alpha }, \end{aligned}$$
(37)

allows to conclude.

Yet none of the bounds (32)–(34) are sufficient to establish the correct power counting of Feynman amplitudes averaged on \(T \in {\mathcal {T}}\). We need to combine the multiscale decomposition (best tool to estimate general Feynman amplitudes on a fixed space) with probabilistic estimates to show that the prefactor \( M^{\left( 2 \alpha -\frac{4}{3}\right) j} \) in (32) is indeed the typical one and that the typical volume factors for the integrals on vertex positions correspond also to those of a space of dimension 4/3.

2.5 The multiscale analysis

Consider a fixed connected Feynman graph G with n internal vertices, all with degrees \(q=4\), N external edges and \(L =2n -N/2\) internal edges. There are in fact several possible prescriptions to treat external arguments in a Feynman amplitude [84,85,86], but they are essentially equivalent from the point of view of integrating over inner vertices the product of propagators. A convenient and simple choice is to put all external legs in the most infrared scale, namely the infrared cutoff scale \(\rho \) (similar to a zero external momenta prescription in a massive theory), and to work with amputated amplitudes which no longer depend on the external positions \(z_1, \ldots , z_N\) but only of the position \(x_0\) of a fixed inner root vertex \(v_0\). It means we forget the N(G) external propagators \(C_{T}(x_{v(k)},z_k) \) factors in \(A_G\) and shall integrate only the \(n-1\) positions \(x_v, v \in \{1, \dots , n-1 \}\). In this way we get an amplitude \(A_G^{amp} (x_0)\) which is solely a functionFootnote 6 of \(x_0\). However we should remember that fields and propagators at the external cutoff scale have a canonical dimension which in our case for a field of scale j is \(M^{-j/3}\). To compensate for the missing factors after amputation we shall multiply this amputated amplitude by \(M^{-\rho N /3}\), and for the fixing of position \(x_0\), we shall add another global factor \(M^{4\rho /3}\). Hence we define

$$\begin{aligned} {\tilde{A}}_G^{amp} (x_0) := M^{\rho (4-N) /3} \sum _{\begin{array}{c} x_v \in V(T)\\ 1\le v\le n-1 \end{array}} \prod _{\ell \in I(G) } C_{T} (x_\ell , y_\ell ). \end{aligned}$$
(38)

For simplicity, we write now \(A_G\) again, instead of \({\tilde{A}}^{amp}_G\). The decomposition (30) leads to the multiscale representation for a Feynman graph G, which is:

$$\begin{aligned} A_{G}(x_0)= & {} M^{\rho (4-N) /3} \sum _{\mu } A_{G,\mu }(x_0), \end{aligned}$$
(39)
$$\begin{aligned} A_{G,\mu }(x_0)= & {} \sum _{\begin{array}{c} x_v \in V(T)\\ 1\le v\le n-1 \end{array}} \prod _{\ell \in I(G) } C_{T}^{j_\ell } (x_\ell , y_\ell ) . \end{aligned}$$
(40)

\(\mu \) is called a “scale assignment” (or simply “assignment”). It is a list of integers \(\{ j_\ell \}\), one for each internal edge of G, which provides for each internal edge l of G the scale \(j_\ell \) of that edge. \(A_{G,\mu }\) is the amplitude associated to the pair \((G,\mu )\), and (39)–(40) is the multiscale representation of the Feynman amplitude.

We recall that the key notion in the multiscale analysis of a Feynman amplitude is that of “high” subgraphs. In our infrared setting, this means the connected components of \(G_{j}\), the subgraph of G made of all edges \(\ell \) with index \(j_\ell \le j\). These connected components are labeled as \(G_{j,k}\), \(k=1,\ldots ,k(G_j)\), where \(k(G_j)\) denotes the number of connected components of the graph \(G_j\).

A subgraph \(g \subset G\) then has in the assignment \(\mu \) internal and external indices defined as

$$\begin{aligned} i_g(\mu )= & {} \sup \limits _{l \mathrm{\ internal \ edge \ of \ } g} \mu (l), \end{aligned}$$
(41)
$$\begin{aligned} e_g(\mu )= & {} \inf \limits _{l \mathrm{\ external \ edge \ of \ } g} \mu (l) . \end{aligned}$$
(42)

Connected subgraphs verifying the condition

$$\begin{aligned} e_g(\mu ) > i_g(\mu ) \quad \quad \mathrm{(high \ condition)} \end{aligned}$$
(43)

are exactly the high ones. This definition depends on the assignment \(\mu \). For a high subgraph g and any value of j such that \(i_g (\mu ) < j \le e_g(\mu ) \) there exists exactly one value of k such that g is equal to a \(G_{j,k}\). High subgraphs are partially ordered by inclusion and form a forest in the sense of inclusion relations [84,85,86].

The key estimates then keep only the spatial decay of a \(\mu \)-optimal spanning tree \(\tau (\mu )\) of G, which minimizes \(\sum _{\ell \in \tau (\mu )} j_\ell (\mu )\) (we use the notation \(\tau \) for spanning trees of G in order not to confuse them with the random tree T). The important property of \(\tau (\mu )\) is that it is a spanning tree within each high component \(G_{j,k}\) [84,85,86]. It always exists and can be chosen according to Kruskal greedy algorithm [93]. It is unique if every edge is in a different slice; otherwise there may be several such trees in which case one simply picks one of them.

Suppose we could assume bounds similar to the \(\mathbb {R}^d\) case. It would mean that a sliced propagator in the slice \(j_\ell \) would be bounded as

$$\begin{aligned} C_{T}^{j_\ell } (x_\ell , y_\ell ) \simeq K M^{-2j_\ell /3} e^{-M^{-j_\ell } d(x_\ell ,y_\ell ) } \end{aligned}$$
(44)

and that spatial integrals over each \(x_v\) would be really 4/3 dimensional, i.e cost \(M^{4j_v/3}\) if performed with the decay of a scale \(j_v\) propagator. Picking a Kruskal tree \(\tau (\mu ) \) with a fixed root vertex, and forgetting the spatial decay of all the edges not in \(\tau \), one can then recursively organize integration over the position \(x_v\) of each internal vertex v from the leaves towards the root. This can be indeed done using for each v the spatial decay of the propagator joining v to its unique towards-the-root-ancestor a(v) in the Kruskal tree. In this way calling \(j_v\) the scale of that propagator we would get as in [84,85,86] an estimate

$$\begin{aligned} |A_{G,\mu }|\le & {} K^{V(G)} M^{-N \rho /3} \prod _{\ell \in I(G)} M^{-2j_\ell /3} \prod _{v \in V(G)} M^{4j_v/3} \end{aligned}$$
(45)
$$\begin{aligned}= & {} K^{V(G)} \prod _{j =1}^\rho \prod _{k=1}^{k(G_j)} M^{\omega (G_{j,k})} \end{aligned}$$
(46)

where the divergence degree of a subgraph \(S \subset G\) is defined as

$$\begin{aligned} \omega (S) = \frac{2}{3}E(S) - \frac{4}{3} (V(S)-1) = \frac{4-N(S)}{3} . \end{aligned}$$
(47)

Standard consequences of such bounds are

  • uniform exponential bounds for completely convergent graphs [86].

  • renormalization analysis: when high subgraphs have positive divergent degree we can efficiently replace them by local counterterms, which create a flow for marginal and relevant operators. The differences are remainder terms which become convergent and obey the same bounds as for convergent graphs, provided we use an effective expansion which renormalizes only high subgraphs [84,85,86].

In fact these bounds cannot be true for all particular trees T since they depend on the Galton–Watson branches being typical. In more exceptional cases, for instance for a tree reduced to the spine plus small lateral branches the effective spatial dimension is 1 rather than 4/3. Such exceptional cases become more and more unlikely when we consider larger and larger sections of the spine. Our probabilistic analysis below proves that for the averaged Feynman amplitudes everything happens as in equation (46). To give a meaning to these averaged amplitudes, we fix the position of the root vertex \(x_0\) to lie on the spine of T. Averaging over T restores translation invariance along the spine, so that we have finally to evaluate averaged amplitudes \({\mathbb {E}}(A_G)\) which are simply numbers. It is for these amplitudes that we shall prove in the next sections our main results Theorems 3.6 and 4.3. But we need to introduce first our essential probabilistic tool, namely the \(\lambda \)-good conditions on trees of [8].

3 Probabilistic Estimates

We have first to recall the probabilistic estimates on random trees from [8] that we are going to use, simplifying slightly some inessential aspects. As mentioned above, [8] mostly considers random paths which are Markovian processes with continuous times, but those are statistically equivalent to above discrete processes in the interesting long-time infrared limit, as is discussed in the remark 5.3 of [8].

For \(x \in T\), we note B(xr) the ball of T centered on x and containing points at most at distance r from x, and m(xr) the number of points of T at distance \(1+[r/4]\) of x, where [.] means the integer part. For a subgraph \(A\subset T\), we define the volume \(V(A) = \sum _{v\in A}d_v\) and more concisely \(V(x,r) = V(B(x,r))\). For \((x,y) \in T^2\), we also write \(q_t (x,y)\) (or sometimes \(q_{t,x}(y)\) to emphasize the starting point x) for the sum over random paths in time t. More precisely given a continuous time random walk Y on T, starting at x at \(t=0\) and jumping from a vertex v to its neighbours with probability \(1/d_v\), waiting at v for a time sampled from a Poisson distribution of mean 1, the heat-kernel writes

$$\begin{aligned} q_t(x,y) = {\mathbb {P}}^x(Y_t = y)/d_y, \end{aligned}$$
(48)

where \({\mathbb {P}}^x(Y_t=y)\) denotes the probability that the random walk Y sits at y at the time t.

For \(\lambda \ge 64\), the ball B(xr) is said \(\lambda \) –good (Definition 2.11 of [8]) if:

$$\begin{aligned} r^2\lambda ^{-2}\le & {} V(x,r) \le r^2\lambda , \end{aligned}$$
(49)
$$\begin{aligned} m(x,r)\le & {} \frac{1}{64} \lambda , \quad V(x, r/\lambda ) \ge r^2 \lambda ^{-4}, \quad V(x, r/\lambda ^2) \ge r^2 \lambda ^{-6}. \end{aligned}$$
(50)

Remark that if B(xr) is \(\lambda \) –good for some \(\lambda \), it is \(\lambda '\)good for all \(\lambda ' > \lambda \). We will also say \(\lambda \)-bad for a ball B(xr) that is not \(\lambda \)-good.

Corollary 2.12 of [8] proves that

$$\begin{aligned} {\mathbb {P}}(B(x,r) \hbox { is not }\lambda -\hbox {good}) \le c_1 e^{-c_2 \lambda }. \end{aligned}$$
(51)

This inequality together with the Borel-Cantelli lemma imply that given r and a real monotonic sequence \(\{\lambda _{l}\}_{l\ge 0}\) with \(\lim _{l \rightarrow \infty } \lambda _{l}= +\infty \), there is, with probability one, a finite \(l_0\) such that B(xr) is \(\lambda _{l_0}\)-good (see also the proof of Theorem 1.5 in [8]). In particular

Lemma 3.1

Defining the random variable \(L=\min \{l : B(x,r) \text { is } \lambda _{l}\text {-good}\}\) we have

$$\begin{aligned} {\mathbb {P}}[ L=l] \le c_1 e^{-c_2 \lambda _{l-1}}. \end{aligned}$$
(52)

Proof

This is because the ball B(xr) must then be \(\lambda _{l-1}\text {-bad}\). \(\square \)

Besides, the conditions of \(\lambda \)-goodness allow to bound with the right scaling the random path factor \(q_t (x,y)\) for y not too far from x. More precisely the main part of Theorem 4.6 of [8] reads

Theorem 3.2

Suppose that \(B=B(x,r)\) is \(\lambda \)–good for \(\lambda \ge 64\), and let \(I(\lambda ,r)=[r^3 \lambda ^{-6}, r^3 \lambda ^{-5}]\). Then

  • for any \(K \ge 0\) and any \(y\in T\) with \(d(x,y) \le Kt^{1/3}\)

    $$\begin{aligned} q_{2t}(x,y) \le c\left( 1+{\sqrt{K}}\right) t^{-2/3} \lambda ^{3} \quad \hbox { for } t\in I(\lambda , r) , \end{aligned}$$
    (53)
  • for any \(y\in T\) with \(d(x,y) \le c_2 r \lambda ^{-19}\)

    $$\begin{aligned} q_{2t}(x,y) \ge c t^{-2/3} \lambda ^{-17} \quad \hbox { for } t\in I(\lambda , r). \end{aligned}$$
    (54)

Notice that these bounds are given for \(q_{2t}(x,y)\) but the factor 2 is inessential (it can be gained below by using slightly different values for K) and we omit it from now on for simplicity.

3.1 Warm-up

To translate these theorems into our multiscale setting, we introduce the notation \(I_j = [ M^{2(j-1)}, M^{2j}] \) and we have the infrared equivalent continuous time representationFootnote 7

$$\begin{aligned} C^j_T (x,y) = \int _0^\infty u^{-\alpha } du \int _{I_j} q_t (x,y) e^{-ut } dt = \Gamma (1- \alpha )\int _{I_j} q_t (x,y)t^{\alpha -1} dt . \end{aligned}$$
(55)

This relates our sliced propagator (31) to the kernel \(q_t\) of [8]. We forget from now on the inessential \(\Gamma (1- \alpha )\) factor. In our particular case \(q=4,\alpha = 1/3\), (55) means that we should simply multiply the estimates on \(q_t\) established in [8] by \(c M^{2j/3}\) to obtain similar estimates for \( C^j_T\). However we have also to perform spatial integrations not considered in [8], which complicate the probabilistic analysis. As a warm up, let us therefore begin with a few very simple examples. Recall that we do not carefully track inessential constant factors in what follows, and that we can use the generic letter c for any such constant when it does not lead to confusion.

Lemma 3.3

(Single Integral Upper Bound). There exists some constant c such that

$$\begin{aligned} {\mathbb {E}}\left[ \sum _y C_T^j (x,y)f(x,y) \right] \le c M^{2j/3} , \end{aligned}$$
(56)

for any \(L^1\) function f with \(0\le f(x,y)\le 1,~ \forall x,y\in T\).

Proof

We introduce two indices \(k \in {\mathbb {N}}\), and \(l \in {\mathbb {N}}\) with the condition \(l\ge l_0 := \sup \{M^{2}, 64\}\) and parameters \(\lambda _{k,l}:= k+ l \). We also define radii

$$\begin{aligned} r_{j,k}:= & {} M^{2j/3} k^{5/3} , \end{aligned}$$
(57)
$$\begin{aligned} r_{j,k,l}:= & {} M^{2j/3}(k+ l)^{5/3}, \end{aligned}$$
(58)

and the balls \(B^T_{j,k}\) and \(B^T_{j,k,l}\) centered on x with radius \(r_{j,k}\) and \(r_{j,k,l}\) (we put an upper index T to remind the reader that these sets depend on our random space, namely the tree T). We also define the annuli

$$\begin{aligned} A^T_{j,k} := \{y : d(x,y) \in [r_{j,k},r_{j,k+1}[ \}, \end{aligned}$$
(59)

so that the full tree is the union of the annuli \(A^T_{j,k}\) for \(k \in {\mathbb {N}}\):

$$\begin{aligned} T = \cup _{k \in {\mathbb {N}}} \; A^T_{j,k}. \end{aligned}$$
(60)

Remark that \(A^T_{j,k} \subset B^T_{j,k+1} \subset B^T_{j,k,l}\) for any \(l \in {\mathbb {N}}^\star \). Remark also that with these definitions

$$\begin{aligned} I_j = [M^{2j-2},M^{2j}] \subset I(\lambda _{k,l},r_{j,k,l})= [r_{j,k,l}^3\lambda _{k,l}^{-6}, r_{j,k,l}^3\lambda _{k,l}^{-5}], \end{aligned}$$
(61)

where \(I(\lambda ,r)\) is as in Theorem 3.2, since our condition \(l \ge l_0 \ge M^{2}\) ensures that \(r_{j,k,l}^3\lambda _{k,l}^{-6} \le M^{2j-2}\). Finally defining \(K_k := M^{2/3} (k+1)\) we have

$$\begin{aligned} d(x,y ) \le K_k t^{1/3},\quad \forall t \in I_j,\;\forall y \in A^T_{j,k}. \end{aligned}$$
(62)

Since the propagator is pointwise positive we can commute any sum or integral as desired. Taking (60) into account we can organize the sum over y according to the annuli \(A^T_{j,k}\). Commuting the sum \({\mathbb {E}}\) and the sum over k, according to the Borel-Cantelli argument in the section above, there exists (almost surely in T) a smallest finite l such that the \(B^T_{j,k,l}\) ball is \(\lambda _{k,l}\)-good. Defining the random variable \(L=\min \{l \ge l_0 : B^T_{j,k,l} \text { is } \lambda _{k,l}\text {-good}\}\), we can partition our \({\mathbb {E}}\) sum according to the different events \(L=l\). We now fix this l so as to evaluate, according to (55)

$$\begin{aligned} {\mathbb {E}}\left[ \sum _y C_T^j (x,y) f(x,y) \right] = \sum _{k=0}^\infty \sum _{l=l_0}^\infty {\mathbb {P}}[L=l] {\mathbb {E}}\vert _{L=l} \Bigl [\sum _{y \in A^T_{jk}} \int _{I_j} dt t^{\alpha -1} q_t (x,y) f(x,y) \Bigr ],\nonumber \\ \end{aligned}$$
(63)

where \({\mathbb {E}}\vert _{A}\) means conditional expectation with respect to the event A. We are in position to apply Theorem 3.2 since all hypotheses and conditions are fulfilled (including \(\lambda _{k,l} \ge 64\) since \(l_0 \ge 64\)). We have for some inessential constant c, under condition \(L=l\)

$$\begin{aligned} q_t (x,y) \le c (1 + \sqrt{K_k} ) M^{-4j/3} \lambda _{k,l}^3 , \quad \forall t \in I_j,\;\forall y \in A^T_{j,k} . \end{aligned}$$
(64)

Hence integrating over \(t \in I_j\)

$$\begin{aligned} C_T^j (x,y) \le c (k+l)^{7/2} M^{-2j/3},\quad \forall y \in A^T_{j,k}, \end{aligned}$$
(65)

for some other inessential constant c. We can now sum over \(y \in A^T_{j,k}\), overestimating the volume of the annulus \(A^T_{j,k}\) by the volume of the \(B^T_{j,k,l}\) ball (the number of vertices it contains), to obtain

$$\begin{aligned} \sum _{y \in A^T_{j,k}} C_T^j (x,y) f(x,y) \le c (k+l)^{7/2} M^{-2j/3} vol(B^T_{j,k,l}) , \end{aligned}$$
(66)

since f is bounded by one. The condition \(L=l\) allows to control the volume \(vol(B^T_{j,k,l}) \) by the \(\lambda _{k,l}\text {-good}\) condition. More precisely (4950) implies

$$\begin{aligned} {\mathbb {E}}\vert _{L=l} \, [ vol(B^T_{j,k,l}) ] \le r_{j,k,l}^2 \lambda _{k,l} . \end{aligned}$$
(67)

Using Lemma 3.1 we conclude that

$$\begin{aligned} {\mathbb {E}}\left[ \sum _y C_T^j (x,y)f(x,y) \right]&\le c\sum _{k=0}^\infty \sum _{l=l_0}^\infty {\mathbb {P}}[L=l] (k+l)^{7/2}M^{-2j/3} r_{j,k,l}^2 \lambda _{k,l}\nonumber \\&\le c M^{2 j/3}\sum _{k=0}^\infty \sum _{l=l_0}^\infty e^{-c'(k+l)} (k+l)^{47/6} \le c M^{2 j/3} . \end{aligned}$$
(68)

\(\square \)

Corollary 3.4

(Tadpole). There exists some constant c such that

$$\begin{aligned} {\mathbb {E}}\left[ C_T^j (x,x) \right] \le c M^{-2j/3} . \end{aligned}$$
(69)

Proof

Taking \(f(x,y)= \delta _{xy}\) in Lemma 3.3 gives the bound. \(\square \)

A lower bound of the same type is somewhat easier, as we do not need to exhaust the full spatial integral but can restrict to a subset, in fact a particular \(\lambda \)-good ball.

Lemma 3.5

(Single Integral Lower Bound).

$$\begin{aligned} {\mathbb {E}}\left[ \sum _y C_T^j (x,y) \right] \ge c M^{2j/3} . \end{aligned}$$
(70)

Proof

We follow the same strategy than for the upper bound but we do not need the index k and the annuli \(A_{j,k}\), since most of the volume is typically in the first annulus - namely the \(k=0\) ball \(B_j\). Restricting the sum over y this ball is typically enough for a lower bound of the (70) type. So we work at \(k=0\) but we need again probabilistic estimates to tackle the case of untypical volume of the ball \(B_j\). Therefore we define for \(l\ge l_0 := \sup \{M^{2}, 64\}\), the parameter \(\lambda _l = l\) and the two balls \(B^T_{j,l} = B(x,r_{j,l})\) and \({\tilde{B}}^T_{j,l} = B(x,{\tilde{r}}_{j,l})\subset B^T_{j,l}\) of radii respectively \(r_{j,l}:=M^{2j/3}\lambda _l^{5/3}\) and \({\tilde{r}}_{j,l}:= c_2r_{j,l}\lambda _l^{-19}\) (in order for (54) to apply below). We introduce the random variable

$$\begin{aligned} L=\min \{l \ge l_0 : B^T_{j,l} \text { and } {\tilde{B}}^T_{j,l} \text { are both } \lambda _{l}\text {-good}\}. \end{aligned}$$
(71)

Again, our choice of \(r_{j,l}\) ensures that

$$\begin{aligned} I_j = [M^{2j-2},M^{2j}] \subset I(\lambda _{l},r_{j,l})= [r_{j,l}^3\lambda _{l}^{-6}, r_{j,l}^3\lambda _{l}^{-5}], \end{aligned}$$
(72)

and the summands being positive, we will restrict the sum over y to the smaller ball \({\tilde{B}}^T_{j,l} \subset B^T_{j,l}\), in order for (54) to apply. We get

$$\begin{aligned} {\mathbb {E}}\left[ \sum _y C_T^j (x,y) \right]\ge & {} {\mathbb {P}}[L\le l] {\mathbb {E}}\vert _{L \le l} \ \Bigl [\sum _{y \in {\tilde{B}}^T_{j,l} } \int _{I_j} dt t^{\alpha -1} q_t (x,y) \Bigr ], \quad \forall l,\ \end{aligned}$$
(73)
$$\begin{aligned}\ge & {} c M^{-2j/3} l^{-17} {\mathbb {P}}[L\le l] \;{\mathbb {E}}\vert _{L=l} [ vol( {\tilde{B}}^T_{j,l} ) ],\quad \quad \forall l,\ \end{aligned}$$
(74)
$$\begin{aligned}\ge & {} cM^{2j/3}{\mathbb {P}}[L\le l] l^{-161/3}, \quad \quad \quad \quad \quad \quad \quad \quad \ \forall l, \ \end{aligned}$$
(75)
$$\begin{aligned}\ge & {} cM^{2j/3} . \end{aligned}$$
(76)

Indeed for the last inequality we remark that \(\lim _{l \rightarrow \infty }{\mathbb {P}}[L\le l] = 1\) (by Lemma 3.1) hence \(\sup _{l \ge l_0}{\mathbb {P}}[L\le l]l^{-161/3} \) is a strictly positive constant that we absorb in c. \(\square \)

3.2 Bounds for convergent graphs

In this section we prove our first main result, namely the convergence of Feynman amplitudes of the type (38)–(40) as the infrared cutoff \(\rho \) is lifted. Therefore we consider a fixed completely convergent graph G with n inner vertices and N external lines, hence for which \(N(S) \ge 6\; \forall S \subset G\). In this graph we mark a root vertex \(v_0\) with fixed position \(x_0\), lying on the spine, i.e. common to all trees T. By translation invariance of the infinite spine, the resulting amplitude \(A_G (x_0)\) is in fact independent of \(x_0\) and we have

Theorem 3.6

For a completely superficially convergent graph (i.e. with no 2- or 4-point subgraphs) G of order \(V(G) = n\), the limit as \(\lim _{\rho \rightarrow \infty } {\mathbb {E}}( A_G)\) of the averaged amplitude exists and obeys the uniform bound

$$\begin{aligned} {\mathbb {E}}( A_G) \le K^n (n!)^\beta \end{aligned}$$
(77)

where \(\beta =\frac{52}{3}\).Footnote 8

Proof

From the linear decomposition \(A_{G} = \sum _\mu A_{G,\mu }\) follows that \({\mathbb {E}}(A_G) = \sum _\mu {\mathbb {E}} (A_{G,\mu } )\). As mentioned above we use only the decay of the propagators of an optimal Kruskal tree \(\tau (\mu )\) to perform the spatial integrals over the position of the inner vertices. It means that we first apply Cauchy-Schwarz inequalities to the \(n+1 - N/2\) edges \(\ell \not \in \tau (\mu )\). To be exact, the first Cauchy–Schwarz inequality applies to the Markovian random walk with heat-kernel \(q_{2t}(x,y)\) which rewrites as an inner-product by the Chapman-Kolmogorov property

$$\begin{aligned} q_{2t}(x,y)= & {} \sum _{z\in V(T)}q_t(x,z)q_t(z,y)= \left\langle q_{t,x} ,q_{t,y}\right\rangle _2 \nonumber \\\le & {} \sqrt{\left\langle q_{t,x}^2\right\rangle _2\left\langle q_{t,y}^2\right\rangle _2} =\sqrt{ q_{2t}(x,x)q_{2t}(y,y)}. \end{aligned}$$
(78)

We refer to [9] for more details and will again use this inner product in Sect. 4. A second Cauchy-Schwarz inequality is then used for the scalar product \((f,g)= \int _0^\infty dt t^{\alpha - 1}f(t) g(t)\) with f standing for \(\sqrt{q_{2t}(x,x)}\) and analogously for g.

Labeling all the corresponding half-edges (not in \(\tau (\mu )\)) as fields \(f= 1, \ldots 2n+2 - N\) and their positions and scale as \(x_f\) and \(j_f\) we have

$$\begin{aligned} \prod _{\ell \not \in \tau (\mu )} C_T^{j_\ell } (x_\ell ,y_\ell )\le & {} c^n \prod _{\ell \not \in \tau (\mu )} \sqrt{C_T^{j_\ell }(x_\ell , x_\ell ) C_T^{j_\ell }(y_\ell , y_\ell ) }\nonumber \\= & {} c^n\prod _{f=1}^{2n+2 - N} [C_T^{j_f}(x_f,x_f)]^{1/2}, \end{aligned}$$
(79)

making use of Eq. (55).

Each inner vertex \(v\in \{1, \ldots n-1\}\) to integrate over is linked to the root by a single path in \(\tau (\mu )\). The first line, \(\ell _v\), in this path relates v to a single ancestor a(v) by an edge \(\ell _v \in \tau (\mu )\). This defines a scale \(j_v := j_{\ell _v}(\mu )\) for the sum over the position \(x_v\).

Taking (79) into account, we write therefore

$$\begin{aligned} {\mathbb {E}}[A_{G,\mu }] \le {\mathbb {E}}\Big [ c^n \sum _{\{x_{v} \}} \prod _{v=1}^{n-1} C_{T}^{j_v} (x_v, x_{a(v)}) \prod _{f =1}^{2n+2 - N} [C_T^{j_f}(x_f,x_f)]^{1/2} \Big ] \; . \end{aligned}$$
(80)

We apply now to the \(n-1\) spatial integrals exactly the same analysis than for the single integral of Lemma 3.3. The main new aspect is that the events of the previous section do not provide independent small factors for each spatial integral. For instance if two positions \(x_v\) and \(x_{v'}\) happen to coincide and the smallest-l \(\lambda _{l}\)-good event occur for a ball centered at \(x_v\), it automatically implies the \(\lambda _{l-1}\)-bad event for the ball centered at \(x_{v}\) and at \(x_{v'}\), because it is the same event. Therefore in this case we do not get twice the same small associated probabilistic factor of Lemma 3.1. This is why we loose a (presumably spurious) factorial \([n!]^\beta \) in (77).

More precisely we introduce for each \(v \in [1, n-1]\) two integers \(k_v\) and \(l_v \ge l_0\), the radii \(r_{j_v, k_v}\), \(r_{j_v, k_v, l_v}\) and the parameters \(\lambda _{k_v,l_v}\) exactly as before. We introduce also all these variables for every field \(f \in [ 1, \ldots 2n+2 - N ]\) not in \(\tau ( \mu )\). We define again the random variable \(L_v\) for \(v \in [1, n-1]\) as the first integer \(\ge l_0\) such that the ball \(B^T_{j_v,k_v,l_v}\) is \(\lambda _{k_v,l_v}\text {-good}\) and \(L_f\) for \(f \in [1, n+1-N/2]\) as the first integer \(\ge l_0\) such that the ball \(B^T_{j_f,k_f,l_f}\) is \(\lambda _{k_f,l_f}\text {-good}\). The integrand is then bounded according to Theorem 3.2, leading to

$$\begin{aligned} {\mathbb {E}}[A_{G,\mu }]\le & {} c^n \sum _{\{k_v\},\{l_v\}\atop \{k_f\},\{l_f\}}{\mathbb {P}}(L_v= l_v, L_f = l_f) \Big [ \prod _{v=1}^{n-1} M^{2j_v/3} [k_v+l_v]^{47/6} \nonumber \\&\prod _{f =1}^{2n+2 - N} M^{-j_f/3}[k_f+l_f]^{7/4} \Big ]. \end{aligned}$$
(81)

Now as mentioned already the \(3n +1 -N\) events \(L_v=l_v\) or \(L_f=l_f\) are not independent so we use only the single best probabilistic factor for one of them. It means we define \(m= \sup _{v,f}\{k_v+l_v, k_f+l_f \}\) and use that \({\mathbb {P}}[L_v= l_v, L_f = l_f] \le c' e^{-cm}\) to perform all the sums with the single probabilistic factor \(e^{-cm}\) from (52). Since each index is bounded by m, the big sum

$$\begin{aligned} \sum _{\{k_v \le m\},\{l_v \le m\}\atop \{k_f \le m\},\{l_f \le m\}} \prod _{v=1}^{n-1} [k_v+l_v]^{47/6} \prod _{f =1}^{2n+2 - N} [k_f+l_f]^{7/4} \end{aligned}$$
(82)

is bounded by \(c^n m^{\frac{59}{6} (n-1) + \frac{15}{4} (2n+2 - N) }\) hence by \(c^n m^{\frac{52n}{3}}\). Finally since

$$\begin{aligned} \sum _m e^{-cm} m^{\frac{52n}{3}} \le c^n [n!]^\beta , \quad \beta =\frac{52}{3} , \end{aligned}$$
(83)

we obtain the usual power counting estimate up to this additional factorial factor:

$$\begin{aligned} {\mathbb {E}}[A_{G,\mu }] \le c^n [n!]^{\beta } \sum _{\mu }\prod _{v=1}^{n-1} M^{2j_v/3}\prod _{f =1}^{2n+2 - N} M^{-j_f/3}. \end{aligned}$$
(84)

From now on we can proceed to the standard infra-red analysis of a just renormalizable theory exactly similar to the usual \(\phi ^4_4\) analysis of [84,85,86,87]. Organizing the bound according to the inclusion forest of the high subgraphs \(G_{j,k}\) we rewrite

$$\begin{aligned} \prod _{v=1}^{n-1} M^{2j_v/3}\prod _{f =1}^{2n+2 - N} M^{-j_f/3} = \prod _{j,k} M^{\omega (G_{j,k})} \end{aligned}$$
(85)

with \(\omega (S) = \frac{2}{3}E(S) - \frac{4}{3} (V(S)-1) = \frac{4-N(S)}{3}\) and get therefore the bound

$$\begin{aligned} {\mathbb {E}}[A_{G,\mu }] \le c^n [n!]^{\beta } \sum _{\mu }\prod _{j,k} M^{[4-N(G_{j,k})]/3}. \end{aligned}$$
(86)

The sum over \(\mu \) is then performed with the usual strategy of [84,85,86,87]. We extract from the factor \(\prod _{j,k} M^{[4-N(G_{j,k})]/3} \) an independent exponentially decaying factor (in our case at least \(M^{-\vert j_f -j_{f'}\vert /54}\) for each vertex v and each pair of fields \((f,f')\) hooked to v of their scale difference \(\vert j_f -j_{f'}\vert \)Footnote 9). We can then organize and perform easily the sum over all scales assigned to all fields, hence over \(\mu \), and it results only in still another \(c^n\) factor. This completes the proof of the theorem. \(\square \)

A lower bound

$$\begin{aligned} {\mathbb {E}}\left[ \sum _y [C_T^j (x,y) ]^2 \right] \ge c \end{aligned}$$
(87)

can be proved exactly like Lemma 3.5 and implies that the elementary one loop 4-point function is truly logarithmically divergent when \(\rho \rightarrow \infty \).

Taken all together the results of this section prove that for the \(\phi ^q\) interaction at \(q=4\) the value \(\alpha = \frac{1}{3}\) is the only one for which the theory can be just renormalizable. Extending to any q can also be done following exactly the same lines and proves that \(\alpha = \frac{2}{3} - \frac{4}{3q}\), as in (24), is the only exponent for which the theory is just renormalizable in the infrared regime.

4 Localization of High Subgraphs

When the graph contains \(N=2\) or \(N=4\) subgraphs, we need to renormalize. According to the Wilsonian strategy, renormalization has to be performed only on high divergent subgraphs, and perturbation theory is then organized into a multi-series in effective constants, one for each scale, all related through a flow equation. This is standard and remains true either for an ultraviolet or for an infrared analysis [86].

Two key facts power the renormalization machinery and their combination allows to compare efficiently the contribution of a high divergent subgraph to its Taylor expansion around local operator [86, 87]:

  • the quasi-locality (relative to the internal scale \(i_S (\mu )\)) between external vertices of any high subgraph \(S=G_{j,k}\) provided by the Kruskal tree (because it remains a spanning tree when restricted to any high subgraph);

  • the small change in an external propagator of scale \(e_S (\mu ) =j_M\) when one of its arguments is moved by a distance typical of the much smaller internal ultraviolet scale \(i_S (\mu )=j_m\ll j_M\).

Taken together these two facts explain why the contribution of a high subgraph is quasi-local from the point of view of its external scales, hence explain why renormalization by local counterterms works.

However usual tools of ordinary quantum field theory such as translation invariance and momentum space analysis are no longer available on random trees, and we have to find the probabilistic equivalent of the two above facts in our random-tree setting:

  • in our case, the proper time of the path of a propagator at scale j is \(t_j \simeq M^{2j}\) and the ordinary associated distance scale is \(r_j \simeq t_j^{1/3} \simeq M^{2j/3}\). We expect the associated scaled decay between external vertices of any high subgraph \(G_{j,k}\) provided by the Kruskal tree to be true only for typical trees. However we prove below that the techniques used in Lemma 3.3 to sum over y validate this picture;

  • in our case, the small change in an external propagator of scale \(j_M\) should occur when one of its arguments is moved by a distance of order \(r_{j_m} \simeq M^{2j_m/3}\). We shall prove that in this case we gain a small factor \(M^{-(j_M-j_m)/3}\) compared to the ordinary estimate in \(M^{-2j_M/3}\) of (69) for \(C_T^{j_M}\). This requires comparing propagators with different arguments hence some additional work.

Hence, the following analysis justifies the heuristic power counting argument given in Sect. 2.3 and that the subtraction of local counterterms allows indeed to control the diverging amplitude in this context of random trees (with some additional subtleties in the 2-point function case).

4.1 Warm up

We explain first on a simplified example how to implement these ideas, then give a general result. Our first elementary example consists in studying the effect of a small move of one of the arguments of a sliced propagator \(C^{j}_T (x, y)\). We need to check that it leads, after averaging on T, to a relatively smaller and smaller effect on the sliced propagator when \(j \rightarrow \infty \).

Consider three sites x, y and z on the tree and the difference

$$\begin{aligned} \Delta ^j_T (x;y,z) := \vert C^{j}_T (x, y) - C^{j}_T (x, z)\vert . \end{aligned}$$
(88)

We want to show that when \(d(y,z) \ll r_j = M^{2j/3}\), we gain in the average \({\mathbb {E}}[\Delta (x,y,z) ]\) a small factor compared to the ordinary estimate in \(M^{-2j/3}\) for a single propagator without any difference.

This is expressed by the following Lemma.

Lemma 4.1

There exists some constant c such that for any T and any \(t \in I_j\)

$$\begin{aligned} \vert q_t (x,y) - q_t (x,z) \vert \le c M^{-j}\sqrt{d(y,z) q_t (x,x)}. \end{aligned}$$
(89)

Moreover

$$\begin{aligned} {\mathbb {E}}[\Delta ^j_T (x;y,z) ]\le c M^{-2j/3} M^{-j/3 } \sqrt{d(y,z)} . \end{aligned}$$
(90)

This bound is uniform in \(x \in {\mathcal {S}}\) and the factor \( M^{-j/3 } \sqrt{d(y,z)}\) is the gain, provided \(d(y,z) \ll r_j = M^{2j/3}\).

Proof

We use again results of [8]. With their notations, it is proved in their Lemma 3.1 that

$$\begin{aligned} \vert f(y) - f(z) \vert ^2 \le R_{eff} (y,z) {\mathcal {E}}(f,f) \end{aligned}$$
(91)

where the effective graph resistance \(R_{eff} (y,z)\) in the case of a tree T is nothing but the natural distance d(yz) on the tree, and noting as earlier \(\langle f,g\rangle _2 \) the \(L_2(T)\) scalar product \(\sum _{y\in T} f(y)g(y) \),

$$\begin{aligned} {\mathcal {E}}(f,f) := \langle f, {\mathcal {L}}f \rangle _2 \end{aligned}$$
(92)

is the natural positive quadratic form associated to the Laplacian. Applying this estimate to the function \(f_{t,x}\) defined by \(f_{t,x}(y) = q_t(x, y)\) exactly as in the proof of Lemma 4.3 of [8] leads to

$$\begin{aligned} \vert f_{t,x}(y) - f_{t,x}(z) \vert ^2 \le d(y,z)\frac{q_t (x,x)}{t} \end{aligned}$$
(93)

hence to

$$\begin{aligned} \vert f_{t,x}(y) - f_{t,x}(z) \vert \le c M^{-j}\sqrt{d(y,z) q_t (x,x)} \end{aligned}$$
(94)

for any \(t \in I_j\). From there on (90) follows easily by an analysis similar to Corollary 3.4\(\square \)

The next Lemma describes a simplified renormalization situation: a single propagator \(C^{j_M}_T (x, y)\) mimicks a single external propagator at an “infrared” scale \(j_M\) and another propagator \(C^{j_m}_T (y, z)\) mimicks a high subgraph at an “ultraviolet” scale \(j_m \ll j_M\). The important point is to gain a factor \(M^{-(j_M - j_m)/3}\) when comparing the “bare” amplitude

$$\begin{aligned} A^{b}_T (x,z) := \sum _{y\in T} C^{j_M}_T (x, y) C^{j_m}_T (y, z) \end{aligned}$$
(95)

to the “localized” amplitude at z

$$\begin{aligned} A^l_T (x,z):= C^{j_M}_T (x, z) \sum _{y\in T} C^{j_m}_T (y, z) \end{aligned}$$
(96)

in which the argument y has been moved to z in the external propagator \(C^{j_M}_T\). Introducing the averaged “renormalized” amplitude

$$\begin{aligned} {\bar{A}}^{ren}_T(x,z):= {\mathbb {E}}[A^{b}_T (x,z) - A^l_T (x,z)], \end{aligned}$$
(97)

we have

Lemma 4.2

$$\begin{aligned} \vert {\bar{A}}^{ren}_T(x,z)\vert \le c M^{- (j_M - j_m)} . \end{aligned}$$
(98)

This Lemma shows a net gain \(M^{- (j_M - j_m)/3}\) compared with the ordinary estimate \(M^{- 2(j_M - j_m)/3}\) which we would get for \(A^{b}_T \) or \(A^{l}_T \) separately.

Proof

We replace the difference \( C^{j_M}_T (x, y) - C^{j_M}_T (x, z)\) by the bound of Lemma 4.1. Taking out of \({\mathbb {E}}\) the trivial scaling factors

$$\begin{aligned} \vert {\bar{A}}_{ren}(x,z)\vert \le c M^{-j_M/3+2j_m/3}{\mathbb {E}}\Big [ \sum _{y \in T} \sqrt{d(y,z)} \sup _{t \in I_{j_M} \atop t' \in I_{j_m}}[\sqrt{q_t (x,x)} q_{t'}(y,z) ] \Big ]. \end{aligned}$$
(99)

We apply the same strategy that in the previous sections, hence we introduce the radii \(r_{j_m,k_m}\) and \(r_{j_m,k_m,l_m}\) and the corresponding balls and annuli as in the proof of Lemma 3.3 to perform the sum over y using the \(q_{t'}(y,z)\) factor. We also introduce the radii \(r_{j_M,k_M,l_M}\) to tackle the \(\sqrt{q_t (x,x)}\) which up to trivial scaling is exactly similar to a field factor in \([ C_T^{j_f}(x_f,x_f)]^{1/2}\) in (79), hence leads to a \(M^{-2j_M/3}\) factor. The \(\sum _{y\in T} \) then costs an \(M^{4j_m/3}\) factor, the \(\sqrt{d(y,z)}\) factor costs an \(M^{j_m/3}\) factor and the \(q_{t'}(y,z)\) brings an \(M^{-4j_m/3}\). Gathering these factors leads to the result. \(\square \)

4.2 Renormalization of 4-point subgraphs

The 4-point subgraphs \(N(S) =4\) in this theory have \(\omega (S) = \frac{N(S)-4}{3}\) hence are logarithmically divergent. Consider now a graph G which has no 2-point subgraphs, hence with \(N(S) \ge 4\) for any subgraph S. Recall the previous evaluation

$$\begin{aligned} |A_{G,\mu }|\le & {} K^{V(G)} M^{-N \rho /3} \prod _{\ell \in I(G)} M^{-2j_\ell /3} \prod _{v \in V(G)} M^{4j_v/3} \end{aligned}$$
(100)
$$\begin{aligned}= & {} K^{V(G)} \prod _{j =1}^\rho \prod _{k=1}^{k(G_j)} M^{\omega (G_{j,k})} \end{aligned}$$
(101)

of its bare amplitude. When there are 4-point subgraphs this amplitude, which is finite at finite \(\rho \), diverges when \(\rho \rightarrow \infty \) since there is no decay factor between the internal scale \(i_\mu (S) \) and the external scale...

In the effective series point of view we fix a scale attribution \(\mu \) and renormalization is only performed for the high subgraphs \(G_{j,k}\) with \(N(G_{j,k})=4\). They form a single forest \({\mathcal {F}}_\mu \) for the inclusion relation. Therefore in this setting the famous “overlapping divergences” problem is completely solved from the beginning. Such divergences are simply an artefact of the BPHZ theorem and completely disappear in the effective series organized according to the Wilsonian point of view [86].

In other words, for every 4-point subgraph S we choose a root vertex \(v_S\), with a position noted \(x^S_1\), to which at least one external propagator, \(C(z_1, x^S_1)\) of S hooks, and we introduce the localization operator \(\tau _S\) which acts on the three of the four external propagators C attached to S through the formula

$$\begin{aligned} \tau _S C(z_2, x^S_2)C(z_3, x^S_3)C(z_4, x^S_4) := C(z_2, x^S_1)C(z_3, x^S_1) C(z_4, x^S_1). \end{aligned}$$
(102)

The effectively renormalized amplitude with global infrared cutoff \(\rho \) is then defined as

$$\begin{aligned} A^{eff}_{G,\rho }(x_0):= & {} M^{\rho (4-N) /3} \sum _{\mu } A^{eff}_{G,\rho , \mu }(x_0)\; \; , \end{aligned}$$
(103)
$$\begin{aligned} A^{eff}_{G,\rho , \mu }(x_0):= & {} \prod _{S \in {\mathcal {F}}_\mu } (1 - \tau _S) \prod _{v=1}^{n-1} \sum _{x_v \in V(T)} \prod _{\ell \in I(G) } C_{T}^{j_\ell } (x_\ell , y_\ell ) . \end{aligned}$$
(104)

The result on a given tree still depends on the choice of the root vertex (because there is no longer translation invariance on a fixed given tree). Nevertheless translation invariance is recovered along the spine for the averaged amplitudes and our second main result is:

Theorem 4.3

For a graph G with \(N (G) \ge 4\) and no 2-point subgraph G of order \(V(G) = n\), the averaged effective-renormalized amplitude \({\mathbb {E}}[ A_G^{eff}] =\lim _{\rho \rightarrow \infty } {\mathbb {E}}[ A_{G,\rho }^{eff}]\) is convergent as \(\rho \rightarrow \infty \) and obeys the same uniform bound than in the completely convergent case, namely

$$\begin{aligned} {\mathbb {E}}( A_G^{eff}) \le K^n (n!)^\beta . \end{aligned}$$
(105)

Proof

Since the renormalization operators \(1-\tau _S \) are introduced only for the high subgraphs, they always bring by estimates (89)–(90) a factor \(M^{- (e_g(\mu ) - i_g (\mu ))/3}\).

Exactly like in the previous section, we obtain therefore a bound

$$\begin{aligned} |A^{eff}_{G,\mu }|\le & {} K^{V(G)} M^{-N \rho /3} \prod _{\ell \in I(G)} M^{-2j_\ell /3} \prod _{v \in V(G)} M^{4j_v/3} \end{aligned}$$
(106)
$$\begin{aligned}= & {} K^{V(G)} \prod _{j =1}^\rho \prod _{k=1}^{k(G_j)} M^{\omega ^{ren}(G_{j,k})} \end{aligned}$$
(107)

with \(\omega ^{ren}(G_{j,k}) = \omega (G_{j,k}) =\frac{4-N(G_{j,k})}{3} \) if \(N(G_{j,k}) > 4\) and \(\omega ^{ren}(G_{j,k}) =-1/3 \) if \(N(G_{j,k}) = 4\). Therefore \( A^{eff}_{G} = \sum _\mu A^{eff}_{G,\mu } \) can be bounded exactly like \(A_G\), using the same single \(\lambda \)-good condition as for the proof of Theorem 3.6. It therefore obeys the same estimate. \(\square \)

The perturbative theory can be organized in terms of these effective amplitudes provided the bare coupling constant at a vertex v with highest scale \(j^h(v)\) is replaced by an effective constant \(\lambda _{j^h(v)}\).

Remember that in the usual BPHZ renormalized amplitude we must introduce the Zimmermann’s forest sum, that is introduce \(\tau _S \) counterterms also for subgraphs that are not high. Such counterterms cannot be combined efficiently with anything so have to be bounded independently, using the cutoff provided by the condition that they are not high. This unavoidably leads to additional factorials which this time are not spurious, as they correspond to the so-called renormalons. These renormalons disappear in the effective series [86], and the problem is exchanged for another question, namely whether the flow of the effective constants remains bounded or not.

4.3 Multiple subtractions

Finally in the general perturbative series there occurs also 2-point subgraphs. For them we need to perform multiple subtractions. In the \(\phi ^q\) theory with \(q=4\) the 2-point function has divergence degree \(\omega = 2/3\) so it is not cured by a single difference as above. We need a kind of systematic analog of an operator product expansion around local or quasi-local operators. In our model the Laplacian is the main actor which replaces ordinary gradients in fixed space models. It is also the one that can be transported easily from one point to another, gaining each time small factors. Therefore if our problem requires renormalization beyond strictly local terms (such as wave function renormalization) we shall describe now a possibly general method to apply.

For any function f we can write the expansion

$$\begin{aligned} f(u) = \overline{ f} (u) + \mathscr {L} f (u) \end{aligned}$$
(108)

where \({\overline{f}}\) is the local average \(\frac{1}{d_u}\sum _{v \sim u} f (v)= \frac{1}{D}Af\) over the neighbors of u, and \(\mathscr {L} := \frac{1}{D}{\mathcal {L}}= {\mathbb {1}}- \frac{1}{D}A\) is the normalized operator that appears in the discretized heat equation on T. Remark indeed that from (2) we deduce

$$\begin{aligned} {[}C_{n+1} -C_{n}] (x,y) = \left[ \left( \frac{1}{D}A-{\mathbb {1}}\right) C_{n} \right] (x,y) = - [\mathscr {L} C_{n}](x,y) \end{aligned}$$
(109)

where \(C_n (x,y)\) is the sum over discrete random walks from x to y in exactly n steps.

Iterating we can define for any fixed \(p \in {\mathbb {N}}\) (where we simply put d for \(d_u\) when there is no ambiguity) an expansion:

$$\begin{aligned} f = {\bar{f}} + \overline{\mathscr {L} f} + \overline{\mathscr {L}^2 f} + \cdots + \overline{\mathscr {L}^p f} + \mathscr {L}^{p+1} f. \end{aligned}$$
(110)

From now on we forget the discretized notations and return to the infrared continuous time notation in which the heat equation reads

$$\begin{aligned} \frac{d}{dt} q_{t} = - {\mathcal {L}}q_t . \end{aligned}$$
(111)

Lemma 4.4

Consider the function \(\psi _x(t)= \langle q_{t,x}^2\rangle _2 = q_{2t}(x,x)\). The rth time derivatives \(\phi _r = (-1)^r\psi ^{(r)}\) are all positive monotone decreasing.

Proof

The heat equation (111) means by induction that

$$\begin{aligned} \phi _r =2^r \langle q_{t,x}, {\mathcal {L}}^r q_{t,x}\rangle _2 \ge 0. \end{aligned}$$
(112)

\(\square \)

Corollary 4.5

$$\begin{aligned} \langle q_{t,x}, {\mathcal {L}}^r q_{t,x}\rangle _2 \le c_r q_{c'_rt}(x,x) t^{-r}. \end{aligned}$$
(113)

Proof

For any r since \(\phi _r\) is positive monotone decreasing, we have

$$\begin{aligned} \phi _r (t) \le \frac{2}{t}\int _{\frac{t}{2}}^t \phi _r (s) ds = \frac{2}{t} [\phi _{r-1} \left( \frac{t}{2}\right) - \phi _{r-1} (t) ]\le \frac{2}{t} \phi _{r-1} \left( \frac{t}{2}\right) \end{aligned}$$
(114)

so that (113) follows by induction with \(c_r = 2^{r (r+1)/2}\) and \(c'_r = 2^{1-r}\). \(\square \)

Local transport up to pth order of the function f from point z to y is then defined as

$$\begin{aligned} f(z)= & {} \Big [ {\bar{f}} + \overline{{\mathcal {L}}f} +\overline{{\mathcal {L}}^2 f} + \cdots + \overline{{\mathcal {L}}^p f}\Big ](y) \end{aligned}$$
(115)
$$\begin{aligned}&+ \Delta _{yz}\Big [{\bar{f}} + \overline{{\mathcal {L}}f} + \overline{{\mathcal {L}}^2 f} + \cdots + \overline{{\mathcal {L}}^p f} \Big ] + {\mathcal {L}}^{p+1}f(z) \end{aligned}$$
(116)

where \(\Delta _{yz} g := g(z) - g(y)\). Each difference term is then evaluated in the case \(f=q_{t,x}\) as

$$\begin{aligned} \vert \Delta _{yz}\overline{{\mathcal {L}}^r q_{t,x}}\vert\le & {} \sum _{u\sim y \atop v \sim z} \vert {\mathcal {L}}^r q_{t,x}(u) -{\mathcal {L}}^r q_{t,x} (v) \vert \end{aligned}$$
(117)
$$\begin{aligned}\le & {} c_r \sqrt{d(y,z) {\mathcal {E}}({\mathcal {L}}^r q_{t,x}, {\mathcal {L}}^r q_{t,x}) } \end{aligned}$$
(118)
$$\begin{aligned}\le & {} c_r\sqrt{d(y,z) q_{c'_rt}(x,x)} t^{-r-1/2} \end{aligned}$$
(119)

and the last term \( {\mathcal {L}}^{p+1}f(z)\) is a finite sum of differences of the type \({\mathcal {L}}^{p}_{\cdot }q_{t,x}(z)- {\mathcal {L}}^{p}_{\cdot }q_{t,x}(u)\) for u close to z. It does not need to be transported, since again

$$\begin{aligned} \vert {\mathcal {L}}^{p}q_{t,x}(z)- {\mathcal {L}}^{p}q_{t,x}(u) \vert \le c_p \sqrt{d(z,u)q_{c'_pt}(x,x)} t^{-p-1/2} . \end{aligned}$$
(120)

The constants in these equation may grow very fast with p, but renormalization shall require such bounds only up to a very small order p, typically two.

Applying now the usual probabilistic estimates in the manner of the previous section means that the \(\sqrt{q_{c't}(x,x)}\) averages to a \(c M^{-2j/3}\) factor uniformly for \(t_j \in I_{j}\). Therefore we have the following analogs of Lemma 4.1:

Corollary 4.6

There exists some constant \(c_r\) such that uniformly for \(t_j \in I_{j}\)

$$\begin{aligned} {\mathbb {E}}[\vert \Delta _{yz}\overline{{\mathcal {L}}^r q_{t,x}} \vert ]\le & {} c_r M^{-2j/3} M^{-(2r+1)j } \sqrt{d(y,z)}, \end{aligned}$$
(121)
$$\begin{aligned} {\mathbb {E}}[\vert \Delta _{yz}\overline{{\mathcal {L}}^r C_j^T (x,z)}\vert ]\le & {} c_r M^{-(2r+1)j } \sqrt{d(y,z)}, \end{aligned}$$
(122)
$$\begin{aligned} {\mathbb {E}}[\vert {\mathcal {L}}^{p+1} C_j^T (x,z) \vert ]\le & {} c_p M^{-(2p+1)j } . \end{aligned}$$
(123)

These bounds coincide with those of Lemma 4.1 for \(r=0\) but improve rapidly with r. They should be useful for further renormalization, such as the one of the more divergent 2-point function. In the \(\phi ^4\) model above, since our propagator is a fractional power of the Laplacian, the corresponding “wave function renormalization” is not the standard one of the Laplacian. Moreover, physics is not directly associated to perturbative renormalization but rather to renormalization group flows, which require the computation of beta functions that are model dependent. For all these reasons we shall not push further the study of the scalar \(\phi ^4\) model here. In the next section we include some comments on SYK-type tensor models on random trees, since they were our main motivation for this study.

5 Comments on SYK and Random Trees

Essential features in SYK models are their definition at finite temperature and their holographic and maximal quantum chaotic properties [65,66,67,68].

When the time coordinate takes values on the real line, it is well understood that compactifying this line on a circle of perimeter \(\beta \) allows to study a field theory at the finite temperature \(1/\beta \). Because of the distinctive spine that comes out in our ensemble of random trees, we believe that quantum field theory on random trees at finite temperature should be in fact formulated on a circle dressed by random trees (called below random unicycles). Indeed a compactified spine corresponds to a single cycle.

Unicycles are very mild modifications of trees. Instead of having no cycle they have a single cycle \({\mathcal {C}}(\Gamma ) \) of length \(\ell \). They can therefore be embedded on the sphere as planar graphs with two faces (recall that trees have a single “external” face). Like the spine of random trees they should be decorated on each vertex of the spine by independent critical Galton–Watson branches, so that the total number of vertices is n with typically \(n\gg \ell \) (see Fig. 3). The continuum limit of such random unicycles when \(\ell \rightarrow \infty \) should then be defined, like Aldous continuous tree [5, 6], through a Gromov–Hausdorff limit.

As usual, Bosonic fields on such random unicycles should then obey periodic boundary conditions and Fermionic fields antiperiodic ones along the cycle. This study is left to a forthcoming paper.

Fig. 3
figure 3

This unicycle of length \(\ell =8\) and order \(n=42\) is binary: every vertex has degree either 3 or 1

In practice, we are interested in generalizing one-dimensional tensor models à la SYK to models defined on random trees.

A first step will be to focus on a particular bosonic tensor model, inspired by [80]. They considered a d-dimensional Bosonic tensor model with field \(\psi _{abc}\) that also involved, from the start, a rescaled Laplacian [95,96,97] \(\Delta ^{\zeta }\) with \(\zeta = d/4\), corresponding to the scaling dimension at the IR fixed point. Remark that this \(\zeta \) is compatible with our \(\alpha =1/3\) for \(d=4/3\).

The interaction is the most general quartic and \(O(N)^3\) tensor-invariant, hence involves three terms, tetrahedron, double-trace and pillow (see Eq. 7 in [80]). The choice of a non-canonical propagator allows the authors to analyze rigorously the renormalization group flow of the three couplings involved, proving the existence of an infrared fixed point which depends parametrically on the tetrahedral coupling. Taking this coupling small plays the role of the \(\epsilon \) parameter in the Wilson-Fisher analysis.

A simplifying feature in such models is that the renormalization group flow does not include general 2- or 4-point diagrams, as in the \(\phi ^q\) models considered above. Only melonic diagrams dominate the large N limit of correlation functions. This peculiarity allows to close the Schwinger–Dyson equations for the 2n-point functions.

A longer term goal is to investigate whether similar Fermionic models defined on random unicycles could still show in the asymptotic infrared regime approximate reparametrization invariance on the spine and to explore their corresponding holographic properties.

Finally, under specification conditioning of the branching process, it is possible to force the Galton–Watson tree on having a finite number \(p>1\) of infinite spines [98]. It would be interesting to characterize heat-kernel bounds relying on the techniques of [8], their scaling limit (since Aldous’ CRT has a single spine) and determine the renormalization group properties of field theories on such trees.