Abstract
The bootstrap percolation (or threshold model) is a dynamic process modelling the propagation of an epidemic on a graph, where inactive vertices become active if their number of active neighbours reach some threshold. We study an optimization problem related to it, namely the determination of the minimal number of active sites in an initial configuration that leads to the activation of the whole graph under this dynamics, with and without a constraint on the time needed for the complete activation. This problem encompasses in special cases many extremal characteristics of graphs like their independence, decycling or domination number, and can also be seen as a packing problem of repulsive particles. We use the cavity method (including the effects of replica symmetry breaking), an heuristic technique of statistical mechanics many predictions of which have been confirmed rigorously in the recent years. We have obtained in this way several quantitative conjectures on the size of minimal contagious sets in large random regular graphs, the most striking being that 5-regular random graph with a threshold of activation of 3 (resp. 6-regular with threshold 4) have contagious sets containing a fraction \(1/6\) (resp. \(1/4\)) of the total number of vertices. Equivalently these numbers are the minimal fraction of vertices that have to be removed from a 5-regular (resp. 6-regular) random graph to destroy its 3-core. We also investigated Survey Propagation like algorithmic procedures for solving this optimization problem on single instances of random regular graphs.
Similar content being viewed by others
Avoid common mistakes on your manuscript.
1 Introduction
Models of epidemic spreadings as dynamical processes occurring on a graph appear in various contexts besides epidemiology [15, 23, 34, 42, 63]; for instance social sciences study viral marketing campaigns aimed at propagating new social trends, and in economy it is crucial to understand cascading effects potentially leading to the bankrupt of financial institutions. In these models individual agents are located on the vertices of a graph, and their state (healthy or contaminated for instance) evolve in time according to the state of their neighbours, the edges of the graph representing the contacts between agents that can possibly transmit the illness from one contaminated agent to an healthy one.
There is a great diversity in the details of these models: the dynamics can occur in continuous (asynchronous) or discrete time, according to deterministic or random rules, the state of an agent can be boolean (healthy or contaminated) or describe several levels of contamination, and finally the dynamics can be monotonous or not. To precise this last point, a dynamics is said monotonous if the states of an agent always occur in the same order in time, for instance in the Susceptible-Infected-Recovered (SIR) model the only allowed transitions are S \(\rightarrow \) I and I \(\rightarrow \) R, a Recovered individual being immune forever, whereas in the SIS model an agent can become infected several times in a row. In this paper we will concentrate on a simple monotonous dynamics, that evolve deterministically in discrete time, with inactive (Susceptible) variables becoming active (Infected) when their number of active neighbours reach some threshold, and then remain active for ever. For this reason it is called the threshold model, see [39] for a version introduced in sociology with an underlying complete graph, and [27] for its first appearance in physics under the name of bootstrap percolation (on random regular graphs).
Given one specific dynamical model there are many different questions that can be asked. The first, a priori simplest, issue concerns the time evolution of the system from a random initial condition, taking the initial state of each agent as an independent random variable. For monotonous dynamics a stationary state is reached after some time, and one can wonder whether the epidemic has invaded the whole graph (in other words whether it percolates) in this final state. The probability of this event obviously depends on the fraction of infected vertices in the initial condition, and this may lead to phase transitions for certain class of graphs; see [5, 11, 43] for such a study of the bootstrap percolation on finite-dimensional lattices, and [12, 25, 27, 44–46, 50, 73] for various type of dynamics on random graphs. In particular one finds for the bootstrap percolation on random regular graphs a phase transition at some initial critical density \({\theta _\mathrm{r}}\) (dependent on the degree of the graph and the threshold of activation): with high probability initial conditions with a fraction \(\theta \) of active vertices (without correlations between the sites) are percolating if and only if \(\theta > {\theta _\mathrm{r}}\).
Besides these studies of the “forward” (or “direct”) time evolution, which are somehow simplified by the independence assumption for the initial state variables, one can also formulate more difficult inference and optimization questions. An example of the former type is to infer some information on the initial state given a snapshot of the epidemic after some time evolution [6, 51, 66, 72]; this “inverse problem” is particularly relevant in epidemiology in the search of the “zero patient” who triggered the spreading of an illness. For what regards the latter type of questions, the design of an efficient vaccination campaign can indeed be seen as an optimization problem: find the smallest set of nodes (to minimize the economical and social cost) whose vaccination will prevent the epidemic to reach a given fraction of the population [7]. We shall actually consider in this paper the somehow reverse optimization problem, namely targeting a small set of initially active sites that lead to the largest possible propagation of the contagion. This obviously makes more sense in the perspective of viral marketing, in which it was first considered [47] than in the epidemiological one; the initial adopters of a new product, that can be financially incited to do so, are expected to convince most of their acquaintances and progressively the largest possible part of the population. From this point of view the additional constraint that the propagation should be as fast as possible is also a relevant one.
More precisely, one can define two versions of this optimization problem: (i) given a fixed number of initially active agents, choose them in order to maximize the number of active agents at some fixed later time, or in the final state of the propagation; (ii) find the minimal number of initially active agents such that all the agents are active, again after some time or in the final state. We will concentrate on the latter version of the problem but part of our analysis applies to both. These optimization problems are known to be hard from a (worst-case) computational complexity point of view [28, 35, 47], even to approximate. Exhibiting minimal percolating sets for bootstrap percolation on finite dimensional lattices is relatively easy thanks to their regular structures, but more refined extremal problems are also relevant in this case, see for instance [20, 62]. The understanding of these optimization problems seems less advanced in the case of sparse random graphs. There exist upper and lower bounds on the size of minimal contagious sets [4, 35, 67], some based in particular on the expansion properties of such graphs [31]. One particular case of the optimization problem (when the threshold of activation is equal to the degree of the vertex minus one) is actually equivalent to the decycling number problem of graph theory [19] (also known as minimal Feedback Vertex Set), which was settled rigorously for 3-regular random graphs in [17] (this paper also contains bounds for higher degrees). As this last point unveils the notion of minimal contagious sets is connected in some special cases to many other problems in graph theory; one way to see this connection is to picture the inactive sites of the initial condition as particles to be put on the graph. One wants to pack as many as possible of them (to obtain a contagious set of minimal size), yet they do have some kind of repulsive interactions because of the constraint of complete percolation at a later time. This is particularly clear when the threshold of activation is equal to the degree for all vertices: the problem is then exactly equivalent to the hard-core particle model, also known as independent set or vertex cover.
The strategy we shall follow to determine the minimal size of contagious sets of sparse random graphs will be the same as in [8, 9], namely a reformulation under the form of a statistical mechanics model which can be treated with the so-called cavity method [53–56]. This (heuristic) method yields predictions for any interacting model defined on a sparse random graph; its use in the context of random constraint satisfaction problems led to the discovery of a very rich phenomenology of phase transitions [48, 56], with many of these predictions later confirmed rigorously [1, 3, 13, 30, 32, 57]. Let us emphasise in particular the determination of the maximal size of independent sets of random regular graphs (which as we saw is a problem related to the present one), for which the predictions of the cavity method (see [14] and references therein) have been recently rigorously confirmed (for graphs of large enough but finite degree) in [33]. Another example in the context of graph theory is the study of matchings in random graphs, where the cavity method [75] has also been proved to be correct [26]. The main originality of our contribution with respect to [8, 9] is the use of a more refined version of the cavity method (i.e. incorporating the effects of replica symmetry breaking), and an analytical study of the limit where the time at which the complete activation is required is sent to infinity.
The rest of the article is organized as follows. In Sect. 2 we define precisely the dynamics under study, recall briefly some known results for random initial conditions, formulate the optimization problem and propose various interpretations of it, and for the convenience of the reader we summarize the main results to be obtained in the following. In Sect. 3 we derive the cavity method equations, both at the replica symmetric and one step of replica symmetry breaking level. The solution of these equations for random regular graphs is presented in Sect. 4, which contains the main analytical results of this work. Section 5 is devoted to single sample numerical experiments, where we confront the analytical predictions with the optimized initial configurations obtained with two kind of algorithms (a simple greedy one and a more involved procedure based on message passing). We finally draw our conclusions and present perspectives for future work in Sect. 6. The most technical parts of the computations are deferred to two Appendices.
2 Definitions and Main Results
2.1 Definition of the Dynamics
Let us consider a graph on \(N\) vertices (or sites), \(G=(V,E)\), with the vertices labelled as \(V=\{1,\ldots ,N\}\), and the number of edges denoted \(|E|=M\). The dynamical process under study concerns the evolution of variables \({\sigma }_i^t\) on the vertices, \({\sigma }_i^t=0\) (resp. \(1\)) if the vertex \(i\) is inactive (resp. active) at time \(t\). We shall denote \({\underline{\sigma }}^t=({\sigma }_1^t,\ldots ,{\sigma }_N^t)\) the global configuration at time \(t\). The latter is determined by the initial condition \({\underline{\sigma }}\) at the initial time, \({\underline{\sigma }}^0={\underline{\sigma }}\), and then evolves subsequently in a deterministic and parallel way, in discrete time, according to the rules:
where \({\partial i}\) is the set of neighbours of \(i\) on the graph, and \(l_i\) is a fixed threshold for each vertex; we will also use \(d_i = |{\partial i}|\) to denote the degree of vertex \(i\). The dynamics is monotonous (irreversible), an active site remaining active at all later times, an inactive site \(i\) becoming active if its number of active neighbours at the previous time crosses the threshold \(l_i\). Note that the configuration \({\underline{\sigma }}^t\) at time \(t\) is a deterministic function of the initial condition \({\underline{\sigma }}={\underline{\sigma }}^0\), and that by monotonicity one can define the final configuration \({\underline{\sigma }}^\mathrm{f} = \lim \limits _{t \rightarrow \infty } {\underline{\sigma }}^t\), this stationary configuration being reached in a finite number of steps for all finite graphs.
It turns out that the final configuration \({\underline{\sigma }}^\mathrm{f}\) is also the one reached by a sequential dynamics in which at each time step only one site \(i\) with at least \(l_i\) active neighbours is activated; a moment of thought reveals the independence of the final configuration with respect to the order of the updates. \({\underline{\sigma }}^\mathrm{f}\) is indeed the smallest configuration (considering the partial order \({\underline{\sigma }}\le {\underline{\sigma }}^{\prime }\) if and only if \({\sigma }_i \le {\sigma }^{\prime }_i\) for all vertices) larger than the initial condition \({\underline{\sigma }}\), such that no further site can be activated. It will sometimes be useful in the following to think of this process in a dual way, corresponding to the original presentation of bootstrap percolation in [27], namely to consider that inactive sites are sequentially removed if they have less than a certain number of inactive neighbours. An equivalent definition of \({\underline{\sigma }}^\mathrm{f}\) is thus given by the inactive sites it contains, that form the largest set (with respect to the inclusion partial order) contained in the set of inactive sites of \({\underline{\sigma }}\), and such that in their induced graph the degree of site \(i\) is larger or equal than \(d_i-l_i+1\); they form thus a (generalized inhomogeneous version of the) core of the initially inactive sites.
2.2 Reminder of the Behaviour for Random Initial Conditions on Random Regular Graphs
To put in perspective the optimization problem to be studied in this paper it is instructive to first recall briefly some well-known results for the evolution from a random initial configuration [12, 27]. For the sake of simplicity let us consider \(G\) to be a \(k+1\)-random regular graph (i.e. a graph drawn uniformly at random among all graphs in which every vertex has degree \(k+1\)), with a uniform threshold for activation set to \(l_i=l\) for all vertices. Suppose that the states of the vertices in the initial condition are chosen randomly, independently and identically for each vertex, with a probability \(\theta \) (resp. \(1-\theta \)) for a vertex to be active (resp. inactive). The probability for one vertex \(i_0\) to be active at some time \(t+1\), denoted \(x_{t+1}\), can be computed from the following equation:
Indeed such a vertex was either active in the initial condition, or has seen at least \(l\) of its neighbours activate themselves before time \(t\), and without the participation of \(i_0\). The probability \({\widetilde{x}}_t\) of this last event obeys the recursive equation
with a number of participating neighbours reduced from \(k+1\) to \(k\) as \(i_0\) has to be supposed inactive here. The initial condition for these equations is \(x_0={\widetilde{x}}_0=\theta \). In the limit \(t \rightarrow \infty \) of large times \({\widetilde{x}}_t \rightarrow {\widetilde{x}}_\infty (\theta )\), the smallest fixed-point in \([0,1]\) of the recursion (3). For each \(k\ge 2\) and \(l\) with \(2\le l \le k\) there exists a threshold \({\theta _\mathrm{r}}(k,l)\) such that \({\widetilde{x}}_\infty (\theta )\) is equal to 1 for \(\theta > {\theta _\mathrm{r}}\), strictly smaller than 1 for \(\theta <{\theta _\mathrm{r}}\). From Eq. (2) one realizes that the same statement applies to \(x_\infty (\theta )\), hence \({\theta _\mathrm{r}}\) is the threshold for complete activation (percolation) from a Bernouilli random initial condition with probability \(\theta \) for each active site. Studying more precisely Eq. (3) one realizes that for \(l=k\) the transition is continuous (\(x_\infty (\theta _\mathrm{r}^-)=1\)), with an explicit expression for the threshold, \({\theta _\mathrm{r}}(k,k)=\frac{k-1}{k}\). For \(2\le l \le k-1\) the transition is discontinuous (\(x_\infty (\theta _\mathrm{r}^-)<1\)), the threshold \({\theta _\mathrm{r}}\) is obtained as the solution of the equations:
where \({\widetilde{x}_\mathrm{r}}={\widetilde{x}}_\infty (\theta _\mathrm{r}^-)\) is the value of the fixed-point of (3) at the bifurcation where it disappears discontinuously. For \(l=2\) these equations can be solved explicitly and yield
For generic values of the parameters \(k,l\) there is no explicit expression of \({\theta _\mathrm{r}}\), as (4) are algebraic equations of arbitrary degree; some numerical values of \({\theta _\mathrm{r}}\) will be given in Table 4. For a given value of \(k\) the threshold \({\theta _\mathrm{r}}(k,l)\) is growing with \(l\): if an initial condition leads to complete activation for some parameter \(l\) it will also be activating under the less constrained dynamics with \(l^{\prime }<l\).
The relevant range for the threshold parameter \(l\) in this study of random initial conditions is \(2\le l \le k\). Indeed for \(l=0\) after one step the configuration is completely active regardless of \({\underline{\sigma }}^0\), for \(l=1\) a single active site (per connected component) in the initial configuration is enough to activate the whole graph, hence in these two cases \({\theta _\mathrm{r}}=0\). On the other hand if \(l=k+1\) one has \({\theta _\mathrm{r}}=1\): any pair of adjacent inactive sites in the initial condition will remain inactive for ever, and the number of such pairs is linear in \(N\) as soon as \(\theta <1\).
Note that the recursion equations (2, 3) are exact if the neighbourhood up to distance \(t\) of the vertex \(i_0\) is a regular tree of degree \(k+1\). The limit \(t\rightarrow \infty \) can be taken in this way only if the graph considered is an infinite regular tree. A rigorous proof that this reasoning is in fact correct also for the large size limit of random regular graphs (that converge locally to regular trees) can be found in [12].
2.3 Definition of the Optimization Problem Over Initial Conditions
Let us now come back to a general graph \(G\) with some thresholds \(l_i\) for vertex activation, and consider the minimal fraction of active vertices in an initial configuration that activates the whole graph, i.e.
This corresponds to the minimal size of a contagious (or percolating) set, divided by the total number of vertices. Following [8, 9] it will turn out useful to introduce another parameter \(T\) (a positive integer) in this optimization problem, and impose now that the fully active configuration is reached within this time horizon \(T\):
Obviously for any finite graph \({\theta _\mathrm{min}}(G,\{l_i\},T)\) decreases when \(T\) increases and has \({\theta _\mathrm{min}}(G,\{l_i\})\) as its limit for \(T\rightarrow \infty \). To turn the computation of \({\theta _\mathrm{min}}\) into a form more reminiscent of statistical mechanics problems we shall introduce a probability measure over initial configurations:
where \({\underline{\sigma }}^T\) is as above the configuration obtained after \(T\) steps of the dynamics starting from the configuration \({\underline{\sigma }}={\underline{\sigma }}^0\), the \(\mu \) and \(\epsilon \) are for the time being arbitrary parameters, and the partition function \(Z\) ensures the normalization of this law. The parameter \(\mu \) is a “chemical potential” that controls the fraction of initially active vertices (if \(\epsilon =0\) the measure \(\eta \) reduces to the Bernouilli measure), while \(\epsilon \) is the cost to be paid for each site \(i\) inactive at the final time \(T\). In particular if \(\epsilon =+\infty \) one has
with \({\mathbb {I}}(A)\) is the indicator function of the event \(A\), the measure is thus supported by activating initial configurations (within the time horizon \(T\)). It is then obvious that the knowledge of \(Z\) allows to deduce the sought-for minimal density \({\theta _\mathrm{min}}\), as
Actually one can gain more information from the whole dependency of the partition function on \(\mu \). Suppose indeed that the number of initial configurations with a fraction \(\theta \) of active vertices that activate the whole graph in \(T\) steps is, at the leading exponential order, \(e^{N s(\theta )}\), with an entropy density \(s(\theta )\) of order one with respect to \(N\). Then this entropy density can be computed, in the large \(N\) limit, as a Legendre transform of the logarithm of the partition function. More precisely, defining the free-entropy density \(\phi \) as
the evaluation of the sum over configurations in the definition of \(Z\) via the Laplace method yields in the large \(N\) limit:
hence \(s(\theta )\) can be obtained by an inverse Legendre transform of \(\phi (\mu )\), with \(s(\theta ) = \phi (\mu ) - \mu \, \theta \) and \(\theta = \phi ^{\prime }(\mu )\).
For completeness let us also make a similar statement when \(\epsilon \) is finite, i.e. when one does not impose strictly the constraint of complete activation at time \(T\). Denoting \(s(\theta ,\theta ^{\prime })\) the entropy density of initial configurations that have a fraction \(\theta \) of initially active vertices and that lead after \(T\) steps of evolution to a configuration with a fraction \(\theta ^{\prime }\) of active sites, one has
Varying the parameters \(\mu \) and \(\epsilon \) thus allows to reconstruct the function \(s(\theta ,\theta ^{\prime })\), and hence to solve the optimization problem denoted (i) in the introduction, namely for a fixed value of \(\theta \) find the maximal reachable \(\theta ^{\prime }\). We will mainly concentrate in the following of the paper on the optimization problem denoted (ii) in the introduction, that is imposing the full activation of the graph at time \(T\) (\(\theta ^{\prime }=1\)), which as explained above can be studied via the computation of \(s(\theta )=s(\theta ,\theta ^{\prime }=1)\) from the inverse Legendre transform of the free-entropy with \(\epsilon =+\infty \).
The definitions above were valid for any graph and any choice of the activation thresholds; we shall however be particularly interested in the case of large random regular graphs with uniform thresholds, we thus define
where the average is over uniformly chosen regular graphs of degree \(k+1\) on \(N\) vertices, with the same threshold for activation \(l\) on every vertex. The fact that the limit in the definition of \({\theta _\mathrm{min}}(k,l,T)\) exists could actually be proven rigorously using the method developed in [18], and it is expected that \({\theta _\mathrm{min}}(G,\{l_i=l\},T)\) is self-averaging (i.e. concentrates around its average in the large \(N\) limit). The existence of \({\theta _\mathrm{min}}(k,l)\) might be a more difficult mathematical problem that we shall not discuss further; it is a reasonable conjecture that it coincides with the limit of \({\theta _\mathrm{min}}(k,l,T)\) when \(T\rightarrow \infty \), i.e. that the large size and large time limits commute. We will see in Sect. 4.2.1 one argument in favour of this conjecture. Let us emphasize that \({\theta _\mathrm{min}}(k,l) < {\theta _\mathrm{r}}(k,l)\), with a strict inequality. This is indeed a large-deviation phenomenon: even if most initial configurations with density smaller than \({\theta _\mathrm{r}}\) do not activate the whole graph some very rare ones (with a probability exponentially small in \(N\) in the Bernouilli measure of parameter \(\theta <{\theta _\mathrm{r}}\)) are able to do so. Note also that \({\theta _\mathrm{min}}(k,l)\) is growing with \(l\) at fixed \(k\), for the same reasons as explained above in the discussion of \({\theta _\mathrm{r}}\). The computations of \({\theta _\mathrm{min}}\) we shall present will follow the strategy explained above on an arbitrary graph, namely the computation of a free-entropy density, that we define in the case of random regular graphs as the quenched average over the graph ensemble,
2.4 Equivalence with Other Problems and Bounds
As mentioned in the introduction the problem of minimal contagious sets can be related, for appropriate choices of the threshold parameters \(l_i\), to other standard problems in graph theory.
Consider first the case of an arbitrary graph where the thresholds \(l_i\) are equal to the degrees \(d_i\) for all vertices. An inactive site in the initial configuration will be activated only if it is surrounded by active vertices, and it will do so in a single step. In other words in any percolating initial condition, whatever the time horizon \(T\), the inactive vertices must form an independent set (no two inactive vertices are allowed to be neighbours). For regular random graphs one has thus \({\theta _\mathrm{min}}(k,k+1,T)={\theta _\mathrm{min}}(k,k+1)\) for all \(T\), and this quantity is equal to 1 minus the density of the largest independent sets of a \(k+1\)-regular random graph.
Another correspondance with previously studied models arises when \(T=1\), for any choice of the thresholds \(l_i\). Indeed in this case the vertex \(i\) can be inactive in a percolating initial configuration only if its number of inactive neighbours is smaller than some value (namely, \(\le d_i-l_i\)). These generalized hard-core constraints (repulsion between inactive vertices) correspond exactly to the so-called Biroli–Mézard (BM) model [21, 70] (with the correspondance inactive vertex \(\leftrightarrow \) vertex occupied by a particle in the BM model, and \(d_i-l_i \leftrightarrow \ell _i\) of the BM model). Hence for \(T=1\) the minimal density \({\theta _\mathrm{min}}\) is 1 minus the density of a close packing of the corresponding BM model. Further specializing this \(T=1\) case by setting \(l_i=1\) on each vertex leads to the constraint that every inactive site in a percolating initial configuration has to be adjacent with at least one active site, in other words that the active sites form a dominating set of \(G\). The minimal density \({\theta _\mathrm{min}}\) is thus the domination number (divided by \(N\)) of \(G\).
Consider now the thresholds of activation to be 1 less than the degrees, i.e. \(l_i=d_i-1\) on all vertices, with no constraint on the time of activation (\(T=\infty \)). As explained at the end of Sect. 2.1, the inactive vertices in the final configuration form the 2-core of the inactive ones in the initial configuration. A percolating initial configuration must be such that this 2-core is empty, in other words the subgraph induced by the inactive sites of the initial configuration must be acyclic (a tree or a forest), i.e. the active sites have to form a decycling set [19] (also known as a Feedback Vertex Set), and \(N {\theta _\mathrm{min}}\) is the decycling number of \(G\). This characterization leads to the following bound for every \(k+1\)-regular graph with thresholds \(k\) of activation on every site,
Indeed if \(A\) denotes the number of active vertices in a percolating initial configuration, the \(N-A\) other vertices induces a forest, the number of edges between inactive vertices is thus at most \(N-A-1\). On the other hand this number is at least \(\frac{k+1}{2} N - (k+1)A\) (the first term being the total number of edges, and the number of edges incident to at least one active site being at most \((k+1)A\)). The decycling number of random regular graphs was studied in [17], proving in particular that the bound (16) is actually tight for 3-regular large random graphs, i.e. \({\theta _\mathrm{min}}(2,2)=1/4\), and it was conjectured to be also the case for 4-regular ones (i.e. \({\theta _\mathrm{min}}(3,3)=1/3\)). An asymptotic lowerbound on \({\theta _\mathrm{min}}(k,k)\) for large values of \(k\) was worked out in [41], we will come back on this result in Sect. 4.2.1. Note also that the decycling number of arbitrary sparse random graphs was studied with physics methods in [71, 76].
For general thresholds smaller than the degrees minus one the active sites of a percolating initial configuration must form a “de-coring” set instead of a “de-cycling” set (i.e. their removal has to provoke the disappearance of a \(q\)-core with \(q>2\)). A generalization of the lower bound (16) to any \(k+1\)-regular graph with uniform threshold \(l\) was given in [35], and reads
Its proof goes as follows. Consider the sequential process explained at the end of Sect. 2.1 in which at each time \(t\) a single vertex gets activated, and denote \(E(t)\) the number of edges between active and inactive vertices after \(t\) steps of this process. By definition of the activation rule \(E(t+1)-E(t) \le k+1-2l\). If as above \(A\) denotes the number of active sites in a percolating initial configuration, by definition \(E(N-A)=0\), hence \(E(0)\ge (N-A) (2l-k-1)\). On the other hand \(E(0) \le (k+1) A\), which gives the lower bound (17) on the possible values of \(A\).
We should also mention an upper bound on the minimal sizes of contagious sets derived in [4, 67] for graphs of arbitrary degree distributions, which yields in the case of \(k+1\)-regular graphs:
To conclude this discussion let us mention that the “de-coring” perspective on the minimal contagious set problem is somehow reminiscent (even if not directly equivalent), to the Achlioptas processes [2, 69] (more precisely of their offline version [24]) where one looks for an extremal event avoiding the appearance of an otherwise typical structure (a giant component in the Achlioptas processes, a core in the minimal contagious set case).
2.5 Main Analytical Results
Let us draw here a more detailed plan of the rest of the paper to make its reading easier and faster for someone not interested in the technical details of the statistical mechanics method (who can browse quickly over the next section and jump to the results announced in Sect. 4). In order to compute the minimal density \({\theta _\mathrm{min}}\) of contagious sets we shall rephrase this problem as a statistical mechanics model and apply to it the cavity method. The latter is based on self-consistent assumptions of various degrees of sophistication, parametrized by the so-called level of replica symmetry breaking. We will study the first two levels of this hierarchy, named replica symmetric (RS) and one step of replica symmetry breaking (1RSB). These two approaches will lead to two predictions for \({\theta _\mathrm{min}}\), to be denoted respectively \({\theta _\mathrm{min,0}}(k,l,T)\) and \({\theta _\mathrm{min,1}}(k,l,T)\). From general bounds established in the context of disordered statistical mechanics models (first for the Sherrington-Kirkpatrick model [40, 64, 74] and later for some models on sparse random graphs [36, 37, 65]) it is expected that the different levels of the cavity method provide improving lower bounds on \({\theta _\mathrm{min}}\), namely
Our computation of \({\theta _\mathrm{min,0}}(k,l,T)\) and \({\theta _\mathrm{min,1}}(k,l,T)\) relies on the resolution of a set of roughly \(2T\) algebraic equations on \(2T\) unknowns, explicit numbers will be given in Sect. 4. We managed to perform analytically the \(T\rightarrow \infty \) limit and reduce the determination of \({\theta _\mathrm{min,0}}(k,l)\) and \({\theta _\mathrm{min,1}}(k,l)\) (their limit when \(T\rightarrow \infty \)) to a finite number of equations, that will also be presented along with numerical results in Sect. 4. We found four particular cases in which the predictions of the first two levels of replica symmetry breaking coincide when \(T\rightarrow \infty \), which led us to conjecture that they are the exact ones, namely:
all these cases saturating the lower bounds of (16, 17). The first (resp. second) equality was actually proven (resp. conjectured) in [17]. We have also performed a large degree expansion of the decycling number of random regular graphs, yielding the conjecture
3 Cavity Method Treatment of the Problem
3.1 Factor Graph Representation
As explained in Sect. 2.3 the central quantity to compute is the free-entropy density defined from the partition function normalizing the probability law (8), that for completeness we shall generalize to possibly site dependent chemical potentials \(\mu _i\) and costs for non-activation \(\epsilon _i\):
This expression is not very convenient to handle directly because the variables \({\sigma }_i\) have complicated interactions under this law: \({\sigma }_i^T\) is indeed a function of all variables \({\sigma }_j\) on the vertices \(j\) at distance smaller than \(T\) from \(i\). A way to circumvent this difficulty and to turn the interactions of the model into local ones has been proposed in [8, 9], and we shall follow the same approach here.
Let us first define \(t_i({\underline{\sigma }})\) as the time of activation of site \(i\) in the dynamical process generated by the initial configuration \({\underline{\sigma }}\), i.e. \(t_i({\underline{\sigma }})=\min \{ t : {\sigma }_i^t=1 \}\), with conventionally \(t_i({\underline{\sigma }})=\infty \) if this time is strictly greater than the time horizon \(T\). These variables obey the following equations:
where the function \(f\) is defined as
Here \({\underset{l}{\min }}(t_1,\ldots ,t_n)\) is the \(l\)-th smallest \(t_i\), i.e ordering the arguments as \(t_1 \le t_2 \le \cdots \le t_n\) one has \({\underset{l}{\min }}(t_1,\ldots ,t_n)=t_l\). This translates the dynamic rules (1) in terms of the activation times, a site \(i\) activating at the time following the first time where at least \(l_i\) of its neighbours are active. In the following \(f(0,t_1,\ldots ,t_n;l)\) will be abbreviated in \(f(t_1,\ldots ,t_n;l)\). Reciprocally one can show that if a set of \(\{t_i\}_{i\in V}\) verifies the condition that for all \(i\) either \(t_i=0\) or \(t_i=f(\{t_j\}_{j \in {\partial i}};l_i)\), then they correspond to the activation times for the dynamics started from the initial condition \({\underline{\sigma }}\) such that \({\sigma }_i=1\) if and only if \(t_i=0\). These two descriptions in terms of \(({\sigma }_1,\ldots ,{\sigma }_N)\) and \((t_1,\ldots ,t_N)\) are thus equivalent, yet the great advantage of the latter is that the conditions to enforce among the \(t_i\) are local along the graph, and that they contain in an obvious way the information on \({\sigma }_i^T\) that was lacking to deal with (22).
Finally a last twist on Eq. (22) will be to “duplicate” the activation time \(t_i\) on all edges connecting \(i\) to one of its neighbour \(j\), introducing redundant variables \(t_{ij}\) to be finally constrained to be all equal to \(t_i\). Let us denote \({\underline{t}}\) the collective configurations of all these \(2M\) variables \(t_{ij},t_{ji}\) on each edge \({\langle }i,j{\rangle }\) of the graph, that take values in \(\{0,1,\ldots ,T,\infty \}\), and consider the following probability measure on \(({\underline{\sigma }},{\underline{t}})\):
where the functions \(w_i\) are defined by
The above observations imply that the marginal of \({\underline{\sigma }}\) under \(\eta ({\underline{\sigma }},{\underline{t}})\) is precisely the desired one from Eq. (22), and that in the support of the law the \({\underline{t}}\) are strictly constrained to be the activation times for the dynamics starting from \({\underline{\sigma }}\). This correspondance being one-to-one the partition function is the same in (22) and (25). A portion of the factor graph [49] associated to the probability law (25) is sketched in Fig. 1, with black squares representing the function nodes (interactions) \(w_i\), black circles the variables \({\sigma }_i\), and white circles the variables \(t_{ij},t_{ji}\). One notes that if the original graph \(G\) is a tree (resp. is locally a tree) then the corresponding factor graph is a tree (resp. is locally a tree). This fact was the motivation for the “duplication” of the \(t_i\) variables on the surrounding edge, without it short loops of interactions would still be present in the factor graph.
3.2 Replica Symmetric (RS) Formalism
Let us now explain how the probability law (25) and its associated normalization \(Z\) can be handled within the cavity formalism, first at the simplest, so called Replica Symmetric (RS), level.
3.2.1 Single Sample Equations
If the graph \(G\) were a finite tree the factor graph associated to (25) would be a tree, hence \(Z\) and the marginals of \(\eta \) could be computed exactly via the recursive equations that we are about to write down. If the graph is only locally tree-like these equations are only approximate, they correspond to the (loopy) Belief Propagation equations, valid under some assumptions of long-range correlation decay under the measure \(\eta \). This recursive computation of \(Z\) amounts to introduce on each directed edge \(i\rightarrow j\) of the graph a “message” \(\eta _{i \rightarrow j}(t_{ij},t_{ji})\), which is a normalized probability distribution over a pair of activation times. These messages obey recursion relations of the form \(\eta _{i \rightarrow j} = {\widehat{g}}(\{ \eta _{k \rightarrow i} \}_{k \in {\partial i \setminus j}};l_i,\epsilon _i,\mu _i)\), where the mapping \(\eta ={\widehat{g}}(\eta _1,\ldots ,\eta _k;l,\epsilon ,\mu )\) is given by
Here and in the following unprecised summations over a time index go along \(\{0,\ldots ,T,\infty \}\). The function \({\widehat{z}_\mathrm{iter}}(\eta _1,\ldots ,\eta _k;l,\epsilon ,\mu )\) is defined by normalization, in such a way that \(\sum _{t,t^{\prime }}\eta (t,t^{\prime })=1\).
The knowledge of the messages \(\eta _{i \rightarrow j}\) on all edges of the graph allows to compute the free-entropy density, according to the Bethe formula:
where the second sum runs over the (undirected) edges of the graph, and the local partition functions are
The marginals of the law (25) can also be deduced from the messages, for instance the probability distribution of the activation time \(t_i\) for the vertex \(i\) reads \(\eta (t_i)={\widehat{\eta }_\mathrm{site}}(\{\eta _{j \rightarrow i} \}_{j \in {\partial i}};l_i,\epsilon _i,\mu _i)(t_i)\), where
The probability that the vertex \(i\) is active in the initial condition is then deduced as \(\eta ({\sigma }_i=1)=\eta (t_i=0)\). As explained above in Eq. (13), one can deduce from the above results the entropy density \(s(\theta ,\theta ^{\prime })\) for initial configurations with a fraction \(\theta \) of active sites leading to a fraction \(\theta ^{\prime }\) of active sites after \(T\) steps, taking \(\mu _i=\mu \) and \(\epsilon _i = \epsilon \) for all sites, with
Note that the derivatives of \(\phi \) with respect to \(\mu \) and \(\epsilon \) can be taken only on the explicit dependence in (28), the recursion equations on the messages \(\eta _{i \rightarrow j}\) being precisely the stationarity condition of \(\phi \) with respect to the \(\eta \)’s.
3.2.2 A More Compact Parametrization of the Messages
Each probability distribution \(\eta (t,t^{\prime })\) is a priori described by \((T+2)^2-1\) independent real numbers (the times run over \(T+2\) values, including \(\infty \), and there is a global normalization constraint). We shall see however that a much more compact parametrization is possible, which will be very useful for the further analytical treatment of the model. From now on we shall assume that \(\mu _i=\mu \) and \(\epsilon _i =\epsilon \) for all vertices. To unveil these simplifications let us first rewrite Eq. (27) more explicitly:
where in all the three cases \(t^{\prime }\) can take any value in \(\{0,1,\ldots ,T,\infty \}\). Now the condition “\({\underset{l}{\min }}(t_1,\ldots ,t_k,t^{\prime })=t-1\)” is easily seen to be equivalent to “at least \(l\) of the time arguments are \(\le t-1\) and at most \(l-1\) of them are \(\le t-2\)”. Similarly the condition “\({\underset{l}{\min }}(t_1,\ldots ,t_k,t^{\prime }) \ge T\)” is equivalent to “at most \(l-1\) times are \(\le T-1\)”. This observation allows to rewrite the above equations under the following form:
where the summation in the second (resp. third) line is over the partitions \(I,J,K\) (resp. \(I,J\)) of \(\{1,\ldots ,k\}\). These expressions reveal a first simplification, as already noticed in [8, 9]: among the \((T+2)^2\) elements of \(\eta (t,t^{\prime })\) only \(3T+2\) are distinct. Indeed \(\eta (0,t^{\prime })\) is independent of \(t^{\prime }\), for a given value of \(t \in \{1,\ldots ,T\}\) \(\eta (t,t^{\prime })\) takes at most three distinct values, whether \(t^{\prime }\ge t\), \(t^{\prime }=t-1\), or \(t^{\prime } \le t-2\) and finally \(\eta (\infty ,t^{\prime })\) takes two values whether \(t^{\prime }\le T-1\) or \(t^{\prime } \ge T\). There is however a further simplification to perform: in the right hand sides of the above equations the \(\eta _i\)’s always appear under the form of particular linear combinations. In particular the elements under the diagonal of the matrices \(\eta _i\), i.e. \(\eta _i(t,t^{\prime })\) with \(t\ge t^{\prime }\), always intervene under the form \(\sum _{t \ge t^{\prime }}\eta (t,t^{\prime })\). This allows to reduce further the number of relevant linear combinations of elements of the \(\eta \)’s. A convenient parametrization of the messages \(\eta \) is thus provided by the numbers \(a_t\) for \(t \in \{0,1,\ldots ,T\}\) and \(b_t\) for \(t \in \{1,\ldots ,T\}\), defined by:
One can consistently extend these definitions with \(b_0=0\), and it will be useful to adopt the convention \(e^{-\mu b_{-1}}=0\) in order to simplify some expressions. Let us denote \(h=(a_0,a_1,\ldots ,a_T,b_{T-1},\ldots ,b_1)\) the vector of \(2T\) reals encoding in this way a matrix \(\eta \); \(h\) will be called a cavity field in the following (note that we excluded \(b_T\) which disappears from the final expressions). The recursion relations (36–38) should now be transformed into a relation between cavity fields, i.e. \(h=g(h_1,\ldots ,h_k)\), with \(h_i=(a_0^{(i)},a_1^{(i)},\ldots ,a_T^{(i)},b_{T-1}^{(i)},\ldots ,b_1^{(i)})\). Inserting the definitions (39) into the Eqs. (36–38) leads to the explicit form for \(g\),
where we defined
It can be checked that for \(T=1\) and \(\epsilon =+\infty \) these equations correspond, as they should, to the one of the Biroli–Mézard model (see Eqs. (108, 109) of [70]). One can also express the partial partition functions \({\widehat{z}_\mathrm{site}}\) and \({\widehat{z}_\mathrm{edge}}\) in terms of these fields. It will be more convenient to factor out a common part in the site and edge contributions to the free-entropy. Denoting \(r(\eta )=\sum _t \eta (t,0)\), we define \({z_\mathrm{edge}}\) as:
where the explicit expression is obtained from Eq. (30). Similarly, exploiting Eq. (29), we get for the site term (factoring also a contribution from the chemical potential):
where as above in the summations \(I,J,K\) denotes a partition of \(\{1,\ldots ,k+1\}\).
To summarize the results of this reparametrization, on a given graph one has cavity fields \(h_{i\rightarrow j}\) on each directed edge, obeying the Belief Propagation equations \(h_{i\rightarrow j}=g(\{h_{k \rightarrow i}\}_{k \in {\partial i \setminus j}})\), with the \(g\) defined in Eq. (40), and the Bethe free-entropy density is computed from these cavity fields according to
with \({z_\mathrm{site}}\) and \({z_\mathrm{edge}}\) defined in Eqs. (43) and (42) respectively. Note that the factors \(r\) introduced in the definitions of \({z_\mathrm{site}}\) and \({z_\mathrm{edge}}\) compensate because in the expression of the Bethe free-energy of Eq. (28) the messages on each directed edge appear exactly once in the site term and once in the edge term. The marginals of the law \(\eta ({\underline{\sigma }},{\underline{t}})\) can also be computed from the cavity fields \(h\), in particular from the expression (31) one obtains the marginal of one activation time from the incident cavity fields as
3.2.3 Random (Regular) Graphs
The replica symmetric cavity method, for generic models defined on sparse random graphs, postulates the asymptotic validity of the above computations, exact on finite trees, thanks to the local convergence of random graphs to trees and an assumption of correlation decay at large distance. The order parameter is then a probability distribution over cavity fields, the randomness arising from the fluctuations of the degrees of the vertices in the graph and/or the randomness in the local interactions.
In the case of random regular graphs with no disorder in the coupling the situation is even simpler, as one can look for a “factorized” solution with all cavity fields equal. In particular for the model at hand on a \(k+1\) random regular graph, with the same threshold of activation \(l\) for all vertices, the RS prediction for the typical free-entropy density in the thermodynamic limit defined in Eq. (15) reads
which is easily obtained from (44) noting that \(2M=(k+1)N\) in a regular graph. The field \(h\) is the fixed-point solution of the cavity recursion (40),
The marginal law for the activation time is obtained from (45) by setting all the fields to \(h\), which allows finally to compute the entropy density from the Legendre inverse transform discussed in (13).
We shall discuss the results obtained from this RS prediction in the next Section, more explicit formulas for the RS equation in this case, along with some technical details on their resolution being displayed in the Appendix 1. One can however anticipate that in some regime of parameters the RS hypothesis will be violated. This is for instance known for \(T=1\), \(\epsilon =+\infty \), which corresponds to the Biroli–Mézard model; it was indeed shown in [70] that for large negative values of \(\mu \) the predictions of the RS ansatz are unphysical, the frustration arising from the contradictory constraints of putting as few active vertices in the initial condition as possible while imposing that all vertices become active at a latter time induces long-range correlations between variables that are incompatible with the RS ansatz. This limit \(\mu \rightarrow -\infty \) being the interesting case for the computations of the minimal density of contagious sets, we shall now see how to include the effects of replica symmetry breaking in this model.
3.3 One Step of Replica Symmetry Breaking (1RSB) Formalism
The long-range correlation decay assumption underlying the RS cavity method breaks down for models with too much frustration. In this case one has to picture the configuration space as fractured into pure states, or clusters, that we shall index here by \(\gamma \), such that the correlation decay assumption only holds for the Gibbs–Boltzmann probability law restricted to one pure-state. The partition function restricted to a given pure-state is denoted \(Z_\gamma \), in such a way that \(Z=\sum _\gamma Z_\gamma \). The replica symmetry breaking version of the cavity method then postulates some properties of this decomposition into pure states, which are compatible with the local convergence of the graph under study to a tree. In the first non-trivial version of the RSB formalism, so called one-step RSB (1RSB), one assumes the existence of a complexity function, also called configurational entropy in the context of glasses, \(\Sigma (\phi )\), such that the number of pure states with an internal free-entropy density \(\phi _\gamma = \frac{1}{N} \ln Z_\gamma \) close to some value \(\phi \) is, at the leading exponential order, \(e^{N \Sigma (\phi )}\). The computation of \(\Sigma (\phi )\) is performed via the 1RSB potential with a parameter \(m\) (known as the Parisi breaking parameter), related to \(\Sigma \) through a Legendre transform structure [58]:
The function \(\Sigma (\phi )\) can be reconstructed in a parametric way varying \(m\), with
\(\phi _\mathrm{int}(m)\) denoting the internal free-entropy density of the clusters selected by the corresponding value of \(m\). The value \(m=1\) plays a special role in this approach, as it corresponds a priori to the original computation of the free-entropy density of the model. However a so-called condensation (or Kauzmann) transition can occur, signaled by the vanishing of the complexity \(\Sigma \) associated to \(m=1\). In this case the Gibbs–Boltzmann measure is dominated by a sub-exponential number of pure-states, corresponding to a parameter \(m_\mathrm{s}<1\) with \(\Sigma (m_\mathrm{s})=0\). In the following paragraphs we shall derive the 1RSB equations and the expression of the 1RSB potential for the model under study, before discussing the concrete results for random regular graphs in the next Section.
3.3.1 Single Sample Equations
Let us first discuss the 1RSB formalism with the basic messages represented in terms of the matrices \(\eta (t,t^{\prime })\). In the RS description one had a message \(\eta _{i \rightarrow j}\) on each directed edge of the graph, solution of the recurrence equations \(\eta _{i \rightarrow j} = {\widehat{g}}(\{ \eta _{k \rightarrow i} \}_{k \in {\partial i \setminus j}};l_i,\epsilon ,\mu )\), see Eq. (27). At the 1RSB level one introduces instead a distribution \({\widehat{P}}_{i \rightarrow j}(\eta )\) on each directed edge, the randomness being over the choice of the pure-state \(\gamma \) with a weight proportional to \(Z_\gamma ^m\). These distributions are thus found to obey the recurrence equations \({\widehat{P}}_{i \rightarrow j}={\widehat{G}}[\{ {\widehat{P}}_{k \rightarrow i} \}_{k \in {\partial i \setminus j}}]\), where \({\widehat{P}}={\widehat{G}}({\widehat{P}}_1,\ldots ,{\widehat{P}}_k)\) means
with \({\widehat{g}}\) and \({\widehat{z}_\mathrm{iter}}\) defined in Eq. (27), and \({\widehat{\mathcal{Z}}_\mathrm{iter}}\) normalizes the distribution \({\widehat{P}}\). The 1RSB potential \(\Phi (m)\) defined above is then computed from the solution of these equations, according to
where
are weighted averages, over the pure-states distribution, of the site and edge contributions to the free-entropy defined in (29, 30). Similarly the marginal distribution of an activation time can be computed as a weighted average of the RS expression in the various pure-states, i.e.
Note that the derivative \(\Phi ^{\prime }(m)\), which plays an important role to compute the complexity from Eq. (49), can be taken in (51) on the explicit dependence on \(m\) only, the recursion relations on the \({\widehat{P}}_{i\rightarrow j}\) being the stationarity conditions of (51) with respect to the \({\widehat{P}}\)’s.
As we have seen in the discussion of the RS cavity method the matrices \(\eta \) can be parametrized in a more economic way by the fields \(h\) (vectors of \(2T\) real numbers). The expressions of the 1RSB quantities can also be rewritten using this parametrization. After a few lines of algebra one finds that the potential \(\Phi (m)\) reads
with
the weighted averages of the quantities defined in (42, 43). The field distributions \(P_{i \rightarrow j}(h)\) are solutions of the recurrence equations \(P_{i \rightarrow j}=G(\{P_{k \rightarrow i} \}_{k \in {\partial i \setminus j}})\), where the mapping \(P=G(P_1,\ldots ,P_k)\) is given explicitly by
\({\mathcal{Z}_\mathrm{iter}}\) is a normalizing factor ensuring that the left hand side is a probability distribution, \(g\) is the function defined in Eq. (40), and the reweighting factor reads
the last equality following from Eqs. (36, 39).
3.3.2 Random Regular Graphs
For the reasons explained in the context of the RS ansatz in Sect. 3.2.3 one can look for a simple factorized solution of the 1RSB equations in the case of a \(k+1\) regular random graph with all thresholds of activation equal to \(l\). In this case one has to find a distribution \(P(h)\) solution of
where \(m \in [0,1]\) is the Parisi breaking parameter and the functions \(g\) and \({z_\mathrm{iter}}\) are the ones defined in Eqs. (40, 59). The 1RSB potential is then computed as
with the functions \({z_\mathrm{site}},{z_\mathrm{edge}}\) defined in Eqs. (42, 43). As already mentioned above \(\Phi ^{\prime }(m)\) can be computed by taking into account only the explicit dependence on \(m\) of (61).
3.4 “Energetic” 1RSB Formalism
Even within the simplified case of the factorized ansatz for regular graphs the 1RSB equations are relatively complicated, as they involve the resolution of a distributional equation on \(P(h)\). However we are ultimately interested in a particular limit for the computation of the minimal density of contagious sets, namely the case where \(\epsilon =+\infty \) (to take into account only the fully activating configurations), and in the limit \(\mu \rightarrow -\infty \) (to select the initial configurations with the minimal number of active sites). It turns out that a simplified version of the 1RSB formalism can be devised in this case, corresponding to the “energetic” version of the 1RSB cavity method, first developed in [55, 56], see in particular Sect. 5 of [70] for such a treatment of the related Biroli–Mézard model. This simplified treatment amounts to take simultaneously the limit \(m\rightarrow 0\) and \(\mu \rightarrow - \infty \), with a fixed finite value of a new parameter \(y=-\mu m\). To explain the meaning of this limit let us rewrite more explicitly the expression of the 1RSB potential of Eq. (48) in the case \(\epsilon =+\infty \), introducing the complexity \(\Sigma (s,\theta )\) counting the (exponential) number of clusters containing a number of order \(e^{N s}\) of activating initial configurations with a fraction \(\theta \) of active sites, hence with a free-entropy density \(\phi =\mu \theta +s\):
In the limit \(m\rightarrow 0\), \(\mu \rightarrow - \infty \) with \(y=-\mu m\) this function becomes
The “energetic” complexity \(\Sigma _\mathrm{e}(\theta )\) can thus be computed via an inverse Legendre transform of the potential \(\Phi _\mathrm{e}(y)\),
As we shall see the “energetic” 1RSB cavity equations leading to the computations of \(\Phi _\mathrm{e}(y)\) are much simpler than the initial 1RSB ones at finite values of \(\mu \) and \(m\). The price to pay for this simplification is the loss of information on the entropy of the clusters when going from \(\Sigma (s,\theta )\) to \(\Sigma _\mathrm{e}(\theta )\). However this is not a problem for the determination of \({\theta _\mathrm{min}}\): its estimate at the 1RSB level, to be denoted \({\theta _\mathrm{min,1}}\), is the smallest value of \(\theta \) with \(\Sigma _\mathrm{e}(\theta ) \ge 0\). Indeed the least dense activating configurations have to be in some pure states, whatever their entropy.
3.4.1 Simplification of the Cavity Field Recursion (Warning Propagation Equations)
We want to simplify the Eq. (40) giving \(h=g(h_1,\ldots ,h_k)\) with \(\epsilon =+\infty \) and in the limit \(\mu \rightarrow -\infty \). First let us make some remarks, valid when \(\epsilon =+\infty \) for any value of \(\mu \). From the definition (39) of the fields \(b_t\), or from their expressions in (40), it is obvious that
One can also notice that for \(\epsilon =+\infty \) one has, for any \(\mu \), the equality \(a_T=b_T\): this appears both from the definition (39) of the fields, as \(\eta (\infty ,t)=0\) when \(\epsilon =+\infty \), and from the recursion relations (40), the last term in the first line of (40) disappearing when \(\epsilon =+\infty \). To continue the above chain of inequalities let us first compute from (40)
where \(I,J\) forms a partition of \(\{1,\ldots ,k\}\). This shows that \(e^{-\mu a_{T-1}} \ge e^{-\mu a_T} = e^{-\mu b_T}\), because in the right-hand side \(e^{-\mu b_T^{(i)}} \ge e^{- \mu b_{T-1}^{(i)}}\). These inequalities can then be continued by recurrence, as for \(t \in \{0,\ldots ,T-2\}\) one obtains from (40)
hence
and for any \(\mu \le 0\):
Let us now take the limit \(\mu \rightarrow -\infty \) in the Eq. (40), assuming that \(a_t\) and \(b_t\) have finite limits. Treating (40) at the leading exponential order one obtains
where
Now from the inequalities (69) it appears that \(\mathcal{S}_t \le 1\), hence that the \(a\)’s and \(b\)’s belong to the interval \([0,1]\). It is however natural to assume that they are integers, as in the limit \(\mu \rightarrow -\infty \) they can be interpreted as differences between number of particles in constrained groundstate configurations (see [55, 70] for more details). Within this ansatz the \(a\)’s and \(b\)’s can only be equal to \(0\) or \(1\); using in addition the inequalities (69) one realizes that the fields \(h\) can only take \(2T+1\) possible values, that we shall call \(A_t\) for \(t \in \{0,1,\ldots ,T-1\}\) and \(B_t\) for \(t\in \{0,1,\ldots ,T\}\). These are defined as follows; \(A_t\) denotes the case where \(a_0=\cdots =a_t=1\), all the other \(a\)’s and \(b\)’s vanishing. For \(t\in \{2,\ldots ,T\}\), \(B_t\) means that \(b_1=\cdots =b_{t-1}=0\), all the other \(a\)’s and \(b\)’s being equal to 1. Finally \(B_1\) corresponds to the case where all \(a\)’s and \(b\)’s are equal to 1, and \(B_0\) to the case where they all vanish. Note that one can consistently extend these definitions to \(A_T=B_T\), as by definition \(a_T=b_T\).
It remains to determine the value of \(h=g(h_1,\ldots ,h_k)\) in this \(\mu \rightarrow -\infty \) limit, when all the fields \(h_1,\ldots ,h_k\) belong to the set \(\{A_0,A_1,\ldots ,A_{T-1},A_T=B_T,B_{T-1},\ldots ,B_1,B_0 \}\) of “hard fields”, or Warning Propagation messages. Some algebra, sketched in Appendix 1, leads to:
where \(t_1,\ldots ,t_n \in \{0,\ldots ,T-1\}\) and \(t_{n+1},\ldots ,t_k \in \{0,\ldots ,T\}\). We assumed conventionally that \({\underset{l}{\min }}(t_1,\ldots ,t_{l-1})=\infty \).
The Eq. (73) can be given a very intuitive interpretation. The messages \(h \in \{A_0,\ldots ,A_{T-1},B_0,\ldots ,B_T\}\) can be interpreted as “warnings” sent from one vertex of the graph to one of its neighbours, with the following meanings. A vertex \(i\) sends a message \(h_{i \rightarrow j}=B_t\) to one of its neighbour \(j\) to say: “if \(j\) is kept inactive at all times the configuration of \(i\) and of its sub-tree (the one rooted at \(i\) and excluding \(j\)) leads to complete activation of the sub-tree within the time horizon \(T\), and \(i\) activates itself at time \(t\)”. In particular \(h_{i \rightarrow j}=B_0\) means that \(i\) is activated in the initial configuration. On the contrary \(i\) sends the message \(h_{i \rightarrow j}=A_t\) to \(j\) to express: “the complete activation of \(i\) and its sub-tree requires that \(j\) becomes activated at time \(t\)”. The rules of Eq. (73) for the combination of these messages are then obtained by finding the configuration compatible with them, containing the minimal number of active sites in the initial configuration (because of the \(\mu \rightarrow -\infty \) limit):
-
if strictly less than \(l-1\) incoming messages are of the type \(B_{t_i}\), with \(t_i \in \{0,\ldots ,T-1\}\), the central site \(i\) will never have more than \(l\) active neighbours (even with the participation of the receiving site \(j\)) if it is initially inactive, hence the only way for \(i\) to be active at time \(T\) is to be active in the initial configuration, which implies \(h_{i \rightarrow j}=B_0\).
-
if at least \(l\) of the incoming messages are of the type \(B_{t_i}\), with \(t_i \in \{0,\ldots ,T-1\}\), say \((B_{t_1},\ldots ,B_{t_n})\), the central site \(i\) will become active at time \(t=1+{\underset{l}{\min }}(t_1,\ldots ,t_n)\), without the “help” of the activation of the site \(j\) receiver of the message. This situation thus leads to a message of type \(B_t\), at the condition that all other incoming messages of type \(\{A_0,\ldots ,A_T\}\) do not require the activation of the central site \(i\) at a time strictly earlier than \(t=1+{\underset{l}{\min }}(t_1,\ldots ,t_n)\).
-
the participation of the activation of the receiving site \(j\) is required at some time \(t\) when the above condition is not fulfilled, i.e. when the incoming messages \((A_{t_{n+1}},\ldots ,A_{t_k})\) require the activation of the central site at some time \(t_\mathrm{act}=\min (t_{n+1}, \ldots ,t_k) < 1+ {\underset{l}{\min }}(t_1,\ldots ,t_n)\). This mechanism is possible if at time \(t_\mathrm{act} -1\) already \(l-1\) of the neighbours sending messages of type \(B\) are active, i.e. it requires \({\underset{l-1}{\min }}(t_1,\ldots ,t_n)\le t_\mathrm{act} -1\). The “help” needed from the receiving site is that it is active at some time before \(t_\mathrm{act} -1\); in the limit \(\mu \rightarrow -\infty \) the least dense configurations, and thus the least stringent constraint on the time of activation is privileged, hence the message sent in this case is \(h_{i\rightarrow j}= A_{t_\mathrm{act} -1}\).
-
all cases not fulfilling one of the conditions above require that \(i\) is active in the initial configuration to be active at time \(T\), hence the message sent is \(h_{i\rightarrow j}= B_0\).
3.4.2 Energetic 1RSB Single Sample Equations
Within this ansatz the 1RSB distributions \(P(h)\) greatly simplify, as they are supported on the discrete set \(h \in \{A_0,A_1,\ldots ,A_{T-1},A_T=B_T,B_{T-1},\ldots ,B_1,B_0 \}\). We shall denote \(p_t\) the weight in \(P(h)\) of the event \(h=A_t\), and similarly \(q_t\) for \(h=B_t\) (with again the convention \(p_T=q_T\) to simplify notations), i.e.
The 1RSB recursion relation (58) now reduces to a recursion between these finite-dimensional vectors of probabilities; inserting the definition (74) in the right hand side of (58) and exploiting the combination rule (73) between hard fields, one obtains the following limit for the recursion relation \(P=G[P_1,\ldots ,P_k]\):
the reweighting term of Eq. (59) becoming indeed \({z_\mathrm{iter}}(h_1,\ldots ,h_k)^m = e^{y a_0(h_1,\ldots ,h_k)}\), hence the factor \(e^y\) multiplying the probabilities of all warnings except \(B_0\); this is indeed the only case where an active site has to be inserted in the initial configuration.
To compute the 1RSB potential we have to study the limit of the contribution of site and edge terms in the limit \(\mu \rightarrow -\infty \), \(m\rightarrow 0\). We have from Eq. (43)
which can be simplified following the same reasoning than the one which led to (73). This yields
where \(I,J,K\) is a partition of \(\{1,\ldots ,k+1\}\). This expression can be interpreted intuitively in terms of the warnings defined above; the factor multiplying \((e^y-1)\) is indeed the probability of complete activation, at time \(t \in \{1,\ldots ,T\}\), for an initially empty site receiving messages \((h_1,\ldots ,h_{k+1})\) from its neighbours, with their respective distributions \(P_1,\ldots ,P_{k+1}\). As a matter of fact, for its activation to occur at time \(t\) at least \(l\) neighbours must have activated without any help from the central site at time \(t-1\), no more than \(l-1\) must be active at time \(t-2\) (otherwise the activation time would be strictly less than \(t\)), and the neighbours sending messages of type \(A_{t^{\prime }}\) should not require activation at a time \(t^{\prime }<t\).
For the edge term we obtain from Eq. (42)
hence
One can interpret the factor multiplying \((1-e^{-y})\) as the probability of complete activation when two messages \((h_1,h_2)\) drawn with the probabilities \(P_1,P_2\) are sent in the two opposite directions of an edge.
Let us summarize the main findings of this subsection. In the limit \(\mu \rightarrow -\infty \), \(m\rightarrow 0\) with \(y = -\mu m\) the 1RSB formalism simplifies in the following way. The cavity field distributions \(P_{i \rightarrow j}(h)\) have now a discrete support with \(2T\) possible values, each of them is thus described by a (normalized) vector of \(2T\) probabilities denoted \(\{p_t,q_t\}\). These vectors are solutions of recurrence equations of the form \(P_{i\rightarrow j}=G(\{P_{k\rightarrow i} \}_{k\in {\partial i \setminus j}})\), the mapping \(G\) being defined in Eq. (75). The energetic limit of the 1RSB potential is then computed as
with the expression of \({\mathcal{Z}_\mathrm{site}}\) and \({\mathcal{Z}_\mathrm{edge}}\) given in Eqs. (77, 79). This expression of \(\Phi _\mathrm{e}\) is variational, its derivative with respect to \(y\) (which is needed in the computation of the inverse Legendre transform in (64)) can be taken on the explicit dependence only.
3.4.3 Random Regular Graphs
For the reasons already exposed in the context of the RS and of the full 1RSB cavity formalism a factorized solution of the energetic 1RSB equations can be searched for when dealing with random \(k+1\) regular graphs with a constant threshold of activation \(l\). One has thus a single vector of probabilities \(P=(\{p_t,q_t\})\), fixed-point solution of Eq. (75), from which the energetic 1RSB potential is obtained as
with \({\mathcal{Z}_\mathrm{site}}\) and \({\mathcal{Z}_\mathrm{edge}}\) defined in Eqs. (77, 79).
4 Results of the Cavity Method for Random Regular Graphs
We shall present now the results of the resolution of the cavity equations for random regular graphs of degree \(k+1\), with an activation threshold equal to \(l\) for all vertices. In all this discussion it will be understood that \(\epsilon =+\infty \), i.e. we only consider initial configurations that activate the whole graph in \(T\) steps. We will first present in Sect. 4.1 the results for finite values of \(T\), which are qualitatively the same for all values of \(k,l\) and \(T\); the behaviour of the replica symmetric cavity method are first displayed, then we turn to the effects of replica symmetry breaking, in particular in the “energetic” limit to compute the minimal density of initially active sites in activating configurations. In a second part (Sect. 4.2) we shall discuss the limit \(T\rightarrow \infty \), in which some further analytical computations can be performed. In this case several qualitatively distinct phenomena emerge, depending on the values of \(k\) and \(l\).
4.1 Finite \(T\) Results
4.1.1 Replica Symmetric Formalism
The technical details of the resolution of the RS equation \(h=g(h,\ldots ,h)\), where \(g\) is given in Eq. (40), and of the computation of the free-entropy density, are deferred to the Appendix 1. From a numerical point of view it is an easy task, as it corresponds essentially to the resolution of a set of \(2T\) equations on \(2T\) unknowns. Let us discuss the numerical results obtained in this way. On the left panel of Fig. 2 we display the curve \(\theta (\mu )\) of the average fraction of initially active sites as a function of the chemical potential \(\mu \); the curve is for \(k=l=2\) and \(T=3\), the qualitative features are independent of these precise values. This function is increasing as it should, and reaches a finite limit when \(\mu \rightarrow -\infty \), that would be the candidate value for \({\theta _\mathrm{min}}\) if the RS computation was correct in this limit. This however cannot be true, as revealed from the computation of the entropy, displayed in the right panel of Fig. 2: for \(\mu < \mu _{s=0}\) the RS entropy becomes negative, which is a certain indication of the inadequacy of the RS theory in this regime. In Fig. 3 we display the results for the entropy \(s(\theta )\) of the number of configurations with a fraction \(\theta \) of initially active sites, for the regime of positive entropies where the RS prediction cannot be ruled out at once (for the cases \(k=l=2\) and \(k=3\), \(l=2\)). For increasing values of \(T\) these curves converge to a limit, this will be further discussed in Sect. 4.2.1. The numerical values of the chemical potential and of the fraction of active sites at the point of entropy cancellation, which would be the best guess from the RS computation of the value of \({\theta _\mathrm{min}}\), denoted respectively \(\mu _{s=0}\) and \({\theta _\mathrm{min,0}}\), can be found for various values of \(T\) in the Tables 1, 2, and 3 for the cases \(k=l=2\), \(k=l=3\) and \(k=3\), \(l=2\) respectively. For \(T=1\) they reproduce, as they should, the results of the Biroli–Mézard model given in [70].
4.1.2 1RSB Results
As we have seen above the hypothesis underlying the RS computation must go wrong when \(\mu \) is decreased towards \(-\infty \), as the entropy computed within the RS scheme becomes negative for \(\mu < \mu _{s=0}\); a 1RSB computation is thus required to investigate the limit \(\mu \rightarrow -\infty \) and hence the properties of the least dense activating initial conditions, in particular their density \({\theta _\mathrm{min}}\).
We have thus solved numerically the 1RSB equations (60) using population dynamics methods [54], i.e. representing \(P(h)\) as a weighted sample of fields \(h_i\). This method has become fairly standard and we shall not give more details on the procedure, see for instance [53, 54] for detailed presentations. In the particularly important \(m=1\) case we used a version of this procedure, inspired by the tree reconstruction problem, that allows to get rid of the reweighting terms in (60) and is thus much more precise and efficient numerically, see [52, 61] for more technical details.
The results of these investigations follow the usual pattern encountered in constraint satisfaction problems [48]: for large enough values of \(\mu \) (i.e. for dense enough initial configurations) there is no non-trivial solution of the 1RSB equation with \(m=1\); decreasing \(\mu \) a non-trivial solution appears discontinuously at a threshold \({\mu _\mathrm{d}}\) (the “dynamic” transition). Its complexity (or configurational entropy) \(\Sigma \) is positive in an interval \(\mu \in [{\mu _\mathrm{c}},{\mu _\mathrm{d}}]\), which thus corresponds to a “dynamic 1RSB phase” with an exponential number of clusters contributing to the Gibbs measure, see Fig. 4 for an illustration in the case \(T=1\). The numerical values of \({\mu _\mathrm{d}}\) and \({\mu _\mathrm{c}}\) (as well as the associated densities of initially active sites \({\theta _\mathrm{d}}\) and \({\theta _\mathrm{c}}\)), can be found for several values of \(T\) in the Tables 1, 2, and 3. For the values of \(\mu \) in the interval \([{\mu _\mathrm{c}},{\mu _\mathrm{d}}]\) the thermodynamic predictions of the RS computations are correct. Note that in all the cases we investigated (\(k=2,3\), \(2\le l \le k\) and \(T\le 5\)) we always found a discontinuous transition with \({\mu _\mathrm{c}}< {\mu _\mathrm{d}}\); we cannot rule out the possibility that for other values of the parameters the replica symmetry breaking transition is continuous with \({\mu _\mathrm{c}}={\mu _\mathrm{d}}\) (as happens for instance in the independent set problem at low degrees [14]).
Lowering further the chemical potential, i.e. in the regime \(\mu <{\mu _\mathrm{c}}\), the complexity at \(m=1\) becomes negative. This is thus a true replica symmetry breaking phase with only a sub-exponential number of clusters contributing to the Gibbs measure; \({\mu _\mathrm{c}}\) corresponds to the “condensation” transition. In this phase the thermodynamic properties of the model differ from the RS prediction and are given by the properties of the clusters selected by the static value of the Parisi parameter, \({m_\mathrm{s}}(\mu )\), for which the complexity vanishes. This value can be determined by computing the complexity as a function of \(m\), for a fixed value of \(\mu \), see left panel of Fig. 5 for an example.
To compute the minimal density \({\theta _\mathrm{min}}(T)\) one has to take the limit \(\mu \rightarrow -\infty \); we have introduced above in Sect. 3.4 a simplifying ansatz in this limit, assuming in particular a finite value of \(-\mu m\). To check the consistency of this ansatz we solved the complete 1RSB equations for \(T=1\) and several values of \(\mu \) large and negative. The Parisi parameter \({m_\mathrm{s}}\) is plotted as a function of \(-1/\mu \) in the right panel of Fig. 5; in the limit \(\mu \rightarrow -\infty \) one obtains indeed a linear behaviour, corresponding to a finite limit of \(-\mu {m_\mathrm{s}}\).
4.1.3 Energetic 1RSB Results
We turn now to the results obtained via the energetic 1RSB cavity method, i.e. taking simultaneously the limits \(\mu \rightarrow -\infty \) and \(m \rightarrow 0\) with a finite value for \(y=-\mu m\). The equations to solve in this case amounts to find the fixed point of Eq. (75), from which one obtains the 1RSB potential (81) and the energetic complexity \(\Sigma _\mathrm{e}(\theta )\) from the Legendre transform structure explained in (64), as a parametric plot varying the parameter \(y\). The computational complexity of this problem is drastically reduced compared to the complete 1RSB equations: as in the RS case one has a set of (roughly) \(2T\) equations on \(2T\) real unknowns, instead of an equation on a probability distribution of fields. More technical details on the procedure to solve these equations can be found in Appendix 1.
Figure 6 displays the energetic complexity \(\Sigma _\mathrm{e}(\theta )\) for a few values of \(T\), in the cases \(k=l=2\) and \(k=3\), \(l=2\). The expert reader will notice that we restricted the range of \(y\) used in this plot to the so-called physical branch, in such a way that \(\Sigma _\mathrm{e}\) is a concave function of \(\theta \). The most important characteristics of these curves are the values of \({\theta _\mathrm{min,1}}\) where the complexity vanishes, and the corresponding values \({y_\mathrm{s}}\) of the parameter \(y\); these are reported for several values of \(T\) in the last columns of the Tables 1, 2, and 3. Indeed \({\theta _\mathrm{min,1}}\) is the 1RSB prediction for \({\theta _\mathrm{min}}\), as it corresponds to the smallest density of active sites in initial configurations belonging to clusters with a non-negative complexity. For \(T=1\) these values can be successfully cross-checked with the results of the Biroli–Mézard model [70], and the parameter \({y_\mathrm{s}}\) agrees with the fit of \(-\mu {m_\mathrm{s}}(\mu )\) in the limit \(\mu \rightarrow -\infty \) obtained from the full 1RSB equations (cf. right panel of Fig. 5).
4.2 The Large \(T\) Limit
The limit case \(T\rightarrow \infty \) is particularly interesting as it corresponds to the original influence maximization problem with no constraint on the time taken to activate the whole graph. This limit can be performed analytically for the RS and energetic 1RSB formalism; the technical details of these computations can be found in Appendix section “The Large \(T\) Limit”, we present here the results of these analytical simplifications. It turns out that the case \(k=l\) is qualitatively different from the case \(k>l\), we shall thus divide this section according to this distinction.
4.2.1 The Case \(k=l\)
Let us first recall that when \(k=l\) the dynamics from a random initial configuration of density \(\theta \) has a continuous transition at \({\theta _\mathrm{r}}(k,k)=\frac{k-1}{k}\) (see Sect. 2.2); we also saw in Sect. 2.4 that minimal contagious sets (with no constraint on the activation time) correspond to minimal decycling sets, which led to the bound \({\theta _\mathrm{min}}(k,k) \ge \frac{k-1}{2 k} =\frac{{\theta _\mathrm{r}}(k,k)}{2}\). In the rest of this subsection we shall for simplicity abbreviate \({\theta _\mathrm{r}}(k,k)\) by \({\theta _\mathrm{r}}\).
As suggested by the left panel of Fig. 3 in the case \(k=l=2\), the RS entropy \(s(\theta )\) converges to a limit curve when \(T\rightarrow \infty \). This limit curve can actually be computed analytically for all \(k\); we defer the details of the computation to Appendix section “Asymptotics for \(l=k\)” and only state here the properties of this limit curve. For \(\theta \ge {\theta _\mathrm{r}}\) it coincides with the binary entropy function \(-\theta \ln \theta - (1-\theta ) \ln (1-\theta )\); this is a posteriori obvious. Indeed by definition of \({\theta _\mathrm{r}}\) typical configurations in this density range do activate the whole graph, hence the number of activating initial configurations coincide (at the leading exponential order) with the total number of configurations of this density. A non-trivial portion of the limit curve arises in the density range \([{\theta _\mathrm{r}}/2,{\theta _\mathrm{r}}]\), where it is given by
This function has the same value and the same first derivative than the binary entropy function in \({\theta _\mathrm{r}}\), while at the lower limit \({\theta _\mathrm{r}}/2\) of its range of definition it has an infinite derivative with a finite value
The parametric plot of \(s(\theta )\) also contains a vertical segment for \(\theta = {\theta _\mathrm{r}}/2\), from \(-\infty \) to the value given in (83).
The complexity \(\Sigma _\mathrm{e}(\theta )\) of the energetic 1RSB formalism also converges to a limit curve when \(T\rightarrow \infty \), as shown in Fig. 6 and obtained analytically in Appendix section “Asymptotics for \(l=k\)”. This limit curve has the same vertical segment in \({\theta _\mathrm{r}}/2\) from \(-\infty \) to the value (83); the non-trivial part of the curve is given in a parametrized form as follows:
where \({\widetilde{\lambda }}\) is the positive parameter along the curve, the Parisi parameter
is the slope of the tangent to the curve \(\Sigma _\mathrm{e}(\theta )\), and
When \({\widetilde{\lambda }}\rightarrow 0^+\) this part of the curve connects with the vertical segment in \({\theta _\mathrm{r}}/2\). The large values of \({\widetilde{\lambda }}\) yield a non-concave branch of \(\Sigma _\mathrm{e}\) that has to be discarded.
Depending on the value of \(k\) qualitatively different behaviours emerge from the analysis of the RS entropy and 1RSB energetic complexity:
-
For \(k=l=2\) the entropy of the endpoint in \({\theta _\mathrm{r}}/2\) given in (83) is strictly positive (it is equal to \((\ln 2)/2\)); moreover the energetic complexity curve converges, in the \(T\rightarrow \infty \) limit, to a vertical segment (the non-trivial part parametrized by \({\widetilde{\lambda }}\) is convex and has thus to be discarded). This leads to the conclusion that \({\theta _\mathrm{min}}={\theta _\mathrm{r}}/2=1/4\) in this case, saturating the lowerbound of (16), and recovering the rigorous result of [17] on the decycling number of 3-regular graphs. This is a reassuring evidence in favour of the validity of the approach, in particular on the interversion of the \(T\rightarrow \infty \) and \(N\rightarrow \infty \) limit. It would be an even more challenging computation to determine the limit of \({\theta _\mathrm{d}}\) and \({\theta _\mathrm{c}}\) as \(T\) diverges; we are however tempted to conjecture that they both go to \(1/4\) and that the effects of replica symmetry breaking are irrelevant in this limit. A numerical argument in favour of this conjecture will be presented in Sect. 5, where it is shown that a simple greedy algorithm is able to find contagious sets of these densities. Assuming this is true, the expression (82) would give for \(k=2\) the typical (quenched) entropy of the decycling sets of 3-regular random graphs in their non-trivial regime of densities \([1/4,1/2]\). Note that the coincidence of the RS entropy and 1RSB energetic complexity at \({\theta _\mathrm{min}}\) is reminiscent of the phenomenology discussed for the matching problem in [75], which might suggest that the minimal density activating configurations are at a large Hamming distance in configuration space one from the other.
-
For \(k=l=3\) the expression (83) of the entropy in \({\theta _\mathrm{r}}/2\) is still positive (equal to \(\ln 3 - (4/3) \ln 2\)), hence the endpoint of the non-trivial part of both the RS entropy and the 1RSB complexity curves occurs in \({\theta _\mathrm{min,0}}={\theta _\mathrm{min,1}}={\theta _\mathrm{r}}/2=1/3\), saturating again the bound (16). This leads to the conclusion that \({\theta _\mathrm{min}}=1/3\) in this case, as was also conjectured in [17]. However, at variance with the previous case, the energetic complexity curve has a non-trivial part for \(\theta > {\theta _\mathrm{min}}\), as shown in the left panel of Fig. 7. We thus expect that the limits of \({\theta _\mathrm{d}}\) and \({\theta _\mathrm{c}}\) when \(T\rightarrow \infty \) are strictly greater than \(1/3\), hence that simple algorithms would have difficulties to find the minimal contagious sets (see Sect. 5 for a numerical check of this statement), and that the RS entropy (82) is incorrect for some regime of densities close to \(1/3\).
-
Finally when \(k=l\ge 4\) the entropy in (83) is negative, the cancellation of \(s\) occurs at a value \({\theta _\mathrm{min,0}}\) strictly between \({\theta _\mathrm{r}}/2\) and \({\theta _\mathrm{r}}\), see the right panel of Fig. 7. The energetic complexity vanishes on its non-trivial part parametrized by \({\widetilde{\lambda }}\), at a value \({\theta _\mathrm{min,1}}\) slightly larger than \({\theta _\mathrm{min,0}}\), see Table 4 for some numerical values. Whether \({\theta _\mathrm{min,1}}\) should be taken as a conjectured exact value for \({\theta _\mathrm{min}}\) or simply as a lowerbound is dubious and might depend on the value of \(k\). Indeed one should test the stability of the 1RSB ansatz against further levels of replica symmetry breaking. This computation is in principle doable along the lines of [59, 60, 70], but has not been performed yet. It is however relatively easy to set up an asymptotic expansion at large \(k\) of the thresholds \({\theta _\mathrm{min,0}}\) and \({\theta _\mathrm{min,1}}\) from the expressions (82, 84). One finds that the first terms of the expansion are equal at the RS and 1RSB level, it is thus natural to conjecture that they are indeed the correct expansion of \({\theta _\mathrm{min}}\), namely
$$\begin{aligned} {\theta _\mathrm{min}}(k,k) = 1- \frac{2 \ln k}{k} - \frac{2}{k} + O\left( \frac{1}{k \ln k} \right) \ . \end{aligned}$$(89)This conjecture is in agreement with the rigorous lowerbound proven in [41],
$$\begin{aligned} {\theta _\mathrm{min}}(k,k) \ge 1- \frac{2 \ln k}{k} - \frac{4-2\ln 2}{k} + o\left( \frac{1}{k} \right) \ . \end{aligned}$$(90)It can also be compared with the asymptotic expansion in the case \(l=k+1\) [38] where the inactive vertices have to form an independent set of the graph:
$$\begin{aligned} {\theta _\mathrm{min}}(k,k+1) = 1- \frac{2 \ln k}{k} + \frac{2 \ln \ln k}{k} + \frac{2 \ln 2 -2}{k} + o\left( \frac{1}{k} \right) \ . \end{aligned}$$(91)The third term of this expansion is of a larger order; indeed the condition imposed on the graph induced by the inactive vertices is much more stringent when \(l=k+1\) (it has to be made of isolated vertices) with respect to the case \(l=k\) (it only has to be acyclic).
Let us mention at this point that \({\theta _\mathrm{min}}(T)\), the minimal density of initial configuration percolating within \(T\) steps of the dynamics, reaches its asymptotic value \({\theta _\mathrm{min}}\) as \(T\rightarrow \infty \) with different finite \(T\) corrections in the various cases listed above. The analysis of Appendix section “Asymptotics for \(l=k\)” shows that for \(k=l=2\) (resp. \(k=l=3\)) these corrections are of order \(2^{-T}\) (resp. \(3^{-T}\)), which is in agreement with a numerical fit of the data in Table 1 (resp. Table 2). On the contrary for \(k=l\ge 4\) these corrections are only polynomially small in \(T\).
Finally, we could also compute analytically the distribution of activation times, within the RS formalism, for the initial configurations with a non-trivial density \(\theta \) of active vertices in the interval \([{\theta _\mathrm{r}}/2,{\theta _\mathrm{r}}]\). Their cumulative distribution function \(P_t=\eta (t_i\le t)\) obtained from the marginals of the law (25) reads in the \(T\rightarrow \infty \) limit with \(t\) kept fixed:
where \(w_t\) is a series defined recursively by
Examples of this cumulative distribution are displayed in Fig. 8. As explained above the predictions of the RS cavity method are not expected to be correct for \(\theta < {\theta _\mathrm{c}}\); in the particular case \(k=l=2\) we however expect this result to be true down to \(\theta ={\theta _\mathrm{min}}=1/4\). Note that \(P_t\) goes to \(1\) when \(t\rightarrow \infty \), in other words in the limit \(T\rightarrow \infty \) the support of the distribution of activation times does not scale with \(T\) and remains of order 1. One can also check that when \(\theta ={\theta _\mathrm{r}}\), the prediction \(P_t\) of (92) coincides, as it should, with the distribution of activation times for random initial conditions of density \({\theta _\mathrm{r}}\) given in Eq. (2); to see this one can notice that \(w_t\) is equal to the series \({\widetilde{x}}_t\) defined in Eq. (3) for the study of random initial conditions, when \(k=l\) and \(\theta ={\theta _\mathrm{r}}\). At the lower limit of the interval of density, \(\theta ={\theta _\mathrm{r}}/2\), one obtains instead a simple expression,
A straightforward analysis of (92, 93) reveals that for all \(\theta < {\theta _\mathrm{r}}\) the cumulative distribution \(P_t\) reaches 1 with corrections of order \(1/t\), in other words the probability \(P_t-P_{t-1}\) that a vertex activates precisely at time \(t\) has a power-law tail with exponent \(-2\). On the contrary the random initial conditions of density \({\theta _\mathrm{r}}\) have \(1-P_t\) of order \(1/t^2\), hence the exponent of the tail is \(-3\); random initial conditions with \(\theta >{\theta _\mathrm{r}}\) have instead an exponentially decaying tail for their distribution of activation times.
4.2.2 The Case \(k>l\)
We shall now turn to a description of the limit as \(T\rightarrow \infty \) of the RS and energetic 1RSB results when \(k>l\), with again the technical details relegated in the Appendix section “Asymptotics for \(l<k\)”. The RS entropy \(s(\theta )\) coincides with the binary entropy function for \(\theta \ge {\theta _\mathrm{r}}\), for exactly the same reasons as explained above in the case \(k=l\) (here and in the rest of this subsection we denote \({\theta _\mathrm{r}}\) the threshold \({\theta _\mathrm{r}}(k,l)\)). The non-trivial part of \(s(\theta )\) and \(\Sigma _\mathrm{e}(\theta )\) are obtained in a parametric way, with unfortunately rather long expressions that we shall now progressively describe. We keep implicit below the dependency of all quantities on \(k\) and \(l\) when there is no risk of confusion.
This parametrization is given in terms of a real \(\lambda \) in the range \(]0,\lambda _\mathrm{r}]\), where this upper limit is expressed in terms of the threshold \({\theta _\mathrm{r}}\) for activation from a random initial condition as \(\lambda _\mathrm{r}=(1-{\theta _\mathrm{r}})\theta _\mathrm{r}^{k-1}\). We need first to introduce some auxiliary functions \({\widehat{u}}(\lambda )\), \({\widehat{v}}(\lambda )\), \(u_*(\lambda )\) and \(v_*(\lambda )\). The first two are given explicitly as
where we recall that \({\widetilde{x}_\mathrm{r}}\) is the fixed-point of Eq. (3) at the bifurcation \({\theta _\mathrm{r}}\), see also (4). The last one, \(v_*(\lambda )\), is defined as the smallest positive solution of
then \(u_*(\lambda )\) can be deduced as the solution of
One can check that \(u_*(\lambda ) \ge {\widehat{u}}(\lambda ) \ge {\widehat{v}}(\lambda ) \ge v_*(\lambda )\) on the interval \(\lambda \in ]0,\lambda _\mathrm{r}]\), and that \(u_*={\widehat{u}}=1/{\theta _\mathrm{r}}\) and \(v_*={\widehat{v}}={\widetilde{x}_\mathrm{r}}/{\theta _\mathrm{r}}\) in \(\lambda =\lambda _\mathrm{r}\). We then define two functions \({F_\mathrm{site}}(\lambda )\) and \({F_\mathrm{edge}}(\lambda )\) through
where for clarity we kept implicit the \(\lambda \) dependency of \({\widehat{u}},{\widehat{v}},u_*\) and \(v_*\), and we introduced
We can finally give the parametric form of the RS entropy \(s(\theta )\):
where \(\mu (\lambda )\) is the opposite of the derivative of \(s(\theta )\) in the point \(\theta (\lambda )\). Thanks to the values \({\widehat{u}},{\widehat{v}},u_*\) and \(v_*\) assume in \(\lambda _\mathrm{r}\) this curve joins the binary entropy function in \({\theta _\mathrm{r}}\) with a continuous slope.
Similarly the 1RSB entropic complexity \(\Sigma _\mathrm{e}(\theta )\) is obtained parametrically as
with \(y(\lambda )\) giving the slope of the tangent of \(\Sigma _\mathrm{e}(\theta )\) in the point \(\theta (\lambda )\).
An example of the limit for the RS entropy can be found in the right panel of Fig. 3 for \(k=3\), \(l=2\), along with some finite \(T\) curves, and a similar plot for the energetic complexity is displayed in the right panel of Fig. 6. The entropy and energetic complexity for this case in the limit are compared in Fig. 9. The values \({\theta _\mathrm{min,0}}\) and \({\theta _\mathrm{min,1}}\) where \(s(\theta )\) and \(\Sigma _\mathrm{e}(\theta )\) vanish are easily determined numerically from the above representation, and are collected in Table 4 for various values of \(k\) and \(l\). For most of the cases one finds \({\theta _\mathrm{min,1}}\) to be slightly larger than \({\theta _\mathrm{min,0}}\); as explained above the exactness of this 1RSB prediction has still to be assessed from a computation of the stability with respect to further replica symmetry breaking.
There are however two special cases which stand on a different footing, namely \((k,l)=(4,3)\) and \((k,l)=(5,4)\). Indeed in these two cases one has the same phenomenology than for \(k=l=3\), namely a coincidence of \({\theta _\mathrm{min,0}}\) and \({\theta _\mathrm{min,1}}\) due to a vertical segment in the curves \(s(\theta )\) and \(\Sigma _\mathrm{e}(\theta )\) extending to positive values. This phenomenon can be understood by studying the limit \(\lambda \rightarrow 0\) of the above representation of these curves. After some algebra one finds indeed that for \(k<2l-1\),
the limiting value for \(\theta \) being valid both for the RS (101) and 1RSB (102) expressions. It turns out that for \(k=4\), \(l=3\) and \(k=5,l=4\), the latter expression for the entropy \(s\) and complexity \(\Sigma _\mathrm{e}\) is strictly positive, hence the simple predictions \(1/6\) and \(1/4\) for \({\theta _\mathrm{min}}\) in these two cases respectively, that saturate the lowerbound of (17). We did not find any other values of \(k,l\) that produce the same phenomenon.
Finally the distribution of activation times in the RS formalism exhibits a very different pattern with respect to the case \(k=l\) (see Fig. 10 for an illustration). As a matter of fact, in the limit \(T\rightarrow \infty \) the activation times \(t\) of the vertices have to be divided in three categories, each of them comprising a finite fraction of the \(N\) vertices: (i) \(t=O(1)\) (ii) \(t=O(T)\) (iii) \(t=T-O(1)\). The category (ii) of vertices can be described by a scaling function for the cumulative distribution, \(P(s)=P_{t=sT}\), with \(s\in ]0,1[\) a reduced time. One has \(P(s=0^+)>0\) and \(1-P(s=1^-)>0\), these two numbers representing the fractions of vertices of type (i) and (iii) respectively. They can be computed following the techniques of the Appendix section “Asymptotics for \(l<k\)”, yielding for initial configurations with a fraction \(\theta (\lambda ) < {\theta _\mathrm{r}}\) of active vertices:
5 Algorithmic Results
We shall present in this Section the results of numerical experiments performed on finite size random regular graphs, for which we have constructed explicitly some activating initial configurations. We have used two strategies to do so, one based on a simple greedy heuristic, the other inspired by the results of the cavity method. Both of them build iteratively a percolating initial configuration, starting from the configuration with all vertices inactive, and adding one active vertex at a time (another route would be to start from the all active configuration and sequentially reduce the number of active vertices, but we did not investigate this alternative strategy). We shall denote \(\tau \) the number of addition steps performed by the algorithm, and \({\underline{\sigma }}(\tau )\) the initial configuration considered at this point (that contains by definition \(\tau \) active vertices). The configuration denoted \({\underline{\sigma }}^T(\tau )\) (resp. \({\underline{\sigma }}^\mathrm{f}(\tau )\)) is thus the configuration obtained after \(T\) (resp. an infinite) number of steps of the dynamics defined in (1) from the initial configuration \({\underline{\sigma }}(\tau )\); we will denote \(|{\underline{\sigma }}^T(\tau )|\) the number of active vertices in this configuration. The algorithm stops when this number reaches \(N\), as \({\underline{\sigma }}(\tau )\) is then the first percolating initial configurations encountered. The difference in the two algorithms to be presented below lies in the rule used to choose which additional active vertex to add in the initial configuration in a step \(\tau \rightarrow \tau +1\).
5.1 A Greedy Algorithm
Let us first consider the case of a finite time horizon \(T\), i.e. the problem of finding an initial configuration \({\underline{\sigma }}\) with \({\underline{\sigma }}^T\) the fully active configuration and \({\underline{\sigma }}\) containing the smallest possible number of active vertices. The simplest strategy is to choose at each time step \(\tau \rightarrow \tau +1\) the inactive vertex of \({\underline{\sigma }}(\tau )\) whose activation leads to the largest possible value of \(|{\underline{\sigma }}^T(\tau +1)|\), and stop at the first time \(\tau \) such that \({\underline{\sigma }}^T(\tau )\) is the fully active configuration. This can be immediately generalized to the case \(T=\infty \) by including at each time step the vertex whose activation increases most \(|{\underline{\sigma }}^\mathrm{f}(\tau +1)|\); this version of the greedy procedure was actually a tool in the rigorous bounds on \({\theta _\mathrm{min}}\) for graphs with good expansion properties of [31]. If several vertices lead to the same increase the ties can be broken arbitrarily. The time complexity of the greedy algorithm is a priori cubic in the number \(N\) of vertices: a linear number of steps \(\tau \rightarrow \tau +1\) have to be performed before finding a percolating initial configuration. For each of these steps a number of order \(N\) of candidate new configurations \({\underline{\sigma }}(\tau +1)\) have to be considered, the computation of \({\underline{\sigma }}^T(\tau +1)\) requiring itself a linear number of operations for each configuration. It is however easy to reduce significantly this complexity when \(T=\infty \). As explained at the end of Sect. 2.1, in this case the final configuration of the dynamical process can be obtained sequentially, regardless of the order of the activations. By monotonicity the configuration \({\underline{\sigma }}^\mathrm{f}(\tau +1)\) can be computed by adding one active vertex to \({\underline{\sigma }}^\mathrm{f}(\tau )\) (instead of \({\underline{\sigma }}(\tau )\)) and determining the number (of order 1) of additional activations that can be triggered by this addition. This reduces the total complexity to a quadratic scaling with \(N\).
In Fig. 11 we plot the fraction of active vertices in the configuration \({\underline{\sigma }}^T(\tau )\) as a function of the density \(\tau /N\) of the active vertices in the initial configuration obtained after \(\tau \) steps of this greedy procedure; when the curve reaches 1 we have thus obtained an initial configuration that percolates within \(T\) steps (note that the part of the curve for smaller \(\tau \) corresponds to the alternative optimization problem labelled (i) in the introduction). The density of the contagious sets reached in this way are summarized in Table 5; as expected these densities are strictly greater than the prediction \({\theta _\mathrm{min,1}}\) of the 1RSB cavity method, and also than the ones reached by more involved message-passing algorithms (see the discussion in next subsection).
One can clearly see a qualitative difference between the cases \(k=l\) and \(k>l\) in the two panels of Fig. 11: in the latter case as \(T\) gets larger the last active vertices added in the initial configuration before finding a percolating one provoke a very steep increase in the final size of the activated set. As said above the greedy procedure can easily be generalized to \(T=\infty \); the density of the smallest contagious sets constructed in this way are presented in Table 6 for various values of \(k\) and \(l\). As these results demonstrate the greedy algorithm is able, in all cases we investigated, to find contagious sets with a density strictly smaller than \({\theta _\mathrm{r}}\), the density above which typical uncorrelated configurations are percolating. However in general the density reached by this simple procedure is strictly greater than the prediction \({\theta _\mathrm{min,1}}\) of the cavity method for their minimal size; this is in agreement with the interpretation of the replica symmetry breaking creating metastable states that trap simple local search procedures and prevent them from reaching global optima of the cost function landscape in which the search moves. The only exception is the case \(k=l=2\), for which the minimal density \(1/4\) (corresponding to the decycling number of 3-regular random graphs [17]) is actually reached by the greedy procedure; this result is in line with the analysis of Sect. 4.2.1, which revealed a disappearance of the RSB phase in the large \(T\) limit for this peculiar case.
Further information on the minimal contagious sets produced by the greedy algorithm with \(T=\infty \) can be obtained from the distribution of the activation times of the vertices they induce, which are plotted in Fig. 12. Of course as the graphs under study are finite the support of these distributions is bounded; in all cases we investigated we found that the time to reach total activation from these initial configurations scales logarithmically with the number of vertices of the graph (see also Fig. 13 for a comparison between two different sizes of the graph). The qualitative difference between the cases \(k=l\) and \(k>l\) expected from the discussion of the \(T\rightarrow \infty \) limit of Sect. 4.2 is indeed apparent on these curves; in the latter case a finite fraction of the vertices are activated at the very end of the dynamical process. However the activation time distributions induced by the configurations produced by the greedy algorithm are not in quantitative agreement with the RS analytical predictions (with a value of \(T\) and \(\theta \) chosen to fit the numerical ones). A possible explanation for this discrepancy is that the greedy algorithm is a very “out-of-equilibrium” algorithm, hence the configurations it reaches are not the typical ones of the “equilibrium” measure (8).
5.2 Survey Propagation
The second algorithmic procedure we investigated is based on the insight provided by the statistical mechanics analysis on the structure of the configuration space of the problem; it corresponds indeed to the Survey Propagation algorithm introduced in [56] for the analysis of random satisfiability problem (and more precisely to its variant introduced in [16] for the energy minimization in the unsatisfiable phase of such problems). An idealized thought experiment for the construction of minimal contagious sets would be to sequentially assign the values of the \({\sigma }_i\) according to their marginal probabilities in the law (8), with \(\epsilon =+\infty \) and \(\mu =-\infty \); the exact determination of such marginals is in general a very hard computational tasks, and in practice one has to content oneself with approximations provided for instance by message passing procedures. This is the road we have followed here, by implementing the single-sample energetic 1RSB equations (75), i.e. assigning to each directed edge \(i\rightarrow j\) of the graph under study a vector \(P_{i\rightarrow j}\) of \(2T\) probabilities. At each step \(\tau \) of the algorithm the Eq. (75) are iterated several times to look for a global solution of these equations; the presence of \(\tau \) active (decimated) vertices in the current configuration \({\underline{\sigma }}(\tau )\) is implemented as a boundary condition in these equations, easily seen to be \(P_{i\rightarrow j}(h)=\delta (h-B_0)\) for the outgoing messages from an activated vertex \(i\). The information contained in such a solution of the 1RSB equations can be a priori exploited in several ways; we chose to compute, for each vertex \(i\) not yet activated, the quantity
i.e. the contribution of the site \(i\) to the derivative of the potential \(\Phi _\mathrm{e}\) given in Eq. (80). This number measures indeed the tendency of \(i\) to be active in all configurations belonging to the clusters considered in the energetic 1RSB formalism. Accordingly we choose the vertex \(i\) with the largest value of \(W_i\) to be the new active vertice to be added to \({\underline{\sigma }}(\tau )\) in order to form \({\underline{\sigma }}(\tau +1)\). For simplicity we fixed the value of \(y\) in the whole procedure to the value \(y_\mathrm{s}\) determined analytically, that leads to a vanishing complexity before the decimation; we also tried to recompute this value of \(y\) during the course of the decimation but did not obtain significant improvement of the performances in the cases considered.
The values of the density of the percolating initial configurations we managed to construct in this way are presented in Table 5 for the two cases \(k=l=2\) and \(k=3\), \(l=2\), for several (relatively small) values of \(T\). The results are better than the simple greedy algorithm, and in most of the cases also than the maxsum replica-symmetric algorithm [8–10], but in some cases deviate significantly from the prediction \({\theta _\mathrm{min,1}}\) for the density of minimal contagious sets. An analytical understanding of the performances of such decimation procedures is actually a challenging open problem (see [29, 68] for partial results in the simpler case of the Belief-Propagation guided decimation). We did not study much larger values of \(T\) because we faced in this case convergence issues for the iterations of the Eqs. (75), that a simple damping did not seem to alleviate efficiently. A pragmatic, even if not completely satisfactory, position we adopted for the results at \(T\ge 4\) for the case \(k=3\), \(l=2\), was to ignore somehow the convergence problems, stopping the iterations of (75) after a time fixed beforehand, and computing the value of \(W_i\) from these unconverged messages. As Table 5 demonstrates this attitude is not unreasonable as the densities reached are still better than the one of the greedy algorithm (yet can get worse than the maxsum procedure [8–10]).
6 Conclusions and Perspectives
In this paper we have continued the study initiated in [8, 9] of the minimal contagious sets for the bootstrap percolation (or threshold model) dynamics on random graphs. We have shown the importance of taking into account the phenomenon of replica symmetry breaking in the determination of the minimal density \({\theta _\mathrm{min}}\) of active vertices in percolating initial conditions, and could simplify analytically the equations determining \({\theta _\mathrm{min}}\) in the limit \(T\rightarrow \infty \) where the constraint on the time to reach a complete activation of the graph disappears. Reformulating the problem as the minimal number of vertices to be removed in a graph in order to destroy some specific subgraphs (its cycles or more generically its \(q\)-core) we recovered a previously known result for the decycling number of 3-regular random graphs [17] as well as a conjecture for 4-regular ones [17], and proposed new quantitative conjectures for the sizes of the minimal “de-coring” sets for all pairs of degree of the graph and minimal degree of the targeted core. These take a particularly simple rational form for the removal of the 3-core in 5- and 6- regular random graphs.
Let us sketch now some possible directions for future study. A first project would be to test the stability of the 1RSB ansatz we used to compute \({\theta _\mathrm{min,1}}\), to assess for which values of \((k,l)\) this number should be expected to be the exact value \({\theta _\mathrm{min}}\) and not only a lowerbound. This computation should be doable following the techniques of [59, 60, 70] for all finite \(T\), and might even be simplified in the large \(T\) limit. By analogy with the independent set problem which is a marginal case of the problem investigated here one could surmise to find that the 1RSB ansatz is stable for large enough values of the degree \(k\) (and maybe also of the threshold \(l\)). This is also the regime where one can hope to see a mathematically rigorous proof of these predictions, as recently obtained for the independent sets in [33]. Asymptotic expansions of \({\theta _\mathrm{min,0}}(k,l)\) and \({\theta _\mathrm{min,1}}(k,l)\) in the large \(k\) limit for \(k>l\) should also be performed, considering either \(l\) fixed in this limit, \(l\) proportional to \(k\), or \(k-l\) fixed.
For the sake of concreteness and simplicity we presented explicit results only for regular random graphs, however we gave the intermediate equations of the RS and 1RSB cavity method under a form that can be directly applied to any sparse random graph ensembles with arbitrary prescribed degree distribution, and possibly fluctuating thresholds for activation. The latter could naturally be correlated with the degree of the vertices, triggering for instance the activation if the fraction of active neighbours reaches some fixed proportion (instead of a fixed number). It would be interesting to see how the results presented here are qualitatively modified by the local fluctuations in the graph structure, which would be particularly severe in the case of power-law tails in the degree distribution.
We also concentrated exclusively in this paper on the problem of optimizing the number of initially active vertices, imposing that all vertices are active at a later time. The variant of this problem where one puts a constraint on the maximal number of active vertices allowed in the initial configuration and try to maximize the level of activation at a later time is also relevant, in particular for applications to real-world situations. At the RS level we have sketched how to do this by controlling the parameter \(\epsilon \) (the cost to be paid for finally inactive vertices) that we kept arbitrary in the first steps of the computations, a systematic study and the inclusion of the effects of replica symmetry breaking remains to be done.
Finally we believe that the message passing procedure inspired by the energetic 1RSB equations presented in Sect. 5.2 would be worth investigated further. One should try to study (and cure) the convergence issues that arise for larger values of \(T\), maybe changing the way the information provided by the messages is used. One could in particular exploit them in a softer way by implementing a reinforcement technique [8, 9] instead of a direct decimation. A more extensive comparison with the maxsum message passing procedure studied in [8, 9] could also be interesting.
References
Achlioptas, D., Coja-Oghlan, A.: Algorithmic barriers from phase transitions. In: IEEE 49th Annual IEEE Symposium on Foundations of Computer Science, 2008. FOCS’08, pp. 793–802. IEEE (2008)
Achlioptas, D., D’Souza, R.M., Spencer, J.: Explosive percolation in random networks. Science 323(5920), 1453–1455 (2009)
Achlioptas, D., Ricci-Tersenghi, F.: On the solution-space geometry of random constraint satisfaction problems. In: Proceedings of the 38th Annual ACM Symposium on Theory of Computing (2006)
Ackerman, E., Ben-Zwi, O., Wolfovitz, G.: Combinatorial model and bounds for target set selection. Theor. Comput. Sci. 411(44–46), 4017–4022 (2010)
Aizenman, M., Lebowitz, J.L.: Metastability effects in bootstrap percolation. J. Phys. A 21(19), 3801 (1988)
Altarelli, F., Braunstein, A., Dall’Asta, L., Lage-Castellanos, A., Zecchina, R.: Bayesian inference of epidemics on networks via belief propagation. Phys. Rev. Lett. 112, 118701 (2014)
Altarelli, F., Braunstein, A., Dall’Asta, L., Wakeling, R.: Containing epidemic outbreaks by message-passing techniques. Phys. Rev. X 4, 021024 (2014)
Altarelli, F., Braunstein, A., Dall’Asta, L., Zecchina, R.: Large deviations of cascade processes on graphs. Phys. Rev. E 87, 062115 (2013)
Altarelli, F., Braunstein, A., Dall’Asta, L., Zecchina, R.: Optimizing spread dynamics on graphs by message passing. J. Stat. Mech. 2013(09), P09011 (2013)
Altarelli, F., Braunstein, A., Dall’Asta, L., Zecchina, R.: Private communication (2014)
Balogh, J., Bollobás, B., Duminil-Copin, H., Morris, R.: The sharp threshold for bootstrap percolation in all dimensions. Trans. Am. Math. Soc. 364(5), 2667–2701 (2012)
Balogh, J., Pittel, B.G.: Bootstrap percolation on the random regular graph. Random Struct. Algorithms 30(1–2), 257–286 (2007)
Bapst, V., Coja-Oghlan, A., Hetterich, S., Rassmann, F., Vilenchik, D.: The Condensation Phase Transition in Random Graph Coloring. arXiv:1404.5513 (2014)
Barbier, J., Krzakala, F., Zdeborova, L., Zhang, P.: The hard-core model on random graphs revisited. J. Phys. Conf. Ser. 473(1), 012021 (2013)
Barrat, A., Barthelemy, M., Vespignani, A.: Dynamical Processes on Complex Networks. Cambridge University Press, Cambridge (2008)
Battaglia, D., Kolář, M., Zecchina, R.: Minimizing energy below the glass thresholds. Phys. Rev. E 70, 036107 (2004)
Bau, S., Wormald, N.C., Zhou, S.: Decycling numbers of random regular graphs. Random Struct. Algorithms 21(3–4), 397–413 (2002)
Bayati, M., Gamarnik, D., Tetali, P.: Combinatorial approach to the interpolation method and scaling limits in sparse random graphs. Ann. Probab. 41(6), 4080–4115 (2013)
Beineke, L.W., Vandell, R.C.: Decycling graphs. J. Graph Theory 25(1), 59–77 (1997)
Benevides, F., Przykucki, M.: Maximum Percolation Time in Two-Dimensional Bootstrap Percolation. arXiv (2013)
Biroli, G., Mézard, M.: Lattice glass models. Phys. Rev. Lett. 88, 025501 (2001)
Biskup, M., Schonmann, R.: Metastable behavior for bootstrap percolation on regular trees. J. Stat. Phys. 136(4), 667–676 (2009)
Boccaletti, S., Latora, V., Moreno, Y., Chavez, M., Hwang, D.-U.: Complex networks: structure and dynamics. Phys. Rep. 424(4), 175–308 (2006)
Bohman, T., Frieze, A., Wormald, N.C.: Avoidance of a giant component in half the edge set of a random graph. Random Struct. Algorithms 25(4), 432–449 (2004)
Bohman, T., Picollelli, M.: Sir epidemics on random graphs with a fixed degree sequence. Random Struct. Algorithms 41(2), 179–214 (2012)
Bordenave, C., Lelarge, M., Salez, J.: Matchings on infinite graphs. Probab. Theory Relat. Fields 157(1–2), 183–208 (2013)
Chalupa, J., Leath, P.L., Reich, G.R.: Bootstrap percolation on a bethe lattice. J. Phys. C Solid State Phys. 12(1), L31 (1979)
Chen, N.: On the approximability of influence in social networks. In: Proceedings of the Nineteenth Annual ACM-SIAM Symposium on Discrete Algorithms, SODA ’08, pp. 1029–1037 (2008)
Coja-Oghlan, A.: On belief propagation guided decimation for random k-sat. In: Proceedings of 22nd SODA, p. 957 (2011)
Coja-Oghlan, A.: The Asymptotic \(k\)-sat Threshold. arXiv:1310.2728 (2013)
Coja-Oghlan, A., Feige, U., Krivelevich, M., Reichman, D.: Contagious Sets in Expanders. arXiv:1306.2465 (2013)
Daudé, H., Mora, T., Mézard, M., Zecchina, R.: Pairs of sat assignments and clustering in random boolean formulae. Theor. Comput. Sci. 393, 260–279 (2008)
Ding, J., Sly, A., Sun, N.: Maximum independent sets on random regular graphs. arXiv:1310.4787 (2013)
Dorogovtsev, S.N., Mendes, J.F.F.: Evolution of networks. Adv. Phys. 51(4), 1079–1187 (2002)
Dreyer Jr, P.A., Roberts, F.S.: Irreversible \(k\)-threshold processes: graph-theoretical threshold models of the spread of disease and of opinion. Discrete Appl. Math. 157(7), 1615–1627 (2009)
Franz, S., Leone, M.: Replica bounds for optimization problems and diluted spin systems. J. Stat. Phys. 111(3–4), 535–564 (2003)
Franz, S., Leone, M., Toninelli, F.L.: Replica bounds for diluted non-poissonian spin systems. J. Phys. A Math. Gen. 36, 10967–10985 (2003)
Frieze, A., Luczak, T.: On the independence and chromatic numbers of random regular graphs. J. Comb. Theory Ser B 54(1), 123–132 (1992)
Granovetter, M.: Threshold models of collective behavior. Am. J. Sociol. 83, 1420–1443 (1978)
Guerra, F.: Broken replica symmetry bounds in the mean field spin glass model. Commun. Math. Phys. 233(1), 1–12 (2003)
Haxell, P., Pikhurko, O., Thomason, A.: Maximum acyclic and fragmented sets in regular graphs. J. Graph Theory 57(2), 149–156 (2008)
Hethcote, H.W.: The mathematics of infectious diseases. SIAM Rev. 42, 599 (2000)
Holroyd, A.E.: Sharp metastability threshold for two-dimensional bootstrap percolation. Probab. Theory Relat. Fields 125(2), 195–224 (2003)
Janson, S., Luczak, M., Windridge, P.: Law of large numbers for the sir epidemic on a random graph with given degrees. arXiv:1308.5493 (2013)
Janson, S., Luczak, T., Turova, T., Vallier, T.: Bootstrap percolation on the random graph \(g_{n, p}\). Ann. Appl. Probab. 22(5), 1989–2047 (2012)
Karrer, B., Newman, M.E.J.: Message passing approach for general epidemic models. Phys. Rev. E 82, 016101 (2010)
Kempe, D., Kleinberg, J., Tardos, E.: Maximizing the spread of influence through a social network. In: Proceedings of the Ninth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD ’03, pp. 137–146 (2003)
Krzakala, F., Montanari, A., Ricci-Tersenghi, F., Semerjian, G., Zdeborová, L.: Gibbs states and the set of solutions of random constraint satisfaction problems. Proc. Natl. Acad. Sci. 104(25), 10318–10323 (2007)
Kschischang, F., Frey, B., Loeliger, H.: Factor graphs and the sum-product algorithm. IEEE Trans. Inf. Theory 47(2), 498 (2001)
Lelarge, M.: Diffusion and cascading behavior in random networks. Games Econ. Behav. 75(2), 752–775 (2012)
Lokhov, A.Y., Mézard, M., Ohta, H., Zdeborova, L.: Inferring the Origin of an Epidemic with Dynamic Message-Passing Algorithm. arXiv:1303.5315 (2013)
Mézard, M., Montanari, A.: Reconstruction on trees and spin glass transition. J. Stat. Phys. 124(6), 1317–1350 (2006)
Mézard, M., Montanari, A.: Information, Physics and Computation. Oxford University Press, Oxford (2009)
Mézard, M., Parisi, G.: The Bethe lattice spin glass revisited. Eur. Phys. J. B. 20, 217 (2001)
Mézard, M., Parisi, G.: The cavity method at zero temperature. J. Stat. Phys. 111(1–2), 1 (2003)
Mézard, M., Zecchina, R.: Random \(k\)-satisfiability problem: from an analytic solution to an efficient algorithm. Phys. Rev. E 66(5), 056126 (2002)
Molloy, M.: The freezing threshold for k-colourings of a random graph. In: Proceedings of the 44th Symposium on Theory of Computing, p. 921. ACM (2012)
Monasson, R.: Structural glass transition and the entropy of the metastable states. Phys. Rev. Lett. 75, 2847–2850 (1995)
Montanari, A., Parisi, G., Ricci-Tersenghi, F.: Instability of one-step replica-symmetry-broken phase in satisfiability problems. J. Phys. A 37(6), 2073 (2004)
Montanari, A., Ricci-Tersenghi, F.: On the nature of the low-temperature phase in discontinuous mean-field spin glasses. Eur. Phys. J. B. 33(3), 339–346 (2003)
Montanari, A., Ricci-Tersenghi, F., Semerjian, G.: Clusters of solutions and replica symmetry breaking in random \(k\)-satisfiability. J. Stat. Mech. 2008(04), P04004 (2008)
Morris, R.: Minimal percolating sets in bootstrap percolation. Electron. J. Comb. 16(1), R2 (2009)
Newman, M.: The structure and function of complex networks. SIAM Rev. 45(2), 167–256 (2003)
Panchenko, D.: The Sherrington–Kirkpatrick Model. Springer, New york (2013)
Panchenko, D., Talagrand, M.: Bounds for diluted mean-fields spin glass models. Probab. Theory Relat. Fields 130(3), 319–336 (2004)
Pinto, P.C., Thiran, P., Vetterli, M.: Locating the source of diffusion in large-scale networks. Phys. Rev. Lett. 109, 068702 (2012)
Reichman, D.: New bounds for contagious sets. Discrete Math. 312(10), 1812–1814 (2012)
Ricci-Tersenghi, F., Semerjian, G.: On the cavity method for decimated random constraint satisfaction problems and the analysis of belief propagation guided decimation algorithms. J. Stat. Mech. 2009(09), P09001 (2009)
Riordan, O., Warnke, L.: Achlioptas process phase transitions are continuous. Ann. Appl. Probab. 22(4), 1450–1464 (2012)
Rivoire, O., Biroli, G., Martin, O., Mézard, M.: Glass models on Bethe lattices. Eur. Phys. J. B. 37, 55 (2004)
Qin, S.-M., Zhou, H.-J.: Solving the Undirected Feedback Vertex Set Problem by Local Search. arXiv:1405.0446 (2014)
Shah, D., Zaman, T.: Rumors in a network: who’s the culprit? IEEE Trans. Inf. Theory 57(8), 5163–5181 (2011)
Shrestha, M., Moore, C.: Message-passing approach for threshold models of behavior in networks. Phys. Rev. E 89, 022805 (2014)
Talagrand, M.: The parisi formula. Ann. Math. 163, 221 (2006)
Zdeborová, L., Mézard, M.: The number of matchings in random graphs. J. Stat. Mech. 2006(05), P05003 (2006)
Zhou, H.-J.: Spin glass approach to the feedback vertex set problem. Eur. Phys. J. B. 86(11), 1–9 (2013)
Acknowledgments
We warmly thank Fabrizio Altarelli, Victor Bapst, Alfredo Braunstein, Amin Coja-Oghlan, Luca Dall’Asta, Svante Janson, Marc Lelarge and Riccardo Zecchina for useful discussions, and in particular FA, AB, LDA and RZ for sharing with us the unpublished numerical results [10] on their maxsum algorithm, and SJ for a useful correspondence and for pointing out the reference [17]. The authors acknowledge the support of the French Agence Nationale de la Recherche (ANR) under reference ANR-11-JS02-005-01 (GAP project) and of the People Programme (Marie Curie Actions) of the European Union’s Seventh Framework Programme FP7/2007-2013/ under REA Grant Agreement No 290038.
Author information
Authors and Affiliations
Corresponding author
Appendices
Appendix 1: The Limit \(\mu \rightarrow -\infty \) of the Fields Recursion
We justify here the Eq. (73) for the recursion \(h=g(h_1,\ldots ,h_k)\) between “hard fields” \(h_i\in \{A_0,A_1,\ldots ,A_{T-1},A_T=B_T,B_{T-1},\ldots ,B_1,B_0 \}\). We can first notice that in Eqs. (70, 71) the (constrained) maximum over the partitions \(I,J,K\) of \(\mathcal{S}_t\) is always reached for \(|I|+|J|\) and \(|I|\) as small as possible (because \(a_t^{(i)} \ge b_{t-1}^{(i)} \ge b_{t-2}^{(i)} \)), which allows to rewrite
where \(J,K\) forms a partition of \(\{1,\ldots ,k\}\). In addition one realizes that
which by logical negation leads to
Combining these logical rules leads after a short reasoning to
and
Considering the various possible cases leading to a field of type \(A_t\) or \(B_t\) yields finally (73).
Appendix 2: Technical Details on the Resolution of the Factorized RS and Energetic 1RSB Equations
We shall present in this Appendix the details of the RS and energetic 1RSB cavity equations in the particular case of random \(k+1\) regular graphs with an uniform threshold \(l\) of activations. It turns out that despite their different interpretations these two version of the cavity method can be treated in an unified way. We thus begin by introducing this common formulation, then we unveil the simplifications that arise in the case \(l=k\), before finally discussing the limit \(T\rightarrow \infty \), both in the case \(l=k\) and \(l<k\).
1.1 Common Formulation
1.1.1 RS Cavity Method
Consider the fixed-point RS equation \(h=g(h,\ldots ,h)\), with \(g\) defined in Eq. (40); alternatively we saw in Eqs. (66, 67) an expression for the differences \(e^{-\mu a_t} - e^{-\mu a_{t+1}}\). Setting \(h_i=h\) in the right-hand sides of these equations, and using the identity
for any function \(f\) of a partition \(I,J,K\), allows to show the equivalence of the fixed-point equation on \(h=(a_0,\ldots ,a_T,b_{T-1},\ldots ,b_1)\) with:
These equations are valid for \(t\in \{0,\ldots ,T-1\}\), with the boundary conditions \(e^{-\mu b_{-1}}=0\), \(b_0=1\), \(a_T=b_T\), \(a_{T+1}=b_{T-1}\). The thermodynamic quantities can also be simplified in this factorized case, the site contribution to the RS free-entropy reading from Eq. (43):
while the edge contribution of Eq. (42) becomes
Let us introduce some new notations and define a change of parameters on the unknowns \(a_t,b_t\), as \(u_t=e^{-\mu a_t}\), \(v_t=e^{-\mu b_t}\). We also define a new parameter \(\lambda \), with \(\lambda =e^{-\mu + \mu k a_0}\). In terms of these new quantities the above set of equations becomes
with \(v_{-1}=0\), \(v_0=1\), \(u_T=v_T\), \(u_{T+1}=v_{T-1}\), and
In other words the \(u\)’s and \(v\)’s are solutions of a set of polynomial equations, and as such should be viewed as a function of \(\lambda \) and \(T\) (and of course of \(k\) and \(l\)). They also obey, on top of the boundary conditions, the inequalities \(u_0 \ge u_1 \ge \cdots \ge u_T = v_T \ge v_{T-1} \ge \cdots v_1 \ge v_0=1\). The chemical potential \(\mu \) has disappeared from this set of equations, but actually it is now implicitly a function of \(\lambda \) and \(T\), as from the definition of \(\lambda \) one recovers \(\mu \) with \(\mu = -\ln (\lambda u_0^k)\).
For future use we emphasize here an identity between the derivatives of \(D\) and \(S\) and introduce a new function \(C(u,v)\):
Let us also rewrite the thermodynamic quantities in terms of these new variables. The expressions (117) and (118) become
where we introduced the two functions
We emphasize here the dependency on \(\lambda \) and \(T\), which was kept implicit in the \(u_t\) and \(v_t\)’s. One has then the final expressions of all RS thermodynamic quantities as:
One can also express the probability distribution of the activation times in terms of these new variables. Denoting \(P_t\) the cumulative distribution, i.e. the probability that the activation time of one vertex is smaller or equal than \(t\), one has from Eq. (45):
where we defined
One can check that, as it should, \(P_0=\theta \) the fraction of initially active sites (summations over empty sets being equal to zero by convention), and \(P_T=1\) (as \(\epsilon =+\infty \) all vertices are active at the final time).
1.1.2 Energetic 1RSB Cavity Method
We now turn to a similar study of the energetic 1RSB equations in the factorized case, namely the determination of the normalized vector of probabilities \(P=(p_0,\ldots ,p_{T-1},q_T,\ldots ,q_0)\), solution of the fixed-point equation \(P=G(P,\ldots ,P)\), with the mapping \(G\) defined in Eq. (75).
Let us first note that in general the normalization \(Z[P_1,\ldots ,P_k]\) of (75) can be expressed in terms of \(q_0\),
This remark allows to rewrite the fixed-point equation \(P=G(P,\ldots ,P)\) as
where in the first line \(t\in \{0,\ldots ,T-1\}\) and in the second \(t\in \{1,\ldots ,T\}\). These two sets of equations are supplemented by the normalization condition \(q_0+\cdots +q_T+p_{T-1}+\cdots +p_0=1\).
The site and edge contributions of the energetic 1RSB potential, defined in (77, 79), become in the factorized case:
Now let us change variables and trade the unknowns \(p_t,q_t\) for some variables \(u_t\), \(v_t\), and the parameter \(y\) for some parameter \(\lambda \), according to
Inserting these definitions in the above equations one realizes that the quantities \(u_t\) and \(v_t\) are solutions of exactly the same set of Eqs. (119, 120) defined in the RS case, and obey the same boundary conditions and inequalities. From the solution of these equations, for a given value of the parameter \(\lambda \), one recovers the parameter \(y\) noting that by the normalization condition one has \(u_0=1/q_0\), hence \(y = \ln (\lambda u_0^k - u_0 +1)\). The expressions of \({\mathcal{Z}_\mathrm{site}}\) and \({\mathcal{Z}_\mathrm{edge}}\) within this parametrization are easily obtained from the above equations and read:
with the same functions \({F_\mathrm{site}}\) and \({F_\mathrm{edge}}\) defined in Eqs. (124, 125) for the RS case. One has finally an expression for the thermodynamic quantities of the energetic 1RSB formalism as
where \(\theta \) is here the opposite of the derivative of \(\Phi _\mathrm{e}\) with respect to \(y\), which after a short computation reads
1.1.3 Simplifications for \(l=k\)
In the case \(l=k\) further simplifications arise. Indeed the function \(S(u,v)\) defined in (121) is in this case independent of \(u\), and the Eqs. (119, 120) can be rewritten as:
This set of equations is particularly simple to solve, and admits a single solution for each value of \(\lambda \). One can indeed compute by recurrence the value of the \(v_t\) for increasing values of \(t\) from \(0\) to \(T\), then deduce the value of \(u_{T-1}\), and finally by a downward recurrence the values of \(u_t\) for \(t\) from \(T-2\) to \(0\). The thermodynamic observables are then deduced from (126) in the RS case or (132) in the energetic 1RSB case, where the site contributions can be simplified from (124), yielding
These simplifications can also be performed for the function (128) giving the distribution of activation times, which reads in the case \(k=l\):
1.1.4 Numerical Resolution for \(l<k\)
In the case \(l<k\) we did not find a simple change of variables on the unknowns \(u_t,v_t\) that would put the system of Eqs. (119, 120) in the triangular form that appeared naturally when \(k=l\) and led to a direct resolution by successive substitutions. We therefore resorted to the Newton-Raphson iterative method for solving (119, 120), taking care of choosing a good initial condition for the iterations to be convergent. This guess on the solution was provided by analytical asymptotic expansions, either in the limit \(\lambda \rightarrow 0\) or with \(T\rightarrow \infty \) (see next paragraph). Depending on the values of \(\lambda \) and \(T\) we found either 0, 1 or 2 relevant solutions of (119, 120), but this multi valuedness has no physical meaning and comes only from the arbitrary choice of the parametrization in terms of \(\lambda \). Indeed there is a single solution for each value of the chemical potential \(\mu \) (or \(y\) in the energetic 1RSB formalism).
1.2 The Large \(T\) Limit
In the rest of this Appendix we shall justify analytically the claims made in Sects. 4.2.1 and 4.2.2 on the behaviour of the RS and energetic 1RSB solutions as \(T\) goes to infinity.
1.2.1 The Trivial Solution
As anticipated in Sect. 4, in the large \(T\) limit the portion of the curve \(s(\theta )\) corresponding to \(\theta >{\theta _\mathrm{r}}\) should coincide with the entropy \(-\theta \ln \theta - (1-\theta ) \ln (1-\theta )\) counting all configurations with a fraction \(\theta \) of initially active sites, as such configurations are typically activating (see the reminder on random initial configurations of Sect. 2.2). Let us see how to prove this statement. A moment of thought, considering for instance the form of the RS equations at \(\epsilon =0\), reveals that this situation should correspond to a solution of (119, 120) with \(u_t={\widetilde{u}}\), independently of \(t\). This ansatz is indeed consistent with Eq. (119), and with this substitution Eq. (120) becomes
This last equation is a simple recursion on the \(v\)’s, with the initial value \(v_0=1\). For the boundary condition \(u_T=v_T\), \(u_{T+1}=v_{T-1}\) to be asymptotically (when \(T\rightarrow \infty \)) verified one has to impose the values of \({\widetilde{u}}\) and \(\lambda \) such that the \(v_t\) solution of (140) converge to \({\widetilde{u}}\) when \(t \rightarrow \infty \), in other words that the smallest fixed point solution \(v \ge 1\) of \(v=1+S({\widetilde{u}},v)\) is precisely equal to \({\widetilde{u}}\). The condition \({\widetilde{u}}=1+S({\widetilde{u}},{\widetilde{u}})\) imposes the following relationship between \({\widetilde{u}}\) and \(\lambda \), \({\widetilde{u}}=1+\lambda {\widetilde{u}}^k\). Using this condition one can then rewrite (140) as
Comparing this equation with (3) one realizes that by definition of \({\theta _\mathrm{r}}\), all the values of \({\widetilde{u}}\) in the interval \([1,1/{\theta _\mathrm{r}}[\) are such that the condition \(v_t \rightarrow {\widetilde{u}}\) is fulfilled (with the value of \(\lambda \) fixed by \({\widetilde{u}}=1+\lambda {\widetilde{u}}^k\)). Let us now compute the RS thermodynamic quantities associated with this solution. As the \(u_t\) are independent of \(t\) the summation in Eq. (124) can be performed with a telescopic identity, and yields after a short computation \({F_\mathrm{site}}={\widetilde{u}}-1\). Similarly one sees easily from (125) that \({F_\mathrm{edge}}={\widetilde{u}}\) for this solution. This gives indeed the function \(s(\theta )=-\theta \ln \theta - (1-\theta ) \ln (1-\theta )\) for \(\theta > {\theta _\mathrm{r}}\) upon replacing in the expression of the RS thermodynamic potential (cf. Eq. (126)). In addition the cumulative distribution \(P_t\) of activation times defined in Eq. (127) coincides on this solution with the series \(x_t\) of Eq. (2) obtained as the activation time cumulative distribution of a random initial condition.
In the following we shall describe the non-trivial part of the resolution of the RS and energetic 1RSB equations in the large \(T\) limit, i.e. in the RS case the part of the curve \(s(\theta )\) for \(\theta < {\theta _\mathrm{r}}\). The cases \(l=k\) and \(l<k\) are technically rather different, we shall thus divide the discussion according to this distinction.
1.2.2 Asymptotics for \(l=k\)
As explained in Sect. 1 in the case \(l=k\) the equations on \(v_t\) decouple, these quantities become independent of \(T\) and are solutions of the recurrence \(v_{t+1}=1+\lambda v_t^k\). A straightforward study of this equation (see Fig. 14 for an illustration) reveals the existence of a critical value \(\lambda _\mathrm{c}\) such that \(v_t\) converges to a finite value when \(t\rightarrow \infty \) if \(\lambda \le \lambda _\mathrm{c}\), while it diverges when \(\lambda > \lambda _\mathrm{c}\). This critical parameter and the associated fixed-point \(v_\mathrm{c}\) of the recurrence are solution of the equations:
which are easily solved and yield \(\lambda _\mathrm{c}=\frac{(k-1)^{k-1}}{k^k}\), \(v_\mathrm{c}=\frac{k}{k-1}\).
The case \(\lambda < \lambda _\mathrm{c}\) corresponds actually to the trivial solution already discussed above, let us thus consider the alternative situation, \(\lambda > \lambda _\mathrm{c}\). The divergence of \(v_t\) is then actually very steep, with a double exponential form. Indeed when \(v_t \gg 1\) the recurrence becomes approximately \(v_{t+1} \approx \lambda v_t^k\), which reveals that \((\ln \ln v_t)/t\) converges to \(\ln k\). As \(u_0 \ge v_T\) one also has a divergence of \(u_0\) with \(T\) in this regime; from (126) (resp. (132)) this implies that the chemical potential \(\mu \) of the RS formalism (resp. the parameter \(y\) of the energetic 1RSB one) go to \(-\infty \) (resp. \(+\infty \)), i.e. that the parametric curve \(s(\theta )\) (resp. \(\Sigma _\mathrm{e}(\theta )\)) has a vertical tangent in this regime. Furthermore we shall prove now that the corresponding density \(\theta \) of initially active sites converges to \((k-1)/(2k)\) (both in the RS and energetic 1RSB cases), hence this branch corresponds to a vertical segment. This is actually a consequence of the following statement on the behaviour of the functions \({F_\mathrm{site}}\) and \({F_\mathrm{edge}}\) of Eqs. (138, 125):
as can be easily deduced from the expressions of \(\theta \) given in (123, 126) and (133), along with the divergence of \(u_0\) in the latter case. To prove the claim of Eq. (143), let us first note that, iterating (137), one obtains
where we used (135) to go from the first to the second line. We can thus write
where we introduced the sequence \(\alpha _t\) (note its independence on \(T\)) as
We also have, in terms of this series,
Using these relations, along with the representation \(u_0 = v_T + \sum _{t=0}^{T-1} (u_t -u_{t+1})\), allows to rewrite the definition of (125) as:
The sum in the denominator can be transformed by noting that, from the definition of \(\alpha _t\), \(\alpha _t /v_t = \alpha _t - \alpha _{t-1}\). This yields
Notice now that \(\alpha _t\) has a finite limit when \(t \rightarrow \infty \), thanks to the divergence of \(v_t\) (for the limit of \(\alpha _t\) to exists it is actually enough that \(v_t \gg t\)). Hence the summations in the above equation converge when \(T\rightarrow \infty \) thanks to the exponentially decaying factor \(1/k^t\), and all other terms in the numerator and denominator are neglectible in this limit. This proves the limit \(2k/(k-1)\) for \({F_\mathrm{edge}}\) (one could also compute the main correction, of order \(k^{-T}\), from this expression). The statement on \({F_\mathrm{site}}\) is proved with similar manipulations, that brings from (138) to the expression (exact for all \(T\)),
As above the limit \(T\rightarrow \infty \) can now be taken safely, the converging summations being the only non-vanishing terms of the numerator and denominator, hence the convergence of \({F_\mathrm{site}}\) to \((k+1)/(k-1)\), with corrections of order \(k^{-T}\). These corrections actually contribute to the non-trivial dependence on \(\lambda \) of \(s\) and \(\Sigma _\mathrm{e}\) (which are both finite) in this regime; we did not push their determination further, and merely observe here that their order \(k^{-T}\) explains the statement on the finite \(T\) corrections to \({\theta _\mathrm{min}}\) for \(k=l=2\) and \(k=l=3\) made in Sect. 4.2.1.
We have just seen that in the \(T \rightarrow \infty \) limit the cases \(\lambda < \lambda _\mathrm{c}\) and \(\lambda > \lambda _\mathrm{c}\) describe, respectively, the trivial branch \(\theta > {\theta _\mathrm{r}}\) of the RS entropy and its vertical segment at \({\theta _\mathrm{r}}/2\). To describe the range \([{\theta _\mathrm{r}}/2,{\theta _\mathrm{r}}]\) of non-trivial densities of initially active sites one has thus to investigate a regime where \(\lambda \) is in a \(T\)-dependent scaling window around \(\lambda _\mathrm{c}\).
Let us denote \({\widetilde{v}}_t\) the solution of the recursion right at the critical point, i.e. \({\widetilde{v}}_{t+1}= 1 + \lambda _\mathrm{c}{\widetilde{v}}_t^k\), with \({\widetilde{v}}_0=1\). This series converges to \(v_\mathrm{c}\), with an asymptotic behaviour which is easily found to be
Now if \(\lambda = \lambda _\mathrm{c} + \delta \), with an infinitesimal positive value of \(\delta \), the solution \(v_t\) of the recursion \(v_{t+1}=1+\lambda v_t^k\) spends a time of order \(\delta ^{-1/2}\) around the avoided fixed-point \(v_\mathrm{c}\) before crossing over to the doubly exponentially growing regime investigated above (this is a general feature of such recursive equations in the neighbourhood of a bifurcation, see for instance [22]). It is thus natural to investigate the scaling window parametrized by \({\widehat{\lambda }}\) as
the numerical prefactor and the square on \({\widehat{\lambda }}\) being chosen to simplify the following expressions. One can then look for a solution of the recurrence equation under the form \(v_t = v_\mathrm{c} + \frac{1}{T} V(t/T)\), with \(V(s)\) a scaling function. Expanding at the leading order in \(T\) one obtains a differential equation on \(V\),
The latter can be integrated into
the constant in the solution of the differential equation being obtained by a matching argument between the regime \(s\rightarrow 0\) and the large \(t\) asymptotics of the critical series \({\widetilde{v}}_t\) given in (152). Note that this form is only valid for \({\widehat{\lambda }}< 1\), otherwise one enters the regime where \(v_T\) diverges with \(T\). One can furthermore assume a similar scaling ansatz for the \(u_t\), introducing a scaling function \(U(s)\) under the form \(u_t = v_\mathrm{c} + U(t/T)\). Inserting these forms in Eq. (137) yields a differential equation on \(U\),
which is integrated in
with \(A\) and \(B\) two constants of integration. These can be fixed by imposing the boundary conditions \(u_T=v_T\) and \(u_{T+1}=v_{T-1}\), which translates here in \(U(1)=V(1)/T\) and \(U^{\prime }(1)=-V^{\prime }(1)/T\). Solving these equations yield \(A\) and \(B\); considering in particular \(u_0=v_\mathrm{c} + U(0)\) one obtains, at the leading order in a large \(T\) expansion,
One realizes at this point that for any fixed \({\widehat{\lambda }}< 1\), the limit of \(u_0\) coincides with \(v_\mathrm{c}\), in other words we are describing in this regime the end of the trivial branch, with \(\theta \approx {\theta _\mathrm{r}}\). To describe the non-trivial regime of densities \([{\theta _\mathrm{r}}/2,{\theta _\mathrm{r}}]\) one has thus to further refine the scaling window, taking now \({\widehat{\lambda }}\) approaching \(1\) in a \(T\)-dependent way. The inspection of (158) reveals that the correct scaling that allows to obtain a non-trivial limit of \(u_0\) corresponds to \({\widehat{\lambda }}= 1 - O(T^{-1/4})\). We shall thus set
with \({\widetilde{\lambda }}>0\) the new parameter describing this scale, the numerical prefactor being chosen for convenience. After a short computation one obtains the limit as \(T \rightarrow \infty \) of the thermodynamic quantities in this scaling regime of \(\lambda \) as
the last two expressions being obtained by inserting the scaling ansatz on \(u_t\) and \(v_t\) in the definitions (125, 138); at the lowest order one can actually replace the \(v_t\)’s by \(v_\mathrm{c}\) there. This yields a parametric representation of the thermodynamic quantities of the RS (resp. energetic 1RSB) formalism in terms of \({\widetilde{\lambda }}\), by inserting these last results in Eq. (126) (resp. (132, 133)). In the RS case one can check that \({\widetilde{\lambda }}\rightarrow 0\) corresponds to \(\theta \rightarrow {\theta _\mathrm{r}}/2\), while \({\widetilde{\lambda }}\rightarrow \infty \) yields \(\theta \rightarrow {\theta _\mathrm{r}}\), hence this scaling regime allows to cover the desired range \([{\theta _\mathrm{r}}/2,{\theta _\mathrm{r}}]\) for the densities of initially active sites. It is furthermore possible to invert the relation \(\theta ({\widetilde{\lambda }})\), which yields finally the formula (82) announced in the main text for the entropy of activating initial configurations of density in the non-trivial interval \([{\theta _\mathrm{r}}/2,{\theta _\mathrm{r}}]\). In the energetic 1RSB case this last step does not seem possible and the final result (84) is presented in a form parametrized by \({\widetilde{\lambda }}\). We did not embark in a systematic study of the finite \(T\) corrections in this regime, it is however clear that they are polynomially small in \(T\), which justifies the statement made in Sect. 4.2.1 on the corrections to \({\theta _\mathrm{min}}(T)\) for \(k=l\ge 4\).
Let us finally justify the results presented at the end of Sect. 4.2.1 on the distribution of activation times. Assuming a finite value of \(t\), the expression of (139) becomes in the regime parametrized by \({\widetilde{\lambda }}\):
the last summation in (139) yielding a subdominant correction of order \(1/T\). Note that \({F_\mathrm{site}}({\widetilde{\lambda }},t)\) tends to \({F_\mathrm{site}}({\widetilde{\lambda }})\) as \(t\rightarrow \infty \), which means that the support of the distribution of the activation times does not scale with \(T\) in this regime. The expression (92) for the cumulative distribution of activation times follows then easily from its generic definition given in Eq. (127), upon expressing all the quantities depending on \({\widetilde{\lambda }}\) as a function of the corresponding \(\theta \). In the main text we introduced for clarity the series \(w_t = {\theta _\mathrm{r}}{\widetilde{v}}_t\), to allow for an easier comparison with the distribution of activation times from a random initial condition.
1.2.3 Asymptotics for \(l<k\)
Let us now discuss the solution of the set of Eqs. (119, 120) in the limit \(T\rightarrow \infty \), in the case \(l<k\), and justify the statements made in Sect. 4.2.2; as we shall see their behaviour and the method of study is qualitatively different compared to the case \(l=k\).
We shall first rephrase Eqs. (119, 120) as a single recursive equation, by introducing a four-dimensional vector \(w_t\) defined by
The recursive equations (119, 120) on the \(u_t\)’s and \(v_t\)’s become a single recursion on \(w_t\), of the form \(w_{t+1} = R(w_t)\) where the function \(R\) is given by
The function \(S\) was defined in (121), while \(E(u,u_+,v)\) is given implicitly as \(D(E(u,u_+,v),v)=D(u_+,v)+u_+-u\), with the function \(D\) of (121). Inverting this relation one obtains an explicit expression of \(E\):
We have thus a representation of the time evolution of \(w\) as the flow of a discrete dynamical system in a four-dimensional space. The boundary conditions on the \(u_t\)’s and \(v_t\)’s translate into conditions on the allowed values of \(w_0\) and \(w_T\). The former must indeed lie in the two-dimensional manifold with \(v=1\) and \(v_-=0\), while the latter is restricted to the two-dimensional manifold defined by \(u=v\) and \(u_+=v_-\). When \(T\rightarrow \infty \), for a fixed value of \(\lambda \), the solution \(w_t\) of the recursion \(w_{t+1}=R(w_t)\) must find a way to go infinitely slowly from the first manifold at \(t=0\) to the second one at \(t=T\rightarrow \infty \). It must in consequence remains as close as possible to the fixed points of the evolution map \(R\).
The study of the equation \(w=R(w)\) is very simple and shows that these fixed points span the two-dimensional subspace with \(u=u_+\), \(v=v_-\). One can then compute the Jacobian matrix of \(R\) on such a fixed-point, and realizes that this matrix has two eigenvalues equal to 1 (corresponding to the invariance of the fixed-point subspace under \(u\rightarrow u+\delta u\) and \(v \rightarrow v+\delta v\)), and two eigenvalues \(C(u,v)\) and \(1/C(u,v)\), where \(C\) is the function defined in (122). All the fixed points have thus an unstable direction, except the one-dimensional set of fixed points obeying the further condition \(C(u,v)=1\), which constitutes a line of marginal fixed points. In the \(T\rightarrow \infty \) limit the solution \(w_t\) is thus expected to remain close to this line, otherwise the flow along the unstable directions forbid to go from one boundary manifold at \(t=0\) to the other one at \(t=T \gg 1\). This analysis is corroborated by the numerical results presented in Fig. 15, where we show the solution \(u_t,v_t\) determined numerically for some large but finite value of \(T\). In particular the right panel demonstrate that for most values of \(t\) (i.e. excluding both \(t\) finite and \(T-t\) finite in the large \(T\) limit), the couple \((u_t,v_t)\) falls on the marginal fixed-point line \(C(u,v)=1\).
More precisely, the solution \(u_t,v_t\) can be described in the large \(T\) limit by two scaling functions \(U(s)\) and \(V(s)\), function of a rescaled time \(s=t/T \in ]0,1[\), such that at the leading order,
Inserting this ansatz in the Eqs. (119, 120), one realizes that the condition \(C(U(s),V(s))=1\), that we obtained intuitively above, is indeed precisely what is needed to enforce (119, 120) at the leading order in the large \(T\) limit. Note that the explicit dependency of \(U\) and \(V\) on \(s\) can be determined from the sub-dominant corrections in this limit; however we shall not need it in what follows. It will indeed be enough to compute the value of \(U\) and \(V\) for \(t\) small and \(t\) close to \(T\), i.e. for \(s\) around 0 and 1. As revealed by the numerical data presented in Fig. 15, the matching between the scaling regime described by the functions \(U,V\) (i.e. for \(s\) strictly between 0 and 1) and the boundary conditions at \(t=0\) and \(t=T\) affects the series \(v_t\) but not \(u_t\). In other words, for \(t\) finite while \(T\rightarrow \infty \) one has \(u_t\rightarrow u_*=U(0)\) independently of \(t\), where \(u_*\) is some (\(\lambda \) dependent) constant still to be determined, while \(v_t\) converges to the solution of the recursion \(v_{t+1}=v_t+S(u_*,v_t)-S(u_*,v_{t-1})\) obtained from (120) by replacing \(u_t\) by its limit \(u_*\). Equivalently one has in this regime \(v_{t+1}=1+S(u_*,v_t)\). When \(t\rightarrow \infty \) (after the large \(T\) limit) this series \(v_t\) converges to \(v_*=V(0)\), the smallest fixed-point solution of this recursion on \(v\); for this behaviour to match the beginning of the scaling regime (i.e. \(s\rightarrow 0\)) one must impose simultaneously
The first equation allows to express \(u_*\) as a function of \(v_*\); replacing in the second one leads to the single equation on \(v_*\) given in Eq. (96), while (97) is nothing but an explicit version of the condition \(C(u_*,v_*)= 1\). A similar reasoning in the regime \(T-t\) finite reveals that \(U(1)={\widehat{u}}\) and \(V(1)={\widehat{v}}\) have to obey
It is easy to check that the expressions of \({\widehat{u}}\) and \({\widehat{v}}\) given in (95) are indeed solutions of these two equations, using the equations on \({\theta _\mathrm{r}}\) and \({\widetilde{x}_\mathrm{r}}\) of Eq. (4). By definition for \(\lambda \in ]0,\lambda _\mathrm{r}]\) one has \(u_* \ge {\widehat{u}}\ge {\widehat{v}}\ge v_*\), see Fig. 16 for a representation of the solution of the Eqs. (166, 167) as a function of \(\lambda \). In \(\lambda _\mathrm{r}\), where one recovers the trivial solution studied in Appendix section “The Trivial Solution”, one has \(u_*={\widehat{u}}=1/{\theta _\mathrm{r}}\) and \(v_*={\widehat{v}}={\widetilde{x}_\mathrm{r}}/{\theta _\mathrm{r}}\).
Let us now deduce the value of \({F_\mathrm{site}}\) and \({F_\mathrm{edge}}\) in the large \(T\) limit from the above characterization of the behaviour of the \(u_t\)’s and \(v_t\)’s. From Eq. (125) one has in this limit
the matching regimes of \(t\) finite and \(T-t\) finite having neglectible contributions to the summation. The integral above can be computed even if we have not determined the time-dependency of the scaling functions \(U(s)\) and \(V(s)\): using \(\mathrm{d}s \, U^{\prime }(s) = \mathrm{d}u\) and the condition \(C(U(s),V(s))=1\), one has
where \(u(v)\) (resp. \(v(u)\)) is the solution of \(C(u(v),v)=1\) (resp. \(C(u,v(u))=1\)). The equation \(C(u(v),v)=1\) can be explicitly solved into
This allows to compute the integral in (169) and to obtain (99).
We shall now compute similarly the limit of \({F_\mathrm{site}}\) that was defined in Eq. (124). In that equation we shall exploit the fact that \(u_t-u_{t+1}\) is of order \(1/T\) to perform the approximation
Within this approximation the first term leads to a telescopic summation, we then get
As \(u_T=v_{T-1}+O(1/T)\) in the first summation only the term \(p=k+1\) survives; the second term can be rearranged as above in terms of integrals of the scaling functions, namely
Inserting the expression of \(u(v)\) given in Eq. (170) yields easily to the value of \({F_\mathrm{site}}\) written in (98). The parametric representations of \(s(\theta )\) and \(\Sigma _\mathrm{e}(\theta )\) given in Sect. 4.2.2 are then direct consequences of Eqs. (126, 132, 133).
For what concerns the distribution of activation times, one has in the regime \(t=sT\) with \(s\in ]0,1[\) the following limit for the function \({F_\mathrm{site}}\) defined in (128):
Studying the limit \(s\rightarrow 0+\) and \(s\rightarrow 1^-\) of this expression leads to the expressions (104) for the fraction of vertices which activate at the very beginning and at the very end of the process.
Rights and permissions
About this article
Cite this article
Guggiola, A., Semerjian, G. Minimal Contagious Sets in Random Regular Graphs. J Stat Phys 158, 300–358 (2015). https://doi.org/10.1007/s10955-014-1136-2
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10955-014-1136-2