1 Introduction

1.1 Context

Personalized PageRank is an algorithm to assign a score to the vertices of a network, indicating their importance. The algorithm is one of the great successes in network science. It has been used in a wide amount of applications that include ranking of websites (Page et al. 1999; Ding et al. 2003; Haveliwala 2003), clustering of similar objects (Chung 2009; Andersen et al. 2006; Tabrizi et al. 2013), classifying vertices of a network (Avrachenkov et al. 2012; Merkurjev et al. 2018; Dostal et al. 2014; Avrachenkov et al. 2012), detection of anomalous events (Fontugne et al. 2019; Yoon et al. 2019; Yao et al. 2012), or recommender systems (Al Janabi and Kadiam 2019; Zhang et al. 2009; Nguyen et al. 2015), to name a few. Additionally, personalized PageRank possesses numerous properties (Langville and Meyer 2004; Ipsen and Wills 2006; Brezinski and Redivo-Zaglia 2006; Pretto 2002) that have recently inspired, in the semi-supervised learning context, a large family of PageRank generalizations that allow to tackle challenging settings such as signed graphs (Bautista et al. 2019), anomalous diffusion processes (De Nigris et al. 2017), infinite dimensional spaces (Mai and Couillet 2017), Sobolev spaces (Zhou and Belkin 2011), or time-graph dual spaces (Girault et al. 2014). However, despite the success of all these PageRank-based algorithms, they still pose problems from the algorithmic point of view. In particular, they do not adapt well to networks that evolve over time, a situation that we aim to address in this work.

1.2 Related works

While personalized PageRank is formally defined as the solution of a fixed point equation (see Section 2.1), it can be equivalently interpreted as the stationary distribution of a random walk diffusion process with restart. In this process, random walkers are placed in a subset of vertices of the network according to some initial distribution, given by the so-called personalization vector, and then are diffused according to the following rule: at each step, a random walker can either transition to an adjacent vertex with probability \(\alpha\) or restart to its initial position with probability \((1 - \alpha )\) (Haveliwala 2003; Page et al. 1999). This interpretation of Personalized PageRank has been exploited by numerous works to develop fast algorithms for computing PageRank (Haveliwala 1999; Kamvar et al. 2004; Fujiwara et al. 2012; Bahmani et al. 2011; Maehara et al. 2014; Avrachenkov et al. 2007; Andersen et al. 2006; Berkhin 2006). However, in cases where the graph evolves over time, such stationary distribution drifts and needs to be updated. While the algorithms above allow to fastly recompute a personalized PageRank vector, proceeding in this way becomes impractical for very large networks that constantly evolve, such as the web graph that has been reported to have \(60\times 10^{12}\) nodes and evolve at a rate of \(600\times 10^3\) new pages created every second (Ohsaka et al. 2015). Additionally, several works (Ipsen and Wills 2006; Brezinski and Redivo-Zaglia 2006; Pretto 2002) have studied the impact of graph perturbations to PageRank vectors, showing that (i) the magnitude of the change to a PageRank vector is upper bounded by the size of the perturbation (magnitude of the change to the graph transition matrix); and (ii) the PageRank scores only change significantly for the nodes that are close to the perturbed area. This implies that if a perturbation is small, then it is wasteful to recompute the PageRank vector from scratch. A better alternative consists in only updating the scores of nodes close to the perturbation. In the literature, this challenge is commonly referred to as local PageRank updating. There are three reference methods. Firstly, (Bahmani et al. 2010) proposes a Monte Carlo approach where multiple random walkers are run to estimate the entries of the PageRank vector based on how frequently walkers visit nodes. Secondly, the work of (Yoon et al. 2018) exploits the random walk with restart interpretation of personalized PageRank to show that the stationary distribution of this walk can be updated by running a local diffusion process in the affected area. Then, (Yoon et al. 2018) runs this local process in a distributed fashion via the power method. Thirdly, the work of (Ohsaka et al. 2015) proposes a centralized updating algorithm based on the use of a residual and an approximation vector, sequentially pushing mass from the residual vector into the approximation vector using Gauss-Southwell update rules. While these works have been subject of deep theoretical studies (Zhang et al. 2016) and successfully applied in practice (Yoon et al. 2020), they still suffer from two main limitations: (i) they have slow convergence rates, particularly when trying to attain very small approximation errors; and (ii) they fully rely on the random walk interpretation of PageRank and thus cannot be used to update the novel generalizations of PageRank used in semi-supervised learning (Bautista et al. 2019; De Nigris et al. 2017; Mai and Couillet 2017; Zhou and Belkin 2011; Girault et al. 2014) which rely on more complex dynamical processes.

1.3 Goals, contributions and outline

In this work, we address the two limitations listed above. We propose a novel local updating algorithm based on Chebyshev polynomials. These polynomials have already been used to fastly approximate personalized PageRank vectors from scratch (Bautista Ruiz 2019), showing faster convergence speed than Gauss-Southwell and power iteration based methods (the building blocks of Ohsaka et al. (2015) and Yoon et al. (2018), respectively). However, Chebyshev polynomials have not yet been considered in the updating setting because they lack the ability to set an initial guess, which all the updating algorithms described above rely upon. Thus, the core of our proposal relies on a novel updating equation that is tailored for the large family of PageRank methods mentioned above. Concretely, it allows us to (i) use a previous personalized PageRank vector as an initial guess in the Chebyshev context; (ii) cast the updating challenge as the task of running a local diffusion process that we can efficiently compute (i.e. with just a few message exchanges) via the Chebyshev polynomials; (iii) update any of the recent generalized formulations of PageRank (Bautista et al. 2019; De Nigris et al. 2017; Mai and Couillet 2017; Zhou and Belkin 2011; Girault et al. 2014) among which classical personalized PageRank arises as a special case.

The paper is organized as follows: Section 2 sets definitions and reviews local updating methods. Section 3 presents Chebyshev polynomials and introduces the proposed algorithm. Section 4 numerically evaluates the algorithm. Section 5 concludes the work.

2 Definitions and state-of-the-art

2.1 Definitions

Let \({\mathcal {G}} ({\mathcal {V}}, {\mathcal {E}}, W)\) be an undirected weighted graph where W refers to the adjacency matrix, \({\mathcal {E}}\) is the set of edges and \({\mathcal {V}}\) is the set of vertices. By \(D = diag(d_1, \cdots , d_N)\) we denote the diagonal matrix of degrees, where \(d_i = \sum _{j}W_{ij}\). The matrices \(L = D - W\) and \(P = D^{-1}W\) refer to the combinatorial Laplacian and random walk transition matrices of \({\mathcal {G}}\), respectively. The notation \(u \sim v\) implies that u is adjacent to v. Additionally, we denote the personalized PageRank vector of \({\mathcal {G}}\) by \(\text {pr}_{\alpha } (y)\). Recalling the interpretation of PageRank as a random walk process with restart, the parameter \(\alpha\) refers to the restarting probability of the random walkers (also called teleportation parameter) and y is a probability vector with the initial distribution of the random walkers. The personalized PageRank vector is thus defined as the solution to the following fixed point equation

$$\begin{aligned} \text {pr}_{\alpha } (y) = (1-\alpha ) y + \alpha P^T \text {pr}_{\alpha }(y). \end{aligned}$$
(1)

Let \(\widetilde{{\mathcal {G}}}(\widetilde{{\mathcal {V}}}, \widetilde{{\mathcal {E}}}, {\widetilde{W}} )\) denote an undirected weighted graph which is an evolved or perturbed version of \({\mathcal {G}}\). In analogy to the definitions for \({\mathcal {G}}\), the matrices \({\widetilde{D}}\), \({\widetilde{L}}\) and \({\widetilde{P}}\) refer to the degree, Laplacian and random walk transition matrices of \(\widetilde{{\mathcal {G}}}\), respectively. Additionally, we let \({\widetilde{\text {pr}}}_{\alpha }(y)\) be the personalized PageRank vector of \(\widetilde{{\mathcal {G}}}\) with restarting probability \(\alpha\) and initial distribution y. In this paper, we address the problem of estimating \({\widetilde{\text {pr}}}_{\alpha }(y)\) from \(\text {pr}_{\alpha }(y)\), which can be seen as the problem of updating \(\text {pr}_{\alpha }(y)\) to obtain the PageRank of the evolved network (under the same \(\alpha\) and y). To have consistent sized matrices, we model nodes joining/leaving the network as isolated nodes that get connected/disconnected. By definition, isolated nodes correspond to zero rows and columns in the graph matrices and their degree and inverse degree are zero.

2.2 State-of-the-art approaches for local PageRank updating

In this section, we briefly review the state-of-the-art methods of Yoon et al. (2018) and Ohsaka et al. (2015) for local PageRank updating. We focus on these two approaches because they are deterministic methods that can update a PageRank vector up to any desired accuracy; hence, they directly compare to our proposed algorithm that we describe in Section 3.

Random Walk with Restart (RWR) (Yoon et al. 2018). This work is rooted in the power iteration method, which guarantees that the recursive formula

$$\begin{aligned} {\widetilde{p}}^{(t)} = (1-\alpha ) y + \alpha {\widetilde{P}}^T {\widetilde{p}}^{(t-1)} \end{aligned}$$
(2)

converges to \({\widetilde{\text {pr}}}_\alpha (y)\) when \(t \rightarrow \infty\). Since it is guaranteed that a PageRank vector does not substantially change under small perturbations, the authors of Yoon et al. (2018) exploit the fact that Eq. (2) offers the possibility to set an initial guess \({\widetilde{p}}^{(0)}\), which they set to \({\widetilde{p}}^{(0)} = \text {pr}_\alpha (y)\). This procedure of using \(\text {pr}_\alpha (y)\) as an initial guess to compute \({\widetilde{\text {pr}}}_\alpha (y)\) is known in the literature as a warm restart. Clearly, the warm restart sets the trajectory of the recursive equation (2) very close to \({\widetilde{\text {pr}}}_\alpha (y)\) from the start, thus helping to drastically reduce the number of iterations needed to obtain very precise approximations. Notice however that, even though this reduces the number of iterations, the initial guess \({\widetilde{p}}^{(0)}\) is a dense vector (most entries are nonzero). As a result, all the vertices in the graph must be involved in the computation of the new PageRank vector, albeit only the nodes of the perturbed area must be updated. The authors of Yoon et al. (2018) address this issue by showing that (2), under a warm restart, can be rewritten as:

$$\begin{aligned} {\widetilde{\text {pr}}}_\alpha (y) = \text {pr}_\alpha (y) + \frac{1}{(1-\alpha )} {\widetilde{\text {pr}}}_{\alpha }(r), \end{aligned}$$
(3)

where \(r = \alpha [ {\widetilde{P}}^T - P^T ] \text {pr}_{\alpha }(y)\). Eq. (3) shows that updating an existing PageRank vector amounts to computing another PageRank vector with an initial seed r that is completely localized (only nonzero) in the 1-hop vicinity of the perturbed nodes. The vector \({\widetilde{\text {pr}}}_{\alpha }(r)\) can then simply computed by setting \(y = r\) and \({\widetilde{p}}^{(0)} = r\) in (2), showing that running the recursion for a few iterations is essentially equivalent to locally diffusing r to the affected nodes and updating their values.

Push method (Ohsaka et al. 2015). This work is rooted in the Gauss-Southwell recursive formula which uses two vectors to approximate \({\widetilde{\text {pr}}}_\alpha (y)\): an approximation vector \({\widetilde{p}}\) and a residual vector \({\widetilde{r}}\) coding the difference between \({\widetilde{\text {pr}}}_{\alpha }(y)\) and \({\widetilde{p}}\). The method starts with an initial guess \({\widetilde{p}}^{(0)}\) and then, at iteration t, transfers mass from \({\widetilde{r}}^{(t -1)}\) into \({\widetilde{p}}^{(t)}\) using a set of update equations which ensure that the following equation holds

$$\begin{aligned} {\widetilde{\text {pr}}}_\alpha (y) = {\widetilde{p}}^{(t)} + \frac{1}{1- \alpha } {\widetilde{\text {pr}}}_\alpha ({\widetilde{r}}^{(t)}). \end{aligned}$$
(4)

Clearly, minimizing the entries of \({\widetilde{r}}\) implies that \({\widetilde{p}}\) converges to \({\widetilde{\text {pr}}}_\alpha (y)\). To attain this, if at iteration t the largest entry of \({\widetilde{r}}^{(t)}\) corresponds to vertex u, then the state of the algorithm at iteration \(t+1\) is determined by the following set of update equations (\(\delta _u\) is the indicator vector of vertex u):

$$\begin{aligned} {\widetilde{p}}^{(t+1)}&= {\widetilde{p}}^{(t)} + {\widetilde{r}}^{(t)}_u \delta _u. \end{aligned}$$
(5)
$$\begin{aligned} {\widetilde{r}}^{(t+1)}&= {\widetilde{r}}^{(t)} - {\widetilde{r}}^{(t)}_u \delta _u + \alpha {\widetilde{P}}^T {\widetilde{r}}^{(t)}_u \delta _u. \end{aligned}$$
(6)

This procedure is repeated until all entries from the residual diminish below some user-defined threshold. The authors of Ohsaka et al. (2015) show that Eq. (4) still holds when setting \({\widetilde{p}}^{(0)} = \text {pr}_{\alpha }(y)\) and \({\widetilde{r}}^{(0)} = \alpha [ {\widetilde{P}}^T - P^T ]\text {pr}_{\alpha }(y)\). While \({\widetilde{r}}^{(0)}\) is similar to the residual of the RWR method, in the push method, it is used differently: the locality of \({\widetilde{r}}^{(0)}\) implies that only a few entries can surpass the tolerance threshold, meaning that only a few push operations are needed to drive them below the threshold again and obtain a good approximation of the evolved PageRank vector.

3 Proposed method

3.1 PageRank computation via Chebyshev polynomials

In this subsection, we present the Chebyshev polynomials, which are a general technique to approximate matrix functions and form the basis of our algorithm derived in Section 3.2. As discussed in the introduction, the two main drawbacks of current local PageRank updating algorithms are that (i) they are slow to converge; and (ii) they are not adapted to update the novel generalizations of personalized PageRank which rely on more complex dynamical processes than random walks. Notably, the Chebyshev polynomials carry the potential to address these limitations because (i) they can be used to approximate the effect of general diffusion operators, thus covering the recent generalizations of PageRank in the literature; and (ii) they have been shown to converge faster than methods based on power-iteration and Gauss-Southwell rules when computing standard personalized PageRank from scratch (Bautista Ruiz 2019).

figure a

In the context of signal processing on graphs, the Chebyshev polynomials were introduced in Shuman et al. (2011) as a mean to approximate functions of a graph matrix, achieving considerable success in the contexts of graph signal filtering (Cheng et al. 2019; Tseng and Lee 2021; Tian et al. 2014) and graph neural networks (Defferrard et al. 2016; Yan et al. 2021). They operate as follows: let \({\mathcal {R}}\) denote a graph matrix and \(h({\mathcal {R}})\) be a function of it, which is defined as the function being applied to its eigenvalues (spectrum). Then, Shuman et al. (2011) shows that \(h({\mathcal {R}})\) can be approximated by means of the truncated series:

$$\begin{aligned} h({\mathcal {R}}) \approx \frac{1}{2}c_0 + \sum _{t = 1}^K c_t {\bar{T}}_t({\mathcal {R}}), \end{aligned}$$
(7)

where

$$\begin{aligned} {\bar{T}}_t({\mathcal {R}}) = {\left\{ \begin{array}{ll} 1, &{} t = 0 \\ \frac{{\mathcal {R}} - \phi }{\phi }, &{} t = 1 \\ 2\left( \frac{{\mathcal {R}} - \phi }{\phi }\right) {\bar{T}}_{t-1} - {\bar{T}}_{t-2}, &{} t \ge 2 \end{array}\right. } \end{aligned}$$
(8)

\(\phi = \lambda _{max}/2\), \(c_t = \frac{2}{\pi } \int _0^{\pi } cos(t \theta ) h(\phi (cos(\theta )+1) ) d\theta\), and \(\lambda _{max}\) is an upper bound on the spectrum of \({\mathcal {R}}\). Eq. (7) is known as the Chebyshev polynomial approximation of \(h({\mathcal {R}})\). The strength of the Chebyshev approximation is its coefficients \(c_t\): they are optimal in the \(\infty\)-norm sense (Tremblay et al. 2018), meaning that for a given computational budget K they approximate \(h({\mathcal {R}})\) with the least \(\infty\)-norm error. In contradistinction, the coefficients of Power Iteration are ruled by the spectral gap of \({\mathcal {R}}\), resulting in a polynomial that requires a large K for an equally good approximation (thus, it converges slowly). While Gauss-Southwell is not a polynomial expression, its convergence speed is inversely proportional to the target error, thus being too penalizing for small errors. Another asset of (7) is that it allows to approximate the result of multiplying \(h({\mathcal {R}})\) with a vector y in a distributed fashion. To see this, let us recall that y can be interpreted as a signal that lives on the vertices of the graph encoded by \({\mathcal {R}}\). Thus, if nodes are given communication and computation capabilities, each node can compute its own value of the matrix-vector product \(h({\mathcal {R}}) y\) by transmitting and receiving messages (with the values of y) to and from their neighbors. This distributed algorithm is detailed in Algorithm 1.

It is important to stress that even though (Bautista Ruiz 2019) shows that Chebyshev polynomials can converge to personalized PageRank vectors significantly faster than power iteration and Gauss-Southwell methods from scratch, they have not been considered in the updating scenario because they do not offer the possibility to give them an initial guess.

3.2 Local PageRank updating via Chebyshev polynomials

Table 1 Possible choices of the reference operator \({\mathcal {R}}\) reported in the semi-supervised learning literature

In this subsection, we detail our main contribution: an algorithm based on Chebyshev polynomials that allows to locally update a large family of PageRank formulations. We start by noticing that classical personalized PageRank and its novel generalizations used in semi-supervised learning (Bautista et al. 2019; De Nigris et al. 2017; Mai and Couillet 2017; Zhou and Belkin 2011; Girault et al. 2014) can all be framed under one general equation where the only difference among methods lies in the matrix encoding the graph. To show this, let us introduce the change of variable \(\alpha = \frac{1}{\mu + 1}\). From Eq. (1), it is easy to see that personalized PageRank can be rewritten as the solution to: \(LD^{-1}\text {pr}_{\mu }(y) + \mu \text {pr}_{\mu }(y) = \mu y\), where \(LD^{-1} = ({\mathbb {I}} - P^T)\) is the so-called random walk Laplacian. This expression, which is a discrete partial differential equation, implies that classical PageRank can alternatively be interpreted as the equilibrium state of a dynamical process in which the dynamics are ruled by the operator \(LD^{-1}\). Notably, several of the novel generalizations of personalized PageRank admit the same interpretation, allowing their solutions to be expressed in the following general form:

$$\begin{aligned} {\mathcal {R}}\text {pr}_{\mu }(y) + \mu \text {pr}_{\mu }(y) = \mu y, \end{aligned}$$
(9)

where \({\mathcal {R}}\) denotes a generalized reference operator associated to graph \({\mathcal {G}}\). Thus, the only difference among several personalized PageRank generalizations is the choice of the operator \({\mathcal {R}}\). In Table 1, we list some of the possible choices of \({\mathcal {R}}\) and the methods associated to them.

Clearly, the advantage of Eq. (9) is that any algorithm that we derive based on \({\mathcal {R}}\) automatically covers a large family of PageRank methods. Therefore, the updating algorithm we propose in this work is an algorithm to update the solution of Eq. (9). To derive our updating algorithm, we start by noticing that the solution of Eq. (9), for an evolved graph \(\widetilde{{\mathcal {G}}}\), can be expressed as a matrix function of \(\widetilde{{\mathcal {R}}}\) in the following way

$$\begin{aligned} {\widetilde{pr}}_{\mu }(y) = \mu \left( \widetilde{{\mathcal {R}}} + \mu {\mathbb {I}}\right) ^{-1}y, \end{aligned}$$
(10)

where the function is thus given by

$$\begin{aligned} h(\widetilde{{\mathcal {R}}}) = \mu \left( \widetilde{{\mathcal {R}}} + \mu {\mathbb {I}}\right) ^{-1}. \end{aligned}$$
(11)

While Eq. (10) can be leveraged to efficiently compute \({\widetilde{pr}}_{\mu }(y)\) from scratch (for instance via the Chebyshev polynomials), it is not useful in an updating scenario because it does not allow to set an initial guess. Therefore, our first goal is to derive a recursive equation that converges to (9), which we can then use to set \(\text {pr}_{\mu }(y)\) as an initial guess. We stress that the natural approach of developing Eq. (10) in its geometric series is not useful for our purposes, as such approach results in a recursive expression \({\widetilde{p}}^{(t)} = y + \frac{1}{\mu } \widetilde{{\mathcal {R}}} {\widetilde{p}}^{(t-1)}\) that only converges to \({\widetilde{\text {pr}}}_{\mu }(y)\) when \(\mu > \lambda _{max}\). To obtain a recursive equation that converges for all \(\mu > 0\), hence for all \(\alpha \in (0, 1]\), we perform two changes: firstly, we transform operator \(\widetilde{{\mathcal {R}}}\) into an equivalent operator \(\widetilde{{\mathcal {S}}}\) that has eigenvalues normalized to the range \([-1, 1]\); secondly, we apply the same transformation to \(h(\widetilde{{\mathcal {R}}})\) in order to find its equivalent \(h(\widetilde{{\mathcal {S}}})\). The operator \(\widetilde{{\mathcal {S}}}\) can be simply obtained by means of the following transformation:

$$\begin{aligned} \widetilde{{\mathcal {S}}} = \left( 2/\lambda _{max}\right) \widetilde{{\mathcal {R}}} - {\mathbb {I}}. \end{aligned}$$
(12)

Now, let us refer to the diagonal matrices of eigenvalues of \(\widetilde{{\mathcal {R}}}\) and \(\widetilde{{\mathcal {S}}}\) by \(\Lambda\) and s, respectively. Then, by expressing (11) in terms of \(\Lambda\) and applying the same transformation to \(\Lambda\) as in (12), we obtain the expression for h(s), and hence \(h(\widetilde{{\mathcal {S}}})\), as follows:

$$\begin{aligned} h(\Lambda )&= \frac{\mu }{\Lambda + \mu } \end{aligned}$$
(13)
$$\begin{aligned}&= \frac{\mu }{ \left( \lambda _{max}/2\right) (s + 1) + \mu } \end{aligned}$$
(14)
$$\begin{aligned}&= \frac{2 \mu / \lambda _{max} }{s + 1 + \left( 2\mu / \lambda _{max}\right) } = h(s), \end{aligned}$$
(15)

where, for the sake of clarity, we have expressed matrix inversion in the form of division. The normalized spectrum of \(\widetilde{{\mathcal {S}}}\) allows us to then develop the geometric series of \(h(\widetilde{{\mathcal {S}}})\) to obtain the following recursive expression that converges for all \(\mu > 0\):

$$\begin{aligned} h(\widetilde{{\mathcal {S}}}) = \left( \frac{2\mu }{2\mu + \lambda _{max}} \right) \sum _{t = 0}^{\infty } \left( { - \frac{\lambda _{max}}{2\mu + \lambda _{max}}} \right) ^t \widetilde{{\mathcal {S}}}^t \end{aligned}$$
(16)

Eq. (16) highlights our restriction to undirected graphs: if the spectrum is complex, then the recursion is not guaranteed to converge. We thus leave the extension of (16) to directed graphs as future work. Finally, by setting

$$\begin{aligned} \rho= & {} \frac{2\mu }{2\mu + \lambda _{max}}, \end{aligned}$$
(17)
$$\begin{aligned} \psi= & {} - \frac{\lambda _{max}}{2\mu + \lambda _{max}}, \end{aligned}$$
(18)

and applying (16) to y, we obtain the following recursive equation

$$\begin{aligned} {\widetilde{p}}^{(t)} = \rho y + \psi \widetilde{{\mathcal {S}}} {\widetilde{p}}^{(t-1)} \end{aligned}$$
(19)

that converges to \({\widetilde{\text {pr}}}_{\mu }(y)\) as \(t \rightarrow \infty\). Clearly, Eq. (19) allows us to set \({\widetilde{p}}^{(0)} = \text {pr}_{\mu }(y)\) and drive the trajectory of the recursion close to the exact evolved PageRank, reducing the number of iterations towards convergence.

However, Eq. (19) is not fully satisfactory. Firstly, using it to update a PageRank vector involves sending messages across the entire network due to the fact that \({\widetilde{p}}^{(0)} = \text {pr}_{\mu }(y)\) is dense, even though the update mostly takes place in the perturbed graph region. Secondly, it follows power-iteration convergence speed, which is slow. To amend these issues, we extend the result of Yoon et al. (2018) in Eq. (3) to a large family of PageRank methods by means of the following Lemma. For the sake of notation clarity, we refer to the convergent state of (19) by \({\widetilde{\text {pr}}}_{\rho , \psi }(y) = {\widetilde{p}}^{(\infty )}\).

Lemma 1

Given a fixed set of coefficients \(\rho\), \(\psi\), and initial condition y, we have that

$$\begin{aligned} {\widetilde{\text {pr}}}_{\rho , \psi }(y) = \text {pr}_{\rho , \psi }(y) + \frac{1}{\rho } {\widetilde{\text {pr}}}_{\rho , \psi }(r) \end{aligned}$$
(20)

where

$$\begin{aligned} r = \psi \left[ \widetilde{{\mathcal {S}}} - {\mathcal {S}}\right] \text {pr}_{\rho , \psi }(y) \end{aligned}$$
(21)

Proof

We start with a warm restart in recursion (19) as follows:

$$\begin{aligned} {\widetilde{p}}^{(1)}&= \rho y + \psi \widetilde{ {\mathcal {S}} } {\widetilde{p}}^{(0)} \end{aligned}$$
(22)
$$\begin{aligned}&= \rho y + \psi \widetilde{ {\mathcal {S}} } \text {pr}_{\rho , \psi }(y) \end{aligned}$$
(23)
$$\begin{aligned}&= \text {pr}_{\rho , \psi }(y) - \psi {\mathcal {S}} \text {pr}_{\rho , \psi }(y) + \psi \widetilde{ {\mathcal {S}} } \text {pr}_{\rho , \psi }(y) \end{aligned}$$
(24)
$$\begin{aligned}&=\text {pr}_{\rho , \psi }(y) + \psi \left[ \widetilde{ {\mathcal {S}} } - {\mathcal {S}} \right] \text {pr}_{\rho , \psi }(y) \end{aligned}$$
(25)
$$\begin{aligned}&=\text {pr}_{\rho , \psi }(y) + r \end{aligned}$$
(26)

Then, for the second iteration, we have

$$\begin{aligned} {\widetilde{p}}^{(2)}&= \rho y + \psi \widetilde{ {\mathcal {S}} } {\widetilde{p}}^{(1)} \end{aligned}$$
(27)
$$\begin{aligned}&= \rho y + \psi \widetilde{ {\mathcal {S}} } \left( \text {pr}_{\rho , \psi }(y) + r \right) \end{aligned}$$
(28)
$$\begin{aligned}&= \text {pr}_{\rho , \psi }(y) - \psi {\mathcal {S}} \text {pr}_{\rho , \psi }(y) + \psi \widetilde{ {\mathcal {S}} } \left( \text {pr}_{\rho , \psi }(y) + r \right) \end{aligned}$$
(29)
$$\begin{aligned}&= \text {pr}_{\rho , \psi }(y) + \psi \left[ \widetilde{ {\mathcal {S}} } - {\mathcal {S}} \right] \text {pr}_{\rho , \psi }(y) + \psi \widetilde{ {\mathcal {S}} } r \end{aligned}$$
(30)
$$\begin{aligned}&= \rho y + \psi \widetilde{ {\mathcal {S}} } \text {pr}_{\rho , \psi }(y) + \psi \widetilde{ {\mathcal {S}} } r \end{aligned}$$
(31)
$$\begin{aligned}&= \text {pr}_{\rho , \psi }(y) + \psi r + \psi \widetilde{ {\mathcal {S}} } r \end{aligned}$$
(32)

By successive applications of this procedure, we have that

$$\begin{aligned} {\widetilde{p}}^{(\infty )}&= \text {pr}_{\rho , \psi }(y) + \sum _{t = 0}^{\infty } \psi ^t \widetilde{ {\mathcal {S}} }^t r \end{aligned}$$
(33)
$$\begin{aligned}&= \text {pr}_{\rho , \psi }(y) + \frac{1}{\rho }{\widetilde{pr}}_{\rho , \psi }(r) \end{aligned}$$
(34)

\(\square\)

Lemma 1 has several implications. Firstly, it states that updating a PageRank vector amounts to computing another PageRank vector with an initial distribution r that is completely localized (nonzero) in the 1-hop vicinity of the nodes that changed between \({\mathcal {G}}\) and \(\widetilde{{\mathcal {G}}}\). Secondly, it shows that the size of the update is upper bounded by the size of the perturbation. This is, it shows that \(\Vert {\widetilde{\text {pr}}}_{\rho , \psi }(y) - \text {pr}_{\rho , \psi }(y) \Vert\) is upper bounded by \(\Vert {\widetilde{\text {pr}}}_{\rho , \psi }(r) \Vert\) and thus by \(\Vert \widetilde{{\mathcal {S}}} - {\mathcal {S}}\Vert\). Thirdly, since computing \({\widetilde{\text {pr}}}_{\rho , \psi }(r)\) involves diffusing r through the graph, then the locality of r implies that only a few messages among the nodes in the affected region are enough update the affected nodes. Fourthly, it makes it obvious that it is not necessary to use the slow recursive equation (19) to perform the update. Instead, \({\widetilde{\text {pr}}}_{\rho , \psi }(r)\) can be more efficiently computed by means of the Chebyshev polynomials. Therefore, our proposed algorithm consists in leveraging Eq. (20) and in approximating \({\widetilde{\text {pr}}}_{\rho , \psi }(r)\) by means of Chebyshev polynomials. It is summarized in Algorithm 2.

figure b

We conclude the section by highlighting that the generality of the Chebyshev polynomials and our derivations pave to way to explore updating algorithms for other centrality metrics, such as the recently proposed \(\Psi\)-score to measure influence in social networks (Giovanidis et al. 2021). Yet, this calls for a deep study of the spectral properties of such centrality metrics, which goes beyond the scope of this work. We thus leave the extension of our results to other centrality measures as future work.

4 Numerical evaluation

Goals. In this section, we evaluate the performance of the proposed algorithm. In particular, our goals are as follows: (i) to assess the performance gains obtained by the algorithm with respect to computing PageRank from scratch using the Chebyshev polynomials; (ii) to demonstrate that the proposed algorithm can be used to update both standard and generalized PageRank vectors; (iii) to assess how the performance of the algorithm degrades as perturbations grow in size; (iv) to compare the proposed algorithm with the state-of-the-art alternatives; and (v) to evaluate the performance of the algorithm in a tracking scenario where a PageRank vector needs to be updated during a long period of time.

Metrics. We assess performance in terms of the number of messages that need be exchanged in order to approximate the evolved PageRank vector within a specified relative error (\(\ell _2\)-norm sense). Since each node can update its PageRank score in constant time once it receives the neighbor’s values, the complexity of our algorithm is governed by the number of messages transmitted and the time it takes to transmit and receive messages. From these variables, only the number of message exchanges is ruled by our algorithm since the time it takes to transmit and receive messages is dependent of the technology where the algorithm is implemented. We thus consider the number of message exchanges a better metric that the routine’s running time.

Data. We perform our experiments in the Tech-AS-Topology temporal network (Rossi and Ahmed 2015), which is a real-world network from an autonomous system with 34.8K nodes and 171.4K edges organized in 32.8K graph snapshots. This network contains both sparse and dense regions, meaning that perturbations can affect small or large regions. For the experiments, we pre-process the data by turning the snapshots into undirected graphs, resulting in a total of 215.4K timestamped edges. The first graph snapshot in the sequence contains 32K nodes and 111.6K edges. Then, in successive snapshots, new edges adhere into the network branching nodes already present or new nodes joining the graph. The dataset only contains edge additions; therefore, we simulate an edge deletion setting (see Experiment 4) by reversing the time axis: we consider the originally last snapshot as the new first one and the originally first snapshot as the new last one. From this perspective, a new snapshot causes edges to disappear or nodes to leave.

4.1 Experiment 1

Fig. 1
figure 1

Experiment 1. Performance of proposed algorithm vs a computation from scratch. We assess approximation error as larger computational budgets are given to the algorithms. We update both standard PageRank and a Generalized PageRank used in semi-supervised learning (Bautista et al. 2019)

Fig. 2
figure 2

Experiment 2. Sensitivity of proposed algorithm to perturbations. We assess the number of messages required to update within an error of \(1\times 10^{-13}\) as the number of added edges grows. The alternative from scratch (same conditions) is used as reference

In our first experiment, we address goals (i) and (ii). For this, we fix a small graph perturbation. Then, as we vary the allowed number of messages, we measure how well Algorithm 2 approximates the true PageRank of the evolved graph. For comparison purposes, we apply the same test to a computation from scratch using Algorithm 1. To show that the proposed algorithm can update generalized PageRank propositions as well as standard PageRank, we employ it to update standard PageRank vectors and the recently proposed \(L^\gamma\)-PageRank vectors from Bautista et al. (2019). For this experiment, we use the first snapshot from the Tech-AS-Topology network as initial graph. Then, we use the second snapshot of the network as the perturbation: it contains 120 new edges and 1 new node joining the graph. We choose a vertex at random and use its indicator function as the initial distribution y (a common setting in local graph clustering). We use \(\alpha = 0.5\), measure relative error in the \(\ell _2\) sense, and repeat the experiment for 20 realizations of y.

Results of Experiment 1 are displayed in Figure 1. Figure 1a shows the result of updating standard PageRank, while Figure 1b depicts the result of updating the generalized \(L^\gamma\)-PageRank (Bautista et al. 2019). They show that the proposed algorithm successfully updates both standard and generalized PageRank vectors. In both cases, the proposed algorithm offers significant approximation improvements compared to computing from scratch. We additionally verify that our updating algorithm converges at the same rate than the Chebyshev polynomials from scratch. This amends a tradeoff that needs to be made with current updating algorithms: they are a good option if only few iterations are allowed but their slow convergence makes them worse than Chebyshev polynomials from scratch if several iterations are needed (Bautista Ruiz 2019). Lastly, we notice that the error bars (standard error) are negligible, indicating that the performance of the algorithm is irrespective of the choice of initial seed y and, consequently, of a particular PageRank vector to update.

4.2 Experiment 2

In our second experiment, we address goal (iii). For this, we fix a target approximation error. Then, as we vary the perturbation size, we measure the number of messages required by Algorithm 2 to approximate the true PageRank of the evolved graph within the specified error. For comparison purposes, we apply the same test to a computation from scratch using Algorithm 1. Since larger perturbations imply larger updates, our proposed algorithm should be highly sensitive to the size of perturbations. On the other hand, a computation from scratch should only augment messages proportionally to \(\vert \widetilde{{\mathcal {E}}} \vert - \vert {\mathcal {E}} \vert\). Therefore, we aim to empirically spot the point where the update needed is so large that our updating algorithm offers no benefit over a computation from scratch. For this experiment, we use the first snapshot from the Tech-AS-Topology network as initial graph. Then, we control the size of the perturbation by aggregating an increasingly larger number of subsequent snapshots. We set y as the indicator function of a random vertex and update its associated standard PageRank vector. We use \(\alpha = 0.5\) and set the error at \(1\times 10^{-13}\).

Results of Experiment 2 are displayed in Figure 2. For small perturbations, the number of messages needed by our algorithm to attain the desired error is small. This number increases as the perturbation grows in size, reaching a point where the updating algorithm does not provide any advantage with respect to a computation from scratch. For the Tech-AS-Topology network, this operational limit occurs for a perturbation of around 4000 new edges, which corresponds to roughly \(3\%\) of the edges from the initial graph. This confirms that our updating algorithm should preferably be used when the changes in the graph are small.

4.3 Experiment 3

In our next third experiment, we address goal (iv). For this, we fix a graph perturbation. Then, as we vary the target approximation error, we measure the number of message exchanges our proposed Algorithm 2 and the state-of-the-art alternatives (Yoon et al. 2018; Ohsaka et al. 2015) require to approximate the true PageRank of the evolved graph within the specified error. We stress that our algorithm and the RWR one (Yoon et al. 2018) are distributed and can thus be assessed in terms of transmitted messages. Yet, the push algorithm (Ohsaka et al. 2015) is centralized and normally studied in terms of push operations rather than messages. We notice that the complexity of each push operation is dominated by a matrix-vector multiplication that can be interpreted as transmitted messages; thus, we track this quantity for the Push method. For this experiment, we use the first snapshot from the Tech-AS-Topology network as initial graph. Then, we use the second snapshot of the network as the perturbation. We set y as the indicator function of a random vertex and update its associated standard PageRank vector. We use \(\alpha = 0.5\).

Results of Experiment 3 are displayed in Fig. 3. The proposed algorithm outperforms state-of-the-art alternatives for local PageRank updating, being able to attain any desired relative approximation error with significantly less messages. Indeed, for approximation errors in the order of \(10^{-14}\), which are the best approximations we can obtain using Python’s float64 data types, the proposed Algorithm requires roughly \(35\%\) less messages than the second best method of RWR. We notice that the push algorithm is not competitive for very precise approximations, as the number of message operations it requires quickly becomes large.

4.4 Experiment 4

Fig. 3
figure 3

Experiment 3. Comparison of proposed algorithm with state-of-the-art alternatives. We assess the number of messages required to update a PageRank vector within an increasingly smaller approximation error

Fig. 4
figure 4

Experiment 4. Performance of proposed algorithm in a tracking setting. We update a PageRank vector during a long period of time using \(K = 15\) fixed communication rounds. We measure error with respect to the exact PageRank of each snapshot. The alternative from scratch (same conditions) is used as reference. We consider both edge arrival and edge removal settings

In our fourth experiment, we address goal (v). For this, we fix a number of communication rounds (K). Then, as new graph snapshots arrive, we estimate the PageRank vector of the current snapshot by updating the PageRank vector estimated for the previous snapshot using Algorithm 2. To demonstrate that our algorithm addresses equally edge additions and deletions, we run the experiment in both settings. Since the data only contain edge additions, we simulate deletions by reversing the time axis, meaning that we start from the evolved network and run backwards to the primitive one. For comparison purposes, we estimate the exact PageRank of the current snapshot via a computation from scratch using Algorithm 1 under the same number of communication rounds (K). We stress that this is an extremely challenging task for the updating algorithm because the vector to update is no longer the exact PageRank vector of the previous snapshot but an approximation of it, thus meaning that errors accumulate over time. Therefore, we aim to empirically spot if our method can maintain a PageRank vector for a long time or if it soon becomes worse than the approximation obtained by the method from scratch. For this experiment, we use the aggregated first 100 snapshots from the Tech-AS-Topology network as initial graph. Then, we track the standard PageRank vector during the following 1000 snapshots (reverse for edge removals). The only exact PageRank vector is given to the initial graph. We set y as the indicator function of a random vertex, \(K = 15\) and \(\alpha = 0.5\).

Results of Experiment 4 are displayed in Figure 4. The upper panel shows the relative error between the tracked vectors and the true PageRank of each snapshot when edge additions are considered, while the bottom panel depicts the same quantities for the case of edge deletions. The size of the perturbation in each new snapshot is shown as additional information in both panels. In both cases, the proposed algorithm is able to effectively track the PageRank vector during the entire time horizon. For early times, the updating algorithm returns extremely precise approximations: up to five orders of magnitude improvement with respect to a computation from scratch with the same computational budget. Then, we notice that errors steadily accumulate. However, it is at a sufficiently slow rate that, during the 1000 snapshots, the tracked vector is at least two orders of magnitude closer to the exact PageRank than the alternative from scratch.

5 Conclusion

We proposed a Chebyshev polynomial-based distributed algorithm to locally update personalized PageRank vectors. We showed that the proposed algorithm has faster convergence than state-of-the-art alternatives, bringing us closer to the goal of effortlessly maintaining PageRank vectors in real world networks. Additionally, the algorithm can be used to update more general formulations of PageRank recently proposed in the context of semi-supervised learning. These improvements were possible due to a novel updating equation that encompasses a family of PageRank formulations and that makes it direct to employ Chebyshev polynomials to locally solve the updating challenge. Numerical evaluations showed that the proposed algorithm is an effective tool for tracking PageRank vectors for a long period of time when changes in the graph are small. An interesting prospective work would be to extend these results to undirected graphs that have non-real eigenvalues and to the case in which the PageRank initial distribution and teleportation parameters also evolve over time.