1 Introduction

Social networks are an integral part of social analysis, because they play an important role in the spread of, e.g., information, innovation, or purchase decisions. A social network is defined as a graph with actors (or groups of actors) corresponding to nodes, and arcs corresponding to interactions between actors. Interactions may represent different concepts such as friendship, mentor-apprentice, one- or two-way communication, and so on. Recent years have witnessed growing interest in the definition and study of mathematical models to represent the propagation of influence – broadly defined – in a social network, as well as in the identification of the actors that can play an important role in facilitating such propagation. This paper concerns the identification of such actors.

The influence maximization problem is defined on a graph with an associated diffusion process that models the spread of influence on the graph. A node is defined as activated if it is affected by the diffusion process. A subset of the nodes are selected as seeds, and their role is to initialize the diffusion process. Influence propagates in a breadth-first manner starting from the seeds. Several rules can be used to model the activation of a node. A commonly used model associates an activation threshold to each node, and a nonnegative weight to each arc representing the strength of the interaction; this paper uses such a model. The condition under which a node is activated by its neighbors is often described by an activation function, several types of which are discussed in the literature. The Influence Maximization Problem (IMP) is defined as the problem of identifying a subset of nodes of a given cardinality that maximizes the number of nodes activated at the end of the influence propagation process.

Literature review The idea of identifying the set of nodes that maximizes influence on a network dates back to Domingos and Richardson [9] and Richardson and Domingos  [23]. Several variants of the IMP have been presented in the literature; we refer to the recent surveys Banerjee et al. [2], Kempe et al. [17] and Li et al. [19] for an extensive analysis of these variants.

The majority of the literature models the diffusion of influence on the graph using a threshold model or a cascade model, see e.g., Kempe et al. [16]. In the threshold model, a node becomes active if and only if a function of the weights on arcs incoming from activated neighbors is larger than the node threshold. In the cascade model, a node becomes active if at least one of its neighbors is activated. We further distinguish between deterministic, stochastic and robust models. In deterministic models, the graph parameters (i.e., weights and thresholds) are given and immutable. In the stochastic model, some of them are random variables and we optimize the expected number of activated nodes. In the robust model, some parameters are uncertain and we optimize the worst case over a given uncertainty set (several comprehensive surveys on robust optimization are available in the literature, see e.g.,  Ben-Tal and Nemirovski [4], Bertsimas et al. [6], Buchheim and Kurtz [7], Li et al. [20]). The distinction between deterministic, stochastic, and robust models is crucial, because the stochastic version of the problem leads to a monotone submodular maximization problem under reasonable assumptions Mossel and Roch  [22]. Hence, it admits an efficient \((1 - \frac{1}{e})\)-approximation using a greedy algorithm, see e.g., Kempe et al. [16], Mossel and Roch  [22]. The deterministic and robust version are not known to admit such an approximation in general, and they tend to be much harder to solve in practice (but see Chen et al. [8] for an approximation algorithm for a robust version of IMP under some conditions, that still requires submodularity).

Kempe et al. [16, 17] study the greedy approach for the threshold and cascade model in the stochastic setting. Wu and Küçükyavuz [26] proposes an exact cutting plane approach for the same class of models, using strong optimality cuts exploiting submodularity. Chen et al. [8], He and Kempe [15] present a greedy algorithm for a robust version of the cascade models, optimizing a measure of regret regarding the set of chosen seeds.

Among the variants of IMP, we mention the Target Set Selection Problem (TSSP) Ackerman et al. [1], the (Generalized) Least Cost Influence Problem (GLCIP) Fischetti et al. [12], Gunnec [14], and the Technology Diffusion Problem (TDP) Goldberg and Liu [13], Könemann et al. [18]. The TSSP looks for the minimum-cost set of seed nodes that guarantees activation of a given percentage of the total number of nodes. The TSSP and the IMP are in some sense two different formulations for the same problem Ackerman et al. [1]: in TSSP, the total number of activated nodes is a constraint and the number of seed nodes is the objective function, while for IMP it is the other way around. GLCIP is a generalization of TSSP that allows incentives to decrease node activation thresholds paying a cost Fischetti et al. [12]. Both Ackerman et al. [1] and Fischetti et al.  [12] use integer programming formulations with an exponential size. In the TDP, a nodes activates if it is adjacent to a connected component of active nodes of size at least equal to its threshold. The goal is to find a seed set whose initial activation would trigger a cascade activating the entire graph. Goldberg and Liu [13] propose an \(O(rl \log (n))\)-approximation algorithm, where r is the diameter of the given graph, and l is the number of distinct thresholds used in the instance. This result is improved in Könemann et al.  [18] to a \(O(\min \{r, l\} \log (n))\)-approximation algorithm; to the best of our knowledge, exact algorithms for TDP are not discussed in the literature.

Contributions of this paper We present an exact algorithm for the deterministic and robust IMP assuming a linear threshold model. The algorithm that we propose is based on a mixed-integer linear program (MILP) and Branch-and-Cut. The model of uncertainty for the robust IMP is akin to the \(\varGamma \)-robustness of Bertsimas and Sim [5]: we assume that the node activation thresholds and the arc weights are allowed to vary within a certain range, but the total amount of deviation from the nominal problem data is limited by two robustness parameters (one for node thresholds, one for arc weights); our goal is to choose seeds so as to optimize the total influence on the graph, assuming the worst-case realization of the problem data allowed by the given robustness parameters. It is known that the influence maximization problem is NP-hard Kempe et al. [17]; we show that computing the total influence under our robust model, given the set of seeds, is NP-hard as well. We study properties of the robust solutions, formalizing the intuition that the price of robustness is higher for changes in the node thresholds than for changes in the arc weights. To the best of our knowledge, this is the first time that an exact algorithm for a robust version of IMP is proposed in the literature. Furthermore, even the non-robust version of the MILP used in this paper is novel. Our algorithm for the robust IMP is not simply the application of the ideas of Bertsimas and Sim  [5] to the non-robust model: indeed, for reasons that will become apparent after discussing the mathematical model for IMP in more detail, it is not clear how to apply the procedure of Bertsimas and Sim [5] to our model. We therefore propose a full Branch-and-Cut algorithm that computes, at each node, a valid bound for the robust problem, and progressively refines such bound.

The MILP that we propose for IMP originates from a bilevel formulation of the problem, where the inner problem (a linear problem with an exponential number of constraints and a provably integer optimum) is dualized, leading to a quadratic problem with binary and linear variables. This formulation is linearized with the use of indicator constraints. The final model contains an exponential number of variables. We show that they can be generated within a column generation framework. The number of variables depends on the density of the graph and the arc weights. To make the problem robust, we use valid dual bounds that are progressively refined using the solution of a sub-MILP.

We test the proposed Branch-and-Cut algorithm on a set of instances comprising social network graphs taken from Fischetti et al.  [12]. We show that our integer programming formulation for the non-robust model is competitive with the exact algorithm of Fischetti et al. [12] for the related GLCIP problem, and we are able to solve the robust IMP to optimality for instances of similar size (\(\le 100\) nodes). Furthermore, we implement a heuristic Branch-Cut-and-Price algorithm that works with a subset of the columns (i.e., decision variables). We apply this heuristic to larger graphs (up to 5000 nodes), which are out of reach for our implementation of the exact algorithm because the set of all columns is prohibitively large. We compare the heuristic Branch-Cut-and-Price to a 2-opt heuristic, and show that it consistently finds significantly better solutions within the same amount of time.

The rest of this paper is organized as follows. Section 2 formally describes the robust IMP. Section 3 discusses the computational complexity of the problem and some properties of the robust solution. Section 4 presents a new formulation for the deterministic version of the IMP. Section 5 extends the formulation to the robust case, proposing a Branch-Cut-and-Price algorithm for its solution. Section 6 gives implementation details about the algorithm presented in the previous section and introduces a heuristic version of the Branch-Cut-and-Price. Section 7 provides a numerical evaluation of the proposed methodology and Sect. 8 concludes the paper.

2 Problem formulation

To formulate the problem of maximizing the influence on a graph \(G = (V,E)\), we start by considering the problem of computing the amount of influence spread once the activation seeds are given. Assume w.l.o.g. that \(V = \{1,\dots ,n\}\), and denote by \(\delta ^-(j)\) and \(\delta ^+(j)\) the instar and outstar of node j. In the rest of this paper, we will use \(y \in \{0,1\}^{n}\) as the incidence vector of the seeds.

Definition 1

Given seeds y, a vector of nonnegative node thresholds t, and nonnegative arc weights w, we define the set of active nodes as the set returned by Algorithm 1, and the corresponding influence as its cardinality.

Notice that the main influence diffusion loop is repeated n times, but clearly it can be stopped at iteration \(k < n\) if no node is added to \(A_k\). A summary of this notation, as well as all the notation used in this paper, can be found in “Appendix A”.

figure a

The activation function used in Algorithm 1 is known as the linear threshold model. If the node activation thresholds t and arc weights w are given as input, the model is deterministic. This paper studies a robust counterpart of linear threshold model, in which the activation threshold \(t_j\) of each node j can deviate by some fraction \(\varDelta _N\) from its nominal value, and the total amount of threshold variations is upper bounded by a given number \(B_N\). Similary, each arc weight \(w_{ij}\) can deviate by some fraction \(\varDelta _A\) from its nominal value, and the total amount of weight variations is upper bounded by \(B_A\). Because we want to optimize over a worst-case scenario, the activation thresholds can only increase and the arc weights can only decrease. Throughout the rest of the paper, we assume w.l.o.g. that \(\varDelta _N \in [0, \infty )\) and \(\varDelta _A \in [0, 1]\). Given a vector of seeds \({\bar{y}} \in \{0,1\}^{n}\), the total amount of influence that spreads on the graph under this robust setting is the optimum of the following problem:

$$\begin{aligned} \left. \begin{array}{rrcl} \text {RI}_{x, \theta , \varphi }({\bar{y}}) := &{} \min _{x,\varphi } \sum _{j \in V} x_j \\ \forall j \in V &{} \sum _{i \in \delta ^-(j)} (w_{ij} x_i - \varphi _{ij}) -\theta _j + \epsilon - M x_j &{}\le &{} t_j \\ \forall j \in V &{} x_j &{}\ge &{} {\bar{y}}_j \\ &{} \sum _{j \in V} \theta _j &{}\le &{} B_{N}\\ \forall j \in V &{} 0 \le \theta _j &{}\le &{} \varDelta _{N} t_j \\ &{} \sum _{i,j \in V, i\ne j} \varphi _{ij} &{}\le &{} B_{A}\\ \forall i,j \in V, i\ne j &{} 0 \le \varphi _{ij} &{}\le &{} \varDelta _{A} w_{ij} x_i \\ \forall j \in V &{} x_j \in \{0,1\}, \end{array} \right\} \end{aligned}$$
(1)

where a tight big-M coefficient can be computed as \(\sum _{i \in \delta ^-(j)} w_{ij} + \epsilon \). In the above formulation, \(\epsilon \) is a small enough value ensuring that if \(x_j = 0\) then \(\sum _{i \in \delta ^-(j)} w_{ij} x_{i} < t_j\). For example, if the node thresholds and arc weights are decimal numbers with d digits of precision after the decimal dot, one can pick \(\epsilon = 10^{-d}\) because the difference between the two sides of the equation will be at least \(10^{-d}\). Notice that this effectively imposes a lower bound on the minimum node threshold change. We use \(10^{-d}/2\) in our numerical tests. (In our numerical experiments, the integrality tolerance used by the solver is several orders of magnitude smaller than this value.)

For each node j, we use variable \(\theta _j\) to denote the increase of node threshold \(t_j\), and for each arc (ij), we use variable \(\varphi _{ij}\) to represent the decrease of weight \(w_{ij}\). The constraints \(\sum _{j \in V} \theta _j \le B_{N}, 0 \le \theta _j \le \varDelta _{N} t_j, \; \forall j \in V\) and \(\sum _{j \in V} \varphi _{ij} \le B_{A}, 0 \le \varphi _{ij} \le \varDelta _{A} w_{ij} x_i, \; \forall j \in V\) define a polyhedral uncertainty set in a manner similar to Bertsimas and Sim [5]. We need the binary variable \(x_j\) in the upper bound for \(\varphi _{ij}\) because otherwise, due to the first constraint in (1), the activation of a node could be influenced by an arc coming from an inactive neighbor.

Proposition 1

The optimum of \(\text {RI}_{x, \theta , \varphi }({\bar{y}})\) is the total influence spread on G from seeds \({\bar{y}}\) under our robustness model, and the variables \(x_j\) indicate which nodes are active at the end of the influence propagation process.

Proof

It suffices to show that for a given choice of the threshold change \(\theta _j\) and arc weight change \(\varphi _{ij}\), problem \(\text {RI}_{x,\theta ,\varphi }({\bar{y}})\) computes the total influence spread as if we had applied InfluenceSpread\(({\bar{y}}, t + \theta , w + \varphi )\); the result then follows because we are minimizing over \(\theta \) and \(\varphi \), thereby yielding the influence in the worst-case realization of uncertainty.

Notice that \(\text {RI}_{x,\theta ,\varphi }({\bar{y}})\) is a minimization problem and each \(x_j\) is lower bounded by two quantities only: \({\bar{y}}_j\), and \(\frac{\sum _{i \in \delta ^-(j)} (w_{ij} x_{i} - \varphi _{ij}) - t_j - \theta _j + \epsilon }{\sum _{i \in \delta ^-(j)} w_{ij}}\). The latter quantity is \(> 0\) if and only if \(\sum _{i \in \delta ^-(j)} (w_{ij} x_{i} - \varphi _{ij}) \ge t_j + \theta _j\). Notice that, because of the constraints \(\varphi _{ij} \le \varDelta _{N} x_i\), this expression is equivalent to \(\sum _{i \in \delta ^-(j)} (w_{ij} - \varphi _{ij}) x_{i} \ge t_j + \theta _j\), implying that \(x_j = 1\) if and only if its linear activation rule (with weights \(w + \varphi \) and threshold \(t + \theta \)) is triggered by its neighbors. If we apply InfluenceSpread\(({\bar{y}}, t + \theta , w + \varphi )\), it is easy to see by induction over the main loop that for each \(x_j \in A_k\) there is an implied lower bound \(x_j \ge 1\) (recall that \(x_j\) is binary), and for all nodes \(\not \in A_n\) there is no such implied lower bound. It follows that in the optimal solution \(x_j = 1\) if and only if \(j \in A_n\).\(\square \)

The above formulation does not suffer from self-activating loops (which require additional caution in some other formulations, see e.g., Fischetti et al. [12]), and it has n binary variables only. Problem \(\text {RI}_{x, \theta , \varphi }({\bar{y}})\) is stated as a MILP; Sect. 3.1 shows that computing its value is NP-hard, hence using a MILP formulation is justified. Because of Proposition 1, the robust IMP with q activation seeds can be solved as the following bilevel optimization problem:

$$\begin{aligned} \left. \begin{array}{rrcl} \max _y &{} \text {RI}_{x, \theta , \varphi }(y) &{} &{}\\ &{} \sum _{j \in V} y_j &{}=&{} q \\ \forall j \in V &{} y_j &{}\in &{} \{0,1\}. \end{array} \right\} \end{aligned}$$
(R-IMP)

Notice that if we fix \(\theta = 0\) and \(\varphi =0\), solving \(\text {RI}_{x, \theta =0, \varphi =0}({\bar{y}})\) yields \(x_j = 1\) exactly for the nodes returned by InfluenceSpread\(({\bar{y}}, t, w)\). It is easy to show by counterexample that \(\text {RI}_{x, \theta , \varphi }(y)\), taken as a set function of the incidence vector y, is not submodular even for fixed \(\theta \) and \(\varphi \). The robust approach of Bertsimas and Sim [5] is difficult to apply here because we do not even have a single-level formulation for the problem. It may be possible to overcome this difficulty with a different formulation with an increased number of variables: in particular, a time-expanded graph with \(n^2\) nodes to represent the n iterations of Algorithm 1 would in principle lead to a single-level formulation, which might allow different strategies to make the problem robust. Our approach keeps a model with n binary variables: this is the subject of Sects. 4 and 5.

3 Properties of the robust problem

Before developing an algorithm for (R-IMP), we study some of its properties. More specifically, we show that the inner problem \(\text {RI}_{x,\theta ,\varphi }({\bar{y}})\) is NP-hard given \({\bar{y}}\). We then turn our attention to properties of the robust solution under different uncertainty sets, formalizing the natural intuition that the price of robustness is higher for node threshold uncertainty than it is for arc weight uncertainty.

3.1 Hardness

We define the decision version of \(\text {RI}_{x,\theta ,\varphi }(y)\) as follows:

figure b

Theorem 1

The problem “Robust Influence” is NP-complete, even if \(B_N = 0\) or \(B_A = 0\).

Proof

Membership in NP is straightforward: a polynomial-size certificate for the answer can be constructed by providing the optimal \(\theta ^{{\bar{y}}}\) and \(\varphi ^{{\bar{y}}}\). The answer can be verified by running the polynomial-time algorithm InfluenceSpread\(({\bar{y}}, t + \theta , w + \varphi )\). To show hardness, we provide a reduction from “Exact Cover by 3-Sets.”

figure c

We first examine the case \(B_A = 0\); the case \(B_N = 0\) is similar. We construct an instance of Robust Influence as follows; an example is given in Fig. 1. The graph G has \(|{\mathcal {C}}| + 3q + 1\) nodes and has three layers, with arcs always directed toward subsequent layers. The first layer contains only a special node, selected as the only seed node for the influence spread. The special node is connected to \(|{\mathcal {C}}|\) nodes in the second layer, representing the collection of subsets of X with an arc with weight 2. The \(|{\mathcal {C}}|\) nodes are connected to |X| nodes in the third layer, representing th elements of X. A node in the second layer, corresponding to \(S \in {\mathcal {C}}\), is connected to the nodes representing the three elements of X that it contains. All the arcs between the second and third layer have weight 1. All node thresholds in the second layer are 1, all node thresholds in the third layer are equal to the respective indegree. We set \(B_N = q(1+\epsilon )\), \(k = |{\mathcal {C}}| - q + 1\), \(\varDelta _N = 2, \varDelta _A =1\).

Suppose there exists an exact cover C of X. Then we can set \(\theta _j = 1 + \epsilon \) for all \(j \in C\); in other words, we increase by \(1 + \epsilon \) the threshold of all nodes in the graph corresponding to the subsets in \({\mathcal {C}}\) that yield an exact cover of X. By doing so, these nodes cannot be activated. In the third layer, a node is active if and only if all the entering arcs are coming from active nodes. But because C is an exact cover, there will be exactly one inactive arc for each node in the third layer. Thus, all nodes in the third layer are inactive. It follows that the total influence spread is \(|{\mathcal {C}}| - q + 1\): there is one active node in the first layer (the seed), and \(|{\mathcal {C}}| - q\) in the second layer.

Conversely, suppose the influence spread is \(|{\mathcal {C}}| - q + 1\). Notice that the node in the first layer (chosen as the seed) and at least \(|{\mathcal {C}}| - q\) nodes in the second layer are always active, because the budget \(B_N\) allows us to deactivate at most q nodes in the second layer. Hence, these must be the only active nodes. But then all nodes in the third layer must be adjacent to an inactive node in the second layer. It follows that we found an exact cover C of X.

The proof for \(B_N = 0, B_A > 0\) is similar: we simply set \(B_N = 0, B_A = q(1+\epsilon )\). In this case, we can ensure nodes in the second layer are inactive by decreasing the weight of the corresponding entering arc by \((1+\epsilon )\).\(\square \)

Fig. 1
figure 1

Example of the reduction from “Exact Cover by 3-Sets.” The ground set is \(X = \{a, b, c, d, e, f\}, q = 2, {\mathcal {C}}= \{ S_1, S_2, S_3, S_4 \}\) with \(S_1 = \{a, b, e\}, S_2 = \{b, c, d\}, S_3 = \{c, d, f\}, S_4 = \{d, e, f\}\). Node thresholds are indicated next to each node, and arc weights that are not shown in the figure are 1

3.2 Sensitivity analysis

Given a set of robustness parameters \(P = \{B_N, \varDelta _N, B_A, \varDelta _A\}\), we denote by \(\text {RI}^{P}_{x, \theta , \varphi }(y)\) the problem \(\text {RI}_{x, \theta , \varphi }(y)\) with those parameters. In this section we study the impact of P on the optimal value.

We start by noting that if the upper bounds on the relative deviations \(\varDelta _N, \varDelta _A\) are nonrestrictive, the price of robustness for a single unit of robustness budget on the node thresholds is larger than on arc weights.

Proposition 2

Let \(P^1 = \{B^1_N, \varDelta ^1_N=+\infty , B^1_A, \varDelta ^1_A=1\}\), \(P^2 = \{B^2_N, \varDelta ^2_N=+\infty , B^2_A, \varDelta ^2_A=1\}\). We have the following implication:

$$\begin{aligned} \left\{ \begin{array}{c} B^1_N + B^1_A = B^2_N + B^2_A\\ B^1_N \le B^2_N \end{array} \right. \quad \Longrightarrow \quad \text {RI}^{P^1}_{x, \theta , \varphi }(y) \ge \text {RI}^{P^2}_{x, \theta , \varphi }(y). \end{aligned}$$

Proof

We show that for any feasible solution of \(\text {RI}^{P_1}_{x, \theta , \varphi }(y)\), we can construct a feasible solution for \(\text {RI}^{P_2}_{x, \theta , \varphi }(y)\) with the same objective function value. This immediately implies the claim. Let \((x^1,\theta ^1, \varphi ^1)\) be feasible for \(\text {RI}^{P_1}_{x, \theta , \varphi }(y)\). We construct \((x^2,\theta ^2, \varphi ^2)\) feasible for \(\text {RI}^{P_2}_{x, \theta , \varphi }(y)\) as follows:

$$\begin{aligned} x^2 =x^1;~~ \varphi ^2 = \frac{B^2_A}{B^1_A} \varphi ^1;~~ \theta ^2_j = \theta ^1_j + \frac{B^1_A - B^2_A}{B^1_A} \sum _{i \in \delta ^-(j)} \varphi ^1_{ij},~ \forall j \in V. \end{aligned}$$
(2)

It is immediate to check that \(\text {RI}^{P^1}_{x, \theta , \varphi }(y) = \text {RI}^{P^2}_{x, \theta , \varphi }(y)\), and that the point satisfies the constraints \(0 \le \theta ^2_j\), \(0 \le \varphi ^2_{ij}\) and \(\sum _{i \in \delta ^-(j)} (w_{ij} x^2_i - \varphi ^2_{ij}) -\theta ^2_j + \epsilon - (\sum _{i \in \delta ^-(j)} w_{ij} + \epsilon ) x^2_j \le t_j\).\(\square \)

To prove Proposition 2 we had to assume \(\varDelta _N = \infty \) and \(\varDelta _A = 1\). If these conditions are not satisfied, the argument fails because the point constructed in the proof for \(P^2\) may not satisfy the constraints \(\theta \le \varDelta _N t\) and \(\varphi \le \varDelta _A w x\) in the definition of \(\text {RI}_{x, \theta , \varphi }(y)\). The next two results give sufficient conditions on \(\varDelta _N\) and \(\varDelta _A\) to be able to perform similar constructions. While the statements are technical, we will discuss some more intuitive special cases at the end of this section.

Theorem 2

Let \(P^1 = \{B^1_N, \varDelta ^1_N, B^1_A, \varDelta ^1_A\}\), \(P^2 = \{B^2_N, \varDelta ^2_N, B^2_A, \varDelta ^2_A\}\). We have the following implication:

$$\begin{aligned} \left\{ \begin{array}{c} \varDelta ^2_N \ge \varDelta ^1_N + \frac{B^1_A - B^2_A}{B^1_A} \varDelta ^1_A \max _{j \in V}\{\frac{\sum _{i \in \delta ^-(j)} w_{ij}}{t_j}\}\\ B^1_N + B^1_A = B^2_N + B^2_A\\ B^1_N \le B^2_N \end{array} \right. \quad \Longrightarrow \quad \text {RI}^{P^1}_{x, \theta , \varphi }(y) \ge \text {RI}^{P^2}_{x, \theta , \varphi }(y). \end{aligned}$$

Proof

As for Proposition 2, we show that for any feasible solution of \(\text {RI}^{P_1}_{x, \theta , \varphi }(y)\) we can construct a feasible solution for \(\text {RI}^{P_2}_{x, \theta , \varphi }(y)\) with the same objective function value. For a feasible solution \((x^1, \theta ^1, \varphi ^1)\), define \((x^2, \theta ^2, \varphi ^2)\) as in Eq. (2). We have to show that the upper bounding constraints in \(\text {RI}^{P_2}_{x, \theta , \varphi }(y)\) hold for the proposed solution \((x^2, \theta ^2, \varphi ^2)\). We have:

$$\begin{aligned} \sum _{j \in V} \theta ^2_j = \sum _{j \in V} \theta ^1_j + \frac{B^1_A - B^2_A}{B^1_A} \sum _{j \in V}\sum _{i \in \delta ^-(j)} \varphi ^1_{ij} \le B^1_N + \frac{B^1_A - B^2_A}{B^1_A} {B^1_A} = B^2_N \end{aligned}$$

and

$$\begin{aligned} \varphi ^2_{ij} = \frac{B^2_A}{B^1_A} \varphi ^1 \le \frac{B^2_A}{B^1_A} B^1_A = B^2_A. \end{aligned}$$

Finally, since \((x^1,\theta ^1, \varphi ^1)\) is feasible, we have:

$$\begin{aligned} \frac{\theta ^2_j}{t_j} = \frac{\theta ^1_j}{t_j} + \frac{B^1_A - B^2_A}{B^1_A} \sum _{i \in \delta ^-(j)} \frac{\varphi ^1_{ij}}{t_j} \le \varDelta ^1_N + \frac{B^1_A - B^2_A}{B^1_A} \sum _{i \in \delta ^-(j)} \frac{\varDelta ^1_A w_{ij}x_{ij}}{t_j} \le \varDelta ^2_N. \end{aligned}$$

Thus, \(\theta ^2\) satisfies the constraints \(\theta ^2_j \le \varDelta ^2_N t_j\). \(\square \)

Theorem 3

Let \(P^1 = \{B^1_N, \varDelta ^1_N, B^1_A, \varDelta ^1_A\}\), \(P^2 = \{B^2_N, \varDelta ^2_N, B^2_A, \varDelta ^2_A\}\). We have the following implication:

$$\begin{aligned} \left\{ \begin{array}{c} \varDelta ^2_N \ge \varDelta ^1_N + \frac{B^1_A - B^2_A}{B^1_A} \frac{1}{1-\varDelta ^1_A}\\ B^1_N + B^1_A = B^2_N + B^2_A\\ B^1_N \le B^2_N \end{array} \right. \quad \Longrightarrow \quad \text {RI}^{P^1}_{x, \theta , \varphi }(y) \ge \text {RI}^{P^2}_{x, \theta , \varphi }(y). \end{aligned}$$

Proof

We use the same approach and notation as in the proof of Theorem (2). Construct the following solution \((x^2,\theta ^2, \varphi ^2)\) for \(\text {RI}^{P^2}_{x, \theta , \varphi }(y)\):

$$\begin{aligned} x^2 =x^1;~~ \varphi ^2 = \frac{B^2_A}{B^1_A} \varphi ^1;~~ \theta ^2_j = \left\{ \begin{array}{ll} \theta ^1_j + \frac{B^1_A - B^2_A}{B^1_A} \sum _{i \in \delta ^-(j)} \varphi ^1_{ij} &{}\quad \text {if } x^1_j=1\\ 0 &{}\quad \text {if } x^1_j=0 \end{array}\right. ,~ \forall j \in V\;. \end{aligned}$$

It is immediate to check that \(\text {RI}^{P^1}_{x, \theta , \varphi }(y) = \text {RI}^{P^2}_{x, \theta , \varphi }(y)\), and that all the constraints of \(\text {RI}^{P^2}_{x, \theta , \varphi }(y)\) except for \(\theta ^2_j \le \varDelta ^2_N t_j\) are respected. Since \((x^1,\theta ^1, \varphi ^1)\) is feasible, we have \(\varphi ^1 \le \varDelta ^1_A w x^1_i \le \varDelta ^1_A w\), which implies:

$$\begin{aligned} (1-\varDelta ^1_A)\sum _{i \in \delta ^-(j)} w_{ij} \le \sum _{i \in \delta ^-(j)} (w_{ij} - \varphi ^1_{ij}). \end{aligned}$$

Notice that if node j is not active (i.e., \(x^2_j=x^1_j=0\)), we must have \(\sum _{i \in \delta ^-(j)} (w_{ij} - \varphi ^1_{ij}) < t_j\), and therefore:

$$\begin{aligned} \frac{t_j}{(1-\varDelta ^1_A)} \ge \sum _{i \in \delta ^-(j)} w_{ij} \ge \sum _{i \in \delta ^-(j)} \varphi ^1_{ij}, \end{aligned}$$

where we used the fact that \(\varphi ^1_{ij} \le \varDelta ^1_A w_{ij} x^1_i\le \varDelta ^1_A w_{ij}\) and \(0 \le \varDelta _A \le 1\). Finally, this implies:

$$\begin{aligned} \frac{\theta ^2_j}{t_j} = \frac{\theta ^1_j}{t_j} + \frac{B^1_A - B^2_A}{B^1_A} \sum _{i \in \delta ^-(j)} \frac{\varphi ^1_{ij}}{t_j} \le \varDelta ^1_N + \frac{B^1_A - B^2_A}{B^1_A} \sum _{i \in \delta ^-(j)} \frac{\varphi ^1_{ij}}{t_j} \le \varDelta ^2_N. \end{aligned}$$

\(\square \)

The sufficient conditions given in Theorems 2 and 3 can be unintuitive to grasp. We now give two special cases (one for each of the two sufficient conditions) that consider either \(B_N = 0\) or \(B_A = 0\). These special cases are easier to parse and still capture the essence of the results: as long as the maximum deviation parameters satisfy certain conditions, then the solution to the robust problem, with a given node threshold robustness budget, will be worse (i.e., more conservative) than a solution obtained with the same robustness budget assigned to on arc weight robustness.

Corollary 1

Let \(P^A = \{0, 0, B_A, \varDelta _A\}\), \(P^N = \{B_N, \varDelta _N, 0, 0\}\). We have the following implication:

$$\begin{aligned} \left\{ \begin{array}{c} \varDelta _N \ge \varDelta _A \max _{j \in V}\left\{ \frac{\sum _{i \in \delta ^-(j)} w_{ij}}{t_j}\right\} \\ B_N \ge B_A \end{array} \right. \quad \Longrightarrow \quad \text {RI}^{P^A}_{x, \theta , \varphi }(y) \ge \text {RI}^{P^N}_{x, \theta , \varphi }(y). \end{aligned}$$

Corollary 2

Let \(P^A = \{0, 0, B_A, \varDelta _A\}\), \(P^N = \{B_N, \varDelta _N, 0, 0\}\). We have the following implication:

$$\begin{aligned} \left\{ \begin{array}{c} \varDelta _N \ge \frac{1}{1-\varDelta _A}\\ B_N \ge B_A \end{array} \right. \quad \Longrightarrow \quad \text {RI}^{P^A}_{x, \theta , \varphi }(y) \ge \text {RI}^{P^N}_{x, \theta , \varphi }(y). \end{aligned}$$

4 Activation set formulation for the non-robust problem

Our first step toward solving (R-IMP) is a formulation for the non-robust counterpart of the problem. In this section we therefore assume that \(\theta , \varphi = 0\) for simplicity. The node activation threshold at node j is then \(t_j\) and the arc weights are given by the vector w. The formulation that we use relies on the following definition:

Definition 2

For all \(j \in V\), we define the corresponding collection of minimal activation sets \({\mathcal {C}}_j\) as:

$$\begin{aligned} {\mathcal {C}}_j = \left\{ S \subseteq \delta ^-(j) : \sum _{i \in S} w_{ij} \ge t_j, S \text { is minimal }\right\} \end{aligned}$$

It is obvious that a node j is active if and only if there exists \(S \in {\mathcal {C}}_j\) such that all nodes in S are active. The concept of minimal activation set was first introduced in the recent paper Fischetti et al. [12]. We developed the idea independently and we use our simpler definition, but it is easy to verify that the definition above corresponds to the minimal influencing set of Fischetti et al. [12] in the context of the linear threshold model and no incentives. We can reformulate \(\text {RI}_{x, \theta =0, \varphi =0}({\bar{y}})\) (see Eq. (1)) using the collection of minimal activation sets.

$$\begin{aligned} \left. \begin{array}{rrcl} \text {AS}_x({\bar{y}}) := &{} \min \sum _{j \in V} x_j \\ \forall j \in V, \forall S \in {\mathcal {C}}_j &{} \sum _{i \in S} x_i - x_j &{}\le &{} |S| - 1 \\ \forall j \in V &{} x_j &{}\ge &{} {\bar{y}}_j. \end{array} \right\} \end{aligned}$$
(3)

Proposition 3

If \({\bar{y}}\) is a 0–1 vector, the optimal solution \(x^*\) to \(\text {AS}_x({\bar{y}})\) is integer and \(x_j = 1\) if and only if \(j \in A_n\) as returned by InfluenceSpread\(({\bar{y}}, t, w)\).

Proof

Every \(x_j\) is lower bounded by \({\bar{y}}_j\) and by \(\sum _{i \in S} x_i - |S| + 1\) for some subset of nodes S adjacent to node j.

We first show by induction for \(k=1,\dots ,n\) that for every node \(j \in A_k\) in InfluenceSpread\(({\bar{y}}, t, w)\), we have an implied lower bound \(x_j \ge 1\) in \(\text {AS}_x({\bar{y}})\).

For \(k = 1\) the claim is obvious because of the constraints \(x_j \ge {\bar{y}}_j\). To go from \(k-1\) to k, notice that if node j is added to \(A_k\) at step k of InfluenceSpread\(({\bar{y}}, t, w)\), it must be that \(\sum _{i \in \delta ^-(j) : i \in A_{k-1}} w_{ij} \ge t_j\). By the induction hypothesis for all \(i \in A_{k-1}\) we have \(x_i \ge 1\), hence \(\sum _{i \in \delta ^-(j)} w_{ij} x_i \ge t_j\). By definition of minimal activation set, there must exist some \(S \in {\mathcal {C}}_j\), say \({\bar{S}}\), such that \({\bar{S}} \subseteq A_{k-1}\). Then the corresponding constraint \(\sum _{i \in {\bar{S}}} x_i - x_j \le |{\bar{S}}| - 1\) in the formulation \(\text {AS}_x({\bar{y}})\) reads \(|{\bar{S}}| - x_j \le |{\bar{S}}| - 1\), implying \(x_j \ge 1\).

Finally, for every \(j \not \in A_n\), all the constraints \(\sum _{i \in S} x_i - x_j \le |S| - 1\) are slack because there does not exist \(S \in {\mathcal {C}}_j, S \subseteq A_k\) for some k. Hence, the implied lower bound for \(x_j\) is 0. Since we are minimizing \(\sum _{j \in V} x_j\), at the optimum \(x_j = 1\) if and only if \(j \in A_n\).\(\square \)

We can use \(\text {AS}_x({\bar{y}})\) to obtain a single-level linear formulation for the restriction of (OPT) in which \(\theta =0\) and \(\varphi = 0\). More specifically, we consider the following problem:

figure d

We first take the dual of the inner problem \(\text {AS}_x({\bar{y}})\) for a fixed \({\bar{y}}\). The dual is:

$$\begin{aligned} \left. \begin{array}{rrcl} \max _{\pi , \mu } &{}\quad \sum _{j \in V} \sum _{S \in {\mathcal {C}}_j} (|S|-1) \pi _{j,S} + \sum _{j \in V} \mu _j {\bar{y}}_j \\ \forall j \in V &{}\quad \sum _{k \in \delta ^+(j)} \sum _{S \in {\mathcal {C}}_k : j \in S} \pi _{k,S} - \sum _{S \in {\mathcal {C}}_j} \pi _{j,S} + \mu _j &{}\le &{} 1 \\ \forall j \in V, \forall S \in {\mathcal {C}}_j &{}\quad \pi _{j,S} &{}\le &{} 0 \\ \forall j \in V &{}\quad \mu _j &{}\ge &{} 0. \end{array} \right\} \end{aligned}$$

The solution of this problem has value equal to that of its primal problem and therefore, by Proposition 3, to InfluenceSpread\(({\bar{y}}, t, w)\) whenever \({\bar{y}} \in \{0,1\}^n\). It follows that a valid formulation for (IMP-\(\theta 0\)-\(\varphi 0\)) is the following:

$$\begin{aligned} \left. \begin{array}{r@{\quad }rcl} \max _{\pi , \mu , y} &{} \sum _{j \in V} \sum _{S \in {\mathcal {C}}_j} (|S|-1) \pi _{j,S} + \sum _{j \in V} \mu _j y_j \\ \forall j \in V &{} \sum _{k \in \delta ^+(j)} \sum _{S \in {\mathcal {C}}_k : j \in S} \pi _{k,S} - \sum _{S \in {\mathcal {C}}_j} \pi _{j,S} + \mu _j &{}\le &{} 1 \\ &{} \sum _{j \in V} y_j &{}=&{} q \\ \forall j \in V, \forall S \in {\mathcal {C}}_j &{} \pi _{j,S} &{}\le &{} 0 \\ \forall j \in V &{} \mu _j &{}\ge &{} 0 \\ \forall j \in V &{} y_j &{}\in &{} \{0,1\}. \end{array} \right\} \end{aligned}$$

This is a quadratic problem, but we can easily reformulate it as a linear problem with indicator constraints. To do so, we simply notice that whenever \(y_j = 0\), \(\mu _j\) has no contribution in the objective function; since \(\mu _j\) appears in a single constraint and increasing \(\mu _j\) reduces the feasible region for the remaining variables, there exists an optimal solution in which \(y_j = 0\) implies \(\mu _j = 0\). Thus, we obtain the following formulation for (IMP-\(\theta 0\)-\(\varphi 0\)):

figure e

The advantage with respect to (R-IMP) is of course that we now have a single-level problem, rather than bilevel. To achieve this result we had to fix \(\theta = 0\) and \(\varphi =0\). This restriction will be lifted in the next section.

An alternative approach to deal with bilinear terms is to linearize them via McCormick envelopes (see McCormick  [21]), which are exact when y is binary and \(\mu \) is bounded. It is also possible to explicitly model the logical implication \(y_j = 0 \Rightarrow \mu _j = 0\) with a big-M constraints. With these last two approaches it is necessary to find valid upper bounds for the \(\mu \) variables, since these upper bounds appear explicitly in the linearizations. Unfortunately, the required upper bounds are large: in “Appendix B” we show that n is not a valid bound, and numerically we found cases where even \(n^2\) was not enough. Hence, explicit computation of upper bounds of \(\mu \) seems difficult and would in any case lead to weak formulations. We note, however, that in a Branch-and-Bound framework the upper bounds could be iteratively tightened, see, e.g., Belotti et al. [3].

We prove an additional result that we could not successfully exploit from an empirical point of view, but may be interesting for future research.

Proposition 4

For every \({\bar{y}} \in \{0,1\}^n\), the polyhedron corresponding to the LP obtained by fixing \(y = {\bar{y}}\) in (DUAL-\(\theta 0\)-\(\varphi 0\)) is an integral polyhedron.

Proof

We show that for a given 0-1 vector \({\bar{y}}\), the remaining system in (DUAL-\(\theta 0\)-\(\varphi 0\)) is total dual integral. This implies that it defines an integral polyhedron Edmonds and Giles [10].

The discussion in this section shows that the dual of (DUAL-\(\theta 0\)-\(\varphi 0\)) for fixed \(y = {\bar{y}}\) is the problem \(\text {AS}_x({\bar{y}})\) defined in (3). To show total dual integrality of the desired system, we need to show that for any integer value of the r.h.s. of the first set of constraints in \(\text {AS}_x({\bar{y}})\), either \(\text {AS}_x({\bar{y}})\) is infeasible, or it has an optimal solution that is integer.

Let b be a given vector of integer r.h.s. values for the first set of constraints, which are indexed by \(j \in V, S \in {\mathcal {C}}_j\). First, notice that if \(b_{j,S} < 0\) for any jS, the problem is infeasible; hence, we only need to consider the case \(b \ge 0\). We show how to construct an integer optimal solution.

Define \(x^0 := {\bar{y}}\). Apply the following algorithm: for \(k=1,\dots ,n\), (i) set \(x^k_j \leftarrow 0 \, \forall j\); (ii) for \(j \in V, S \in {\mathcal {C}}_j\), set \(x^k_j \leftarrow \max \{x^k_j, \sum _{i \in S} x^{k-1}_i - b_{j,S}\}\). It is clear that this defines an integral vector \(x^n\). We now show that this solution is optimal. Let \(x^*\) be an optimal solution for the problem with r.h.s. b. We first show by induction that \(x^k \le x^*\). For \(k = 0\) this is obvious as \(x \ge {\bar{y}} = x^0\) is among the constraints. Assume \(x^{k-1} \le x^*\) and suppose \(x^k_h > x^*_h\) for some h. Since \(x^k_h\) is initially 0, it must be that for some S, \(x^k_h\) is set to \(\sum _{i \in S} x^{k-1}_i - b_{h,S} > x^*_h\) for the first time. But \(\sum _{i \in S} x^{k-1}_i - b_{h,S} \le \sum _{i \in S} x^*_i - b_{h,S} \le x^*_h\), because \(x^*\) satisfies the constraints; this is a contradiction. It follows that \(x^k \le x^*\) for all \(k=1,\dots ,n\). It is easy to check that \(x^n\) is feasible by construction, and therefore it must be optimal.\(\square \)

The proof of Proposition 4 shows that the LP obtained from (DUAL-\(\theta 0\)-\(\varphi 0\)) for integral y is total dual integral. Hence, variables \(\pi , \mu \) are automatically integer when y is integer. However, empirically we did not observe any advantage by imposing \(\pi , \mu \) integer in the MILP solver used in our numerical experiments. It is an open question whether this can be exploited in some solution method.

5 Branch-Cut-and-Price for robust influence maximization

For the robust version of the problem it is no longer sufficient to consider only the case \(\theta = 0\) and \(\varphi =0\). In fact, we would like to optimize over \(\theta \) and \(\varphi \) as in the original formulation (R-IMP), but it is not obvious how to use the dualization trick described in the previous section when \(\theta \) and \(\varphi \) are not fixed. This is because the activation sets may change with \(\theta \) and \(\varphi \), since \(\theta \) directly affects the node activation thresholds, while \(\varphi \) affects the edges values. To overcome this difficulty, we propose to work with a modification of (DUAL-\(\theta 0\)-\(\varphi 0\)) that includes dual \(\pi _{j,S}\) variables for all the activation sets that may be minimal for any of the possible values of \(\theta \) and \(\varphi \). We choose an objective function that provides a valid upper (dual) bound for any choice of seed nodes \({\bar{y}}\), and that encodes the objective function for a specific choice of \(\theta \) and \(\varphi \). This objective function is iteratively changed in the course of Branch-and-Cut to tighten the upper bound. At the same time, we keep track of primal solutions so that at termination we have the vector \({\bar{y}}\) attaining the maximum possible \(\text {RI}_{x, \theta , \varphi }({\bar{y}})\), i.e., the maximum influence spread when simultaneously minimizing over \(\theta \) and \(\varphi \).

It will be convenient to introduce the following sets.

Definition 3

For all \(j \in V\) and robustness parameters \(\varDelta _N\) and \(\varDelta _A\), we define the corresponding extended collection of minimal activation sets \({\mathcal {C}}^e_j\) as:

$$\begin{aligned} {\mathcal {C}}^e_j= & {} \left\{ S \subseteq \delta ^-(j) : \exists ~\theta _j \in [0, \varDelta _{N} t_j], \varphi _{ij} \in [0, \varDelta _{A} w_{ij}] \text { such that }\right. \\&\left. \sum _{i \in S} (w_{ij} - \varphi _{ij}) \ge t_j + \theta _j, S \text { is minimal}\right\} \end{aligned}$$

In other words, \({\mathcal {C}}^e_j\) contains all subsets of \(\delta ^-(j)\) that are a minimal activation set for some value of the random data (i.e., \(\theta \) and \(\varphi \)) within the uncertainty set. By definition, \({\mathcal {C}}_j \subseteq {\mathcal {C}}^e_j\). Moreover, it is easy to check that \({\mathcal {C}}^e_j\) coincides with \({\mathcal {C}}_j\) whenever \(\varDelta _N=0\) and \(\varDelta _A=0\). An algorithm to compute \({\mathcal {C}}^e_j\) is given in Algorithm 2. For a given node j, the algorithm starts with an empty set. It recursively adds nodes from \(\delta ^-(j)\), starting from the one connected with the arc of maximum weight, and checking whether each new set is an activation set. The algorithm stops when adding new nodes would make the activation set non-minimal for every possible value of the node threshold and of the arc weights. Indeed, minimality is guaranteed by the fact that nodes are added in decreasing order of the corresponding arc weights.

We then define a family of \(2^n\) possible objective functions, each of which defines a problem equivalent to \(\text {RI}_{x, \theta = {\bar{\theta }}, \varphi = {\bar{\varphi }}}(y)\) for a fixed choice of \({\bar{\theta }}\) and \({\bar{\varphi }}\). In practice the optimization considers one objective function at a time, but this can be locally changed at certain nodes of the Branch-and-Bound tree as long as we ensure that pruning decisions based on the dual bound are correct. This will be explained in Sect. 5.1.

Note that the number of sets in \({\mathcal {C}}_j^e\) grows exponentially with the maximum degree of a node in the graph. In the next subsection, we show that our approach based on row generation and replacement is correct under the assumption that all the sets in \({\mathcal {C}}_j^e\) are known. In practice, to deal with large graphs we propose a column generation approach, discussed in Sects. 5.2 and 6.

figure f

5.1 Row generation and replacement

We now discuss the solution of (R-IMP) via Branch-and-Cut assuming that the sets in \({\mathcal {C}}_j^e\) are known explicitly. To do so, we introduce some additional notation.

Definition 4

For every \({\bar{y}} \in \{0,1\}^n\), let \(\theta ^{{\bar{y}}}, \varphi ^{{\bar{y}}}\) be the optimal values of \(\theta , \varphi \) for problem \(\text {RI}_{x, \theta , \varphi }({\bar{y}})\). We define the corresponding collection of seed-dependent minimal activation sets \({\mathcal {C}}_j^{{\bar{y}}}\) as:

$$\begin{aligned} {\mathcal {C}}_j^{{\bar{y}}} = \left\{ S \subseteq \delta ^-(j) : \sum _{i \in S} (w_{ij} - \varphi _{ij}^{{\bar{y}}}) \ge t_j + \theta ^{{\bar{y}}}_j, S \text { is minimal}\right\} . \end{aligned}$$

In other words, \({\mathcal {C}}_j^{{\bar{y}}}\) is the collection of activation sets that are minimal for a given \(\theta ^{{\bar{y}}},\varphi ^{{\bar{y}}}\). Notice that \({\mathcal {C}}_j^{{\bar{y}}} \subseteq {\mathcal {C}}_j^e\) for all \({\bar{y}}\).

We show that (R-IMP) can be solved via the solution of an exponential number of problems of the form (DUAL-\(\theta 0\)-\(\varphi 0\)), with different objective functions. Furthermore, this can in principle be solved using just one Branch-and-Bound tree. Consider the following problem, where \({\bar{y}} \in \{0,1\}^n\).

figure g

Theorem 4

For any \({\bar{y}} \in \{0,1\}^n\), the value of (R-IMP-\({\bar{y}}\)) at every integer solution \({\tilde{y}}\) is an upper bound to \(\text {RI}_{x, \theta , \varphi }({\tilde{y}})\) [Eq. (1)], and if \({\tilde{y}} = {\bar{y}}\), then the value of (R-IMP-\({\bar{y}}\)) is exactly \(\text {RI}_{x, \theta , \varphi }({\bar{y}})\).

Proof

Consider the following LP, obtained by fixing \(y = {\tilde{y}}\) in (R-IMP-\({\bar{y}}\)) and eliminating the redundant variable z:

$$\begin{aligned} \left. \begin{array}{r@{\quad }rcl} \max _{\pi , \mu } &{} \sum _{j \in V} \sum _{S \in {\mathcal {C}}_j^{{\bar{y}}} } (|S|-1) \pi _{j,S} + \\ &{}\sum _{j \in V} \sum _{S \in {\mathcal {C}}^e_j {\setminus } {\mathcal {C}}_j^{{\bar{y}}} } |S| \pi _{j,S} + \sum _{j \in V} \mu _j &{} &{} \\ \forall j \in V &{} \sum _{k \in \delta ^+(j)} \sum _{S \in {\mathcal {C}}^e_k: j \in S} \pi _{k,S} - \sum _{S \in {\mathcal {C}}^e_j} \pi _{j,S} + \mu _j &{}\le &{} 1 \\ \forall j \in V, {\tilde{y}}_j = 0 &{} \mu _j &{}=&{} 0 \\ \forall j \in V, \forall S \in {\mathcal {C}}^e_j &{} \pi _{j,S} &{}\le &{} 0 \\ \forall j \in V &{} \mu _j &{}\ge &{} 0. \end{array} \right\} \end{aligned}$$
(4)

This problem is feasible as the all-zero solution is feasible. Using the reverse of the transformations discussed in Sect. 4, we can show that the dual of such an LP is equivalent to the following problem:

$$\begin{aligned} \left. \begin{array}{r@{\quad }rcl} \min &{} \sum _{j \in V} x_j &{} &{} \\ \forall j \in V, \forall S \in {\mathcal {C}}_j^{{\bar{y}}} &{} \sum _{i \in S} x_i - x_j &{}\le &{} |S| - 1 \ \\ \forall j \in V, \forall S \in {\mathcal {C}}^e_j {\setminus } {\mathcal {C}}_j^{{\bar{y}}} &{} \sum _{i \in S} x_i - x_j &{}\le &{} |S| \\ \forall j \in V &{} x_j &{}\ge &{} y^*_j \\ \forall j \in V &{} x_j &{}\le &{} 1 \\ \forall j \in V &{} x_j \ge 0. \end{array} \right\} \end{aligned}$$
(5)

The constraints with r.h.s. value |S| are redundant and can be dropped. As a result, with the same argument used for Proposition 3, the optimum value of (5) and therefore (4) is equal to \(\text {RI}_{x, \theta =\theta ^{{\bar{y}}}, \varphi =\varphi ^{{\bar{y}}}}({\tilde{y}})\). The statement then follows: this value is an upper bound for \(\text {RI}_{x, \theta , \varphi }({\tilde{y}})\) (because RI is a minimization problem and \(\theta =\theta ^{{\bar{y}}}, \varphi =\varphi ^{{\bar{y}}}\) is just a feasible solution), and is equal to \(\text {RI}_{x, \theta , \varphi }({\bar{y}})\) when evaluated at \({\tilde{y}} = {\bar{y}}\) because the optimal \(\theta , \varphi \) are \(\theta ^{{\bar{y}}}, \varphi ^{{\bar{y}}}\) by definition.\(\square \)

Theorem (4) shows a way to obtain valid upper bounds. We can exploit this in the following algorithm for (R-IMP). First, generate the extended collection \({\mathcal {C}}^e_j\) of activation sets for all \(j\in V\), using Algorithm 2. It is easy to see that for a given \({\bar{y}}\), each activation set \(S \in {\mathcal {C}}^e_j\) is valid in the range \([t_j, \min \{(1 + \varDelta _N) t_j, \sum _{i \in S} (w_{ij} - \varphi ^{{\bar{y}}}_{ij})\}]\) and it is minimal if the threshold is strictly greater than \(\sum _{i \in S} (w_{ij} - \varphi _{ij}^{{\bar{y}}}) - \min _{i \in S} \{w_{ij} - \varphi _{ij}^{{\bar{y}}}\}\). Then, implement problem (R-IMP-\({\bar{y}}\)) within a Branch-and-Bound solver for an arbitrary initial vector \({\bar{y}} \in \{0,1\}^n\). At every node of the Branch-and-Bound tree we do the following:

  • If the solution to the node LP is not integer, decide if \({\bar{y}}\) used in the first row of (R-IMP-\({\bar{y}}\)) to define the objective function should be updated, with an arbitrary decision rule.

  • If the solution to the node LP is integer, say, \({\tilde{y}}\), then:

    • If \({\bar{y}} = {\tilde{y}}\) for that node, accept the solution;

    • Otherwise, update \({\bar{y}} = {\tilde{y}}\) for that node and its potential children, and put the node back in the list of open nodes. (The solution \({\tilde{y}}\) is thus not accepted.)

Correctness of this algorithm follows by Theorem 4 and standard Branch-and-Bound arguments: we just need to observe that pruning decisions are correct, because an integer solution \({\tilde{y}}\) is only accepted if the corresponding LP has value \(\text {RI}_{x, \theta , \varphi }({\tilde{y}})\), and at every node we have a valid upper bound. There are several valid strategies to choose when to replace the objective function that depends on \({\bar{y}}\). For example, it is reasonable to replace \({\bar{y}}\) every few levels of the Branch-and-Bound tree with a \({\bar{y}}\) (corresponding to a choice of \(\theta , \varphi \)) that improves the dual bound. This can be computed solving a problem of the form \(\text {RI}_{x, \theta , \varphi }({\hat{y}})\) where \({\hat{y}}\) is, e.g., the current vector of upper bounds on the y variables: the cardinality constraint on y can be temporarily relaxed, since the goal is simply to obtain a choice of \(\theta , \varphi \) that can be used to compute the objective function. In practice, our implementation uses IBM ILOG CPLEX, and CPLEX does not allow changing the objective function in the course of Branch-and-Bound. Thus, we implement a variant of the above algorithm, based on restarts. This is described in Sect. 6.

Fig. 2
figure 2

Graph discussed in the example related to Theorem 4. Node labels are indicated inside each node, node thresholds are indicated above and to the right of each node, and arc weights are indicated next to each arc

As every distinct \({\bar{y}}\) yields a different first row in (R-IMP-\({\bar{y}}\)), one may be tempted to keep all these constraints in the formulation so that the objective function z is upper bounded by the tightest such constraint. However, the following example shows that this would break Theorem 4, in the sense that the linear programming relaxation to the problem would no longer yield valid dual bounds.Footnote 1 Consider the graph depicted in Fig. 2. We assume that \(B_N = 0, B_A = 1, q = 2\). The extended collections of minimal activation sets for the three nodes that can be activated by other nodes are:

$$\begin{aligned} {\mathcal {C}}_3^e&= \{S_1 = \{1, 2\}\} \\ {\mathcal {C}}_4^e&= \{S_2 = \{1, 2\}, S_3 = \{1, 3\}, S_4 = \{2, 3\}, S_5 = \{1, 2, 3\}\} \\ {\mathcal {C}}_5^e&= \{S_6 = \{4\}\} \end{aligned}$$

We consider three different values of \({\bar{y}}\) (the other possibilities are irrelevant for the purposes of this example): (1, 1, 0, 0, 0), (0, 1, 1, 0, 0), (1, 0, 1, 0, 0). For brevity, in the following we will write these solutions as 5-digit binary strings. For each \({\bar{y}}\) there are multiple optimal solutions to the problem \(\text {RI}_{x, \theta , \varphi }({\bar{y}})\). In our example, we assume that the optimal solutions have the following values of \(\varphi \) (\(\theta _j\) is always 0 since \(B_N = 0\)) for each value of \({\bar{y}}\):

$$\begin{aligned} {\bar{y}} = 11000&~~\longmapsto ~~\varphi _{14} = 1 \\ {\bar{y}} = 01100&~~\longmapsto ~~\varphi _{24} = 1 \\ {\bar{y}} = 10100&~~\longmapsto ~~\varphi _{23} = 1. \end{aligned}$$

Notice that the activation sets \(S_2, S_3\) are no longer valid activation sets for node 4 when \(\varphi _{14} = 1\), as the sum of the corresponding arc weights does not exceed the threshold of node 4. Similarly, \(\varphi _{24} = 1\) disables activation set \(S_4\), and \(\varphi _{23}\) disables activation set \(S_1\). Problem (R-IMP-\({\bar{y}}\)) with all the constraints corresponding to the three \({\bar{y}}\) indicated above reads:

$$\begin{aligned} \left. \begin{array}{r@{\quad }rcl} \max _{\pi , \mu , y} &{} z &{} &{} \\ ({\bar{y}} = 11000) &{} \pi _{S_1} + 2\pi _{S_2} + 2\pi _{S_3} + \pi _{S_4} + 2\pi _{S_5} + \sum _{j=1,\dots ,5} \mu _j &{}\ge &{} z \\ ({\bar{y}} = 01100) &{} \pi _{S_1} + \pi _{S_2} + \pi _{S_3} + 2\pi _{S_4} + 2\pi _{S_5} + \sum _{j=1,\dots ,5} \mu _j &{}\ge &{} z \\ ({\bar{y}} = 10100) &{} 2\pi _{S_1} + \pi _{S_2} + \pi _{S_3} + \pi _{S_4} + 2\pi _{S_5} + \sum _{j=1,\dots ,5} \mu _j &{}\ge &{} z \\ &{} \pi _{S_1} + \pi _{S_2} + \pi _{S_3} + \pi _{S_5} + \mu _1 &{}\le &{} 1\\ &{} \pi _{S_1} + \pi _{S_2} + \pi _{S_4} + \pi _{S_5} + \mu _2 &{}\le &{} 1\\ &{} \pi _{S_3} + \pi _{S_4} + \pi _{S_5} - \pi _{S_1} + \mu _3 &{}\le &{} 1\\ &{} \pi _{S_6} - \pi _{S_2} - \pi _{S_3} - \pi _{S_4} - \pi _{S_5} + \mu _4 &{}\le &{} 1\\ &{} - \pi _{S_6} + \mu _5 &{}\le &{} 1\\ \forall j=1,\dots ,5 &{} \sum _{k \in \delta ^+(j)} \sum _{S \in {\mathcal {C}}^e_k: j \in S} \pi _S - \sum _{S \in {\mathcal {C}}^e_j} \pi _S + \mu _j &{}\le &{} 1 \\ \forall j=1,\dots ,5 &{} y_j = 0 \Rightarrow \mu _j &{}=&{} 0 \\ &{} \sum _{j = 1,\dots ,5} y_j &{}=&{} 2 \\ \forall j=1,\dots ,6 &{} \pi _{S_j} &{}\le &{} 0 \\ \forall j=1,\dots ,5 &{} \mu _j &{}\ge &{} 0 \\ \forall j=1,\dots ,5 &{} y_j &{}\in &{} \{0,1\}. \end{array} \right\} \end{aligned}$$
(6)

To see where trouble might arise, consider the LP obtained by fixing \(y = 11000\). We take its dual as it makes the situation easier to understand:

$$\begin{aligned} \left. \begin{array}{r@{\quad }rcl} \min &{} x_1 + x_2 + x_3 + x_4 + x_5 &{} &{} \\ &{}x_1 + x_2 - x_3 - \gamma _1 - \gamma _2 -2 \gamma _3 &{}\le &{} 0 \\ &{}x_1 + x_2 - x_4 -2 \gamma _1 - \gamma _2 - \gamma _3 &{}\le &{} 0 \\ &{}x_1 + x_3 - x_4 -2 \gamma _1 - \gamma _2 - \gamma _3 &{}\le &{} 0 \\ &{}x_2 + x_3 - x_4 - \gamma _1 -2 \gamma _2 - \gamma _3 &{}\le &{} 0 \\ &{}x_1 + x_2 + x_3 - x_4 - 2\gamma _1 - 2\gamma _2 - 2\gamma _3 &{}\le &{} 0 \\ &{}x_4 - x_5 &{}\le &{} 0 \\ &{}x_1 -\gamma _1 - \gamma _2 - \gamma _3 &{}\ge &{} 0 \\ &{}x_2 -\gamma _2 - \gamma _2 - \gamma _3 &{}\ge &{} 0 \\ &{}\gamma _1 + \gamma _2 + \gamma _3 &{}\ge &{} 1 \\ &{} x, \gamma &{}\ge &{} 0. \end{array} \right\} \end{aligned}$$
(7)

In the dual (7), \(\gamma \) are the variables corresponding to the first three constraints in (6). The dual has a form similar to (5), as expected, and the constraints correspond to activation sets. However, due to presence of multiple \({\bar{y}}\) constraints in (6), the r.h.s. vector of (7) is 0, and the problem is allowed to “choose” a r.h.s. vector using the \(\gamma \) variables. In contrast, the r.h.s. vector of (5) is fixed and corresponds to having a single nonzero \(\gamma \) variable equal to 1. Because of this, the value of (7) can be lower than \(\text {RI}_{x, \theta , \varphi }({\bar{y}})\), yielding an invalid bound, see Theorem 4. In this example, (7) has optimal objective function value 3.5, and the optimal variables are (we report only nonzeroes):

$$\begin{aligned} x_1 = 1, \; x_2 = 1, \; x_3 = 0.5, \; x_4 = 0.5, \; x_5 = 0.5, \; \gamma _1 = 0.5, \; \gamma _3 = 0.5. \end{aligned}$$

This is not a valid upper bound, because \(\text {RI}_{x, \theta , \varphi }(11000)\) has value 4, as can be easily checked: only node 3 is inactive at the optimum.

5.2 Column generation

As remarked in the preceding discussion, \({\mathcal {C}}^e_j\) may have exponentially large cardinality. However, its elements and the related variables can be generated via column generation. We describe the pricing problem for the LP relaxation of (R-IMP-\({\bar{y}}\)). Notice that since we have indicator variables in (R-IMP-\({\bar{y}}\)), it is not immediately apparent how to obtain the LP relaxation. For all practical purposes, however, indicator variables can be treated as if they were involved in a big-M formulation to activate/deactivate constraints. In the specific case of (R-IMP-\({\bar{y}}\)), we can simply assume that if a variable \(y_j\) has value 0 at node j, then \(\mu _j = 0\), and otherwise \(\mu _j \ge 0\). Thus, at any node of the Branch-and-Bound tree, we have access to the corresponding LP relaxation.

Since the variables \(\pi _{j,S}\) are defined for each activation set S of node j, we can study column generation by considering a single node \(j \in V\): the same reasoning applies to all nodes of the graph. To generate \(\pi _{j,S}\), we need to determine its activation set S. Call x the dual variables for the first set of constraints of (R-IMP-\({\bar{y}}\)) (i.e., the constraints for \(j \in V\), rather than the constraint used to bound the objective function variable z; we use x as these constraints correspond to the variables in (3)). The reduced cost of \(\pi _{j,S}\) in \({\mathcal {C}}^e_j\) is:

$$\begin{aligned} x_j - \sum _{i \in S} x_i - {\left\{ \begin{array}{ll}(|S|-1) &{} \text {if } {\bar{y}} \in {\mathcal {C}}_j^{{\bar{y}}} \\ |S| &{} \text {if } {\bar{y}} \in {\mathcal {C}}^e_j {\setminus } {\mathcal {C}}^{{\bar{y}}}_j \end{array}\right. } \end{aligned}$$

Since we are maximizing and \(\pi _{j,S} \le 0\), a variable may enter the basis if its reduced cost is nonpositive. Therefore, to find if an entering column belonging to \({\mathcal {C}}^e_j\) exists, we should solve:

$$\begin{aligned} \max _{S \in {\mathcal {C}}^e_j} \sum _{i \in S} x_i - x_j + (|S|-1 + \mathbb {1}_{S \in C^e_j {\setminus } C_j^{{\bar{y}}}}). \end{aligned}$$
(8)

Here, \(\mathbb {1}_{f}\) is 1 if f is true and 0 otherwise. If the objective value is strictly positive, then the column \(\pi _{j,S} \in {\mathcal {C}}^e_j\) should enter the LP. Problem (8) can be formulated as a MILP. We first rewrite the objective function as:

$$\begin{aligned} \sum _{i \in S} (x_i + 1) - x_j - 1 + \mathbb {1}_{S \in C^e_j {\setminus } C_j^{{\bar{y}}}}. \end{aligned}$$

Our aim is to determine \(S \in {\mathcal {C}}^e_j\), which is a subset of the nodes in \(\delta ^-(j)\). We introduce binary variables \(\psi _i, i \in \delta ^-(j)\) to select the nodes that will be included in S. We also need to determine if \(S \in C^e_j {\setminus } C_j^{{\bar{y}}}\); we use a binary variable \(\beta \) for this purpose. Furthermore, we only want to generate minimal activation sets: this is not necessary for correctness, but drastically reduces the set of columns. To model the minimality constraint, we use a decision variable \(\alpha = \min \nolimits _{i\in \delta ^-(j): \psi _i = 1} w_{ij} \) that takes the value of the smallest arc weight within the chosen activation set. We then obtain the following pricing problem for node \(j \in V\), where we use \({\bar{w}}_j = \max \nolimits _{i\in \delta ^-(j)}w_{ij}\):

figure h

When solving the pricing problem, one has to be careful in dealing with degeneracy. If the solution to the LP is degenerate, there can be columns with positive reduced cost that do not change the LP solution. This could lead to a situation in which we keep generating the same set of columns because the dual costs do not change, thereby entering an endless loop. To avoid this problem, we simply keep track of all the generated columns and make sure that we do not generate them again by adding no-good cuts in the pricing problem. For example, if \({\bar{S}}\in C_j^e\) is an activation set generated by the pricing problem for node \(j \in V\), then we include the no-good cut \(\sum _{i \not \in {\bar{S}}} \psi _i + \sum _{i \in {\bar{S}}} (1-\psi _i) \ge 1\) in all subsequent pricing problems solved for the same node \(j \in V\). We note that, from a computational point of view, no-good cuts tend to be weak in practice, and their main purpose in our scheme is to avoid cycling, rather than improving the strength of the formulation for the pricing problem. Finally, we remark that to check if all necessary columns have been generated, one has to verify that the value of (PRICE-j) is nonpositive for all \(j \in V\).

The implementation of the algorithms described in this paper is based on the commercial software CPLEX, which does not allow adding columns at nodes of the Branch-and-Bound tree. Implementing an exact Branch-Cut-and-Price would require a different software framework. We decided instead to implement a heuristic version of column generation, still relying on CPLEX. This is described in the next section.

6 Implementation of the algorithms

Sections 5.1 and 5.2 describe exact algorithms that would require fine control over the Branch-and-Cut process: an implementation of those algorithms must be able to change rows of the node LPs, and add columns after solving the pricing problems. These capabilities are not available when using commercial MILP solvers. Nonetheless, we implemented an exact algorithm, following Sect. 5.1, under the assumption that all columns (i.e., activation sets) can be generated in preprocessing, as well as a heuristic Branch-Cut-and-Price that periodically performs a pricing step, but does not offer optimality guarantees. The implementation is described here.

We start with the implementation of the exact algorithm under the assumption that all sets in \({\mathcal {C}}_j^e\) are computed in a preprocessing step. We use this implementation on graphs of smaller size; it is called Exact in Sect. 7. It works as follows.

  1. S1

    Arbitrarily choose an initial \({\bar{y}}\) to set the first row of (R-IMP-\({\bar{y}}\)). Keep a lower bound L on the optimum; initially, \(L \leftarrow 0\).

  2. S2

    Solve (R-IMP-\({\bar{y}}\)) in a Branch-and-Bound framework, using a callback to intercept any integer solution \({\tilde{y}}\). At every integer solution \({\tilde{y}}\), do:

    • Solve \(\text {RI}_{x, \theta , \varphi }({\tilde{y}})\).

    • If \(\text {RI}_{x, \theta , \varphi }({\tilde{y}}) > L\), save \({\tilde{y}}\), update \(L \leftarrow \text {RI}_{x, \theta , \varphi }({\tilde{y}})\), update \({\bar{y}} \leftarrow {\tilde{y}}\) used to determine the first row of (R-IMP-\({\bar{y}}\)), and restart, i.e., go back to [S2].

    • If \(\text {RI}_{x, \theta , \varphi }({\tilde{y}}) < L\), reject the incumbent and continue.

    • Else, if the value of the current incumbent matches the value of \(\text {RI}_{x, \theta , \varphi }({\tilde{y}})\), then accept the incumbent and continue; otherwise, reject the incumbent and continue.

Correctness of this algorithm is a direct consequence of the discussion in Sect. 5.1: at every restart, either the Branch-and-Bound finds a better solution, or it terminates proving optimality of the best known solution. In practice, we allow an exception to the rule that forces a restart: if \(\text {RI}_{x, \theta , \varphi }({\tilde{y}}) > L\), but we have processed more than \(10^6\) nodes in the Branch-and-Bound tree, or we have processed at least \(10^5\) nodes and the dual bound has improved compared to the root node, we do not restart. Instead, we continue the Branch-and-Bound until termination. The rationale is that we have made enough progress in the search, and restarting would be too detrimental.

Next, we describe the heuristic Branch-Cut-and-Price that we use on large graphs; this is called HeurCG in Sect. 7. The algorithm uses Exact as a subroutine. It requires parameters \(\texttt {nic}\) (number of initial columns), \(\texttt {mpi}\) (maximum number of “priced” nodes per iteration), \(\texttt {mci}\) (maximum number of column generation subproblems per iteration), \(\texttt {mcr}\) (maximum number of columns added per round), and it works as follows.

  • Generate an initial set of columns C, with \(|C|=\texttt {nic}\). To generate the initial set of columns we apply RecGenExt to each node \(j \in V\) with an upper bound k on the cardinality of the activation sets: we start with \(k = 2\), and iteratively increase k until we generate at least nic columns, keeping all columns generated this way (which may therefore be more than nic).

  • Repeat the following until the time limit is reached:

    • Launch Exact using columns C, until it either finds a better solution \({\tilde{y}}\) and is about to restart, or terminates proving optimality of a solution \({\tilde{y}}\).

    • Perform a pricing step on the LP relaxation of (R-IMP-\({\bar{y}}\)), fixing \(y = {\tilde{y}}\) and using \({\bar{y}} = {\tilde{y}}\). A pricing step consists of several pricing iterations, where each iteration solves (PRICE-j) for \(\texttt {mpi}\times |V|\) randomly chosen nodes \(j \in V\). The pricing step performs a pricing iteration until: (i) the bound of the LP matches \(\text {RI}_{x, \theta , \varphi }({\tilde{y}})\), or (ii) we have performed \(\texttt {mci} \times |V|\) pricing iterations, or (iii) the number of generated columns per pricing step is greater than \(\texttt {mcr} \times |V|\).

    • Let \(C'\) be the newly generated columns. Set \(C \leftarrow C \cup C'\).

In our implementation we use the following values: \(\texttt {nic}=2500\), \(\texttt {mpi}=0.025\), \(\texttt {mci}=2\) and \(\texttt {mcr}=2\). This choice is motivated by the empirical analysis discussed in “Appendix C”.

7 Computational results

We test our approach on a large collection of graphs taken from the literature. All the proposed algorithms are written in Python using IBM ILOG CPLEX 12.8.0 as MILP solver. The code is available via GitHub, see [24]. We activate the numerical emphasis setting because of some difficulties faced during our numerical evaluation (on some graphs, model (DUAL-\(\theta 0\)-\(\varphi 0\)) can have solutions with very large values). Related to this, we remark that reformulating the indicator constraints with big-M constraints is likely to fail: we tried a similar approach, but the big-M values required for validity are too large in practice. All the experiments reported in this section are executed on a homogeneous cluster equipped with Xeon E7-4850 processors (2.00 GHz, 64 GB RAM).

We consider graphs belonging to the class SW (“small world”), described in Fischetti et al. [12], Watts and Strogatz [25]. We use two different test sets:

  • Small graphs.

    Number of nodes \(n \in \{50, 75, 100\}\), average node degree \(k\in \{4, 8\}\) for \(n=50\) and \(k\in \{8,12\}\) for \(n=75,100\), and rewiring probability \(b\in \{0.1, 0.3\}\). For each combination of settings, we used 5 random instances.

  • Large graphs.

    Number of nodes \(n \in \{2000, 5000\}\), average node degree \(k\in \{12, 16\}\) and rewiring probability \(b\in \{0.1, 0.3\}\). For each combination of settings, we used 5 random instances.

Therefore, we have a total of 60 small instances and 40 large instances. These graph as well as some additional instances are freely available via GitHub at https://github.com/sartorg/robinmax [24]. We also tested a directed version of the Erdos-Rényi random graph Erdos and Rényi [11]: the conclusions of the study are similar and are therefore not reported.

Table 1 Exact algorithm on Small graphs with \(n=50\), averaged across instances of same average node degree k
Table 2 Exact algorithm on Small graphs with \(n=75\), averaged across instances of same average node degree k
Table 3 Exact algorithm on Small graphs with \(n=100\), averaged across instances of same average node degree k

We solve the robust and non-robust IMP on each graph using different sets of parameters and algorithms. The number of seeds is chosen as a fraction (rounded up to an integer) \(s \in \{0.05, 0.10, 0.15\}\) of n for the Small Graphs. The robustness budgets \(B_N\) and \(B_A\) are chosen from \(\{0, 1, 2\}\), where the case with \(B_N = B_A = 0\) corresponds to the non-robust case (i.e., the deterministic linear threshold model). We use \(s \in \{0.10, 0.15\}\), \(B_N, B_A\in \{0, 25, 50\}\) for the Large Graphs. The maximum deviations \(\varDelta _N, \varDelta _A\) are always equal to 0.1.

Tables 12 and 3 illustrate the results obtained on Small Graphs with Exact, where all extended collections of activation sets are generated in a preprocessing phase with Algorithm  2. The tables report the average computation time, the average gap, the fraction of instances solved to optimality and the average number of activation sets generated for each combination of parameters. In our tests CPLEX generates very few cuts – mostly implied bounds, but almost no other cut. This may be related to the numerical difficulties mentioned above. The robust version seems harder than the non-robust counterpart, but at least for smaller instances the computation time is similar: for \(n=50\), whenever we can solve all deterministic instances, we can solve the corresponding robust problems as well. The complexity of the instance increases rapidly with the number of nodes and with the number of seeds; this is not surprising and agrees with similar findings presented in Fischetti et al. [12]. The number of activation sets is strongly dependent on k; already for \(n=100, k=12\) there are several thousand activation sets, indicating that column generation is necessary to scale up the size. The fact that we are able to solve to optimality around \(15\%\) of the instances with \(n=100\) shows that this problem is very difficult even at modest sizes.

On Large Graphs, we compare HeurCG with a multistart 2-opt heuristic. 2-opt works as follows: it randomly chooses a set of seed nodes \(S \subset V\) of the desired cardinality; then, it explores a neighborhood centered at S, defined as \((S {\setminus } \{u\}) \cup \{v\}\), where \(u \in S\) and \(v \in V {\setminus } S\), trying all possible such u and v. If an improving solution is found in the neighborhood, it accepts the new solution, and repeats the neighborhood search from the new S. If there is no better solution, it restarts by generating a new random S. This is repeated until the time limit is hit. The results reported here are for a 2-opt in which the neighborhood is always fully explored before moving to a new solution, so as to choose the largest improvement; we also tested an alternative version in which an improving move is accepted immediately after it is discovered, but its performance was slightly worse.

Table 4 reports results obtained with the two heuristics. It shows the average value of the best incumbent found within one hour of time limit for each combination of parameters. HeurCG clearly outperforms 2-opt in this metric. Additionally, we report that HeurCG finds a better solution on 720 out of 720 instances, and is thus significantly more effective. In fact, 2-opt typically finds a solution that attains influence only marginally larger than the number of seed nodes, while HeurCG achieves nontrivial improvements over that value.

Looking at the relation between the values of the best solution and the robustness parameters, we see that the price of robustness is quite high and protecting against a small amount of uncertainty decreases significantly the total number of activated nodes. We analyze this in more detail in the next subsection.

Table 4 Heuristic Comparison: best incumbent for SW graphs with \(n=2000, 5000\), averaged across instances of same average node degree k
Fig. 3
figure 3

The maximum values of \(\theta \) and \(\varphi \) are reported in (a) and (c). The number of nonzeros of \(\theta \) and \(\varphi \) are reported in (b) and (d). In each subfigure, the value of \(B_N\) increases from top to bottom and the value of \(B_A\) increases from left to right

7.1 The price of robustness: a case study

Our results in Sect. 3.2 indicate that the price to pay to protect the solution from changes in the node thresholds should be at least as high as the one for changes in the arc weights. We performed a detailed study to verify the empirical behavior on our test set. We looked at a few of these cases; here we report only one, as the conclusions are similar — although we did not verify if the entire set of test instances exhibits the same pattern.

We study a SW graph with the following characteristics: \(n=50\), \(k=4\), \(b=0.1\). We examine how the robustness budget affects the value of the optimum. We choose a range of values for \(B_N\) and \(B_A\) from 0 to 5, with an increment of 0.5, for a total of 121 possible combinations. We set \(\varDelta _A=1\) and \(\varDelta _N=\frac{B_N}{\min _j{t_j}}\), so that no upper bounds on the values of \(\theta \) and \(\varphi \) are imposed.

In Fig. 3 we report the maximum value of the \(\theta \) and \(\varphi \) variables, i.e., the \(\ell _\infty \)-norm of the corresponding vector, as well as the number of nonzero \(\theta \) and \(\varphi \) variables at the optimum. Each subtable shows the values of \(\Vert \theta \Vert _{\infty }\), \(\Vert \varphi \Vert _{\infty }\), # nonzero \(\theta \) and # nonzero \(\varphi \) for increasing values of \(B_A\) from right to left and for increasing values of \(B_N\) from top to bottom.

The values of \(\Vert \theta \Vert _{\infty }\) increase as \(B_N\) increases. In the majority of the cases \(\Vert \theta \Vert _{\infty }\) is equal to \(B_N\), and a similar observation holds for \(\Vert \varphi \Vert _{\infty }\) and \(B_A\). As a consequence, there are typically few nonzero \(\theta \) and \(\varphi \), indicating that the worst-case realization of uncertainty is represented by a situation in which very few nodes/arcs are severely affected. In Fig. 4 we show the change in the optimal value as \(B_N\) and \(B_A\) increase. The figure shows that a small amount of robustness is sufficient to significantly decrease the amount of influence attained. The change in the optimal values is remarkably symmetric in \(B_N\) and \(B_A\).

Fig. 4
figure 4

Optimal value of the problem for different values of \(B_N\) and \(B_A\)

8 Conclusions

We present an exact Branch-and-Cut and a heuristic Branch-Cut-and-Price for the influence maximization problem, and its robust counterpart. The algorithms for the robust version of the problem can handle uncertainty on the edge weights and on the nodes thresholds. To solve the deterministic version of the problem we introduce a new formulation originating from a bilevel problem, exploiting the dual of some (integral) subproblems. The formulation is extended to the robust case with a row generation and replacement scheme, within a Branch-and-Cut framework. Since the number of decision variables in our formulation is exponential, we discuss a column generation scheme, and implement a heuristic Branch-Cut-and-Price. Numerical experiments show that we are able to solve to optimality the robust and the deterministic version of the problem on small graphs; instances with 100 or more nodes are challenging already. However, we test our heuristic on graphs with up to 5000 nodes and find that it significantly outperforms a simple 2-opt heuristic.