1 Introduction

We address the capacity reservation problem, the problem of reserving link capacity in a network in order to support multiple possible flow patterns. We are given a description of the network and a collection of traffic demand scenarios, which specify the amount of flow to be routed from some sources to some sinks. We must reserve enough link capacity so that for any of these scenarios, there exist flows that route the traffic demand while using no more than the reserved capacity on each link. Subject to this requirement, we seek to minimize a linear reservation cost. This problem is also referred to as the network design problem.

The capacity reservation problem can be expressed as a linear program (LP), a fact first recognized by Gomory and Hu [1]. For moderate problem sizes, it can be solved using generic LP solvers. However, problems with a large number of scenarios (and especially those that do not fit in a single computer’s memory) are beyond the reach of such solvers.

We present an iterative method for solving the capacity reservation problem based on the alternating direction method of multipliers (ADMM). Each iteration of the algorithm involves solving, in parallel, a single minimum-cost flow problem for each scenario. This means that our method can easily exploit parallel computing architectures and can scale to enormous problem instances that do not fit in the memory of a single computer. Unlike general ADMM, our method maintains a feasible point at each iteration (and therefore an upper bound on the problem value). We can also compute a lower bound at modest cost, which can be used to bound the suboptimality of the current iterate, allowing for non-heuristic termination of the algorithm.

Previous work The optimal reservation problem has a long history and was first proposed by Gomory and Hu [1], who consider the case of a finite source set and a single commodity. In their extension to the multicommodity case, they propose a decomposition method based on duality [2]. They also note that the structure of the capacity reservation problem makes decomposition techniques attractive. Labbé et al. [3] propose two decomposition techniques, one of which is based on Lagrangian duality and foreshadows the method we propose here. Other decomposition approaches are based on cutting plane methods, which can be interpreted as using only a small subset of the source set and adding additional source vectors as necessary. Examples of this approach include the work of Minoux et al. [4] and Petrou et al. [5].

Several other source set descriptions are given in the literature, but are typically intractable. For example, when the source set is given by polyhedron (described by a set of inequalities), the capacity reservation problem is known to be NP-hard [6]. However, for this and other cases, such as ellipsoidal and conic source sets, several effective heuristics exist. One tractable approach is to restrict the flows to be an affine function of the demands. This approach was first proposed by Ben-Ameur and Kerivin [7, 8] and is called static routing, oblivious routing, dynamic routing, or affine routing, depending on the form of the affine function used. For a summary, see [9]. Such a heuristic can be viewed as an application of affinely adjustable robust optimization and has been studied extensively in robust optimization; see [10].

In the case of a finite source set, the capacity reservation problem can be reduced to a convex optimization problem (indeed, a linear program). These problems are tractable, and mature software exists that can be used to solve them [11]. Our approach to the capacity reservation problem is based on a particular algorithm for solving convex optimization problems, called the alternating direction method of multipliers (ADMM) [12, 13], which is well-suited for solving large convex problems on distributed computers.

Outline We start by describing the capacity reservation problem in Sect. 2 and a simple heuristic for solving it. In Sect. 3, we provide a duality-based method for obtaining lower bounds on the problem value. In Sect. 4, we illustrate the heuristic, as well as the lower bounds, on two illustrative examples. In Sect. 5, we give an ADMM-based distributed algorithm for solving the capacity reservation problem. We conclude with a numerical example in Sect. 6.

2 Capacity Reservation Problem

We consider a single-commodity flow on a network. The network is modeled as a directed connected graph with n nodes and m edges, described by its incidence matrix \(A\in {\mathbb {R}}^{n\times m}\),

$$\begin{aligned} A_{ij} = \left\{ \begin{array}{rl} 1, &{} \text{ if } \text{ edge } j \text{ points } \text{ to } \text{ node } i\\ -1, &{} \text{ if } \text{ edge } j \text{ points } \text{ from } \text{ node } i\\ 0\phantom {,} &{} \text{ otherwise. } \end{array}\right. \end{aligned}$$

We let \(f\in {\mathbb {R}}^m\) denote the vector of edge flows, which we assume are nonnegative and limited by a given edge capacity, i.e., \(0\le f\le c\), where the inequalities are interpreted elementwise, and \(c\in {\mathbb {R}}^m\) is the vector of edge capacities. We let \(s \in {\mathbb {R}}^n\) denote the vector of sources, i.e., flows that enter (or leave, when negative) the node. Flow conservation at each node is expressed as \(Af+s=0\) (This implies that \({\mathbf {1}}^Ts=0\)). We say that f is a feasible flow for the source s if there exists f that satisfies \(Af+s=0\) and \(0\le f \le c\). For a given source vector s, finding a feasible flow (or determining that none exists) reduces to solving a linear program (LP).

We consider a set of source vectors \({\mathcal {S}}\subset {\mathbb {R}}^n\), which we call the source set. A flow policy is a function \({\mathcal {F}}:{\mathcal {S}} \rightarrow {\mathbb {R}}^m\). We say that a flow policy is feasible if for each \(s\in {\mathcal {S}}\), \({\mathcal {F}}(s)\) is a feasible flow for s. Roughly speaking, a flow policy gives us a feasible flow for each possible source. Our focus is on the selection of a flow policy, given the network (i.e., A and c) and a description of the source set \({\mathcal {S}}\).

A reservation (or reserved capacity) is a vector \(r\in {\mathbb {R}}^m\) that satisfies \({0 \le r \le c}\). We say that the reservation rsupports a flow policy \({\mathcal {F}}\) if for every source vector \(s\in {\mathcal {S}}\), we have \({\mathcal {F}}(s) \le r\). Roughly speaking, we reserve a capacity \(r_j\) on edge j; then, for any possible source s, we can find a feasible flow f that satisfies \(f_j \le r_j\), i.e., it does not use more than the reserved capacity on each edge. The cost of the reservation r is \(p^Tr\), where \(p\in {\mathbb {R}}^m\), \(p\ge 0\) is the vector of edge reservation prices. The optimal capacity reservation (CR) problem is to find a flow policy \({\mathcal {F}}\) and a reservation r that supports it, while minimizing the reservation cost.

The CR problem can be written as

$$\begin{aligned} \begin{array}{ll} \text{ minimize } &{} p^T r \\ \text{ subject } \text{ to } &{} A{\mathcal {F}}(s) + s = 0, \quad 0 \le {\mathcal {F}}(s) \le r, \quad \forall s\in {\mathcal {S}} \\ &{} r \le c. \end{array} \end{aligned}$$
(1)

The variables are the reservation \(r\in {\mathbb {R}}^m\) and the flow policy \({\mathcal {F}}:{\mathcal {S}} \rightarrow {\mathbb {R}}^m\). The data are A, \({\mathcal {S}}\), c, and p. This problem is convex, but when \({\mathcal {S}}\) is infinite, it is infinite dimensional (since the variables include a function on an infinite set) and contains a semi-infinite constraint, i.e., linear inequalities indexed by an infinite set. We let \(J^\star \) denote the optimal value of the CR problem.

We will focus on the special case when \({\mathcal {S}}\) is finite, \({\mathcal {S}} = \{s^{(1)},\dots ,s^{(K)}\}\). We can think of \(s^{(k)}\) as the source vector in the kth scenario. The CR problem (1) is tractable in this case, indeed, an LP:

$$\begin{aligned} \begin{array}{ll} \text{ minimize } &{} p^T r \\ \text{ subject } \text{ to } &{} Af^{(k)} + s^{(k)} = 0, \quad 0 \le f^{(k)} \le r, \quad k = 1,\dots ,K,\\ &{} r \le c. \end{array} \end{aligned}$$
(2)

The variables are the reservations \(r\in {\mathbb {R}}^m\) and the scenario flow vectors \(f^{(k)}\in {\mathbb {R}}^m\) for each scenario k. This LP has \(m(K+1)\) scalar variables, nK linear equality constraints, and \(m(2K+1)\) linear inequality constraints. In the remainder of the paper, we assume the problem data are such that this problem is feasible, which occurs if and only if for each scenario there is a feasible flow (This can be determined by solving K independent LPs).

We note that a solution to (2) is also a solution for the problem (1) with the (infinite) source set \({\mathcal {S}} = {\mathbf{Conv}}\{s^{(1)},\ldots ,s^{(K)}\}\), where \({\mathbf{Conv}}\) denotes the convex hull. In other words, a reservation that is optimal for (2) is also optimal for the CR problem in which the finite source set \({\mathcal {S}}\) is replaced with its convex hull. To see this, we note that the optimal value of (2) is a lower bound on the optimal value of (1) with \({\mathcal {S}}\) the convex hull, since the feasible set of the former includes the feasible set of the latter. But we can extend a solution of the LP (2) to a feasible point for the CR problem (1) with the same objective, which therefore is optimal. To create such an extension, we define \({\mathcal {F}}(s)\) for any \(s \in {\mathcal {S}}\). We define \(\theta (s)\) as the unique least-Euclidean-norm vector with \(\theta (s) \ge 0\), \({\mathbf {1}}^T\theta (s) =1\) (with \({\mathbf {1}}\) the vector with all entries one) that satisfies \(s = \sum _k \theta _k s^{(k)}\) (These are barycentric coordinates). We then define \({\mathcal {F}}(s) = \sum _{k=1}^K \theta _k(s) f^{(k)}\).

We note for future use another form of the CR problem (2). We eliminate r using \(r = \max _k f^{(k)}\), where the max is understood to apply elementwise. Then, we obtain the problem

$$\begin{aligned} \begin{array}{ll} \text{ minimize } &{} p^T \max _k f^{(k)} \\ \text{ subject } \text{ to } &{} Af^{(k)} + s^{(k)} = 0, \quad 0 \le f^{(k)} \le c, \quad k = 1,\ldots ,K, \end{array} \end{aligned}$$
(3)

where the variables are \(f^{(k)}\), \(k=1, \ldots , K\). This is a convex optimization problem.

We will also express the CR problem using matrices, as

$$\begin{aligned} \begin{array}{ll} \text{ minimize } &{} p^T \max (F) \\ \text{ subject } \text{ to } &{} AF + S = 0, \\ &{} 0 \le F \le C \end{array} \end{aligned}$$
(4)

with variable \(F\in {\mathbb {R}}^{m\times K}\). The columns of F are the flow vectors \(f^{(1)},\ldots ,f^{(K)}\). The columns of S are the source vectors \(s^{(1)}, \ldots , s^{(K)}\), and the columns of C are all the capacity vector c, i.e., \(C = c{\mathbf {1}}^T\), where \({\mathbf {1}}\) is the vector with all entries one. The inequalities above are interpreted elementwise, and the \(\max \) of a matrix is taken over the rows, i.e., it is the vector containing the largest element in each row.

Heuristic flow policy We mention a simple heuristic for the reservation problem that foreshadows our method. We choose each scenario flow vector \(f^{(k)}\) as a solution to the capacitated minimum-cost flow problem

$$\begin{aligned} \begin{array}{ll} \text{ minimize } &{} p^T f \\ \text{ subject } \text{ to } &{} Af + s^{(k)} = 0, \quad 0 \le f \le c, \end{array} \end{aligned}$$
(5)

with variable \(f\in {\mathbb {R}}^m\). These flow vectors are evidently feasible for (3), and therefore the associated objective value \(J^\mathrm {heur}\) is an upper bound on the optimal value of the problem, i.e.,

$$\begin{aligned} J^\star \le J^\mathrm {heur} = p^T \max _k f^{(k)}. \end{aligned}$$

Note that finding the flow vectors using this heuristic involves solving K smaller LPs independently, instead of the one large LP (2). This heuristic flow policy greedily minimizes the cost for each source separately, but it does not coordinate the flows for the different sources to reduce the reservation cost.

We will now show that we also have

$$\begin{aligned} J^\mathrm {heur}/K \le J^\star , \end{aligned}$$

i.e., \(J^\mathrm {heur}/K\) is a lower bound on \(J^\star \). This shows that the heuristic policy is K-suboptimal, that is, it achieves an objective value within a factor of K of the optimal objective value. We first observe that for any nonnegative vectors \(f^{(k)}\), we have

$$\begin{aligned} (1/K) \sum _{k=1}^K p^T f^{(k)} = p^T \left( (1/K)\sum _{k=1}^K f^{(k)}\right) \le p^T \max _{k} f^{(k)}, \end{aligned}$$
(6)

which follows from the fact that the maximum of a set of K numbers is at least as big as their mean. Now minimize the left-hand and right-hand sides over the feasible set of (2). The right-hand side becomes \(J^\star \), and the left-hand side becomes \(J^\mathrm {heur}\).

In Sect. 4.2, we give an example showing that this bound is tight, i.e., the heuristic flow policy can indeed produce an objective value (nearly) K times the optimal value.

3 Lower Bounds and Optimality Conditions

The heuristic flow policy described above provides an upper and lower bound on \(J^\star \). These bounds are computationally appealing because they involve solving K small LPs independently. Here, we extend the basic heuristic method to one in which the different flows are computed using different prices. This gives a parametrized family of lower and upper bounds on \(J^\star \). Using ideas based on linear programming duality, we can show that this family is tight, i.e., there is a choice of prices for each scenario that yields lower and upper bound \(J^\star \). This duality idea will form the basis for our distributed solution to (2), as described in Sect. 5.

Scenario prices We consider nonnegative scenario pricing vectors\(\pi ^{(k)}\in {\mathbb {R}}^m\), for \(k=1,\ldots , K\), satisfying \(\sum _{k=1}^K \pi ^{(k)} = p\).

Then, we have

$$\begin{aligned} p^T \max _k f^{(k)} \ge \sum _{k=1}^K \pi ^{(k)T} f^{(k)}. \end{aligned}$$
(7)

It follows that the optimal value of the LP

$$\begin{aligned} \begin{array}{ll} \text{ minimize } &{} \sum _{k=1}^K \pi ^{(k)T} f^{(k)} \\ \text{ subject } \text{ to } &{} Af^{(k)} + s^{(k)} = 0, \quad 0 \le f^{(k)} \le c, \quad k = 1,\ldots ,K \end{array} \end{aligned}$$
(8)

is a lower bound on \(J^\star \). This problem is separable; it can be solved by solving K small LPs (capacitated flow problems) independently. In other words, to obtain the bound, we first decompose the reservation prices on each edge into scenario prices. Then, for each scenario k, the flow \(f^{(k)}\) is chosen (independently of the other scenarios) to be a minimum-cost flow according to price vector \(\pi ^{(k)}\). The scenario price vectors can be interpreted as a decomposition of the reservation price vector, i.e., the price of reservation along each edge can be decomposed into a collection of flow prices for that edge, with one for each scenario. Finally, we note that this lower bounding property can also be derived as a special case of linear programming duality.

Note that the inequality (6) is a special case of (7), obtained with scenario prices \(\pi ^{(k)} = (1/K)p\). In fact, with these scenario prices, the heuristic subproblem (5) is also just a special case of (8), i.e., the family of lower bounds obtained by solving (8) with different scenario pricing vectors includes the single lower bound obtained using the heuristic method.

Given a set of scenario prices, the flows found as above are feasible for the CR problem. Therefore, \(p^T \max _k f^{(k)}\) is an upper bound on \(J^\star \).

Tightness of the bound and optimality conditions There exist some scenario prices for which the optimal value \(J^\star \) of the CR problem (2) and its relaxation (8) are the same. This is a consequence of linear programming duality; a (partial) dual of (2) is the problem of determining the scenario pricing vectors that maximize the optimal value of (8). In addition, given such scenario prices, any optimal flow policy for (2) is also optimal for (8) with these scenario prices.

We can formalize this notion as follows. A flow policy \(f^{(k)}\) is optimal if and only if there exists a collection of scenario prices \(\pi ^{(k)}\) such that the following conditions hold.

  1. 1.

    The vectors \(\pi ^{(k)}\) must be valid scenario prices, i.e.,

    $$\begin{aligned} \sum _{k=1}^K \pi ^{(k)}_j = p_j, \quad \pi ^{(k)} \ge 0, \quad j=1,\ldots ,m. \end{aligned}$$
    (9)
  2. 2.

    The inequality (7) must be tight:

    $$\begin{aligned} p^T \max _k f^{(k)} = \sum _{k=1}^K \pi ^{(k)T} f^{(k)}. \end{aligned}$$
  3. 3.

    The flow vectors \(f^{(k)}\), for \(k=1,\ldots ,K\), must be optimal for (5), with scenario prices \(\pi ^{(k)}\).

Condition 2 has the interpretation that the optimal reservation cost can be written as a sum of K cost terms corresponding to the scenarios, i.e., the reservation cost can be allocated to the K scenarios. This can be made the basis for a payment scheme, in which scenario k is charged with a fraction \(\pi ^{(k)T} f^{(k)}\) of the total reservation cost. Note that because the scenario prices that show optimality of a collection of scenario flow vectors are not necessarily unique, these cost allocations may not be unique.

Condition 2 also implies the complementarity condition

$$\begin{aligned} \pi _j^{(k)T} (f^{(k)} - r)_j = 0 \end{aligned}$$

for all k and j. In other words, a positive scenario price on an edge implies that the corresponding scenario edge flow is equal to the optimal reservation, i.e., this scenario edge flow must contribute to the reservation cost. Similarly, if a scenario edge flow is not equal to that edge’s reservation, then the scenario price must be zero for that edge.

Condition 3 means that, given the scenario prices, the flow vector for each scenario is a minimum-cost flow using the scenario price vector associated with that scenario. We can interpret the price vectors as weighting the contribution of a scenario edge flow to the total reservation cost. In the case that \(\pi _j^{(k)}=0\), the flow over edge j under scenario k does not contribute to the total reservation cost. If \(\pi _j^{(k)}>0\), then the flow over edge j under scenario k contributes to the total reservation cost, and from before, we have \(f^{(k)}_j = r_j.\) Additionally, if \(\pi ^{(k)} = 0\) for some k, then scenario k is irrelevant, i.e., removing it has no effect on the reservation cost. If instead \(\pi ^{(k)} = p\), we conclude that scenario k is the only relevant scenario.

4 Examples

4.1 Illustrative Example

Here, we consider a simple example with \(n = 5\) nodes, \(m = 10\) edges, and \(K = 8\) scenarios. The graph was generated randomly, from a uniform distribution on all connected graphs with 5 nodes and 10 edges, and is shown in Fig. 1. We use price vector \(p = {\mathbf {1}}\) and capacity vector \(c = {\mathbf {1}}\). The 8 scenario source vectors were chosen according to \(s^{(k)} = -Az^{(k)}\), where each element of \(z^{(k)}\) was randomly chosen from \(\{0,1/3,2/3,1\}\). This guarantees that there is a feasible flow for each scenario.

Fig. 1
figure 1

Graph of the illustrative example

The optimal reservation cost is 6.0, and the objective of the heuristic policy is 7.6 (The lower bound from the heuristic policy is 2.3). The optimal and heuristic flow policies are shown in Fig. 2. The upper plot shows the optimal policy, and the lower plot shows the heuristic policy. For each plot, the bars show the flow policy; the 10 groups are the edges, and the 8 bars are the edge flows under each scenario. The line above each group of bars is the reservation for that edge.

Optimal scenario prices are given in Table 1. First, we confirm that the row sums are indeed p. The fact that only some scenario prices are positive for each edge indicates that no scenario is relevant for every edge, i.e., there is no single scenario that is being planned for. However, for every scenario, the scenario prices are positive for at least one edge, which means that every scenario is potentially relevant to the optimal reservation.

Fig. 2
figure 2

Optimal and naive flow policies

Table 1 Scenario prices \(\pi ^{(1)},\dots ,\pi ^{(8)}\). Column k in the table is the scenario price vector \(\pi ^{(k)}\)

4.2 K-Suboptimal Example

In Sect. 2, we showed that the heuristic method produces an objective value not more than K times the optimal value. The previous example showed that the heuristic approach is not, in fact, optimal. Here, we give an example that shows that this bound is in fact tight, i.e., for every positive integer K, there is an instance of (2) for which the optimal flow policy outperforms a heuristic solution factor of (nearly) K.

We consider a graph with \(n = 2a+1\) nodes, for some positive integer a, with the nodes arranged in three layers, with a, a, and 1 nodes, respectively (We show the graph for \(a=3\) in Fig. 3). There is an edge connecting every node in the first layer with every node in the second layer, i.e., for every \(i = 1,\dots ,a\) and every \(j = a+1,\dots , 2a\), the edge (ij) is in the graph. The price in this edge is \(\epsilon \) if \(i=j\), and \(2\epsilon \) if \(i\ne j\). Similarly, every node in the second layer connects to the last node, i.e., for every \(i=a+1,\dots ,2a\), the edge \((i,2a+1)\) is in the graph. The price for this edge is 1.

Fig. 3
figure 3

Graph example for \(a=3\). The labels give the reservation price for that edge

We consider \(K = a\) scenarios. In each scenario, one unit of flow in injected into a single node in the first layer and must be routed to the last node. In other words, the source vector \(s^{(k)}\) is all zeros, except for the kth element, which is 1, and the last element, which is \(-1\).

A feasible (but not necessarily optimal) point for the CR problem (2) is to route all flow equally through node \(a+1\), the first node in the second layer. We must then reserve one unit of capacity for edge \((a+1, 2a+1)\) and one unit of capacity from each node in the first layer to node \(a+1\), i.e., for edges \((i,a+1)\), for \(i=1,\dots , a\). The total reservation cost is \((2a - 1)\epsilon +1 = (2K - 1)\epsilon +1\).

We now consider the (unique) heuristic solution. The solution to (5) under scenario k is to route all flow from node k to node \(a+k\), and finally to node \(2a+1\). Because no edges support flow under any two different scenarios, the heuristic therefore achieves a reservation cost of \(K(\epsilon +1)\).

As \(\epsilon \rightarrow 0\), the ratio of the costs of the feasible reservation for (2) and the heuristic reservation approaches K. Because the feasible reservation provides an upper bound on the optimal reservation cost, the ratio between the optimal reservation and the heuristic reservation also approaches K.

5 Distributed Solution Method

In this section, we provide a parallelizable algorithm, based on the alternating direction method of multipliers (ADMM), to solve (2) (For details on ADMM, see [13]). We first express the problem (4) in the consensus form

$$\begin{aligned} \begin{array}{ll} \text{ minimize } &{} p^T \max ({\tilde{F}}) + g(F) \\ \text{ subject } \text{ to } &{} F = {\tilde{F}} \end{array} \end{aligned}$$
(10)

with variables \(F\in {\mathbb {R}}^{m\times K}\) and \({\tilde{F}}\in {\mathbb {R}}^{m\times K}\). The function g is the indicator function for feasible flows:

$$\begin{aligned} g(F) = \left\{ \begin{array}{cl} 0, &{} \text{ if } AF + S = 0 \text{ and } 0 \le F \le C \\ \infty \phantom {,} &{} \text{ otherwise. } \end{array}\right. \end{aligned}$$

In the consensus problem (10), we have replicated the flow matrix variable and added a consensus constraint, i.e., the requirement that the two variables representing flow policy must agree.

The augmented Lagrangian for problem (10) is

$$\begin{aligned} L(F,{\tilde{F}}, \varPi ) = p^T \max ({\tilde{F}}) + g(F) + \mathbf{Tr}\,\varPi ^T (F - {\tilde{F}}) + (\rho /2) \Vert F - {\tilde{F}} \Vert _F^2, \end{aligned}$$

where \(\Vert \cdot \Vert _F\) is the Frobenius norm and \(\rho > 0\) is a parameter. Here, \(\varPi \in {\mathbb {R}}^{m\times K}\) is the dual variable.

ADMM The ADMM algorithm (with over-relaxation) for (10) is given below. First, we initialize the iterates \({\tilde{F}}(0)\) and \(\varPi (0)\). Then, we carry out the following steps:

$$\begin{aligned} F(l+1)&= \mathop {\hbox {argmin}}\limits _{F} L\big (F, {\tilde{F}}(l), \varPi (l)\big ) \\ {\tilde{F}}(l+1)&= \mathop {\hbox {argmin}}\limits _{{\tilde{F}}} L\big (\alpha F(l+1) + (1-\alpha ) {\tilde{F}}(l), {\tilde{F}}, \varPi (l)\big ) \\ \varPi (l+1)&= \varPi (l) + \rho \big (\alpha F(l+1) + (1-\alpha ) {\tilde{F}}(l) - {\tilde{F}}(l+1)\big ), \end{aligned}$$

where \(\alpha \) is an algorithm parameter in (0, 2); the argument l is the iteration number. For reasons that will become clear, we call these three steps the flow policy update, the reservation update, and the price update, respectively.

Convergence Given our assumption that (2) is feasible, we have \(F(l)\rightarrow F^\star \), \({\tilde{F}}(l)\rightarrow F^\star \), and \(\varPi (l) \rightarrow \varPi ^\star \), where \((F^\star ,\varPi ^\star )\) is a primal–dual solution to (4). This follows from standard results on ADMM [13].

Flow policy update Minimizing the Lagrangian over F requires minimizing a quadratic function over the feasible set of (4). This can be interpreted as the solution of a minimum-cost flow problem for each of the K scenarios. More specifically, column k of the matrix \(F(l+1)\) is the unique solution to the quadratic minimum-cost flow problem:

$$\begin{aligned} \begin{array}{ll} \text{ minimize } &{} \pi ^{(k)T} f + (\rho /2)\Vert f - \tilde{f}^{(k)}\Vert ^2 \\ \text{ subject } \text{ to } &{} Af + s^{(k)} = 0 \\ &{} 0 \le f \le c. \end{array} \end{aligned}$$
(11)

Here, \({\tilde{f}}^{(k)}\) and \(\pi ^{(k)}\) are the kth columns of the matrices \({\tilde{F}}(l)\) and \(\varPi (l)\), respectively. These K flow problems can be solved in parallel to update the K columns of the iterate \(F(l+1)\). Note that the objective can be interpreted as a sum of the edge costs, using the estimate of the scenario prices, plus a quadratic regularization term that penalizes deviation of the flow vector from the previous flow values \({\tilde{f}}^{(k)}\).

Reservation update Minimizing the Lagrangian over \({\tilde{F}}\) requires minimizing the sum of a quadratic function of \({\tilde{F}}\) plus a positive weighted sum of its row-wise maxima. This step decouples into m parallel steps. More specifically, the jth row of \({\tilde{F}}(l+1)\) is the unique solution to

$$\begin{aligned} \begin{array}{ll} \text{ minimize } &{} p_j \max \big (\tilde{f}\big ) - \pi _j^T \tilde{f} + (\rho /2) \left\| \tilde{f} - \big (\alpha f_j + (1-\alpha ) {\tilde{f}}_j\big ) \right\| ^2, \\ \end{array} \end{aligned}$$
(12)

where \({\tilde{f}}\) is the variable, and \(f_j\), \({\tilde{f}}_j\), and \(\pi _j\) are the jth rows of \(F(l+1)\), \({\tilde{F}}(l)\), and \(\varPi (l)\), respectively (The scalar \(p_j\) is the jth element of the vector p). We interpret the \(\max \) of a vector to be the largest element of that vector. This step can be interpreted as an implicit update of the reservation vector. The first two terms are the difference between the two sides of (7). The problem can therefore be interpreted as finding flows for each edge that minimize the looseness of this bound while not deviating much from a combination of the previous flow values \(f_j\) and \({\tilde{f}}_j\) (Recall that at a primal–dual solution to (2), this inequality becomes tight). There is a simple solution for each of these subproblems, given in “Appendix A”; the computational complexity for each subproblem scales like \(K\log K\).

Price update In the third step, the dual variable \(\varPi (l+1)\) is updated.

Iterate properties Note that the iterates F(l), for \(l = 1,2\dots \), are feasible for (4), i.e., the columns of F(l) are always feasible flow vectors. This is a result of the simple fact that each column is the solution of the quadratic minimum-cost flow problem (11). This means that \(U(l) = p^T\max F(l)\) is an upper bound on the optimal problem value. Furthermore, because F(l) converges to a solution of (4), this upper bound U(l) converges to \(J^\star \) (though not necessarily monotonically).

It can be shown that \(\varPi (l)\ge 0\) and \(\varPi (l){\mathbf {1}}= p\) for \(l=1,2,\dots \), i.e., the columns of \(\varPi (l)\) are feasible scenario price vectors satisfying optimality condition 1 of Sect. 3. This is proven in “Appendix B.” We can use this fact to obtain a lower bound on the problem value, by computing the optimal value of (8), where the scenario pricing vectors \(\pi ^{(k)}\) are given by the columns of \(\varPi (l)\), which is a lower bound on the optimal value of (4). We call this bound L(l). Because \(\varPi (l)\) converges to an optimal dual variable of (10), L(l) converges to \(J^\star \) (though it need not converge monotonically).

Additionally, the optimality condition 2 is satisfied by iterates \({\tilde{F}}(l+1)\) and \(\varPi (l+1)\), for \(l=1,2,\dots \), if we take \(f^{(k)}\) and \(\pi ^{(k)}\) to be the columns \({\tilde{F}}(l+1)\) and \(\varPi (l+1)\), respectively. This is shown in “Appendix B.”

Stopping criterion Several reasonable stopping criteria are known for ADMM. One standard stopping criterion is that that the norms of the primal residual \(\Vert F(l) - {\tilde{F}}(l)\Vert \) and the dual residual \(\Vert \varPi (l+1) - {\tilde{\varPi }}(l)\Vert \) are small (For a discussion, see [13, §3.3.1]).

In our case, we can use the upper and lower bounds U(l) and L(l) to bound the suboptimality of the current feasible point F(l). More specifically, we stop when

$$\begin{aligned} U(l)-L(l) \le \epsilon ^\mathrm {rel} L(l) \end{aligned}$$
(13)

which guarantees a relative error not exceeding \(\epsilon ^\mathrm {rel}\). Note that computing L(l) requires solving K small LPs, which has roughly the same computational cost as one ADMM iteration; it may therefore be beneficial to check this stopping criterion only occasionally, instead of after every iteration.

Choice of parameter \(\rho \). Although our algorithm converges for any positive \(\rho \) and any \(\alpha \in (0,2)\), the values of these parameters strongly impact the number of iterations required. In many numerical experiments, we have found that choosing \(\alpha = 1.8\) works well. The choice of the parameter \(\rho \) is more critical. Our numerical experiments suggest choosing \(\rho \) as

$$\begin{aligned} \rho = \mu \frac{ {\mathbf {1}}^T p}{\max ({\mathbf {1}}^T F^\mathrm {heur})}, \end{aligned}$$
(14)

where \(F^\mathrm {heur}\) is a solution to the heuristic problem described in Sect. 2 and \(\mu \) is a positive constant. With this choice of \(\rho \), the primal iterates F(l) and \({\tilde{F}}(l)\) are invariant under positive scaling of p and scale with S and c (Similarly, the dual iterates \(\varPi (l)\) scale with p, but are invariant to scaling of S and c). Thus, the choice (14) renders the algorithm invariant to scaling of p, and also to scaling of S and c. Numerical experiments suggest that values of \(\mu \) in the range between 0.01 and 0.1 work well in practice. For details, see “Appendix C.”

Initialization Convergence of F(l), \({\tilde{F}}(l)\), and \(\varPi (l)\) to optimal values is guaranteed for any initial choice of \({\tilde{F}}(0)\) and \(\varPi (0)\). However, we propose initializing \({\tilde{F}}(0)\) as a solution to the heuristic method described in Sect. 2, and \(\varPi (0)\) as \((1/K)p{\mathbf {1}}^T\) (In this case, we have \(F(1) = {\tilde{F}}(0)\), which means the initial feasible point achieves an objective within a factor of K of optimal).

6 Numerical Example

We consider some numerical examples with varying problem sizes. For each problem size, i.e., for each specific choice of n, m, and K, the graph is randomly chosen from a uniform distribution on all graphs with n nodes and m edges. The elements of the price vector p are uniformly randomly distributed over the interval [0, 1] and \(c = {\mathbf {1}}\). The elements of the source vectors for scenario k are generated according to \(s^{(k)} = -A z^{(k)}\), where the elements of \(z^{(k)}\) are uniformly randomly distributed over the interval [0, 1].

6.1 Detailed Results for \(n=2000\), \(m=4000\), \(K=1000\)

Problem instance We first generate an example with \(n = 2000\) nodes, \(m = 5000\) edges, and \(K = 1000\) scenarios, for which the total number of (flow) variables is 5 million. The optimal value is 18053, and the objective obtained by the heuristic flow policy is 24042. The initial lower bound is 5635, so the initial relative gap is \((U(0)-L(0))/L(0)=3.26\).

Convergence We use \(\alpha = 1.8\), with \(\rho \) chosen according to (14) with \(\mu = 0.05\). We terminate when (13) is satisfied with \(\epsilon ^\mathrm {rel} = 0.01\), i.e., when we can guarantee that the flow policy is no more than \(1\%\) suboptimal (in the sense defined below). We check the stopping criterion at every iteration; however, we only update the lower bound (i.e., evaluate L(l)) when l is a multiple of 10.

The algorithm terminates in 95 iterations. The convergence is shown in Fig. 4. The upper plot shows the convergence of the upper and lower bounds to the optimal problem value, and the lower plot shows the relative gap \({(U(l) - L(l))/L(l)}\) and the relative suboptimality \((U(l) - J^\star )/J^\star \) (computed after the algorithm has converged to very high accuracy) (Note that both the upper and the lower bounds were updated by the best values to date, i.e., they were non-increasing and non-decreasing, respectively The lower bound was only updated once every 10 iterations). Our algorithm guarantees that we terminate with a policy no more than \(1\%\) suboptimal; in fact, the final policy is around \(0.4\%\) suboptimal.

Fig. 4
figure 4

Top: upper and lower bounds U(l) and L(l) on the optimal value. Bottom: estimated relative gap \((U(l) - L(l))/L(l)\) and relative suboptimality \((U(l) - J^\star )/J^\star \)

Timing results We implemented our method in Python, using the multiprocessing package. The flow update subproblems were solved using the commercial interior-point solver Mosek [14], interfaced using CVXPY, a Python-based modeling language for convex optimization [15]. The reservation update subproblems were solved using the simple method given in “Appendix A” (The simulation was also run using the open-source solver ECOS [16], which, however, is single-threaded and unable to solve the full problem directly. Mosek is able to solve the full problem directly, which allows us to compare the solve time with our distributed method).

We used a Linux computer with 4 Intel Xeon E5-4620 CPUs, with 8 cores per CPU and 2 virtual cores per physical core. The flow update and reservation update subproblems were solved in parallel with 64 hyperthreads. Mosek is also configured to use 64 hyperthreads.

The compute times are listed in Table 2. The first line gives the average time to solve a single-flow update subproblem (11). The second line gives the total time taken to solve the problem using our distributed method (which took 95 iterations). The third line gives the time taken to solve the full CR problem directly. For this simulation, we terminated Mosek when the relative surrogate duality gap in the interior-point method reached 0.01 (Although this is a similar stopping criterion to the one used for our method, it is not equivalent to, nor a sufficient condition for, having a relative gap less than 0.02).

Table 2 Timing results for the numerical example

For this example, our distributed method is about 7 times faster than Mosek. This timing advantage will of course scale with the number of cores or threads, since each iteration of our method is trivially parallelizable, whereas the sparse matrix factorization that dominates each iteration of an interior-point method is not. We expect that by scaling the problem larger, this gap in solve time will widen considerably.

6.2 Timing Experiments

Here, we describe several experiments done to show the scalability of our algorithm and compare it with that of Mosek. We note that the problem size here was limited by the memory of a single machine. However, by distributing the computation over several machines, or by using disk storage, our solution method could be scaled up further.

Varying problem size To capture how our method scales with problem size, we generated several problem instances, with the number of scenarios K varying between 10 and 1000. The number of nodes was 2K, and the number of edges was 5K. The graph, as well as the scenario flows, was generated as described above (as in the previous numerical example). The parameters for both our method and Mosek are also identical to the previous numerical example.

The solution time of our method, compared with that of Mosek, is shown in Fig. 5. We see that for small problem sizes, Mosek outperforms our method, but as the problem dimensions increase, our method begins to outperform Mosek. Overall, however, the methods perform similarly.

Fig. 5
figure 5

Solve time as a function of graph size

Varying number of scenarios In order to show how our method scales with the number of scenarios, we generated several problem instances in which the size was fixed as \(n=2000\) and \(m=5000\), and the number of scenarios varied from \(K=10\) to \(K=600\). The graph, as well as the scenario flows, was generated as described above. The parameters for both our method and Mosek are also identical to the first numerical example. The method for generating the graph was generated as described above, and the parameters for both our method and Mosek are also identical to our first example.

The solution time of our method, compared with that of Mosek, is shown in Fig. 6. Mirroring our earlier results for a small number of scenarios, Mosek outperforms our method, but as the graph dimensions increase, our method begins to outperform Mosek.

Fig. 6
figure 6

Solve time as a function of number of scenarios

7 Extensions

Capacitated nodes In our formulation, only the links have capacity limits. However, in many practical cases, nodes may also have capacity limits, with a corresponding reservation cost.

Given a problem with a capacitated node, we show how to formulate an equivalent problem, with only capacitated edges. Consider a capacitated node i with capacity a. We can define new graph, with node i replaced by two nodes, \(i_1\) and \(i_2\); all edges previously entering i now enter \(i_1\), and all edges previously leaving i now leave \(i_2\). We then connect \(i_1\) to \(i_2\) with a single new edge, with capacity a and edge reservation price 0.

Multicommodity network We can extend (2) to handle T different types of commodities. The multicommodity capacity reservation problem is

$$\begin{aligned} \begin{array}{lll} \text{ minimize } &{} p^T r \\ \text{ subject } \text{ to } &{} Af^{(k,t)} + s^{(k,t)} = 0, \quad 0 \le f^{(k,t)} &{} \quad k = 1,\dots ,K, \quad t=1,\dots ,T \\ &{} \sum _{t=1}^T f^{(k,t)} \le r, &{} \quad k = 1,\dots ,K, \\ &{} r \le c. \end{array} \end{aligned}$$
(15)

The variables are the reservations \(r\in {\mathbb {R}}^m\) and \(f^{(k,t)}\in {\mathbb {R}}^m\) for \(k=1,\dots ,K\) and \(t=1,\dots ,T\). The vector \(s^{(k,t)}\) is the source vector associated with type t under scenario k. We note that the scenario prices of Sect. 3 extend immediately to the multicommodity case.

8 Conclusion

In this paper, we addressed the capacity reservation problem, by expressing it as a structured linear program (LP) and developing a scalable ADMM-based iterative method for solving it. Possible future work includes extensions to handle nonlinear reservation costs, or could allow flow costs in addition to reservation costs. Another extension is to replace our set-based uncertainty set with a probabilistic uncertainty description and to replace the flow constraint \({\mathcal {F}}(s) \le r\) with a corresponding probabilistic constraint.