1 Introduction

We study the Flow Edge-Monitor Problem (FlowMntrs, for short), where the objective is to find k edges in an undirected graph G=(V,E) with an unknown circulation ψ, so that if we place k flow monitors on these edges to measure the flow along them, we will maximize the total number of edges for which the value and direction of ψ is uniquely determined by the flow conservation property. Intuitively, the objective is to maximize the number of bridge edges in the subgraph induced by edges not covered by monitors. (For a more rigorous definition of the problem, see Sect. 2.)

Consider, for example, the graph and the monitors shown in Fig. 1. In this example we have k=4 monitors represented by rectangles attached to edges, with measured flow values and directions shown inside. Thus we have ψ(2,3)=4, ψ(3,8)=2, ψ(6,4)=7 and ψ(1,2)=1. From the flow conservation property, we can then determine that ψ(3,5)=2, ψ(8,6)=2, ψ(7,5)=3 and ψ(5,6)=5. Thus with 4 monitors we can determine flow values on 8 edges.

Fig. 1
figure 1

A graph with 4 monitors

Our Results

We first show that the FlowMntrs problem is NP-hard. This proof appears in Sect. 3. Next, in Sect. 4, we study polynomial-time approximation algorithms. We introduce an algorithm called σ-Greedy that, in each step, places up to σ monitors in such a way that the number of edges with known flow is maximized. We then prove that 1-Greedy is a 3-approximation algorithm and that 2-Greedy is a 2-approximation algorithm. In both cases, our analysis is tight. In fact, our approximation results are stronger, as they apply to the weighted case, where the input graph has weights on edges, and the objective is to maximize the total weight of the edges with known flow.

The running times of these two algorithms are O(k(m+n)) and O(km 2(m+n)), respectively, where n=|V| and m=|E|. In fact, as we explain in Sect. 4.1, it is possible to modify 1-Greedy to improve its running time to O(m+n+klogn) without increasing the approximation ratio, although this modification does not generalize easily to larger values of σ.

Overview of Main Ideas

The analysis of the greedy algorithm is based on several ideas. The first idea is to show that the problem can be reduced in linear time to the case of weighted 3-edge-connected graphs. This process involves “gluing” together some connected components, removing bridges, and replacing some sets of edges by single weighted edges. For example, if a graph has an induced path of length 5, this path can be replaced by one edge of weight 5. The complete reduction is described in Sect. 2.

The assumption that the input graph is 3-edge-connected implies that the optimum solution contains at most 3k edges. This leads to a simple analysis of 1-Greedy, given in Sect. 4.1, since we then only need to show that 1-Greedy’s gain is not less than the total weight of the k heaviest edges.

The analysis of 2-Greedy, however, is much more subtle, since it is not sufficient to focus only on the number of edges (in the weighted 3-edge-connected graph obtained from the reduction). Very roughly, depending on the structure of the graph, 2-Greedy may collect either few heavy edges or many light ones, yet still we can show that in all cases its overall gain amortizes to at least half of the optimum. The idea is to divide the edges of the graph into small sets called bundles and charge the algorithm’s gain to the weights of these bundles. This analysis is the most technical part of the paper and is given in Sect. 4.2.

Related Work

A closely related problem was studied by Gu and Jia [5] who considered a traffic flow network with directed edges. They observed that mn+1 monitors are necessary to determine the flow on all edges of a strongly connected graph, and that this bound can be achieved by placing flow monitors on edges in the complement of a spanning tree. (The same bound applies to connected undirected graphs.) Khuller et al. [7] studied an optimization problem where pressure meters may be placed on nodes of a flow network. An edge whose both endpoints have a pressure meter will have the flow determined using the pressure difference, and other edges may have the flow determined via flow conservation property. The goal is to compute the minimum number of meters needed to determine the flow on every edge in the network. They showed that this problem is NP-hard and MAX-SNP-hard, and that a local-search based algorithm achieves 2-approximation. For planar graphs, they have a polynomial-time approximation scheme. The model in [7] differs from ours in that it assumes that the flow satisfies Kirchhoff’s current and voltage laws, while we only assume the current law (that is, the flow preservation property). This distinction is reflected in different choices of “meters”: vertex meters in [7] and edge monitors in our paper. Recall that, as explained above, minimizing the number of edge monitors needed to determine the flow on all edges is trivial, providing a further justification for our choice of the objective function.

The FlowMntrs problem is also related to the classical k-cut and multi-way cut problems [1, 4, 8, 10], where the goal is to find a minimum-weight set of edges that partitions the graph into k connected components. One way to view our monitor problem is that we want to maximize the number of connected components obtained from removing the monitor edges and the resulting bridge edges.

2 Preliminaries

We now give formal definitions. Let G=(V,E) be an undirected graph. Throughout the paper, we use n=|V| to denote the number of vertices in G and m=|E| to be the number of edges. We will typically use letters u,v,x,y,…, possibly with indices, to denote vertices, and a,b,e,f,… to denote edges. If an edge e has endpoints x,y, we write e={x,y}. We allow multiple edges and loops in G, so the endpoints do not uniquely define an edge: if e={x,y} and f={x,y}, it is not necessarily true that e=f.

A circulation on G is a function ψ that assigns a flow value and a direction to any edge in E. (We use the terms “circulation” and “flow” interchangeably, slightly abusing the terminology.) Denoting by ψ(u,v) the flow on e={u,v} from u to v, we require that ψ satisfies the following two conditions (i) ψ is anti-symmetric, that is ψ(u,v)=−ψ(v,u) for each edge {u,v}, and (ii) ψ satisfies the flow conservation property, that is ∑{u,v}∈E ψ(u,v)=0 for each vertex v.

A bridge in G is an edge whose removal increases the number of connected components of G. Let Br(G) be the set of bridges in G.

Suppose that some circulation ψ is given for all edges in some set ME, and not for other edges. We have the following observation:

Observation 1

For {u,v}∈EM, ψ(u,v) is uniquely determined from the flow preservation property if and only if {u,v}∈Br(GM).

We can now define the gain of M to be gain(G,M)=|MBr(GM)|, that is, the total number of edges for which the flow can be determined if we place monitors on the edges in M. We will refer to the edges in M as monitor edges, while the bridge edges in Br(GM) will be called extra edges. If G is understood from context, we will write simply gain(M) instead of gain(G,M).

The Flow Edge-Monitor Problem (FlowMntrs) can now be defined formally as follows: given a graph G=(V,E) and an integer k>0, find a set ME with |M|≤k that maximizes gain(G,M).

The Weighted Case

We consider the extension of FlowMntrs to weighted graphs, where each edge e has a non-negative weight w(e) assigned to it, and the task is to maximize the weighted gain. More precisely, if M are the monitor edges, then the formula for the (weighted) gain is gain(M)=∑ eMB w(e), for B=Br(GM). We will denote this problem by WFlowMntrs.

Throughout the paper, we denote by M some arbitrary, but fixed, optimal monitor edge set. Let B =Br(GM ) be the set of extra edges corresponding to M . Then the optimal gain is gain (G,k)=w(M B ).

Simplifying Assumptions

We make some assumptions about the input graph G that will simplify the proofs. First, if km, then we can simply take M=E and this will be an optimal solution to WFlowMntrs. Therefore, without loss of generality, throughout the paper we will assume that m>k.

The total flow across any cut of G must be 0. In particular, the flow value on any bridge is 0, so we can assume that G does not have any bridges. Further, if G is not connected, we can do this: take any two vertices u, v from different connected components and contract them into one vertex. This operation does not affect the solution to WFlowMntrs. This follows from Observation 1, because, denoting by G′ the graph after the contraction, for any set M of monitor edges, GM and G′−M have exactly the same set of bridges. By repeating these contractions enough many times, we can transform G into a connected graph.

Summarizing, we conclude that, without loss of generality, we can assume that G is connected and does not have any bridges. In other words, G is 2-edge-connected. (Recall that, for an integer c≥1, a graph H is called c-edge-connected, if H is connected and it remains connected after removing any set of at most c−1 edges from H.)

Next, we claim that we can in fact restrict our attention to 3-edge-connected graphs. To justify it, we show that any weighted 2-edge-connected graph G=(V,E) can be converted in linear time into a 3-edge-connected weighted graph G′=(V′,E′) such that:

  1. (i)

    gain (G,k)=gain (G′,k), and

  2. (ii)

    If M′⊆E′ is a set of k monitor edges in G′, then in linear time one can find a set ME of k monitor edges in G with gain(G,M)=gain(G′,M′).

We now show the construction of G′. A 2-cut is a pair of edges {e,e′} whose removal disconnects G. Write ee′ if {e,e′} is a 2-cut. It is known, and quite easy to show, that relation “≃” is an equivalence relation on E. The equivalence classes of ≃ are called edge groups.

Suppose that G has an edge group F with |F|=q, for q≥2, and let H 1,…,H q be the connected components of GF. Then F={e 1,…,e q }, where, for each i, e i ={u i ,v i }, u i H i and v i H i+1 (for i=q we assume q+1≡1). For i=1,…,q−1, contract edge e i so that vertices u i and v i become one vertex, and then assign to edge e q ={u q ,v q } weight \(\sum_{i=1}^{q} w(e_{i})\). We will refer to e q as the deputy edge for F. Figure 2 illustrates the construction.

Fig. 2
figure 2

Contracting edge groups

Let G′=(V′,E′) be the resulting weighted graph. By the construction, G′ is 3-edge-connected. All edge groups can be computed in linear time (see, [9], for example), so the whole transformation can be done in linear time as well.

It remains to show that G′ satisfies conditions (i) and (ii). If M is any monitor set, and if M has two or more monitors in the same edge group, we can remove one of these monitors without decreasing the gain of M. Further, for any monitor edge e of M, we can replace e by the deputy edge of the edge group containing e, without changing the gain. This implies that, without loss of generality, we can assume that the optimal monitor set M in G contains only deputy edges. These edges remain in G′ and the gain of M in G′ will be exactly the same as its gain in G. This shows the “≤” inequality in (i). The “≥” inequality follows from the fact that any monitor set in G′ consists only of deputy edges from G, so all of them are in different edge groups. The same argument implies (ii) as well.

Summarizing, we have shown in this section that, without loss of generality, we can assume that the input graph G has m>k edges and is 3-edge-connected.

The Kernel Graph

Consider an input graph G=(V,E) and a monitor edge set M, and let B=Br(GM). The kernel graph associated with G and M is defined as the weighted graph G M =(V M ,E M ), where V M is the set of connected components of GMB, and E M is determined as follows: For any edge eMB, where e={u,v}, let u′ and v′ be the connected components of GMB that contain, respectively, u and v. Then we add edge {u′,v′} to E M . The weights are preserved, that is w({u′,v′})=w({u,v}). We will say that this edge {u′,v′} represents {u,v} or corresponds to {u,v}. In fact, we will often identify {u,v} with {u′,v′}, treating them as the same object. We point out that in general G M is a multigraph, as it may have multiple edges and loops (even when G does not). However, since G is 3-edge-connected, it is easy to see that so is G M .

Figure 3 shows the kernel graph corresponding to the graph and the monitor set in the example from Fig. 1 (all edge weights are 1).

Fig. 3
figure 3

The kernel graph for the example in Fig. 1. The loop in vertex {1,2,4,7} represents edge {2,1}

Note that we have |E M |≤k+|V M |−1. This can be derived directly from the definitions: The edges in G M that represent extra edges are the bridges in G M and therefore they form a forest in G M . This (and the fact that G M is connected) implies that the number of extra edges is at most |V M |−1, and the inequality follows.

In the paper, we will use the concept of kernel graphs only with respect to some optimal monitor set. Let M be some arbitrary, but fixed, optimal monitor edge set. To simplify notation, we will write G =(V ,E ) for the kernel graph associated with M , that is \(G^{\ast}= G_{M^{\ast}}\), \(V^{\ast}= V_{M^{\ast}}\) and \(E^{\ast}= E_{M^{\ast}}\). In this notation, we have gain (G,k)=w(E ). In the analysis of our algorithms, we will be comparing the weights of edges collected by the algorithm against the edges in the kernel graph G .

3 Proof of NP-Hardness of FlowMntrs

We show that the FlowMntrs is NP-hard (even in the unweighted case), via a reduction from the Clique problem. We start with a simple lemma.

Lemma 2

Let a 1,a 2,…,a s be s positive integers such that \(\sum_{i=1}^{s} a_{i} = n\), for a fixed integer n. Then \(\sum_{i=1}^{s} \binom{a_{i}}{2}\) is maximized if and only if a j =ns+1 for some j and a i =1 for all ij.

Above, we assume that \(\binom{1}{2}= 1(1-0)/2 = 0\).

Proof

By routine algebra, one can verify that for 2≤ab we have

$$ \binom{a}{2} + \binom{b}{2} < \binom{a-1}{2} + \binom{b+1}{2}. $$
(1)

Without loss of generality, assume a 1=max i a i . If a i >1, for any i>1, by the inequality above, we can change a 1a 1+1 and a i a i −1, increasing the value of \(\sum_{i=1}^{s} \binom{a_{i}}{2}\). By repeating this argument, we obtain that an optimum is achieved for a 1=ns+1 and a 2=a 3=⋯=a s =1. That all optima have this form (up to a permutation of the a i ’s) follows from the fact that inequality (1) is strict. □

Theorem 3

FlowMntrs is NP-hard.

Proof

In the Clique problem, given an undirected graph G=(V,E) and an integer q>0, we wish to determine if G has a clique of size at least q. Clique is well-known to be NP-complete (see [3]). We show how to reduce Clique, in polynomial-time, to DecFlowMntrs, the decision version of FlowMntrs, defined as follows: Given a graph G=(V,E) and two integers, k,l>0, is there a set M of k edges in G for which |Br(GM)|≥l?

The reduction is simple. Suppose we have an instance G=(V,E),q of Clique. Without loss of generality, we can assume that G is connected and q≥3. Let n=|V| and m=|E|. We map this instance into an instance G,k,l of DecFlowMntrs, where \(k = m-\binom{q}{2} - l\) and l=nq. This clearly takes polynomial time. Thus, to complete the proof, it is sufficient to prove the following claim:

(∗):

G has a clique of size q iff G has a set M of k edges for which |Br(GM)|≥l.

We now prove (∗). The main idea is that, by the choice of parameters k and l, the monitors and extra edges in the solution of the instance of DecFlowMntrs must be exactly the edges outside the size-q clique of G.

(⇒) Suppose that G has a clique C of size q. Let G′ be the graph obtained by contracting C into a single vertex and let T be a spanning tree of G′. We then take M to be the set of edges of G′ outside T. Thus the edges in T will be the bridges of GM. Since G′ has nq+1 vertices, T has l=nq edges, and M has \(m-\binom{q}{2} - l = k\) edges.

(⇐) Suppose there is a set M of k monitor edges that yields a set B of l′ extra edges, where ll′≤n−1. We show that G has a clique of size q.

Let s be the number of connected components of GMB, and denote by a 1,a 2,…,a s the cardinalities of these components (numbers of vertices). Since |B|=l′, we have sl′+1. Also, \(\sum_{i=1}^{s} a_{i} = n\) and \(\sum_{i=1}^{s} \binom{a_{i}}{2} + k + l' \ge m\). Therefore, using Lemma 2, and the choice of k and l, we have

(2)
(3)
(4)

By routine calculus, the function \(f(x) = \frac{1}{2}(n-x)(n-x-1) + x\) is decreasing in interval [0,n−1], and therefore the above derivation implies that l′≤l, so we can conclude that l′=l. This, in turn, implies that all inequalities in this derivation are in fact equalities. Since (2) is an equality, we have s−1=l′=l=nq. Then, since (3) is an equality, Lemma 2 implies that a j =q for some j and a i =1 for all ij. Finally, (4) can be an equality only if all the connected components are cliques. In particular, we obtain that the jth component is a clique of size q. □

4 Algorithm σ-Greedy

Fix some integer constant σ≥1. Let G=(V,E) be the input graph with n=|V|, m=|E|, and with weights on edges. As justified in Sect. 2, we will assume that m>k and that G is 3-edge-connected.

Algorithm σ-Greedy that we study in this section works in ⌈k/σ⌉ steps and returns a set of k monitor edges. In each step, it assigns σ monitors to a set P of σ edges that maximizes the gain in this step, that is, the total weight of the monitor edges in P and the bridges in GP. These edges are then removed from G, and the process is repeated. A more rigorous description is given in Fig. 4, which also deals with special cases when the number of monitors or edges left in the graph is less than σ.

Fig. 4
figure 4

Pseudo-code for Algorithm σ-Greedy. Y t represents the edges collected by the algorithm in step t, with PY t being the set of monitor edges and Y t P the set of extra edges. M t represents all monitor edges collected up to step t and X t represents all edges collected up to step t

Note that each step of the algorithm runs in time O(m σ(n+m)), by trying all possible combinations of σ edges in the remaining graph G t−1 to find P. Hence, Algorithm σ-Greedy runs in time O(km σ(n+m)/σ). For σ=1, it is possible to modify 1-Greedy to reduce the running time to O(m+n+klogn), as explained in the next section.

4.1 Analysis of 1-Greedy

In this section we consider the case σ=1. Algorithm 1-Greedy at each step chooses an edge whose removal maximizes the gain and places a monitor on this edge. This edge and its corresponding bridges are removed from the graph. This process is repeated k times. We show that this algorithm has approximation ratio 3.

Analysis

Fix the value of k, and some optimal solution M of k monitor edges, and let G =(V ,E ) be the corresponding kernel graph. To avoid clutter, we will identify each edge in E with its corresponding edge in E, thus thinking of E as a subset of E. For example, when we say that the algorithm collected some eE , we mean that it collected the edge in E represented by e.

Recall that gain (G,k)=w(E ), where w(E ) is the sum of weights of the edges in E . Thus we need to show that 1-Greedy’s gain is at least \(\frac{1}{3}w({E^{\ast}})\).

Let e i , i=1,2,…,k, be the k heaviest edges in E , ordered by weight, that is w(e 1)≥w(e 2)≥⋯≥w(e k ). First, we claim that for each t=0,1,…,k, we have

$$ w(X_t) \ge \sum_{i=1}^t w(e_i), $$
(5)

where X t denotes the set of edges collected by the algorithm in the first t steps.

The proof of (5) is by a straightforward induction. It is vacuously true for t=0. Suppose (5) holds for t′=t−1; we will then show that it also holds for t. If X t contains all edges e 1,…,e t , then (5) holds. Otherwise, choose any j, 1≤jt, for which e j X t . By induction, \(w(X_{t-1})\ge\sum_{i=1}^{t-1} w(e_{i})\), and e j is available to 1-Greedy in step t, so its gain in step t is at least w(e j ). Therefore \(w(X_{t})\ge w(X_{t-1}) + w(e_{j}) \ge w(X_{t-1}) + w(e_{t}) \ge\sum_{i=1}^{t} w(e_{i})\), completing the proof of (5).

Since G is 3-edge-connected, each vertex in G has degree at least 3, so \(|{E^{\ast}}| \ge\frac{3}{2} |{V^{\ast}}|\), which implies that \(k \ge|{E^{\ast}}| - |{V^{\ast}}| + 1 > \frac{1}{3} |{E^{\ast}}|\). Thus, from (5), we obtain that the gain of 1-Greedy is \(w(X_{k}) \ge\sum_{i=1}^{k} w(e_{i}) \ge\frac{1}{3}w({E^{\ast}})\). Summarizing, we obtain:

Theorem 4

Algorithm  1-Greedy is a polynomial-time 3-approximation algorithm for the Weighted Flow Edge-Monitor Problem, WFlowMntrs.

With a somewhat more careful analysis, one can show that the approximation ratio of 1-Greedy is actually 3(1−1/k), which matches our lower bound example below.

A Tight-Bound Example

We now present an example showing that our analysis of 1-Greedy is tight. Graph G consists of one connected component with 2k−2 vertices, in which each vertex has degree 3 and each edge has weight 1, and the other connected component that has only two vertices connected by k+2 edges each of weight 1+ϵ. Figure 5 shows the construction for k=5.

Fig. 5
figure 5

Lower bound example for 1-Greedy, with k=5

1-Greedy will be collecting edges from the 2-vertex component on the left, ending up with k edges and total gain (1+ϵ)k. The optimum solution is to put k monitors in the cubic component on the right, thus gaining all 3k−3 edges from this component. For ϵ→0, the approximation ratio tends to 3(1−1/k).

The Running Time

As explained in the previous section, the running time of 1-Greedy is at most O(km(m+n)): we have at most k steps, and in each step, for each edge, in O(m+n)-time compute the gain of removing this edge. We can reduce the running time of each step by computing, in linear time (see, for example, [9]), a partition of the edge set into edge groups. We choose an edge group with maximum total weight and remove any edge from this group. The time per step is O(m+n), giving us an O(k(m+n))-time implementation of 1-Greedy.

However, as the analysis above shows, instead of following 1-Greedy, we can instead choose the k heaviest edges and still accomplish approximation ratio 3. This modified algorithm will then run in time O(m+n+klogn) by, say, using a heap to extract the k heaviest edges. Unfortunately, this modification does not seem to extend to Algorithm σ-Greedy for σ≥2.

4.2 Analysis of 2-Greedy

We now consider σ=2. At each step, Algorithm 2-Greedy selects two edges that maximize the gain. Ties are broken arbitrarily. We place monitors on these two edges, and then remove them from G, as well as the resulting bridges. The process continues for \(\lfloor\frac{1}{2}k\rfloor\) steps. Exceptional situations, when k is odd, or we run out of edges, etc., are handled as in Fig. 4. In this section we show that 2-Greedy is a 2-approximation algorithm.

Idea of the Analysis

Let G=(V,E) be the input graph with weights on edges. As before, we will assume that m>k and that G is 3-edge-connected. We can further assume that 2-Greedy never runs out of edges, for otherwise it computes an optimal solution. We fix some optimal solution M , and let G =(V ,E ) be the corresponding kernel graph with ν=|V | vertices and μ=|E | edges. Recall that gain (G,k)=w(E ); thus we need to show that 2-Greedy’s gain is at least \(\frac {1}{2}w({E^{\ast}})\).

Our proof for 1-Greedy was based on the observation that \(k\ge\frac{1}{3}|{E^{\ast}}|\); thus we only needed to show that 1-Greedy’s gain is at least the total weight of the k heaviest edges in E . This is not sufficient for 2-Greedy. To see this, consider a simple situation where G is unweighted with all vertices of degree 3, the extra edges form a tree in G , and k is even. Then \(\mu= \frac{3}{2}\nu\) and μ=ν+k−1, so μ≈3k. Therefore we need to show that in this case 2-Greedy collects at least \(\frac{3}{2}k\) edges. For the unweighted graph, this is not hard to show: pick a set J of \(\frac{1}{2}k\) independent vertices in G . (Such J exists, by the assumption that all vertices in G have degree 3.) For each vertex v in J, at each step of 2-Greedy, either the three edges of v have already been collected, or they can be collected in this step by placing monitors on two of them. (It is also possible that one edge out of v has already been collected, but in this case the remaining two can be collected in this step as well.) Therefore the gain of 2-Greedy in \(\frac{1}{2}k\) steps will be at least the number of edges out of J, that is \(\frac{3}{2}k\).

If G is weighted, the argument above is not sufficient, since the edges incident to vertices in J may have small weights. Further, it may happen that 2-Greedy will collect only k edges in total, if they are heavy enough, and in this case we need to argue that their total weight is at least \(\frac{1}{2}w({E^{\ast}})\), even though k may be only about \(\frac{1}{3}|{E^{\ast}}|\).

The proof consists of two parts. First, we introduce the concept of a bundle. Intuitively, a bundle is a set of edges that can be collected with at most two monitors, although our definition is more restrictive and will be given shortly. We show that G contains a set T of at most \(\lfloor\frac{1}{2}k\rfloor\) disjoint bundles with total weight at least \(\frac{1}{2}w({E^{\ast}})\). In the second part of the proof we show that the gain of 2-Greedy is at least the total weight of T.

Analysis

For any vertex in G of degree 3, the set of three edges incident on this vertex is called a tripod. A set β of edges is called a bundle if β is either a tripod or it consists of at most two edges. Clearly, all edges of a bundle can be collected with at most 2 monitors. If T is a set of bundles, by E T we denote the set of edges in T, that is E T =⋃ βT β. (We will extend the definition of bundles later in the proof of Lemma 6.)

Lemma 5

For k≥2, there exists a set T of at mostk/2⌋ disjoint bundles such that \(w(E_{T})\ge \frac{1}{2}w({E^{\ast}})\).

Proof

First, we construct a collection Z of bundles that contains all tripods in G and such that each edge appears in exactly two bundles in Z. To this end, we create two copies of each edge. If e={x,y} is an edge, then by e x and e y we will denote these two copies of e. Let F be the set of all these edge copies.

For each vertex v of G of degree 3, if e, f, g are the edges incident to v, add the tripod β={e v,f v,g v} to Z and remove e v, f v and g v from F. Let F′ be the set of remaining edges in F.

We now want to partition F′ into pairs, with each pair becoming a bundle. Assume first that |F′| is even; the argument for the general case is essentially the same but slightly more cumbersome because of the parity issue, and it will be explained later.

Group the edges in F′ arbitrarily into pairs. As long as any of these pairs has two copies of the same edge, say {e x,e y}, do this: take any other pair {f u,g v} (where possibly f=g) and replace these two pairs by pairs {e x,f u} and {e y,g v}. Eventually each pair will contain different edges. All these pairs become bundles that are now added to Z. This completes the construction of Z.

To construct T, we now greedily extract from Z non-overlapping bundles with maximum weight. More specifically, we start with T=∅, and repeat the following step as long as |Z|≥4: choose a bundle β in Z with maximum w(β), add it to T, and then remove from Z exactly four bundles: β, all other bundles in Z that intersect β, and possible a few more additional bundles so that the total is four. This is possible, because, by the definition of Z, each bundle in Z intersects at most three other bundles in Z. If, at the end, |Z|≠∅, then we add to T the bundle β in Z with maximum w(β) and remove all remaining bundles (at most three) from Z.

We now have our set T of bundles, and it remains to show that it has the desired properties. Obviously, all bundles in T are disjoint. In each step of the construction of T, when we add a bundle β to T, we reduce w(E Z ) by at most 4w(β), so \(w(E_{T}) \ge\frac{1}{4} w(E_{Z}) = \frac{1}{2}w({E^{\ast}})\), as needed.

It remains to show that \(|T|\le\lfloor\frac{1}{2}k\rfloor\). Let ν d be the number of vertices of degree d in G . All vertices in G have degree at least 3, thus

$$2\mu=\sum_{d\geq3} d\cdot\nu_d \geq 3\nu_3 + 4(\nu-\nu_3) = 4\nu-\nu_3. $$

We also have μν+k−1, which, together with the inequality above yields 2μν 3≤4k−4. In Z, we have exactly ν 3 tripods and \(\frac{1}{2}(2\mu- 3\nu_{3}) = \mu- \frac{3}{2}\nu_{3}\) pairs, so we obtain \(|Z| = \mu- \frac{1}{2}\nu_{3} \le 2k-2\). Hence \(|T| \leq\lceil(2k-2)/ 4\rceil = \lceil(k-1)/2\rceil \le\lfloor\frac{1}{2}k\rfloor\), as needed.

Now we deal with the case when |F′| is odd. We execute the same process for pairing edges, but in this case we will end up with one unpaired edge that will become a bundle by itself. The proof that \(w(E_{T}) \ge\frac{1}{2}w({E^{\ast}})\) remains valid. We still need to show that \(|T| \leq\lfloor\frac{1}{2}k\rfloor\). In Z, we have ν 3 tripods, \(\frac{1}{2}(2\mu-3\nu_{3}-1)\) pairs, and one singleton; hence \(|Z| = \nu_{3} + \frac{1}{2}(2\mu-3\nu_{3}-1) + 1 = \frac {1}{2}(2\mu-\nu_{3}+1)\). As before, we have 2μν 3≤4k−4, but now 2μν 3=(2μ−3ν 3)+2ν 3=|F′|+2ν 3 is odd, which implies that we in fact have 2μν 3+1≤4k−4. This implies |Z|≤2k−2, and the bound |T|≤⌊k/2⌋ follows, as before. □

Lemma 6

Let k≥2, and let T be the set of bundles constructed in Lemma 5. Then w(X k/2⌋)≥w(E T ); thus 2-Greedy ’s gain is at least w(E T ).

Proof

Let =⌊k/2⌋. To prove the lemma, we show that there is a partition of E T into disjoint sets B 1,B 2,…,B (some possibly empty) such that w(B t )≤w(Y t ), for t=1,2,…,, where Y t is the set of edges collected by 2-Greedy in step t. This is clearly sufficient to establish the theorem.

The idea of the proof is to proceed one step at a time, for t=1,2,…,, starting with T′=T and at each step reducing the number of bundles in T′. The invariant will be that at each step t, all bundles in T′ will be available to 2-Greedy for collection. At step t, we eliminate from these bundles all edges collected by the algorithm, and rearrange some bundles, if necessary, in such a way that the number of bundles in T′ strictly decreases.

To implement the above idea we need to extend the definition of bundles. If β={e,f,g} is a tripod in G and 2-Greedy collects e at some step t′<t while f,g remain uncollected until step t, then the pair β′={f,g} is called a bipod at step t. If βE is a set of edges not collected in the first t−1 steps, then β is called a bundle at step t if either (i) β is a tripod in G , or (ii) β=β 1β 2, where each β i is either an edge or a bipod or ∅. (We will omit phrase “at time t” whenever t is understood from context.) Clearly, this definition extends the one given earlier in this section. However, it is still true that 2-Greedy can collect all edges from a bundle in a single step (that is, with two monitors).

Now we describe the construction of the sets B t . Initially, T′=T. Suppose that, for some t≥1, we have already constructed B 1,…,B t−1 and modified T′ accordingly. Consider now step t of 2-Greedy. We distinguish three cases.

Case 1: Y t intersects at most one bundle in T′. If Y t intersects one bundle, let β be this bundle; otherwise, let β be any bundle. Remove β from T and set B t =β.

Case 2: Y t contains a bundle in T′. In this case we simply set B t =Y t E T and we update T′ as follows: remove from T′ all bundles that are contained in Y t , and for each other bundle β in T′ that intersects Y t , remove from β the edges in Y t (that is, ββY t ).

Case 3: Y t intersects at least two bundles in T′ but it does not contain any. We let B t =Y t E T. To update T′, we proceed in two steps. First, choose any two bundles β 1,β 2 in T′ intersected by Y t , and replace them by β 1β 2Y t . Then, for each other bundle β in T′ intersected by Y t , remove from β all edges in Y t .

Note that Cases 2 and 3 are not mutually exclusive. If, at some steps, both of these cases apply, any of them can be chosen arbitrarily.

We claim that this procedure is correct, in the sense that at each step t all elements of T′ are bundles. To this end, we make two observations. If β={e,f,g} is a tripod at time t, then 2-Greedy can either collect one edge from β or all three, but it cannot collect just two. If one edge is collected, the remaining edges form a bipod. Also, if {e,f} is a bipod at time t, then 2-Greedy either collects both edges e,f or none. If we remove any edges from a bundle, it obviously remains a bundle. Another type of update occurs in Case 3 where we replace β 1 and β 2 by β 1β 2Y t . Here, by the case condition, each β i Y t is either an edge or a bipod, and thus β 1β 2Y t is a correct bundle.

Next, we estimate the weight of the sets B t . In Cases 2 and 3 we have B t Y t , so clearly w(Y t )≥w(B t ). Due to the fact that it takes no more than two monitors to collect all edges in a bundle and 2-Greedy chooses Y t , w(Y t ) is at least as large as the weight of any bundle in T′ at time t; hence w(Y t )≥w(B t ) holds for Case 1 as well.

Finally, note that we have no more than bundles in T′ to start with and the total number of bundles in T′ strictly decreases in each step. Therefore after steps we will have T′=∅, and thus B 1,B 2,…,B is a partition of E T , completing the proof. □

In the case for k=1, 2-Greedy actually gets an optimal solution, and Lemma 5 and 6 combined imply that for k≥2 the gain of 2-Greedy is at least half of the optimum. Thus, summarizing, we obtain our main result.

Theorem 7

Algorithm  2-Greedy is a polynomial-time 2-approximation algorithm for the Weighted Flow Edge-Monitor Problem, WFlowMntrs.

A Tight-Bound Example

Our analysis of 2-Greedy is tight. The example is essentially the same as the one for 1-Greedy, illustrated in Fig. 5, except that the edges in the 2-vertex component on the left side have now weights 1.5+ϵ. 2-Greedy will be collecting edges from this 2-vertex component, so its total gain will be (1.5+ϵ)k, while the optimum gain is 3k−3. For ϵ→0 and k→∞, the ratio tends to 2.

5 Final Comments

The most intriguing open question is what is the approximation ratio of σ-Greedy in the limit for σ→∞. We can show that this limit is not lower than 1.5, and we conjecture that 1.5 is indeed the correct answer.

A natural question to ask is whether our results can be extended to directed graphs. It is not difficult to show that this is indeed true; both the NP-hardness proof and 2-approximation can be adapted to that case.

Another question is whether Algorithm σ-Greedy can be implemented faster—in time that is a polynomial function with exponent independent of σ. Even considerably speeding-up 2-Greedy would be of some interest, although this would probably require efficiently maintaining information about 3-edge-connected components in the presence of edge deletions and is a research topic of its own. To the best of our knowledge, the fastest fully dynamic algorithm for 3-edge-connectivity takes time O(n 2/3) per update and query [2] (see also [6]) and it is unclear whether this algorithm is sufficient for our purpose, since in addition to simply removing edges, we also need to be able to find the edges that optimize the gain at each step.

Another intriguing direction to pursue would be to study the extension of our problem to arbitrary linear systems of equations. In that context, we can put k “monitors” on k variables of the system to measure their values. The objective is to maximize the number of variables whose values can be uniquely deduced from the monitored variables.