1 Introduction

Assume that we are given a finite set U. Then a function \(f :2^U \rightarrow {\mathbb {R}}\) is said to be submodular, if

$$\begin{aligned} f(X) + f(Y) \ge f(X \cup Y) + f(X \cap Y) \end{aligned}$$

for every pair of subsets XY of U, where \({\mathbb {R}}\) is the set of real numbers. Submodular functions play an important role in many fields, e.g., combinatorial optimization, machine learning, and game theory. One of the most fundamental problems related to submodular functions is the submodular function minimization problem. In this problem, we are given a submodular function \(f :2^U \rightarrow {\mathbb {R}}\), and the goal is to find a subset X of U minimizing f(X) among all subsets of U, i.e., to find a minimizer of f. It is known [5, 6, 8, 20] that this problem can be solved in polynomial time (we assume the oracle model).

In this paper, we consider constrained variants of the submodular function minimization problem. Constrained variants of the submodular function minimization problem have been extensively studied in various fields [4, 7, 9,10,11,12,13,14,15, 21, 23]. For example, Iwata and Nagano [9] considered the submodular function minimization problem with vertex covering constraints, set covering constraints, and edge covering constraints, and gave approximability and inapproximability. Goel, Karande, Tripathi, and Wang [4] considered the vertex cover problem, the shortest path problem, the perfect matching problem, and the minimum spanning tree problem with a monotone submodular cost function. Svitkina and Fleischer [21] also considered several optimization problems with a submodular cost function. Especially, Svitkina and Fleischer [21] proved that for the submodular function minimization problem with cardinality lower bound, there does not exist a polynomial-time \(o(\sqrt{n / \ln n})\)-approximation algorithm. Iyer and Bilmes [10] and Kamiyama [14] considered the submodular function minimization problem with submodular set covering constraints. Furthermore, Jegelka and Bilmes [13] considered the submodular function minimization problem with cut constraints. Koufogiannakis and Young [15] considered the monotone submodular function minimization problem with general covering constraints. Hochbaum [7] considered the submodular minimization problem with linear constraints having at most two variables per inequality. Zhang and Vorobeychik [23] considered the submodular function minimization problem with routing constraints.

In this paper, we consider the non-negative submodular function minimization problem with covering type linear constraints. Assume that there exist m linear constraints, and we denote by \(\varDelta _i\) the number of non-zero coefficients in the ith constraints. Furthermore, we assume that \(\varDelta _1 \ge \varDelta _2 \ge \cdots \ge \varDelta _m\). For this problem, Koufogiannakis and Young [15] proposed a polynomial-time \(\varDelta _1\)-approximation algorithm. In this paper, we propose a new polynomial-time primal-dual approximation algorithm based on the approximation algorithm of Takazawa and Mizuno [22] for the covering integer program with \(\{0,1\}\)-variables and the approximation algorithm of Iwata and Nagano [9] for the submodular function minimization problem with set covering constraints. The approximation ratio of our algorithm is

$$\begin{aligned} \max \{\varDelta _2, \min \{\varDelta _1, 1 + \varPi \}\}, \end{aligned}$$

where \(\varPi \) is the maximum size of a connected component of the input submodular function (see the next section for its formal definition). It is not difficult to see that the approximation ratio of our algorithm is at most \(\varDelta _1\). Furthermore, if \(\varPi \) is small (i.e., the input submodular function is close to a linear function) and \(\varDelta _2\) is also small, then our approximation can improve the algorithm of Koufogiannakis and Young [15]. For example, in the minimum knapsack problem with a forcing graph (see, e.g., [22] for its formal definition), \(\varDelta _1\) is large, but \(\varDelta _2\) is small.

2 Preliminaries

We denote by \({\mathbb {R}}\) and \({\mathbb {R}}_+\) the sets of real numbers and non-negative real numbers, respectively. For each finite set U, each vector v in \({\mathbb {R}}^U\), and each subset X of U, we define \(v(X) {:=} \sum _{u \in X}v(u)\).

Throughout this paper, we are given finite sets N and \(M = \{1,2,\ldots , m\}\) such that \(m \ge 2\), and a non-negative submodular function \(\rho :2^N \rightarrow {\mathbb {R}}_+\) such that \(\rho (\emptyset ) = 0\). We assume that for every subset X of N, we can compute \(\rho (X)\) in time bounded by a polynomial in |N|. Furthermore, we are given vectors a in \({\mathbb {R}}_+^{M \times N}\) and b in \({\mathbb {R}}_+^M\). For each subset X of N, we define the vector \(\chi _X\) in \(\{0,1\}^N\) by

$$\begin{aligned} \chi _X(j) {:=} {\left\{ \begin{array}{ll} 1 &{} \text{ if } j \in X \\ 0 &{} \text{ if } j \in N \setminus X. \end{array}\right. } \end{aligned}$$

Then we consider the following problem SCIP.

$$\begin{aligned} \begin{array}{cll} \text{ Minimize } &{}\quad \rho (X) \\ \text{ subject } \text{ to } &{}\quad \displaystyle {\sum _{j \in N}a(i,j) \chi _X(j) \ge b(i)} &{} \quad (i \in M) \\ &{}\quad X \subseteq N. \end{array} \end{aligned}$$

Without loss of generality, we assume that for every element i in M,

$$\begin{aligned} \sum _{j \in N}a(i,j) \ge b(i). \end{aligned}$$
(1)

Otherwise, there does not exist a feasible solution of SCIP.

For each element i in M, we define \(\varDelta _i\) as the number of elements j in N such that \(a(i,j) \ne 0\). Without loss of generality, we assume that \(\varDelta _1 \ge \varDelta _2 \ge \cdots \ge \varDelta _m\).

A subset X of N is said to be separable, if there exists a non-empty proper subset Y of X such that

$$\begin{aligned} \rho (X) = \rho (Y) + \rho (X \setminus Y). \end{aligned}$$

Furthermore, a subset X of N is said to be inseparable, if X is not separable. It is known [1, Proposition 4.4] that N can be uniquely partitioned into non-empty subsets \(I_1,I_2,\ldots ,I_{\delta }\) satisfying the following conditions in polynomial time by using the algorithm of Queyranne [19]. (For completeness, we give an algorithm for computing \(I_1,I_2,\ldots ,I_{\delta }\) in Appendix 4.)

  1. 1.

    \(I_p\) is inseparable for every integer p in \(\{1,2,\ldots ,\delta \}\).

  2. 2.

    For every subset X of N,

    $$\begin{aligned} \rho (X) = \rho (X \cap I_1) + \rho (X \cap I_2) + \cdots + \rho (X \cap I_{\delta }). \end{aligned}$$

Define

$$\begin{aligned} \varPi {:=} \max \{|I_1|, |I_2|, \ldots , |I_{\delta }|\}, \end{aligned}$$

and we call \(\varPi \) the dependency of \(\rho \). In this paper, we propose a polynomial-time approximation algorithm for SCIP whose approximation ratio is

$$\begin{aligned} \max \{\varDelta _2, \min \{\varDelta _1, 1 + \varPi \}\}. \end{aligned}$$

For SCIP, Koufogiannakis and Young [15] proved that if \(\rho \) is monotone, i.e., \(\rho (X) \le \rho (Y)\) for every pair of subsets XY of N such that \(X \subseteq Y\), then there exists a \(\varDelta _1\)-approximation algorithm. (See [9, p.675] for the monotonicity of an objective function.) Iwata and Nagano [9] considered the case where \(a(i,j) \in \{0,1\}\) and \(b(i) = 1\) for every element i in M and every element j in N, and proposed a \(\varDelta _1\)-approximation algorithm. Notice that if there exists a vector c in \({\mathbb {R}}_+^N\) such that \(\rho (X) = c(X)\) holds for every subset X of N, then the dependency \(\varPi \) is equal to 1. Thus, if we assume that \(\varDelta _2 \ge 2\), then the approximation ratio of our algorithm is \(\varDelta _2\). This implies that our result can be regarded as a generalization of the \(\varDelta _2\)-approximation algorithm of Takazawa and Mizuno [22] for the covering integer program with \(\{0,1\}\)-variables.

3 Algorithm

For proposing an approximation algorithm for SCIP, we need to introduce a linear programming relaxation of SCIP. This approach was proposed by Iwata and Nagano [9] for the submodular function minimization problem with set covering constraints.

We first define the function \({\widehat{\rho }} :{\mathbb {R}}_+^N \rightarrow {\mathbb {R}}_+\) called the Lovász extension of \(\rho \) [16]. Assume that we are given a vector v in \({\mathbb {R}}^N_+\). Furthermore, we assume that for non-negative real numbers \({\hat{v}}_1,{\hat{v}}_2,\ldots ,{\hat{v}}_s\) such that \({\hat{v}}_1> {\hat{v}}_2> \cdots > {\hat{v}}_s\), we have \(\{{\hat{v}}_1,{\hat{v}}_2,\ldots ,{\hat{v}}_s\} = \{v(j) \mid j \in N\}\). Then for each integer p in \(\{1,2,\ldots ,s\}\), we define \(N_p\) by

$$\begin{aligned} N_p {:=} \{j \in N \mid v(j) \ge {\hat{v}}_p\}. \end{aligned}$$

Then we define \({\widehat{\rho }}(v)\) by

$$\begin{aligned} {\widehat{\rho }}(v) {:=} \sum _{p =1}^{s}\left( {\hat{v}}_p - {\hat{v}}_{p + 1}\right) \rho (N_p), \end{aligned}$$

where we define \({\hat{v}}_{s + 1} {:=} 0\). It is known [3] that

$$\begin{aligned} {\widehat{\rho }}(v) = \max _{z \in \mathrm{P}(\rho )} \sum _{j \in N} v(j) z(j), \end{aligned}$$
(2)

where we define \(\mathrm{P}(\rho )\) by

$$\begin{aligned} \mathrm{P}(\rho ) {:=} \{z \in {\mathbb {R}}^N \mid z(X) \le \rho (X) \text{ for } \text{ every } \text{ subset } X \text{ of } N\}. \end{aligned}$$

By considering the dual problem of (2), we can see that for every vector v in \({\mathbb {R}}_+^N\), \({\widehat{\rho }}(v)\) is equal to the optimal objective value of the following problem (see, e.g., [9]).

$$\begin{aligned} \begin{array}{cll} \text{ Minimize } &{} \quad \displaystyle {\sum _{X \subseteq N}\rho (X)\xi (X)}\\ \text{ subject } \text{ to } &{} \quad \displaystyle {\sum _{X \subseteq N :j \in X}\xi (X) = v(j)} &{} \quad (j \in N) \\ &{} \quad \xi \in {\mathbb {R}}^{2^N}_+. \end{array} \end{aligned}$$
(3)

It is not difficult to see that for every subset X of N, \(\rho (X) = {\widehat{\rho }}(\chi _X)\). Thus, SCIP is equivalent to the following problem.

$$\begin{aligned} \begin{array}{cll} \text{ Minimize } &{} \quad {\widehat{\rho }}(x) \\ \text{ subject } \text{ to } &{} \quad \displaystyle {\sum _{j \in N}a(i,j) x(j) \ge b(i)} &{} \quad (i \in M) \\ &{} \quad x \in \{0,1\}^N. \end{array} \end{aligned}$$
(4)

Define the vectors \({\overline{a}}\) in \({\mathbb {R}}^{M \times N \times 2^N}_+\) and \({\overline{b}}\) in \({\mathbb {R}}^{M \times 2^N}_+\) by

$$\begin{aligned} \begin{aligned}&{\overline{b}}(i,A) {:=} \max \Big \{0, b(i) - \sum _{j \in A} a(i,j)\Big \},\\&{\overline{a}}(i,j,A) {:=} \min \{a(i,j), {\overline{b}}(i,A)\}. \end{aligned} \end{aligned}$$

Then we consider the following problem.

$$\begin{aligned} \begin{array}{cll} \text{ Minimize } &{} \quad {\widehat{\rho }}(x) \\ \text{ subject } \text{ to } &{} \quad \displaystyle {\sum _{j \in N \setminus A}{\overline{a}}(i,j,A) x(j) \ge {\overline{b}}(i,A)} &{} \quad (i \in M, A \subseteq N) \\ &{} \quad x \in \{0,1\}^N. \end{array} \end{aligned}$$
(5)

The constraints of (5) are based on the results of [1, 2]. It is known [1, 2] that for every vector x in \(\{0,1\}^N\), x is a feasible solution of the problem (4) if and only if x is a feasible solution of the problem (5). We give the proof of this statement for completeness.

Theorem 1

For every vector x in \(\{0,1\}^N\), x is a feasible solution of the problem (4) if and only if x is a feasible solution of the problem (5).

Proof

Let us fix a vector x in \(\{0,1\}^N\) and an element i in M. Assume that x is a feasible solution of the problem (4). Let A be a subset of N. If there exists an element \(j^{*}\) in \(N \setminus A\) such that \(x(j^{*}) = 1\) and \(a(i,j^{*}) \ge {\overline{b}}(i,A)\), then since \({\overline{a}}(i,j,A) \ge 0\) for every element j in N,

$$\begin{aligned} \sum _{j \in N \setminus A}{\overline{a}}(i,j,A) x(j) \ge {\overline{a}}(i,j^{*},A) = {\overline{b}}(i,A). \end{aligned}$$

Assume that \(a(i,j) < {\overline{b}}(i,A)\) for every element j in \(N \setminus A\) such that \(x(j) = 1\). Since \({\overline{a}}(i,j,A) \ge 0\) for every element j in N,

$$\begin{aligned} \sum _{j \in N \setminus A}{\overline{a}}(i,j,A) x(j) \ge 0. \end{aligned}$$

Furthermore, since

$$\begin{aligned} \sum _{j \in N}a(i,j) x(j) \ge b(i), \end{aligned}$$

we have

$$\begin{aligned} \begin{aligned} \sum _{j \in N \setminus A}{\overline{a}}(i,j,A) x(j)&=\sum _{j \in N \setminus A}a(i,j) x(j) \\&\ge b(i) - \sum _{j \in A}a(i,j)x(j) \\&\ge b(i) - \sum _{j \in A}a(i,j). \end{aligned} \end{aligned}$$

This implies that x is a feasible solution of the problem (5).

Assume that x is a feasible solution of the problem (5). Then we have

$$\begin{aligned} \sum _{j \in N}a(i,j)x(j) \ge \sum _{j \in N}{\overline{a}}(i,j,\emptyset )x(j) \ge {\overline{b}}(i,\emptyset ) \ge b(i). \end{aligned}$$

This implies that x is a feasible solution of the problem (4). \(\square \)

We consider the following relaxation problem RP of the problem (5). Notice that Theorem 1 implies that RP is a relaxation problem of the problem (4).

$$\begin{aligned} \begin{array}{cll} \text{ Minimize } &{} \quad {\widehat{\rho }}(x) \\ \text{ subject } \text{ to } &{} \quad \displaystyle {\sum _{j \in N \setminus A}{\overline{a}}(i,j,A) x(j) \ge {\overline{b}}(i,A)} &{} \quad (i \in M, A \subseteq N) \\ &{} \quad x \in {\mathbb {R}}_+^N. \end{array} \end{aligned}$$

Since for every vector v in \({\mathbb {R}}_+^N\), \({\widehat{\rho }}(v)\) is equal to the optimal objective value of the problem (3), the optimal objective value of RP is equal to that of the following problem LP.

$$\begin{aligned} \begin{array}{cll} \text{ Minimize } &{} \quad \displaystyle {\sum _{X \subseteq N} \rho (X) \xi (X)} \\ \text{ subject } \text{ to } &{} \quad \displaystyle {\sum _{j \in N \setminus A}{\overline{a}}(i,j,A) x(j) \ge {\overline{b}}(i,A)} &{} \quad (i \in M, A \subseteq N) \\ &{} \quad \displaystyle {\sum _{X \subseteq N :j \in X} \xi (X) = x(j)} &{}\quad (j \in N) \\ &{}\quad (x,\xi ) \in {\mathbb {R}}^N \times {\mathbb {R}}_+^{2^N}. \end{array} \end{aligned}$$

Notice that we neglect the redundant non-negativity constraint of x. Then the dual problem of LP can be described as follows.

$$\begin{aligned} \begin{array}{cll} \text{ Maximize } &{} \quad \displaystyle {\sum _{i \in M}\sum _{A \subseteq N} {\overline{b}}(i,A) y(i,A)} \\ \text{ subject } \text{ to } &{} \quad \displaystyle {\sum _{i \in M}\sum _{A \subseteq N :j \notin A} {\overline{a}}(i,j,A)y(i,A) = z(j)} &{} \quad (j \in N) \\ &{} \quad (y,z) \in {\mathbb {R}}_+^{M \times 2^N} \times \mathrm{P}(\rho ). \end{array} \end{aligned}$$

We call this problem DLP.

Let z be a vector in \(\mathrm{P}(\rho )\). Define the function \(\rho - z :2^N \rightarrow {\mathbb {R}}_+\) by \((\rho - z)(X) {:=} \rho (X) - z(X)\). Then \(\rho - z\) is submodular, and \(\min _{X \subseteq N} (\rho - z)(X) = (\rho - z)(\emptyset ) = 0\). Furthermore, it is not difficult to see that for every pair of minimizers XY of \(\rho - z\), \(X \cup Y\) is a minimizer of \(\rho - z\). Thus, there exists the unique maximal subset X of N such that \(\rho (X) = z(X)\).

We are now ready to propose our algorithm, called Algorithm 1. This algorithm is based on the approximation algorithm of Takazawa and Mizuno [22] for the covering integer program with \(\{0,1\}\)-variables. For each element i in M and each subset S of N, we define a vector \(g_{i,S}\) in \({\mathbb {R}}^N_+\) by

$$\begin{aligned} g_{i,S}(j) {:=} {\left\{ \begin{array}{ll} {\overline{a}}(i,j,S) &{} \text{ if } j \in N \setminus S \\ 0 &{} \text{ if } j \in S. \end{array}\right. } \end{aligned}$$

Then Algorithm 1 can be described as follows. Notice that \(y_1,y_2,\ldots ,y_T\) are needed only for the analysis of Algorithm 1.

figure a

The following lemmas imply that Algorithm 1 is well-defined and halts in finite time.

Lemma 1

Assume that we are given an element i in M and a subset S of N such that \({\overline{b}}(i,S) > 0\). Then there exists an element j in \(N \setminus S\) such that \({\overline{a}}(i,j,S) > 0\). Furthermore, there exists a subset X of N such that \(g_{i,S}(X) \ne 0\).

Proof

The second statement follows from the first statement. Assume that for every element j in \(N \setminus S\), \({\overline{a}}(i,j,S) = 0\) (notice that \({\overline{a}}(i,j,S) \ge 0\)). Then for every element j in \(N \setminus S\), since \({\overline{b}}(i, S) > 0\), the definition of \({\overline{a}}(i,j,S)\) implies that \(a(i,j) = 0\). Thus, we have

$$\begin{aligned} b(i) > \sum _{j \in S}a(i,j) = \sum _{j \in N}a(i,j), \end{aligned}$$

where the strict inequality follows from the fact that \({\overline{b}}(i,S) > 0\). This contradicts (1). \(\square \)

Lemma 2

Assume that we are given an element i in M, a subset S of N, and a vector z in \(\mathrm{P}(\rho )\) such that \({\overline{b}}(i,S) > 0\). Furthermore, we assume that S is the unique maximal subset of N such that \(\rho (S) = z(S)\). If we define

$$\begin{aligned} \alpha {:=} \min _{X \subseteq N:g_{i,S}(X) \ne 0} \frac{\rho (X) - z(X)}{g_{i,S}(X)} \end{aligned}$$

and \(z^{\prime } {:=} z + \alpha \cdot g_{i,S}\), then we have

(1):

\(z^{\prime } \in \mathrm{P}(\rho )\).

Furthermore, we define \(S^{\prime }\) as the maximal subset of N such that \(\rho (S^{\prime }) = z^{\prime }(S^{\prime })\). Then we have

(2):

\(S \subsetneq S^{\prime }\).

Proof

We first prove (1). For every subset X of N such that \(g_{i,S}(X) = 0\), we have \(z^{\prime }(X) = z(X) \le \rho (X)\). Furthermore, for every subset X of N such that \(g_{i,S}(X) \ne 0\),

$$\begin{aligned} z^{\prime }(X) = z(X) + \alpha \cdot g_{i,S}(X) \le z(X) + \frac{\rho (X) - z(X)}{g_{i,S}(X)} \cdot g_{i,S}(X) = \rho (X). \end{aligned}$$

This completes the proof.

Next we prove (2). Since \(z^{\prime }(j) = z(j)\) for every element j in S, \(\rho (S) = z^{\prime }(S)\). The maximality of \(S^{\prime }\) implies that \(S \subseteq S^{\prime }\). Let Z be a subset of N such that \(g_{i,S}(Z) \ne 0\) and

$$\begin{aligned} \alpha = \frac{\rho (Z) - z(Z)}{g_{i,S}(Z)}. \end{aligned}$$

Then \(\rho (Z) = z^{\prime }(Z)\). The maximality of \(S^{\prime }\) implies that \(Z \subseteq S^{\prime }\) holds. Furthermore, since \(g_{i,S}(Z) \ne 0\), we have \(Z \not \subseteq S\), which implies that \(S\subsetneq S^{\prime }\). This completes the proof. \(\square \)

Notice that since \(Q_{\delta } = S_T\) and \({\overline{b}}(1,S_T) = 0\), \(\beta \) is well-defined.

4 Analysis

In this section, we analyze properties of Algorithm 1.

We first prove that Algorithm 1 is a polynomial-time algorithm. It follows from Lemma 2(2) that T is at most \(|N| + 1\). It is known [18] that \(\alpha _t\) can be computed in polynomial time. Furthermore, it is known (see, e.g., [17, Note 10.11]) that we can find the unique maximal subset \(S_{t+1}\) of N such that \(\rho (S_{t+1}) = z_{t+1}(S_{t+1})\) in polynomial time. These imply that Algorithm 1 is a polynomial-time algorithm.

Next we evaluate the approximation ratio.

Lemma 3

For every integer t in \(\{1,2,\ldots ,T\}\), \((y_t,z_t)\) is a feasible solution of DLP.

Proof

We prove this lemma by induction on t. If \(t = 1\), then this lemma follows from the fact that \(\rho (X) \ge 0\) for every subset X of N. Assume that this lemma holds when \(t = k\) (\(\ge 1\)), and then we consider the case of \(t = k + 1\). Assume that \(t(r+1) < k + 1 \le t(r)\) for an integer r in \(\{1,2,\ldots ,m\}\), where we define \(t(m+1) {:=} 0\). Since \(\alpha _k \ge 0\) follows from \(z_k \in \mathrm{P}(\rho )\), we have \(y_{k+1} \in {\mathbb {R}}_+^{M \times 2^N}\). Furthermore, Lemma 2(1) implies that \(z_{k+1} \in \mathrm{P}(\rho )\). For every element j in \(S_k\), \(z_{k+1}(j) = z_k(j)\) and

$$\begin{aligned} \sum _{i \in M}\sum _{A \subseteq N :j \notin A} {\overline{a}}(i,j,A)y_{k+1}(i,A) = \sum _{i \in M}\sum _{A \subseteq N :j \notin A} {\overline{a}}(i,j,A)y_{k}(i,A). \end{aligned}$$

For every element j in \(N \setminus S_k\), \(z_{k+1}(j) - z_{k}(j) = {\overline{a}}(r,j,S_k) \cdot \alpha _k\) and

$$\begin{aligned} \begin{aligned}&\sum _{i \in M}\sum _{A \subseteq N :j \notin A} {\overline{a}}(i,j,A)y_{k+1}(i,A) - \sum _{i \in M}\sum _{A \subseteq N :j \notin A} {\overline{a}}(i,j,A)y_{k}(i,A) \\&\quad = {\overline{a}}(r,j,S_k) \cdot (y_{k+1}(r,S_k) - y_k(r,S_k)) \\&\quad = {\overline{a}}(r,j,S_k) \cdot \alpha _k. \end{aligned} \end{aligned}$$

This completes the proof. \(\square \)

Lemma 4

The vector \(\chi _Q\) is a feasible solution of the problem (4), i.e., Q is a feasible solution of SCIP.

Proof

Let i be an element in M. Define a subset X of N by

$$\begin{aligned} X {:=} {\left\{ \begin{array}{ll} S_{t(i)} &{} \text{ if } i \ne 1 \\ Q &{} \text{ if } i = 1. \end{array}\right. } \end{aligned}$$

Since \({\overline{b}}(i,X) = 0\), we have

$$\begin{aligned} b(i) \le \sum _{j \in X}a(i,j). \end{aligned}$$

Thus, since \(a(i,j) \ge 0\) for every element j in M and \(X \subseteq Q\), we have

$$\begin{aligned} b(i) \le \sum _{j \in X}a(i,j) \le \sum _{j \in Q}a(i,j). \end{aligned}$$

This implies that \(\chi _Q\) is a feasible solution of the problem (4). This completes the proof. \(\square \)

Lemma 5

We have \(\rho (Q) = z_T(Q)\).

Proof

If \(t(1) = t(2)\), then \(Q = S_T\), and thus this lemma follows from \(z_T(S_T) = \rho (S_T)\). In what follows, we assume that \(t(1) \ne t(2)\). Since \(z_T \in \mathrm{P}(\rho )\),

$$\begin{aligned} z_T(S_{T-1} \cap I_p) \le \rho (S_{T-1} \cap I_p) \end{aligned}$$

for every integer p in \(\{1,2,\ldots ,\delta \}\). Since \(z_{T-1}(S_{T-1}) = \rho (S_{T-1})\) and \(z_T(j) = z_{T-1}(j)\) for every element j in \(S_{T-1}\),

$$\begin{aligned} \begin{aligned} z_T(S_{T-1})&= z_{T-1}(S_{T-1}) \\&= \rho (S_{T-1}) \\&= \rho (S_{T-1} \cap I_1) + \rho (S_{T-1} \cap I_2) + \cdots + \rho (S_{T-1} \cap I_{\delta }) \\&\ge z_T(S_{T-1} \cap I_1) + z_T(S_{T-1} \cap I_2) + \cdots + z_T(S_{T-1} \cap I_{\delta }) \\&= z_T(S_{T-1}). \end{aligned} \end{aligned}$$

This implies that we have

$$\begin{aligned} z_T(S_{T-1} \cap I_p) = \rho (S_{T-1} \cap I_p) \end{aligned}$$

for every integer p in \(\{1,2,\ldots ,\delta \}\). In the same way, we can prove that

$$\begin{aligned} z_T(S_T \cap I_p) = \rho (S_T \cap I_p) \end{aligned}$$

for every integer p in \(\{1,2,\ldots ,\delta \}\). Thus, since

$$\begin{aligned} Q = (S_{T} \cap I_1) \cup \cdots \cup (S_{T} \cap I_{\beta }) \cup (S_{T-1} \cap I_{\beta + 1}) \cup \cdots \cup (S_{T-1} \cap I_{\delta }), \end{aligned}$$

we have

$$\begin{aligned} \begin{aligned} \rho (Q)&= \rho (S_{T} \cap I_1) + \cdots + \rho (S_{T} \cap I_{\beta }) + \rho (S_{T-1} \cap I_{\beta + 1}) + \cdots + \rho (S_{T-1} \cap I_{\delta }) \\&= z_T(S_{T} \cap I_1) + \cdots + z_T(S_{T} \cap I_{\beta }) + z_T(S_{T-1} \cap I_{\beta + 1}) + \cdots \\&\quad + z_T(S_{T-1} \cap I_{\delta }) \\&= z_T(Q). \end{aligned} \end{aligned}$$

This completes the proof. \(\square \)

Theorem 2

Algorithm 1 is an approximation algorithm for SCIP whose approximation ratio is \(\max \{\varDelta _2, \min \{\varDelta _1, 1 + \varPi \}\}\).

Proof

Lemma 4 implies that Algorithm 1 is an approximation algorithm for SCIP. Let OPT be the optimal objective value of SCIP. Lemma 3 implies that

$$\begin{aligned} \sum _{i \in M}\sum _{A \subseteq N} {\overline{b}}(i,A)y_T(i,A) \le \mathsf{OPT}. \end{aligned}$$
(6)

Furthermore, Lemma 5 implies that

$$\begin{aligned} \begin{aligned} \rho (Q) = z_T(Q)&= \sum _{j \in Q}\sum _{i \in M}\sum _{A \subseteq N :j \notin A} {\overline{a}}(i,j,A)y_T(i,A) \\&= \sum _{i \in M}\sum _{A \subseteq N}\sum _{j \in Q\setminus A} {\overline{a}}(i,j,A)y_T(i,A). \end{aligned} \end{aligned}$$
(7)

Let i be an element in M. Then we have

$$\begin{aligned} \begin{aligned} \sum _{A \subseteq N}\sum _{j \in Q\setminus A} {\overline{a}}(i,j,A)y_T(i,A)&= \sum _{A \subseteq N}\sum _{j \in Q \setminus A :a(i,j) \ne 0} {\overline{a}}(i,j,A)y_T(i,A)\\&\le \sum _{A \subseteq N}\sum _{j \in Q \setminus A :a(i,j) \ne 0} {\overline{b}}(i,A)y_T(i,A) \\&\le \varDelta _i \cdot \sum _{A \subseteq N} {\overline{b}}(i,A)y_T(i,A). \end{aligned} \end{aligned}$$
(8)

Assume that \(t(1) \ne t(2)\). Define \(Q_0 {:=} S_{T-1}\). For every subset A of \(S_{T-1}\),

$$\begin{aligned} \begin{aligned} \sum _{j \in Q_{\beta -1}\setminus A} {\overline{a}}(1,j,A)&\le \sum _{j \in Q_{\beta -1}\setminus A} a(1,j) \\&= \sum _{j \in Q_{\beta -1}} a(1,j) - \sum _{j \in A} a(1,j) \ \ \ \ \ \ \text{(by } A \subseteq Q_{\beta -1}\text{) } \\&< b(1) - \sum _{j \in A} a(1,j) \\&\le {\overline{b}}(1,A), \end{aligned} \end{aligned}$$
(9)

where the strict inequality follows from the definition of \(\beta \) (i.e., \({\overline{b}}(1,Q_{\beta -1}) > 0\)). Furthermore, the definition of Algorithm 1 and Lemma 2(2) imply that for every subset A of N, if \(y_T(1,A) > 0\), then \(A \subseteq S_{T-1}\). Thus,

$$\begin{aligned} \begin{aligned}&\sum _{A \subseteq N}\sum _{j \in Q\setminus A} {\overline{a}}(1,j,A)y_T(1,A) \\&\quad = \sum _{A \subseteq S_{T-1}}\sum _{j \in Q\setminus A} {\overline{a}}(1,j,A)y_T(1,A)\\&\quad = \sum _{A \subseteq S_{T-1}}y_T(1,A)\sum _{j \in Q\setminus A} {\overline{a}}(1,j,A)\\&\quad = \sum _{A \subseteq S_{T-1}}y_T(1,A) \Big \{\sum _{j \in Q_{\beta -1}\setminus A} {\overline{a}}(1,j,A) + \sum _{j \in Q \setminus Q_{\beta -1}}{\overline{a}}(1,j,A)\Big \} \\&\quad \le \sum _{A \subseteq S_{T-1}}y_T(1,A) \Big \{{\overline{b}}(1,A) + \sum _{j \in Q \setminus Q_{\beta -1}}{\overline{b}}(1,A)\Big \} \qquad (\text{ by } (9)) \\&\quad \le \sum _{A \subseteq S_{T-1}}y_T(1,A) \Big \{{\overline{b}}(1,A) + \varPi \cdot {\overline{b}}(1,A)\Big \} \qquad (\text{ by } |I_{\beta }| \le \varPi ) \\&\quad = (1 + \varPi ) \cdot \sum _{A \subseteq S_{T-1}}{\overline{b}}(1,A)y_T(1,A)\\&\quad = (1 + \varPi ) \cdot \sum _{A \subseteq N}{\overline{b}}(1,A)y_T(1,A). \end{aligned} \end{aligned}$$
(10)

Notice that if \(t(1) = t(2)\), then \(y_T(1,A) = 0\) for every subset A of N. Thus, (6), (7), (8), and (10) imply that

$$\begin{aligned} \begin{aligned}&\rho (Q) = \sum _{i \in M}\sum _{A \subseteq N}\sum _{j \in Q\setminus A} {\overline{a}}(i,j,A)y_T(i,A)\\&\quad = \sum _{A \subseteq N}\sum _{j \in Q\setminus A} {\overline{a}}(1,j,A)y_T(1,A) + \sum _{i \in M \setminus \{1\}}\sum _{A \subseteq N}\sum _{j \in Q\setminus A} {\overline{a}}(i,j,A)y_T(i,A) \\&\quad \le \min \{\varDelta _1, 1 + \varPi \} \cdot \sum _{A \subseteq N}{\overline{b}}(1,A)y_T(1,A) + \varDelta _2 \cdot \sum _{i \in M \setminus \{1\}}\sum _{A \subseteq N} {\overline{b}}(i,A)y_T(i,A) \\&\quad \le \max \{\varDelta _2, \min \{\varDelta _1, 1 + \varPi \}\} \cdot \sum _{i \in M}\sum _{A \subseteq N} {\overline{b}}(i,A)y_T(i,A) \\&\quad \le \max \{\varDelta _2, \min \{\varDelta _1, 1 + \varPi \}\} \cdot \mathsf{OPT}. \end{aligned} \end{aligned}$$

This completes the proof. \(\square \)

5 A Algorithm for Computing \(I_1,I_2,\ldots ,I_{\delta }\)

It is known [1, Proposition 4.4] that we can compute \(I_1,I_2,\ldots ,I_{\delta }\) by greedily partitioning a separable subset in a current partition. Formally speaking, we can compute \(I_1,I_2,\ldots ,I_{\delta }\) by using Algorithm 2.

figure b

For proving that \(I_1,I_2,\ldots ,I_{\delta }\) can be computed in polynomial time, it suffices to prove that the following problem can be solved in polynomial time.

Input: :

A subset X of N.

Task: :

Decide whether there exists a non-empty proper subset Y of X such that \(\rho (X) = \rho (Y) + \rho (X \setminus Y)\). If there exists such a subset Y, then find Y.

Define \({\overline{\rho }} :2^X \rightarrow {\mathbb {R}}\) by

$$\begin{aligned} {\overline{\rho }}(Y) {:=} \rho (Y) + \rho (X \setminus Y) - \rho (X). \end{aligned}$$

Then it is not difficult to see that for every subset Y of X, we can compute \({\overline{\rho }}(Y)\) in time bounded by a polynomial in |N|. Furthermore, \({\overline{\rho }}(\emptyset ) = {\overline{\rho }}(X) = 0\), \({\overline{\rho }}(Y) = {\overline{\rho }}(X \setminus Y)\) for every subset Y of X. For each pair of subsets YZ of X,

$$\begin{aligned} \begin{aligned} {\overline{\rho }}(Y) + {\overline{\rho }}(Z)&= \rho (Y) + \rho (X \setminus Y) - \rho (X) + \rho (Z) + \rho (X \setminus Z) - \rho (X) \\&\ge \rho (Y \cup Z) + \rho (Y \cap Z) + \rho (X \setminus (Y \cap Z)) \\&\quad + \rho (X \setminus (Y\cup Z)) - 2 \rho (X) \\&= \rho (Y \cup Z) + \rho (X \setminus (Y \cup Z)) - \rho (X)\\&\quad + \rho (Y \cap Z) + \rho (X \setminus (Y \cap Z)) - \rho (X) \\&= {\overline{\rho }}(Y \cup Z) + {\overline{\rho }}(Y \cap Z). \end{aligned} \end{aligned}$$

That is, \({\overline{\rho }}\) is a submodular function. For every subset Y of X,

$$\begin{aligned} 2 {\overline{\rho }}(Y) = {\overline{\rho }}(Y) + {\overline{\rho }}(X \setminus Y) \ge {\overline{\rho }}(X) + {\overline{\rho }}(\emptyset ) = 0. \end{aligned}$$

Thus, there exists a non-empty proper subset Y of X such that \(\rho (X) = \rho (Y) + \rho (X \setminus Y)\) if and only if there exists a minimizer Y of \({\overline{\rho }}\) such that \(Y \ne \emptyset , X\). It is known [19] that we can find a non-empty proper subset Y of X minimizing \({\overline{\rho }}(Y)\) among all non-empty proper subsets of X in polynomial time. Let \(Y^{*}\) be a non-empty proper subset of X minimizing \({\overline{\rho }}(Y^{*})\) among all non-empty proper subsets of X. If \({\overline{\rho }}(Y^{*}) >0\) holds, then there does exist a non-empty proper subset Y of X such that \(\rho (X) = \rho (Y) + \rho (X \setminus Y)\). Otherwise, \(Y^{*}\) is a solution of the above problem. This complete the proof.