Keywords

1 Introduction

Submodular function maximization has drawn much attention practical and theoretical interests [5, 6, 11, 13]. For a given set V, the function \(f: 2^V\rightarrow \mathbb {R}\) is said to be submodular if \(f(X)+f(Y)\ge f(X\cap Y)+f(X\cup Y)\) for \(\forall X, Y\subseteq V\). A set function f is called monotone if \(f(X)\le f(Y)\) for all \(X \subseteq Y \subseteq V\) and it is said to be normalized when \(f(\emptyset )=0\). The well-known greedy algorithm presents a constraint-factor approximation ratio \(1-1/e\) for submodular maximization subject to a cardinality constraint [10]. The bounds can be improved if one make further assumptions on submodular functions. For example, the curvature of a submodular function f in  [3] is defined as

$$k_f=1-\min _{v\in V}\frac{f(V)-f(V\backslash \{v\})}{f(v)},$$

and noting that the curvature is computable with a linear number of function oracle calls, then the greedy algorithm obtains \(\frac{1}{k_f}(1-e^{-k_f})\) guarantee under a cardinality constraint [3].

The greedy algorithm is a simple and effective technique to solve many optimization problems. However, the ground set is often so large that the well-known greedy algorithm is not enough efficient. One solution to the problem is to give some training functions to reduce the ground set, and then Balkanski et al. [1] gave the concept of the two-stage submodular maximization problem. For a ground set V and a constant k, the objective is to obtain a set \(S\subseteq V\) of size at most k and m subsets \(T_1, T_2, \ldots , T_m\) in \(\mathcal {I}(S)\) to maximize the following

$$\frac{1}{m} \sum \limits ^{m}\limits _{i=1}\max \limits _{T\in \mathcal {I}(S)}f_{i}(T),$$

where \(f_i: 2^V\rightarrow \mathbb {R}_{+}\) is submodular for \(i=1,\ldots , m\), and \(\mathcal {I}(S)\) is a constraint set over the reduced ground set \(S\subseteq V\).

Related works have been conducted in the area of the two-stage submodular maximization. For a cardinality constraint, that is \(\mathcal {I}(S)\) \(=\{T :|T|\le k\}\). When k is enough large, Balkanski et al. [1] used the continuous optimization method to design an approximation algorithm with approximation ratio, which asymptotically approaches \(1-1/e\). When k is small, a local search algorithm was obtained with approximation ratio close to 1/2 in [1]. In addition, Mitrovic et al. [9] considered the two-stage submodular maximization with cardinality constraint under streaming and distributed settings. For a matriod constraint, Stan et al. [12] obtained a new local-search based algorithm with approximation ratio \(1-1/e^2\).

On the other hand, for many applications in practice, including experimental design and sparse Gaussian processes [8], the objective function is in general not submodular. The results for submodular optimization problems are not no longer maintained. To solve these optimization problems, an important research method is to introduce some parameters to describe the characteristics of the non-submodular functions, such as submodularity ratio, curvature, generic submodularity ratio, and then design algorithms for the problems and analyze the performances of the algorithms with these parameters. Given a ground set V and a nondecreasing set function \(f: 2^V\rightarrow \mathbb {R}\), the generic submodularity ratio of f is the largest scalar \(\gamma \) such that for any \(X\subseteq Y \subseteq V\) and any \(v \in V \setminus Y,\) \(f(v|X) \ge \gamma f(v|Y),\) which is a quantity characterizing how close a nonnegative nondecreasing set function is to be submodular [4]. And, a function is called \(\gamma \)-submodular if its generic submodularity ratio is \(\gamma \). A natural curvature notion can also be introduced for non-submodular functions. We recall that the curvature of a non-negative set function [2] is the smallest scalar \(\alpha \) such that for \(\forall \ S, \ T\subseteq V, \ i\in S\setminus T\),

$$f(i|S\setminus \{i\} \cup T) \ge (1-\alpha )f(i|S \setminus \{i\}).$$

Based on the above motivation, we discuss the two-stage \(\gamma \)-submodular maximization problem under a matriod constraint. Our main contribution is to design an approximation algorithm with constant approximation ratio with respect to the curvature and the generic submodularity ratio. The rest of our paper is summarized as below. In Sect. 2, we show some technical preliminaries, including notations and relevant known results. In Sect. 3, we give an approximation algorithms along with its analysis. And some concluding remarks are presented in Sect. 4.

2 Preliminaries

Firstly, we recall the following known concepts and results for submodular functions, supmodular functions and modulars function.

Definition 1

For a given set V, the function \(f: 2^V\rightarrow \mathbb {R}\) is called submodular if \(f(X)+f(Y)\ge f(X\cap Y)+f(X\cup Y)\) for \(\forall X, Y\subseteq V\).

An equivalent definition is that the function \(f: 2^V\rightarrow \mathbb {R}\) is said to be submodular if \(f(e |S)\ge f(e |T)\) for \(S\subseteq T \subset V\) and \(v \in V \setminus S\), where \(f(e | S)=f(e\cup S)-f(S)\).

Definition 2

For a given set V, the function \(f: 2^V\rightarrow \mathbb {R}\) is called modular if \(f(X)+f(Y)\le f(X\cap Y)+f(X\cup Y)\) for \(\forall X, Y\subseteq V\).

Furthermore, we define the concept of modular functions.

Definition 3

For a given set V, the function \(f: 2^V\rightarrow \mathbb {R}\) is called modular if \(f(X)+f(Y)= f(X\cap Y)+f(X\cup Y)\) for \(\forall X, Y\subseteq V\).

Next, we formally restate the two-stage submodular maximization problem. For a ground set V, and m nonnegative, monotone and normalized \(\gamma \)-submodular functions \(f_1,\ldots ,f_m\), which are drawn from some unknown distribution, our aim is to select a set \(S\subseteq V\) of size at most k and m subsets \(T_1,\ldots ,T_m\) in \(\mathcal {I}(S)\) to maximize the following

$$\begin{aligned} \frac{1}{m} \sum \limits ^{m}\limits _{i=1}\max \limits _{T\in \mathcal {I}(S)}f_{i}(T), \end{aligned}$$
(2.1)

where \(\mathcal {I}(S)\) is a constraint set over \(S\subseteq V\). In our paper, the set \(\mathcal {I}(S)\) corresponds to a matriod constraint.

Definition 4

For a given set S and \(\mathcal {I} \in 2^S\), a matroid \(\mathcal {M}=(S,\mathcal {I})\) satisfies three properties: (1) \(\emptyset \in \mathcal {I}\); (2) if \(P \subseteq Q \in \mathcal {I} \), then \(P\in \mathcal {I}\); (3) if \(P,Q \in \mathcal {I} \) and \(|P|\le |Q|\), then \(P+q \in \mathcal {I}\), where \(q\in Q\backslash P\).

The mapping below is a very useful tool to study the matriod constraint, which is shown in [7].

Proposition 1

Let \(\mathcal {M}_i=(S,\mathcal {I}_i)\) be a matroid for \(i\in \{1,\ldots , p\}\). For \(\forall X,Y \in \mathcal {I}_i\), there is a mapping \(\pi _{i}\) : \(Y\setminus X \rightarrow X\setminus Y\cup \{\emptyset \}\), which satisfies the three properties (1) \((X \setminus \pi _{i}(y)) \cup {y}\in \mathcal {I}_i\) for \(\forall y \in Y \setminus X\); (2) \(|\pi _{i}^{-1}(x)| \le 1\) for \(\forall x \in X\setminus Y\); (3) let \(X_{y}=\{\pi _{1}(y), \dots , \pi _{p}(y)\}\), then \((X \setminus X_{y}) \cup {y}\in \cap _{i=1}^{p} \mathcal {I}_i\) for \(\forall y \in Y\setminus X\).

Now, we turn to give a nice result for \(\gamma \)-submodular functions in [4].

Proposition 2

Let f be a \(\gamma \)-submodular function. Then for \(\forall S\subseteq T\),

$$f (T) \le f(S)+\frac{1}{\gamma }\sum \limits _{v\in T\setminus S}f(v|S).$$

To maximize the above objective function, we discuss the following

$$\begin{aligned} \ell _{i}(T)= & {} (1-\alpha )\sum \limits _{v\in T}f_i(v),\\ g_{i}(T)= & {} f_{i}(T)-\ell _{i}(T). \end{aligned}$$

where the function \(\ell _i(T)\) is modular, and the function \(g_i(T)\) is a monotone and normalized \(\gamma \)-submodular function, this is because that the function \(f_i(T)\) is monotone and normalized.

Problem (2.1) turns into the following problem: obtain a set \(S\subseteq V\) of size at most k and m subsets \(T_1, T_2, \ldots , T_m\) in \(\mathcal {I}(S)\) to maximize the following

$$\begin{aligned} \frac{1}{m}\sum \limits ^{m}\limits _{i=1}\max \limits _{T\in \mathcal {I}(S)}(g_{i}(T)+\ell _{i}(T)). \end{aligned}$$

For notational convenience, we use the following notations. In terms of the function \(g_{i}\), define the marginal gain of adding an element x to the set \(T_{i}^{j}\) as

$$\Delta ^{g}_{i}(x,T_{i}^{j})=g_{i}(\{x\}\cup T_{i}^{j})-g_{i}(T_{i}^{j}).$$

Similarly, the gain of replacing y with x in terms of the set \( T_{i}^{j}\) is represented by

$$\nabla ^{g}_{i}(x,y,T_{i}^{j})=g_{i}(\{x\}\cup T_{i}^{j}\setminus \{y\})-g_{i}(T_{i}^{j}).$$

Furthermore, for the functions \(g_i\) and \(\ell _i\), we define

$$\begin{aligned} \Delta _{i}(x,T_{i}^{j})= & {} \left( 1-\frac{\gamma +\frac{1}{\gamma }}{k}\right) ^{k-j}\Delta ^{g}_{i}(x,T_{i}^{j})+\left( 1-\frac{1}{k}\right) ^{k-j}\ell _{i}(x), \end{aligned}$$
$$\begin{aligned} \nabla _{i}(x,y,T_{i}^{j})= & {} \left( 1-\frac{\gamma +\frac{1}{\gamma }}{k}\right) ^{k-j}\nabla ^{g}_{i}(x,y,T_{i}^{j})+\left( 1-\frac{1}{k}\right) ^{k-j}(\ell _{i}(x)-\ell _{i}(y)). \end{aligned}$$

The set of elements in \(T_{i}^{j}\) can replace x, which will not violate the matroid constraint, is defined as

$$\begin{aligned} \mathcal {I}(x,T_{i}^{j})=\{y\in T_{i}^{j}:T_{i}^{j}\cup \{x\}\setminus \{ y\} \in \mathcal {I}(S)\}. \end{aligned}$$

Based on the notations of \(\Delta _{i}(x,T_{i}^{j})\) and \(\nabla _{i}(x,y,T_{i}^{j})\), we denote the replacement gain of x in terms of \(T_{i}^{j}\) by

$$\nabla _{i}(x,T_{i}^{j})= {\left\{ \begin{array}{ll} \Delta _{i}(x,T_{i}^{j}) &{}\text {if}\, T_{i}^{j}\cup \{x\}\in {\mathcal {I}}(S),\\ \max \{0,\max _{y\in {\mathcal {I}}(x,T_{i}^{j})}\nabla _{i}(x,y,T_{i}^{j})\} &{}\text {otherwise.} \end{array}\right. }$$

In addition, we define

$$\textsf {Rep}_{i}(x,T_{i}^{j})= {\left\{ \begin{array}{ll} \emptyset &{}\text {if} T_{i}^{j}\cup \{x\}\in {\mathcal {I}}(S),\\ \arg \max _{y\in {\mathcal {I}}(x,T_{i}^{j})}\nabla _{i}(x,y,T_{i}^{j}) &{}\text {otherwise.} \end{array}\right. }$$

3 Problem (2.1) Under a Matroid Constraint

In this section, we discuss Problem (2.1) under \(\mathcal {I}(S)\) is a matroid constraint. A replacement greedy algorithm is shown in Sect. 3.1, and then we analyze its approximation ratio in Sect. 3.2.

3.1 Algorithm

Our replacement greedy algorithm starts with \(S^{0}=\emptyset \), and runs in k rounds. In each round, a new element can be added into the current solution if it does not violate the matroid constraints or can be replaced with some element in the current set while increasing the value of the objective function.

figure a

3.2 Theoretical Analysis

We analyze the performance guarantee of Algorithm 1, which depends on the distorted objective function as follows.

$$\begin{aligned} \Phi _{j}(S^{j})= & {} \sum _{i=1}^{m}\left( \left( 1-\frac{\gamma +\frac{1}{\gamma }}{k}\right) ^{k-j}g_{i}(T_{i}^{j})+\left( 1-\frac{1}{k}\right) ^{k-j}\ell _{i}(T_{i}^{j})\right) . \end{aligned}$$

Lemma 1

In each iteration of Algorithm 1,

$$\begin{aligned}{} & {} \Phi _{j}(S^{j})-\Phi _{j-1}(S^{j-1})\\= & {} \sum _{i=1}^{m}\left( {\nabla _i(t^{j},T_{i}^{j-1})+\frac{\gamma +\frac{1}{\gamma }}{k}\left( 1-\frac{\gamma +\frac{1}{\gamma }}{k}\right) ^{k-j}g_{i}(T_{i}^{j-1})}\right. +\left. {\frac{1}{k}\left( 1-\frac{1}{k}\right) ^{k-j}\ell _{i}(T_{i}^{j-1})}\right) . \end{aligned}$$

For the second term on the right side in Lemma 1, we give the lower bound in the following.

Lemma 2

If the element \(t^{j}\in V\) is added into the current set \(S^{j-1}\), then

$$ \sum _{i=1}^{m}\nabla _{i}\left( t^{j},T_{i}^{j-1}\right) \ge \frac{1}{k}\sum _{i=1}^{m}\sum \limits _{t\in T_{i}^{*}\setminus T_{i}^{j-1}}\nabla _{i}(t,T_{i}^{j-1}), $$

where \(S^{*}=\mathop {\arg \max }\limits _{S\subseteq V,|S|\le k} \sum \limits ^{m}\limits _{i=1}\max \limits _{T\in \mathcal {I}(S)}f_{i}(T)\), \(T_{i}^{*}=\arg \max _{A\in \mathcal {I}(S^{*})}f_{i}(A)\).

The following lemma is crucial to analyze the approximation ratio of Algorithm 1.

Lemma 3

For \(j=1,2,\ldots ,k\), we have

$$\begin{aligned}{} & {} \sum _{i=1}^{m}\nabla _{i}(t^{j},T_{i}^{j-1})\\\ge & {} \frac{1}{k}\left( 1-\frac{\gamma +\frac{1}{\gamma }}{k}\right) ^{k-j}\sum _{i=1}^{m}\left( \gamma (1-\alpha )g_{i}(T_{i}^{*})-(\gamma +\frac{1}{\gamma })g_{i}(T_{i}^{j-1})\right) \\{} & {} +\frac{1}{k}\left( 1-\frac{1}{k}\right) ^{k-j}\sum _{i=1}^{m}\left( \ell _i(T_{i}^{*})-\ell _{i}(T_{i}^{j-1})\right) . \end{aligned}$$

Combining Lemma 1 and Lemma 3, the following theorem is proved as below.

Theorem 1

Algorithm 1 returns a set \(S^{k}\) of size k such that

$$\begin{aligned} \sum _{i=1}^{m}\left( g_{i}(T_{i}^{k})+\ell _{i}(T_{i}^{k})\right) \ge \frac{\gamma }{\gamma +\frac{1}{\gamma }}\left( 1-e^{-(\gamma +\frac{1}{\gamma })}\right) \sum _{i=1}^{m}g_{i}(T_{i}^{*})+\left( 1-e^{-1}\right) \sum _{i=1}^{m}\ell _{i}(T_{i}^{*}). \end{aligned}$$

Proof

By the definition of the function \(\Phi \), it is obtained that

$$\begin{aligned} \Phi _{0}(S^{0})= & {} 0,\\ \Phi _{k}(S^{k})= & {} \sum _{i=1}^{m}\left( \left( 1-\frac{\gamma +\frac{1}{\gamma }}{k}\right) ^{k-k}g_{i}(T_{i}^{k})+\left( 1-\frac{1}{k}\right) ^{k-k}\ell _{i}(T_{i}^{k})\right) \\= & {} \sum _{i=1}^{m}\left( g_{i}(T_{i}^{k})+\ell _{i}(T_{i}^{k})\right) . \end{aligned}$$

Using Lemmas 1 and Lemma 3, we have

$$\begin{aligned}{} & {} \Phi _{j}(S^{j})-\Phi _{j-1}(S^{j-1})\\= & {} \sum _{i=1}^{m}\left( {\nabla _i(t^{j},T_{i}^{j-1})+\frac{\gamma +\frac{1}{\gamma }}{k}\left( 1-\frac{\gamma +\frac{1}{\gamma }}{k}\right) ^{k-j}g_{i}(T_{i}^{j-1})}\right. +\left. {\frac{1}{k}\left( 1-\frac{1}{k}\right) ^{k-j}\ell _{i}(T_{i}^{j-1})}\right) . \end{aligned}$$

Finally,

$$\begin{aligned}{} & {} \sum _{i=1}^{m}\left( g_{i}(T_{i}^{k})+\ell _{i}(T_{i}^{k})\right) \\= & {} \sum _{j=1}^{k}(\Phi _{j}(S^{j})-\Phi _{j-1}(S^{j-1}))\\\ge & {} \sum _{j=1}^{k}\left( {\frac{\gamma }{k}\left( 1-\frac{\gamma +\frac{1}{\gamma }}{k}\right) ^{k-j}\sum _{i=1}^{m}g_{i}(T_{i}^{*})+\frac{1}{k}\left( 1-\frac{1}{k}\right) ^{k-j}\sum _{i=1}^{m}\ell _i(T_{i}^{*})}\right) \\\ge & {} \frac{\gamma }{\gamma +\frac{1}{\gamma }}\left( 1-e^{-(\gamma +\frac{1}{\gamma })}\right) \sum _{i=1}^{m}g_{i}(T_{i}^{*})+\left( 1-e^{-1}\right) \sum _{i=1}^{m}\ell _{i}(T_{i}^{*}). \end{aligned}$$

The curvature is an very useful assumption to obtain the following result.

Theorem 2

There exists an algorithm returning a set \(S^{k}\) of size k such that

$$\begin{aligned} F(S^{k})\ge & {} \left( \frac{\gamma }{\gamma +\frac{1}{\gamma }}\left( 1-(1-\alpha )\gamma \right) (1-e^{-(\gamma +\frac{1}{\gamma })})+(1-\alpha )\gamma (1-e^{-1})\right) OPT. \end{aligned}$$

where \(F(S^{k})=\sum \limits ^{m}\limits _{i=1}\max \limits _{T\in \mathcal {I}(S^{k})}(f_{i}(T_{i}^{k}))\) and OPT is the optimal solution.

Proof

It follows from the definition of \(\ell _i(T)\) and Proposition 2 that

$$\begin{aligned} \ell _{i}(T)= & {} (1-\alpha )\sum \limits _{v\in T}f_i(v)\ge (1-\alpha )\gamma f_i(T). \end{aligned}$$

Furthermore,

$$\begin{aligned}{} & {} \sum _{i=1}^{m}f_{i}(T_{i}^{k})=\sum _{i=1}^{m}\left( g_{i}(T_{i}^{k})+\ell _{i}(T_{i}^{k})\right) \\\ge & {} \frac{\gamma }{\gamma +\frac{1}{\gamma }}\left( 1-e^{-(\gamma +\frac{1}{\gamma })}\right) \sum _{i=1}^{m}g_{i}(T_{i}^{*})+\left( 1-e^{-1}\right) \sum _{i=1}^{m}\ell _{i}(T_{i}^{*})\\= & {} \left( \frac{\gamma }{\gamma +\frac{1}{\gamma }}(1-e^{-(\gamma +\frac{1}{\gamma })})\right) \sum _{i=1}^{m}(f_{i}(T_{i}^{*})-\ell _{i}(T_{i}^{*}))+(1-e^{-1})\sum _{i=1}^{m}\ell _{i}(T_{i}^{*})\\= & {} \left( \frac{\gamma }{\gamma +\frac{1}{\gamma }}(1-e^{-(\gamma +\frac{1}{\gamma })})\right) \sum _{i=1}^{m}f_{i}(T_{i}^{*})+\left( (1-e^{-1})-\left( \frac{\gamma }{\gamma +\frac{1}{\gamma }}(1-e^{-(\gamma +\frac{1}{\gamma })})\right) \right) \sum _{i=1}^{m}\ell _{i}(T_{i}^{*})\\\ge & {} \left( \frac{\gamma }{\gamma +\frac{1}{\gamma }}(1-e^{-(\gamma +\frac{1}{\gamma })})\right) \sum _{i=1}^{m}f_{i}(T_{i}^{*})+(1-\alpha )\gamma \left( (1-e^{-1})-\left( \frac{\gamma }{\gamma +\frac{1}{\gamma }}(1-e^{-(\gamma +\frac{1}{\gamma })})\right) \right) \sum _{i=1}^{m}f_{i}(T_{i}^{*})\\\ge & {} \left( \frac{\gamma }{\gamma +\frac{1}{\gamma }}\left( 1-(1-\alpha )\gamma \right) (1-e^{-(\gamma +\frac{1}{\gamma })})+(1-\alpha )\gamma (1-e^{-1})\right) \sum _{i=1}^{m}f_{i}(T_{i}^{*}). \end{aligned}$$

Finally, we obtain that

$$\begin{aligned} F(S^{k})\ge & {} \left( \frac{\gamma }{\gamma +\frac{1}{\gamma }}\left( 1-(1-\alpha )\gamma \right) (1-e^{-(\gamma +\frac{1}{\gamma })})+(1-\alpha )\gamma (1-e^{-1})\right) OPT. \end{aligned}$$

4 Conclusion

The objective functions for many applications in practice are in general not submodular. To solve these optimization problems, an important research method is to introduce some parameters to describe the characteristics of the non-submodular functions, such as submodularity ratio, curvature, supermodular degree, etc., and then design algorithms for the problems and analyze the performances of the algorithms with these parameters. On the other hand, it is well known that submodular maximization problem can be solved by greedy algorithms, To avoid this limitation of the regular greedy algorithm, we propose combining the distorted objective function and the greedy algorithms, which has the potential to be applicable to other optimization problems.