1 Introduction

\({\textsc {set }}\, b\text {-}{\textsc {multicover}}\) is a basic covering problem in combinatorial optimization. It is convenient to formulate the problem in terms of hypergraphs. A hypergraph \({\mathcal {H}}=(V, {\mathcal {E}})\) consists of a finite set \(V\) and a set \({\mathcal {E}}\) of subsets of \(V\). We call the elements of \(V\) vertices and the elements of \({\mathcal {E}}\) (hyper-)edges. Further let \(n := |V|, m := |{\mathcal {E}}|\). Let \(l\) be the maximum edge size and let \({\varDelta }\) be the maximum vertex degree, where the degree of a vertex is the number of edges containing that vertex.

For \(b\in {\mathbb {N}}\) a set \(b\)-multicover in \({\mathcal {H}}\) is a set of edges \(C \subseteq {\mathcal {E}}\) such that every vertex in \(V\) belongs to at least \(b\) edges in \(C\).Footnote 1 \({\textsc {set }}\, b\text {-}{\textsc {multicover}}\) is the problem of finding a set \(b\)-multicover of minimum cardinality. For \(b=1\) we have the well-known set cover problem. We denote by Opt the cardinality of a minimum set \(b\)-multicover implicitly assuming the dependence of Opt on \(b\).

1.1 Previous Work

While the set cover problem (\(b = 1\)) has been intensively explored over decades [2, 4, 6, 12, 13, 16, 17], relatively less is known for \(b \ge 2\). In the following we give a brief overview of the known approximability results. We say that a polynomial-time algorithm \(A\) achieves an approximation ratio \(\alpha \ge 1\), or is an \(\alpha \)-approximation, if for all instances \(I\) it holds that the set \(b\)-multicover \(A(I)\) returned by \(A\) satisfies \(\left|A(I)\right|/{\mathrm {Opt}} \le \alpha \). With a greedy algorithm an approximation ratio of \(1+\ln (l)\) can be obtained for constant \(l\) in \({\mathcal {O}}(nmb)\) time [15, 20]. This result was improved by Berman et al. [3], who presented a randomized algorithm that yields an expected approximation ratio of \((1+o(1))\ln \big (\frac{l}{b-1}\big )\), when \(\frac{l}{b}\) is large enough, and an expected approximation ratio of \(1+2 \sqrt{\frac{l}{b}}\), if \(\frac{l}{b}\) is small. Let \(\delta := {\varDelta }-b+1\). An approximation ratio of \(\delta \) has been achieved in \({\mathcal {O}}(m \cdot \max \{n,m\})\) time by Hall and Hochbaum [11]. The first approximation below \(\delta \) has been achieved with randomized rounding by Peleg et al. [18], who showed an approximation ratio of \(\delta \cdot \big (1-\big (\frac{c}{n}\big )^\frac{1}{\delta }\big )\), where \(c>0\) is a constant bounded from below roughly by \(2^{-2^{50}}\). The essential point of this \(\delta \cdot \big (1-\big (\frac{c}{n}\big )^\frac{1}{\delta }\big )\)-approximation is that it depends on the instance size \(n\) and tends to \(\delta \) as \(n\) grows. The authors conjectured that unless \({\mathcal {P}}={\mathcal {NP}}\), there does not exist a polynomial time approximation algorithm with a constant approximation ratio smaller than \(\delta \). For \(b=1\) it follows from the work of Khot and Regev [19] that for any arbitrary constant \(\epsilon > 0\) an approximation ratio of \(\delta - \epsilon = {\varDelta }- \epsilon \) cannot be achieved in polynomial time under the unique games conjecture.

So, the challenge is to give an algorithm for \({\textsc {set }}\, b\text {-}{\textsc {multicover}}\), \(b \ge 2\), with a worst-case ratio of the form \(\delta \alpha \), for a constant \(\alpha < 1\), or to prove that no such algorithm exists, unless some complexity classes collapse. To our best knowledge no such results have been proved since 1997.

1.2 Our Results

We present a hybrid randomized algorithm, combining LP-based randomized rounding and a greedy repairing, if the randomized solution is infeasible. Such a hybrid approach is frequently used in practice. The analysis of such an algorithm is a theoretical challenge and has been successfully carried out for, e.g. maximum graph bisection [8], maximum graph partitioning problems [7, 14], the vertex cover and partial vertex cover problem in graphs and hypergraphs [5, 6, 9, 12]. In Theorem 4 we show that our algorithm achieves for hypergraphs with \(l={\mathcal {O}}\left( (nb)^{\frac{1}{5}}\right) \) an approximation ratio of \(\delta \cdot \left( 1-\frac{11 ({\varDelta }- b)}{72l}\right) \) with constant probability. Using a different technique for the analysis we prove the same approximation ratio again with constant probability for hypergraphs with \(l={\mathcal {O}}\left( n^{\frac{1}{4}}\right) \) (Theorem 5). Depending on the value of \(b\) both results outperform each other, for different ranges of \(l\). While Theorem 4 covers a wider range for \(l\) if \(b = {\varOmega }(n^\frac{1}{4})\), Theorem 5 is the better result for smaller values of \(b\). Together we achieve the approximation ratio stated above for hypergraphs with \(l={\mathcal {O}}\left( \max \left\{ (nb)^{\frac{1}{5}}, n^\frac{1}{4}\right\} \right) \).

Note that in both cases our results improve over the ratio of Peleg et al. [18]: for any \(l\) satisfying \(l={\mathcal {O}} \left( ({\varDelta }-b)\cdot n^{\frac{1}{\delta }}\right) \) our ratio is at most the ratio of Peleg et al., and it is the better the smaller \(l\) is. In the important case of \(l\) being a constant, our ratio is \(\delta \cdot (1- c), c \in (0,1)\) a constant independent of \(n\).

This shows that the above stated non-approximability conjecture does not hold for hypergraphs with constant \(l\), including the class of \(l\)-uniform hypergraphs (with constant \(l\)). The conjecture might be true in general, but according to our result reductions must use edge sizes depending on \(n\).

In conclusion, the crucial algorithmic point of this paper is the identification of the maximum edge size \(l\) as a complexity-theoretic relevant hypergraph parameter governing the approximability of \({\textsc {set }}\, b\text {-}{\textsc {multicover}}\) and leading to approximations below \(\delta \) in worst case.

The methods developed in this paper go beyond the well-know analysis of randomized rounding with Chernoff bounds for sums of independent random variables. The challenge is to combine the different subroutines of the hybrid algorithm in the analysis as the repairing step depends on the randomized cover generation, modeled by sums of dependent random variables. For example, we invoke the powerful bounded difference inequality based on the Azuma-Hoeffding martingale bound. Its particular effect is the identification of local hypergraph parameters important for the analysis, in our setting it is the maximum edge size \(l\).

The paper is organized as follows: in Sect. 2 we give definitions and probabilistic tools. In Sect. 3 we present our hybrid randomized algorithm for \({\textsc {set }}\, b\text {-}{\textsc {multicover}}\). In Sects. 4 and 5 we analyze the algorithm in two different ways which together provide the proof for our stated approximation ratio result. Finally in Sect. 6 we sketch some future work.

2 Definitions and Tools

Let \({\mathcal {H}}=(V,{\mathcal {E}})\) be a hypergraph, with \(V\) and \({\mathcal {E}}\) being its sets of vertices and hyperedges. For every vertex \(v \in V\) we define the vertex degree of \(v\) as \(d(v):= |\{E \in {\mathcal {E}} \mathrel {|} v \in E \}|\). The maximum vertex degree is \({\varDelta }:=\max _{v \in V} d(v) \). Let \(l\) denote the maximum cardinality of a hyperedge from \({\mathcal {E}}\). For a set \(X\subseteq V\) we denote by \({\varGamma }(X):=\{E\in {\mathcal {E}} \mathrel {|} X\cap E\ne \emptyset \}\) the set of edges incident in \(X\). It is convenient to order the vertices and edges, i.e., \(V=\{v_1,\ldots ,v_n\}\) and \({\mathcal {E}}=\{E_1,\ldots ,E_m\}\), and to identify the vertices and edges with their indices.

\({\textsc {set }}\, b\text {-}{\textsc {multicover}}\):

Let \({\mathcal {H}}=(V,\,{\mathcal {E}})\) be a hypergraph and \(b \in {\mathbb {N}}\). We call \(C \subseteq {\mathcal {E}}\) a set \(b\)-multicover if no vertex \(i\in V\) is contained in fewer than \(b\) hyperedges of \(C\). \({\textsc {set }}\, b\text {-}{\textsc {multicover}}\) is the problem of finding a set \(b\)-multicover with minimum cardinality.

For the one-sided deviation we will use the Chebychev–Cantelli inequality:

Theorem 1

(see [1]) Let \(X\) be a non-negative random variable with finite mean \({\mathbb {E}}(X)\) and variance Var\((X)\). Then for any \(a>0\) we have

$$\begin{aligned} \Pr (X\ge {\mathbb {E}}(X)+a)\le \frac{\mathrm{Var} (X)}{\mathrm{Var}(X)+a^2}\cdot \end{aligned}$$

A further useful concentration result is the bounded differences inequality:

Theorem 2

(See [10]) Let \(X= (X_1,X_2,\ldots ,X_n)\) be a family of independent random variables with \(X_k\) taking values in a set \(A_k\) for each \(k\). Suppose that the real-valued function \(f\) defined on \(A_1 \times \cdots \times A_n\) satisfies \(|f(x)-f(x')| \le c_k\) for every pair of vectors \(x\) and \(x'\) that differ only in the \(k\)th coordinate. Then for any \(t > 0\)

$$\begin{aligned} \Pr \left[ f(X)\le {\mathbb {E}}(f(X))-t\right] \le \exp {\left( \frac{-2t^{2}}{\sum _{k=1}^{n} c_{k}^{2}}\right) }. \end{aligned}$$

The following estimate on the variance of a sum of dependent random variables can be proved as in the book of Alon and Spencer:

Lemma 1

(see [1]) Let \(X\) be the sum of \(n\) \(0/1\) random variables, i.e. \(X=X_1+\ldots +X_n\). Let \({\varLambda }\) be the set of all unordered pairs \(i,j\in \{1,\ldots ,n\}, \, i\ne j\) such that \(X_i\) and \(X_j\) are not independent. Let \(\gamma = \sum _{\{i,j\}\in {\varLambda }} {\mathbb {E}}(X_iX_j)\), then it holds that \(\mathrm{Var} (X)\le {\mathbb {E}}(X)+2\gamma \).

For a sum of independent random variables we will use the large deviation inequality due to Angluin and Valiant:

Theorem 3

(see [10]) Let \(X_1,\ldots , X_n\) be independent \(\{0,1\}\)-random variables. Let \(X= \sum _{i=1}^n X_i\). For every \(\beta >0\) it holds that

$$\begin{aligned} \Pr [X\ge (1+\beta )\cdot {\mathbb {E}}(X)]\le \exp {\left( - \frac{\beta ^2 {\mathbb {E}}(X)}{3}\right) }. \end{aligned}$$

3 The Randomized Algorithm

Let \({\mathcal {H}}=(V, {\mathcal {E}})\) be a hypergraph and let \((a_{ij})_{(i,j)\in [n]\times [m]}\) be its incidence matrix, where \(a_{ij} = 1\) if vertex \(i\) is contained in edge \(j\), and \(a_{ij} = 0\) otherwise. An integer, linear programming formulation of \({\textsc {set }}\, b\text {-}{\textsc {multicover}}\) is the following:

$$\begin{aligned} ({b-\mathrm{ILP}})&\min \sum _{j=1}^m x_j\\&\sum _{j=1}^m a_{ij}x_j \ge b \quad \text{ for } \text{ all } i \in [n]\\&x_j \in \{0,1\}\quad \text{ for } \text{ all } j \in [m]. \end{aligned}$$

Its linear programming relaxation, denoted by \(b-\)LP, is given by relaxing the integrality constraints to \(x_j\in [0,1] \; \,\text {for all}\, j\in [m]\). We already defined \({\mathrm {Opt}}\) as the value of an optimal solution for \(b-\)ILP. Let \(\mathrm {Opt}^*\) be the value of an optimal solution to \(b-\)LP. Clearly, \({\mathrm {Opt}}^* \le {\mathrm {Opt}}\). Let \(x^*=(x_1^*,\ldots ,x_m^*)\) be an optimal solution of \(b-\)LP. So \({\mathrm {Opt}}^*=\sum _{j=1}^m x^*_j\).

figure a

Let us briefly explain the ingredients of the algorithm SET b-MULTICOVER.

We start with an empty set \(C\), which we extend to a feasible cover. First we solve the LP-relaxation \(b\)-LP in polynomial time, say with some known polynomial-time procedure, e.g. the interior point method. Next, we remove the edges with a corresponding LP-variable of value zero, because they will not be taken into the set \(b\)-multicover by randomized rounding. Then randomized rounding is performed followed by a greedy repairing step, if the solution is infeasible, so the resulting cover is always feasible.

4 The First Analysis

Let \(X_{1},\ldots ,X_{m}\) be \(\{0,1\}\)-random variables defined as follows:

$$\begin{aligned} X_j= {\left\{ \begin{array}{ll} 1 &{}\quad \text {if the edge}\, E_j \, \text {was picked into the cover before repairing}\\ 0 &{}\quad \text {otherwise}. \end{array}\right. } \end{aligned}$$

Note that the \(X_1,\ldots ,X_m\) are independent for a given \(x^* \in [0,1]^m\). For all \(i\in [n] \) we define the \(\{0,1\}\)- random variables \(Z_{i}\) as follows:

$$\begin{aligned} Z_{i}= {\left\{ \begin{array}{ll} 1 &{}\quad \text {if the vertex}~ v_{i}~\text {is covered by at least}~b~\text {edges before repairing}\\ 0 &{}\quad \text {otherwise}. \end{array}\right. } \end{aligned}$$

Then \(Y:=\sum _{j=1}^mX_j\) is the cardinality of the cover after randomized rounding and \(W:=\sum _{i=1}^{n}Z_{i}\) is the number of at least \(b\) time covered vertices after this step.

The following lemma shows that all vertices are almost covered after step 4 of Algorithm 1.

Lemma 2

Let \(0 < \epsilon \le \frac{1}{4}\) and \(b,d,{\varDelta }\in {\mathbb {N}}\) with \(2\le b\le d-1 \le {\varDelta }-1\). Let \(\lambda = (1-\epsilon )\delta \) and let \(x_j \in [0,1]\), \(j \in [d]\), such that \(\sum _{j=1}^{d} x_j \ge b\). Then at least \(b -1\) of the \(x_j\) fulfill the inequality \(x_j \ge \frac{1}{\lambda }\).

Proof

Suppose that there exist at most \(b-2\) values \(x_j\) that fulfill the inequality. W.l.o.g. let \(x_{b-1}, \ldots , x_d\) be the variables which do not fulfill it. Then we have

$$\begin{aligned} \sum _{j=1}^{d}x_j&\le b-2 + \sum _{j=b-1}^{d}x_j < b-2 + \frac{d-b+2}{\lambda }\\&= b-2 + \underbrace{\frac{d-b+2}{{\varDelta }-b+1}}_{\le \frac{3}{2}} \cdot \frac{1}{1-\epsilon } \le b -2 + \frac{3}{2} \cdot \frac{4}{3} = b. \end{aligned}$$

This contradicts \(\sum _{j=1}^{d}x_j \ge b\). \(\square \)

Let us consider a fixed vertex \(v\in V\) with \(d = deg(v) > b\). By Lemma 2 there do exist at least \(b-1\) edges in \(S_{\ge }\) that contain \(v\). Thus at most one additional edge per vertex is needed to obtain a feasible cover. We are now able to bound the number of additional edges taken into the cover in the repairing step by \(n-W\) and receive in total

$$\begin{aligned} \left|C\right| \le Y+n-W. \end{aligned}$$
(1)

For the computation of the expectation of \(W\) we need the following lemma (see Lemma 2.2, [18]).

Lemma 3

For all \(n\in \mathbb {N}\), \(\alpha >0\) and \(x_1,\ldots ,x_n, z\in [0,1]\) with \(\sum _{i=1}^nx_i\ge z\) and \(\alpha x_i<1\) for all \(i\in \mathbb {N}\), we have \(\prod _{i=1}^n(1-\alpha x_i)\le (1-\alpha \frac{z}{n})^n\), and this bound is the tight maximum.

The next lemma is an adaption of Lemma 2 in [6] to \({\textsc {set }}\, b\text {-}{\textsc {multicover}}\).

Lemma 4

Let \(l\) and \({\varDelta }\) be the maximum size of an edge resp. the maximum vertex degree, not necessarily constants. Let \(\epsilon > 0\) and \(\lambda = (1-\epsilon )\delta \) as in Algorithm 1.

\({\mathrm {(i)}}\) :

\(\mathbb {E}(W)\ge (1-\epsilon ^{2})n\).

\({\mathrm {(ii)}}\) :

\({\mathrm {Opt}}^{*}\ge \frac{nb}{l}\).

\({\mathrm {(iii)}}\) :

If \(x^{*}_{j}> 0\) for all \(j\in [m]\), then \({\mathrm {Opt}}^{*}\ge \frac{mb}{{\varDelta }}\).

\({\mathrm {(iv)}}\) :

If \(\lambda \ge 1\), then \({\mathrm {Opt}}^{*}\le {\mathbb {E}}(Y)\le \lambda {\mathrm {Opt}}^{*}\).

Proof

  1. (i)

    Let \(i \in [n]\), \(r = d(i) - b + 1\) and \({z_i}^* := \sum _{E_j \in ({\varGamma }(v_i) \setminus S_\ge )} x_j^*\). If \(\left|S_\ge \cap {\varGamma }(v_i)\right| \ge b\), then \(\Pr (Z_{i}=0)=0\). Otherwise we get by Lemma 2 \(\left|S_\ge \cap {\varGamma }(v_i)\right| = b-1\) and \(z_i^* \ge 1\). Therefore

    So we get for the expectation of \(W\)

    $$\begin{aligned} \mathbb {E}(W)&= \sum _{i=1}^n \Pr (Z_i = 1) = \sum _{i=1}^n (1 - \Pr (Z_i = 0))\\&\ge \sum _{i=1}^n \big (1 - \epsilon ^2\big )\\&= \big (1-\epsilon ^2\big )n. \end{aligned}$$
  2. (ii)

    Using the ILP constraints we have

    $$\begin{aligned} nb \le \sum _{i=1}^n \sum _{E_j \in {\varGamma }(i)} x_j^* = \sum _{j=1}^m \left|E_j\right| x_j^* \le l \cdot \sum _{j=1}^m x_j^* = l \cdot \mathrm {Opt}^*. \end{aligned}$$
  3. (iii)

    Let us consider the dual LP of \(b\)-LP:

    $$\begin{aligned} \text {(D)}\qquad \max \sum _{i = 1}^n y_i b \\ \sum _{i \in E}y_i \le 1&\text { for all } E \in {\mathcal {E}}, \\ y_i \in [0,1]&\text { for all } i \in [n]. \end{aligned}$$

    Let \((y_i^*)_{i \in [n]}\) be an optimal solution for (D) with value \({\mathrm {Opt}}^{*}(D) = {\mathrm {Opt}}^*\) (duality theorem). By complementary slackness if \(x_j^* > 0\), then \(\sum _{i \in E_j} y_i^* = 1\). Therefore

    $$\begin{aligned} mb = \sum _{E \in {\mathcal {E}}} b \overset{\sum y_i^* = 1}{=} \sum _{E \in {\mathcal {E}}} \sum _{i \in E} b y_i^* = \sum _{i \in V} d(i)by_i^* \le {\varDelta }\sum _{i \in V} b y_i^* = {\varDelta }\cdot {\mathrm {Opt}}^*. \end{aligned}$$
  4. (iv)

    For \(S\subseteq {\mathcal {E}}\) we set \({\mathrm {Opt}}^{*}(S):= \sum _{E_j \in S}x^{*}_{j}\). By using the LP relaxation and the definition of the sets \(S_{\ge }\) and \(S_{<}\), and since \(\lambda \ge 1\), we get

    $$\begin{aligned} {\mathrm {Opt}}^{*} \le \underbrace{|S_{\ge }|+ \lambda {\mathrm {Opt}}^{*}(S_{<})}_{=\mathbb {E}(|C|)} \le \underbrace{\lambda {\mathrm {Opt}}^{*}(S_{\ge })}_{\ge |S_{\ge }| }+ \lambda {\mathrm {Opt}}^{*}(S_{<}) = \lambda {\mathrm {Opt}}^{*}. \end{aligned}$$

\(\square \)

We wish to get an approximation ratio strictly better than \(\delta \). Before we continue let us rule out the case where already a trivial solution leads to a good approximation.

Lemma 5

If there exists a constant \(c > 1\) with \(\frac{\delta \cdot {\mathrm {Opt}}^*}{c \cdot bn} > 1\), then it is trivial to compute a set \(b\)-multicover of size \(bn\), which is an \(\frac{\delta }{c}\)-approximation.

Proof

It is always possible to choose for every vertex \(b\) arbitrary edges containing it, because the minimum vertex degree is at least \(b\). Hence we need at most \(bn\) edges to cover all vertices. Since \(bn < \frac{\delta }{c} \cdot {\mathrm {Opt}}^*\), we have a \(\frac{\delta }{c}\)-approximation, which is strictly better than a \(\delta \)-approximation by a constant factor. \(\square \)

By Lemma 5 we may restrict ourselves to the case where the above trivial situation does not hold. Therefore, we assume from now on

$$\begin{aligned} 1 > \frac{\delta \cdot \mathrm {Opt}^*}{c \cdot bn} \overset{\text {Lem.4} (ii)}{\ge } \frac{\delta }{c \cdot l}. \end{aligned}$$
(2)

We will use (2) to guarantee non-trivial approximations in the following theorems. For example, in the next theorem we have by (2), \(\frac{11({\varDelta }- b)}{72l} \in [0,1]\) taking \(c= 72/11\), so \(1 - \frac{11({\varDelta }- b)}{72l} \ge 0\).

Theorem 4

Let \(b\in {\mathbb {N}}_{\ge 2}\) and let \({\mathcal {H}}\) be a hypergraph with maximum vertex degree \({\varDelta }\) and maximum edge size \(l\), \(3 \le l \le \underbrace{2^{-1/4} \cdot 3^{-1}}_{\approx 0.28} \cdot (nb)^\frac{1}{5}\).

Algorithm 1 returns a set \(b\)-multicover of size

$$\begin{aligned} \left|C\right| \le \left( 1 - \frac{11({\varDelta }- b)}{72l}\right) \delta \cdot \mathrm {Opt}^* \end{aligned}$$

with probability at least \(1-(\exp (-2)+\exp (-6)) \approx 0.862\) in polynomial time.

Proof

We choose

$$\begin{aligned} \epsilon := \frac{\delta \cdot \mathrm {Opt}^* (1 + \beta )}{6 b n} \quad \text {with} \quad \beta = \frac{\sqrt{2} l}{\sqrt{n b}}. \end{aligned}$$
(3)

For \(l < \frac{(n b)^\frac{1}{2}}{4 \sqrt{2}}\) we have \(\beta < \frac{1}{4}\). By the assumption on \(l\) in Theorem 4 this upper bound on \(l\) is satisfied. Further let us introduce the constants \(c_1:=\frac{45}{8}\) and \(c_2:=\frac{6}{c_1} > 1\). We can assume \(\frac{\delta \cdot \mathrm {Opt}^*}{c_2 \cdot bn} \le 1\) (Lemma 5). Now

$$\begin{aligned} \epsilon = \frac{\delta \cdot \mathrm {Opt}^* (1+\beta )}{6 bn} = \frac{\delta \cdot \mathrm {Opt}^* (1+\beta )}{c_1 c_2 bn} \le \frac{1+\beta }{c_1} < \frac{5}{4 c_1} = \frac{4}{18} < \frac{1}{4}. \end{aligned}$$
(4)

\({\textsc {set }}\, b\text {-}{\textsc {multicover}}\) is trivial or even unsolvable for \({\varDelta }\le b\). So we may assume \({\varDelta }\ge b + 1\), hence \(\lambda = (1-\epsilon )\delta \ge (1-\epsilon )2 > (1-\frac{4}{18})2 > 1\), which is important for using Lemma 4 later on.

First we focus on two statements that will help us bounding \(\left|C\right|\) with a constant probability.

Claim 1

$$\begin{aligned} \Pr \left( W \le n \big (1-\epsilon ^2\big ) - \sqrt{\delta \cdot \sum _{k=1}^m \left|E_k\right|^2}\right) \le \exp (-6) \end{aligned}$$
(5)

Proof of Claim 1

Consider the function \(f(X_1,\ldots ,X_m) := \sum _{i=1}^n Z_i\). Then we have for any two vectors \(x=(x_1,\ldots ,x_k,\ldots ,x_n)\) and \(x'=(x_1,\ldots ,x_k',\ldots ,x_n)\) that only differ in the \(k\)-th coordinate

$$\begin{aligned} \left|f(x_1,\ldots ,x_k,\ldots ,x_n) - f(x_1,\ldots ,x_k',\ldots ,x_n)\right| \le \left|E_k\right| \end{aligned}$$

for all \(k \in [m]\). Therefore we can use Theorem 2 with \(c_k = \left|E_k\right|\) for all \(k \in [m]\) and \(t = \sqrt{\delta \cdot \sum _{k=1}^m \left|E_k\right|}\) and receive

$$\begin{aligned} \Pr \left( W \le n \big (1-\epsilon ^2\big ) - \sqrt{\delta \cdot \sum _{k=1}^m \left|E_k\right|^2}\right)&\le \Pr \left( W \le \mathbb {E}(W) - \sqrt{\delta \cdot \sum _{k=1}^m \left|E_k\right|^2}\right) \\&\le \exp \left( \frac{-2 \delta \cdot \sum _{k=1}^m \left|E_k\right|^2}{\sum _{k=1}^m \left|E_k\right|^2} \right) \\&=\exp (-2\delta )\\&\le \exp (-6). \end{aligned}$$

Claim 2

For \(\beta = \frac{\sqrt{2} l}{\sqrt{nb}}\) it holds that

$$\begin{aligned} \Pr \left( Y \ge \lambda (1 + \beta ) {\mathrm {Opt}}^* \right) \le \exp (-2) \end{aligned}$$
(6)

Proof of Claim 2

To show the following inequalities we invoke Lemma 4 and Theorem 3:

$$\begin{aligned} \Pr \left( Y \ge \lambda (1 + \beta ) \mathrm {Opt}^* \right)&\overset{\mathrm{Lem 4 (iv)}}{\le }\Pr (Y \ge \mathbb {E}(Y)(1+\beta ))\\&\quad \overset{\mathrm{Th 3 }}{\le }\exp \left( - \frac{\beta ^2\mathbb {E}(Y)}{3}\right) \\&\quad \overset{\mathrm{Lem 4 (iv)}}{\le }\exp \left( -\frac{\beta ^2 \mathrm {Opt}^*}{3}\right) \\&\quad \overset{\mathrm{Lem 4 (ii)}}{\le }\exp \left( -\frac{n b \beta ^2}{3 l}\right) \\&\quad \overset{\mathrm{Def. \beta }}{=} \exp \left( - \frac{2l}{3}\right) \\&\quad \overset{l \ge 3}{\le } \exp (-2). \end{aligned}$$

With Claims 1 and 2 we have with probability at least \(1-(\exp (-2)+\exp (-6)) \approx 0.862\):

$$\begin{aligned} \left|C\right|&\overset{\hbox {(1)}}{\le } Y + n - W\\&\le \underbrace{\lambda (1 + \beta )\mathrm {Opt}^* + n \epsilon ^2}_{(*)} + \underbrace{\sqrt{\delta \cdot \sum _{k=1}^m \left|E_k\right|^2}}_{(**)}. \end{aligned}$$

Next we bound \((*)\) and \((**)\) separately and combine their bounds later.

We start with \((*)\):

$$\begin{aligned}&(*) = \delta (1 + \beta )(1 - \epsilon )\mathrm {Opt}^* + n \epsilon ^2\\&\quad \overset{\mathrm{Def. }\epsilon }{=} {\left( (1 + \beta )(1 - \epsilon ) + \frac{\delta \cdot {\mathrm {Opt}}^* (1 + \beta )^2}{36 n b^2}\right) }\delta \cdot {\mathrm {Opt}}^*\\&\quad \overset{\mathrm{Def. }\epsilon }{=}\left( (1+\beta )\left( 1 - \frac{\delta \cdot {\mathrm {Opt}}^*(1 + \beta )}{6 n b}\right) + \frac{\delta \cdot {\mathrm {Opt}}^* (1 + \beta )^2}{36 n b^2}\right) \delta \cdot {\mathrm {Opt}}^*\\&\quad = \left( (1+\beta ) - \frac{\delta \cdot {\mathrm {Opt}}^*(1 + \beta )^2}{6 n b} + \frac{\delta \cdot {\mathrm {Opt}}^* (1 + \beta )^2}{36 n b^2}\right) \delta \cdot {\mathrm {Opt}}^*\\&\quad =\left( 1 - \left( \frac{(6b-1)\delta (1+\beta )^2{\mathrm {Opt}}^*}{36n b^2} - \beta \right) \right) \delta \cdot {\mathrm {Opt}}^*\\&\quad \overset{\mathrm{Lem.4 (ii)}}{\le } \left( 1 - \frac{(6b-1)\delta (1+\beta )^2-36\beta l b}{36lb}\right) \delta \cdot \mathrm {Opt}^*\\&\quad =\left( 1 - \frac{(6-\frac{1}{b})\delta (1+\beta )^2-36\beta l}{36l}\right) \delta \cdot {\mathrm {Opt}}^*\\&\quad \overset{\mathrm{Def. \beta }}{=} \left( 1 - \frac{\delta }{36l} \left( (6-\frac{1}{b})(1+\beta )^2 - \frac{36 \sqrt{2}l^2}{\delta \sqrt{nb}} \right) \right) \delta \cdot \mathrm {Opt}^*\\&\quad \overset{b \ge 2}{\le }\left( 1 - \frac{\delta }{36l} \left( \frac{11}{2}(1+\beta )^2- \frac{36 \sqrt{2}l^2}{\delta \sqrt{nb}}\right) \right) \delta \cdot {\mathrm {Opt}}^*\\&\quad \overset{\mathrm{bound on } l}{\le }\left( 1 - \frac{\delta }{36l} \left( \frac{11}{2}(1+\beta )^2 - \frac{8}{2\delta } \right) \right) \delta \cdot \mathrm {Opt}^*\\&\quad \le \left( 1 - \frac{\delta }{36l} \left( \frac{11}{2} - \frac{8}{2\delta } \right) \right) \delta \cdot \mathrm {Opt}^*\\&\quad =\left( 1 - \frac{11\delta -8}{72l} \right) \delta \cdot {\mathrm {Opt}}^*. \end{aligned}$$

Now we continue with \((**)\). Note that

$$\begin{aligned} \frac{b}{{\varDelta }}({\varDelta }- b + 1) = \left( \frac{{\varDelta }- b}{{\varDelta }}\right) b + \frac{b}{{\varDelta }} \ge \frac{{\varDelta }- b}{{\varDelta }} + \frac{b}{{\varDelta }} = 1. \end{aligned}$$

From this we can immediately deduce

$$\begin{aligned} \frac{m}{\delta } \overset{\mathrm{Def. \delta }}{=} \frac{m}{{\varDelta }- b + 1} \le \frac{m b}{{\varDelta }}. \end{aligned}$$
(7)

We make use of this in the following calculation:

$$\begin{aligned} \,(**)&= \sqrt{\delta \cdot \sum _{k=1}^m \left|E_k\right|^2}\\&\le \sqrt{\delta m l^2} = l \sqrt{m} \frac{\delta }{\sqrt{\delta }}\\&\overset{\hbox {(7)}}{\le } l \sqrt{\frac{mb}{{\varDelta }}} \delta \\&\overset{\mathrm{Lem.4 (iii)}}{\le } l \sqrt{\mathrm {Opt}^*} \cdot \delta \\&= \frac{l^{\frac{5}{2}}}{l \sqrt{l}} \sqrt{\mathrm {Opt}^*} \cdot \delta \\&\overset{\mathrm{bound on l}}{\le } \frac{3 \sqrt{nb} \sqrt{\mathrm {Opt^*}}}{72 \cdot l \sqrt{l}} \delta \\&\overset{\mathrm{Lem.4 (ii)}}{\le } \frac{3}{72l} \delta \cdot \mathrm {Opt}^*. \end{aligned}$$

Together we get

$$\begin{aligned} \left|C\right|&\le (*) + (**)\\&\le \left( 1 - \frac{11\delta - 8}{72l} \right) \delta \cdot \mathrm {Opt}^* + \frac{3}{72 l} \delta \cdot \mathrm {Opt}^*\\&= \left( 1 - \frac{11\delta - 11}{72l} \right) \delta \cdot \mathrm {Opt}^*\\&= \left( 1 - \frac{11({\varDelta }- b)}{72l} \right) \delta \cdot \mathrm {Opt}^*. \end{aligned}$$

\(\square \)

5 An Alternative Analysis

In this section we estimate the expected size of the set \(b\)-multicover after repairing. This is a non-trivial new aspect, because we have to simultaneously couple the outcomes of randomized rounding and repairing in the expectation. After estimating the expectation, we will use concentration inequalities, leading to the same approximation ratio we accomplished in the last section, but for hypergraphs with \(l = {\mathcal {O}}(n^\frac{1}{4})\). Together with Theorem 4 we achieve the claimed ratio for \(l \in {\mathcal {O}}\left( \max \{(nb)^\frac{1}{5},n^\frac{1}{4}\}\right) \) as desired.

First we compute the expectation and the variance of the cardinality of the set \(b\)-multicover returned by Algorithm 1. We keep in mind that we do not consider instances where a trivial cover with \(nb\) edges yields a good approximation (Lemma 5).

Lemma 6

Let \({\mathcal {H}}\) be a hypergraph with maximum vertex degree \({\varDelta }\) and maximum edge size \(l \ge 3\). Let \(\epsilon =\frac{\delta \cdot \mathrm {Opt}^{*}}{6bn}\). Then Algorithm 1 returns a set \(b\)-multicover \(C\) with

$$\begin{aligned} {\mathbb {E}}(|C|)\le \delta \cdot \left( 1 - \frac{(6-1/b)\delta }{36l}\right) {\mathrm {Opt}}^*. \end{aligned}$$

Proof

By Lemma 5 we can assume that

$$\begin{aligned} \frac{\delta {\mathrm {Opt}}^*}{6bn}\in [0,\frac{1}{4}]. \end{aligned}$$

Therefore it holds that \(\lambda =\delta \left( 1-\frac{\delta { \mathrm {Opt}}^*}{6bn}\right) \ge \frac{3}{4}\delta >1\).

Now we can bound the expectation by

\(\square \)

The variance is estimated as follows.

Lemma 7

\(\mathrm{Var}(|C|)\le \delta l \mathbb {E}(|C|).\)

Proof

For \(j \in [m]\) let \(\hat{X}_j = 1\) if and only if the edge \(E_j\) is contained in the set \(b\)-multicover after repairing. Thus, it holds that \(\left|C\right| = \sum _{j=1}^m \hat{X}_j\). Further let \({\varLambda }, \gamma \) be defined as in Lemma 1. As \(\hat{X}_j = 1\) for all \(j \in S_{\ge }\), \(\hat{X}_j\) is a non-trivial variable only for \(j \in S_<\), and we may restrict us to such \(j\). Let \(i,j \in S_<\). For any pair of edges from \(S_<\) we have

$$\begin{aligned} \mathbb {E}(\hat{X}_i \hat{X}_j)&\le \min \{\Pr (\hat{X}_i = 1), \Pr (\hat{X}_j = 1)\}\\&\le \frac{\Pr (\hat{X}_i = 1) + \Pr (\hat{X}_j = 1)}{2}. \nonumber \end{aligned}$$
(8)

According to Lemma 2, there exist for every vertex at least \(b-1\) edges from \(S_\ge \) containing it. Consequently, the maximum vertex degree in the subhypergraph induced by \(S_<\), which is considered for randomized rounding and repairing, is at most \({\varDelta }- (b - 1) = \delta \). So for a fixed edge \(E_i \in S_<\)

$$\begin{aligned} \sum _{j \in E_i} \left|{\varGamma }(\{v_j\})\cap S_<\right| - 1 \le \sum _{j \in E_i} (\delta - 1) \le l (\delta - 1) \end{aligned}$$

random variables \(\hat{X}_j\) depend on \(\hat{X}_i\). Now

$$\begin{aligned} \gamma&=\sum _{\{E_i,E_j\} \in {\varLambda }} \mathbb {E}(\hat{X}_i \hat{X}_j) \overset{\hbox {(8)}}{\le } \sum _{\{E_i,E_j\} \in {\varLambda }} \frac{\Pr (\hat{X}_i = 1) + \Pr (\hat{X}_j = 1)}{2}\\&<\sum _{i=1}^m \frac{\sum _{j \in E_i}\left|{\varGamma }(\{v_j\}) \cap S_<\right| - 1}{2}\Pr (\hat{X}_i=1) \le \frac{l(\delta -1)}{2}\mathbb {E}(|C|). \end{aligned}$$

By Lemma 1 we have:

$$\begin{aligned} \mathrm{Var}(|C|) = \mathrm{Var}(|C \setminus S_\ge |) \le \mathbb {E}(|C \setminus S_\ge |) + l(\delta - 1) \mathbb {E}(|C|) \le l\delta \mathbb {E}(|C|). \end{aligned}$$

\(\square \)

Finally we use the statements about expectation and variance to obtain our second main result. Note that due to Lemma 5 and (2), \(\frac{11({\varDelta }-b)}{72l} \in [0,1]\), so the approximation factor in the following theorem is non-negative.

Theorem 5

Let \(b\in \mathbb {N},~b\ge 2\), let \({\mathcal {H}}\) be a hypergraph with maximum edge size \(l \le \sqrt{\frac{11}{72}}n^{\frac{1}{4}}\) and maximum vertex degree \({\varDelta }\) and let \(\epsilon =\frac{\delta \cdot \mathrm {Opt}^{*}}{6bn}\). Algorithm 1 returns a set \(b\)-multicover \(C\) of size

$$\begin{aligned} |C|\le \delta \left( 1-\frac{11({\varDelta }-b)}{72l}\right) \mathrm {Opt}^* \end{aligned}$$

with probability at least \(1- \frac{1}{1+ b} \ge 2/3\) in polynomial time.

Proof

Let \({\mathcal {A}}\) be the event \(|C| > \delta \left( 1- \frac{11 ({\varDelta }-b)}{72l}\right) \mathrm {Opt}^*\). To prove the claimed upper bound for \(\left|C\right|\), we estimate the concentration of \(|C|\) around its expectation. This can be done as follows:

We may continue,

$$\begin{aligned} \frac{\left( \frac{11\delta \mathrm {Opt}^*}{72l}\right) ^{2}}{\mathrm{Var}(|C|)}&\overset{\text {Lem 7}}{\ge }&\left( \delta \frac{121(\mathrm {Opt}^*)^{2}}{(72)^{2}l^{3}\mathbb {E}(|C|)}\right)&\overset{\mathbb {E}(|C|)\le \delta \mathrm {Opt}^*}{\ge }&\frac{121 \cdot \mathrm {Opt}^*}{(72)^{2}l^{3}}\overset{\text {Lem.4} (iii)}{\ge }\frac{121 b n}{(72)^{2}l^{4}}, \end{aligned}$$

since \(l\le \sqrt{\frac{11}{72}}n^\frac{1}{4}\) we have \(\frac{121 b n}{(72)^{2}l^{4}}\ge b\).

Therefore we get    \(\Pr \left( |C|\ge l\left( 1- \frac{11 ({\varDelta }-b)}{72l}\right) \mathrm {Opt}^*\right) \le \frac{1}{1 + b} \le \frac{1}{3}\). \(\square \)

6 Future Work

An interesting problem is the derandomization of our algorithms, where the challenge comes from the fact that we have a hybrid algorithm. Furthermore, attempts should be made to prove (or disprove) the non-approximability conjecture of Peleg et al. [18]. Our work indicates that hypergraphs where \(l\) is a function of \(n\) would be helpful for this task. Another interesting problem is the partial set multicover, where only a subset of vertices needs to be covered.