1 Introduction

Let \(\Sigma = (\mathsf {X}, \mathcal {R})\) be a finite range space where \(\mathsf {X}\) is a finite set of objects and \(\mathcal {R}\) is a family of subsets of \(\mathsf {X}\) called ranges. Set \(n = |\mathsf {X}|, m = |\mathcal {R}|\), and \(N = n+m\). A subset \(\mathsf {H}\subseteq \mathsf {X}\) is called a hitting set of \(\Sigma \) if \(\mathsf {H}\) intersects every range of \(\mathcal {R}\), and a subset \(\mathcal {C}\subseteq \mathcal {R}\) is called a set cover of \(\Sigma \) if the union of ranges in \(\mathcal {C}\) is equal to \(\mathsf {X}\). It is well known that a set cover of \(\Sigma \) is a hitting set of the dual range space \(\Sigma ^{\perp }\) (see Sect. 2 for the definition of \(\Sigma ^{\perp }\)). The hitting-set (resp. set-cover) problem is to find the smallest hitting set (resp. set cover) of \(\Sigma \). Both problems are well-known to be NP-complete [27] and have been extensively studied. Let \(\kappa := \kappa (\Sigma )\) and \(\chi := \chi (\Sigma )\) denote the size of an optimal hitting set and set cover, respectively, of \(\Sigma \). Sometimes, we also use \(\textsc {OPT}\) to denote either \(\kappa \) or \(\chi \) and the meaning will be clear from the context.

In this paper, we are primarily interested in the hitting-set and set-cover problems for geometric range spaces, where \(\mathsf {X}\) is a finite set of points in \(\mathbb {R}^d\) and \(\mathcal {R}\) is a finite family of simply-shaped regions chosen from some infinite class (e.g., rectangles, balls, simplices, halfspaces). In this case, ranges are \(\mathsf {X}\cap R\) for \(R \in \mathcal {R}\). With a slight abuse of notation, we denote the range space \(\Sigma = (\mathsf {X}, \{X\cap R \mid R \in \mathcal {R}\})\) as \((\mathsf {X}, \mathcal {R})\). The hitting-set and set-cover problems are NP-complete even for simple geometric range spaces, e.g., when \(\mathcal {R}\) is a set of unit disks or unit squares in \(\mathbb {R}^2\) [25]. This has led to the development of polynomial-time approximation algorithms for these problems. Traditionally, the focus has been on developing polynomial-time algorithms with the smallest possible approximation ratio. Many recent applications (e.g., in sensor networks, database systems, computer vision) call for repeated computation of hitting sets and set covers. For such applications, it is desirable to have near-linear-time algorithms for these problems even if it means sacrificing a little on the approximation ratio. Motivated by these applications, we study the problem of computing, in near linear time, near-optimal hitting sets and set covers for geometric range spaces.

Related work The well-known greedy algorithm gives a polynomial-time \(O(\log n)\)-approximation for a hitting set or set cover [21]. The known lower bound results imply that this is the best one can hope for, within a constant factor, assuming certain conjectures in complexity theory [24]. However, by exploiting the underlying geometry, polynomial-time algorithms with better approximation factors can be obtained for many geometric range spaces; see [6, 18, 20, 39] and the references therein for various results of this kind. These algorithms employ and adapt a wide range of novel techniques, including the usage of \(\varepsilon \)-nets [30].

Given a parameter \(\varepsilon \in (0,1]\), an \(\varepsilon \)-net for range space \(\Sigma = (\mathsf {X}, \mathcal {R})\) is a subset \(\mathsf {Q}\subseteq \mathsf {X}\) that intersects every range \(R \in \mathcal {R}\) whose size is at least \(\varepsilon |\mathsf {X}|\). In other words, \(\mathsf {Q}\) is a hitting set for all the “heavy” ranges. Haussler and Welzl [30] proved that a range space of VC-dimensionFootnote 1\(\delta \) has an \(\varepsilon \)-net of size \(O( (\delta /\varepsilon ) \log (\delta /\varepsilon ))\). The bounds on the size of \(\varepsilon \)-nets have been improved for several geometric range spaces. For example, \(O(1/\varepsilon )\)-size \(\varepsilon \)-nets exist when \(\mathsf {X}\) is a set of points and \(\mathcal {R}\) is a set of halfspaces in \(\mathbb {R}^2\) or \(\mathbb {R}^3\), pseudo-disks \(\mathbb {R}^2\), or translates of a fixed convex polytope in \(\mathbb {R}^3\) [33, 36, 42]. Aronov et al. [6] gave an \(O(n \log ^{d-1} n)\)-expected-time randomized algorithm to construct an \(\varepsilon \)-net of size \(O( (1/\varepsilon ) \log \log (1/\varepsilon ))\) for the case when \(\mathcal {R}\) is a set of axis-parallel d-rectangles for \(d = 2, 3\). Bus et al. [11] further improved the upper-bound to \(13.4/\varepsilon \) for disks in the plane, based on a construction using Delaunay triangulations. Recent results of Pach and Tardos [40] and Kupavskii et al.  [32] show that if \(\mathcal {R}\) is a set of axis-parallel d-rectangles, then the size of an \(\varepsilon \)-net in the worst-case is \(\Omega \bigl (\frac{1}{\varepsilon } \log \log \frac{1}{\varepsilon }\bigr )\) for \(d = 2, 3\) and \(\Omega \bigl (\frac{1}{\varepsilon } \log \frac{1}{\varepsilon }\bigr )\) for \(d \ge 4\). See also [4, 23] for other results on \(\varepsilon \)-nets.

Building on an iterative-reweighting technique by Clarkson [18], Brönnimann and Goodrich [10] showed that if an \(\varepsilon \)-net of a range space \(\Sigma \) of size \(O( (1/\varepsilon ) g(1/\varepsilon ))\) can be computed in polynomial time, where \(g(\cdot )\) is a monotonically-nondecreasing sublinear function, then a hitting set of size \(O(\kappa g(\kappa ))\) also can be computed in polynomial time. Since a set cover of \(\Sigma \) is a hitting set of the dual range space of \(\Sigma \), their algorithm also extends to set cover. If \(\Sigma \) has finite VC-dimension, then the dual range space also has finite VC-dimension, and thus a hitting set (resp. set cover) of \(\Sigma \) of size \(O(\kappa \log \kappa )\) (resp. \(O(\chi \log \chi )\)) can be computed in polynomial time. The size of hitting set reduces to \(O(\kappa )\) if \(\Sigma \) admits an \(\varepsilon \)-net of size \(O(1/\varepsilon )\) (e.g. when \(\mathcal {R}\) is a set of halfspaces in \(\mathbb {R}^3\), or a set of pseudo-disks in \(\mathbb {R}^2\)), and to \(O(\kappa \log \log \kappa )\) if \(\mathcal {R}\) is a set of rectangles in \(\mathbb {R}^2\) or \(\mathbb {R}^3\). The Brönnimann–Goodrich algorithm can be viewed as first solving the LP relaxation of the hitting-set problem and then obtaining an approximate hitting set from the fractional LP solution via \(\varepsilon \)-net. A more general LP solver can also be used to obtain the same results [34]. See [6, 20, 44] for improved approximation ratios for set covers of geometric range spaces.

The algorithms by Clarkson [18] and Brönnimann–Goodrich [10] are instances of the so-called multiplicative weight (MW) method. The MW method, which has been repeatedly discovered, goes back to the 1950’s and has been used in numerous fields including machine learning, linear programming, semidefinite programming, graph algorithms, game theory, on-line algorithms, and computational geometry. In computational geometry, besides hitting-set/set-cover, the MW method has been used for linear programming [19, 31], constructing spanning trees of low stabbing number [16], and constructing partition trees for range searching [35]. See the survey by Arora et al. [8] for a comprehensive review of this method.

Approximation algorithms for hitting set and set cover are known that do not rely on \(\varepsilon \)-nets. For instance, the local-search technique by Mustafa and Ray [39] computes \((1+\varepsilon )\)-approximate hitting sets for halfspaces in \(\mathbb {R}^3\) and pseudo-disks in \(\mathbb {R}^2\) in \(N^{O(\varepsilon ^{-2})}\) time. Mustafa et al.  [38] developed a quasi-polynomial-time algorithm for the case where the halfspaces and pseudo-disks are weighted.

Both the greedy and the Brönnimann–Goodrich algorithms work in \(O(\kappa \log N)\) stages. A straightforward implementation of both algorithms takes O(mn) time per stage, which can be improved to \(O( N{\mathrm {polylog}}(N))\) in some cases using geometric data structures. Thus these algorithms run in time \(\Omega ( N\kappa )\).

Agarwal et al. [3] studied near-linear algorithms for the hitting-set problem. They proposed a variant of the greedy algorithm that performs \(O(\log n)\) stages and chooses, in near-linear time, \(O(\kappa \phi (\kappa ))\) points in each stage for the hitting set if the union of any subset \(\mathcal {F}\subseteq \mathcal {R}\) has complexity \(O(|\mathcal {F}| \phi (|\mathcal {F}|))\), where \(\phi (\cdot )\) is a sublinear function. They also presented an efficient implementation of the Brönnimann–Goodrich algorithm for the special case when \(\mathcal {R}\) is a set of d-rectangles for \(d = 2,3\). Their algorithm is rather complex and computes a hitting set of size \(O(\kappa \log \log \kappa )\) in \(O( (N+ \kappa ^{d+1}) \mathrm {polylog}(n))\) time. Not only is this algorithm inefficient for large values of \(\kappa \), but it also does not extend to other range spaces.

Our results We present two simple algorithms, both based on the MW method, for computing a small-size hitting set or set cover of a range space \(\Sigma \) of finite VC-dimension. They first compute a near-optimal fractional solution using the MW method and then transform this fractional solution to an integral solution using \(\varepsilon \)-nets (e.g., as in [34]). As mentioned earlier, a set cover of \(\Sigma \) is a hitting set of the dual range space \(\Sigma ^{\perp }\), so we only describe our algorithms in terms of computing a hitting set.

The first algorithm, described in Sect. 3, is a simpler but more efficient variant of the Brönnimann–Goodrich algorithm [10]. It divides the \(O(\kappa \log n)\) stages of their algorithm into \(O(\log n)\) rounds so that each range is processed only once in each round, and it computes an \(\varepsilon \)-net twice, instead of \(O(\kappa \log n)\) times as in [10]. Assuming that an \(\varepsilon \)-net of \(\Sigma \) can be computed in \(O(n {\mathrm {polylog}}(n))\) time and range queries on \(\Sigma \) can be answered in \({\mathrm {polylog}}(n)\) time, the expected running time of the algorithm is \(O( N{\mathrm {polylog}}(N))\). See Theorem 3.2 for a more precise bound.

The second algorithm, described in Sect. 4, is a Monte Carlo algorithm that computes a small-size hitting set or set cover with high probability (i.e., with probability at least \(1-1/N^{\Omega (1)}\)). It can be viewed as solving a two-player zero-sum game. In this game, one player Alice has the points in \(\mathsf {X}\) as her pure strategies and the other player Bob has the ranges in \(\mathcal {R}\) as his pure strategies. If Alice plays \(x \in \mathsf {X}\) and Bob plays \(R \in \mathcal {R}\), then Bob pays Alice 1 if \(x \in R\) and 0 otherwise. An optimal mixed strategy for Alice gives an optimal fractional solution to the hitting-set, from which one can use \(\varepsilon \)-net to obtain a hitting set of small size. Using the MW method, the algorithm computes near-optimal mixed strategies for both Alice and Bob, which gives a near-optimal fractional solution to the hitting-set problem. We remark that the MW method has been used to solve the zero-sum game approximately [26, 28], but our algorithm is somewhat different, tailored to computing a hitting set.

We next describe in Sect. 5 consequences of these algorithms for a number of geometric range spaces. The results are presented in terms of the second algorithm and thus achieve the approximation ratios with high probability. We can obtain Las Vegas algorithms for these problems using the first algorithm, possibly paying an additional logarithmic factor in the running time, that are guaranteed to obtain these approximation ratios. See Table 1 for a summary of the results.

Either no near-linear algorithms with guaranteed approximation ratio were known for these problems, or near-linear algorithms attained a worse approximation ratio. For example, the algorithm by Agarwal et al.  [3] computes only an \(O(\log n)\)-approximate hitting set for disks in \(O(N\log ^3 N)\) time, and it does not extend to computing a set cover. We remark that our algorithms rely on standard range searching data structures. It might be possible to improve \(\mathrm {polylog}\) factors in the running time by optimizing these data structures, but we feel it is not worth the effort.

Table 1 Summary of results for different geometric range spaces using the second algorithm, so these bounds hold with probability at least \(1-\frac{1}{N^{\Omega (1)}}\)

Finally, in Sect. 6, we extend the second algorithm to compute, in near-linear time, an O(1)-approximate (discrete) maximum independent set for disks. In this problem, we are also given a discrete range space \(\Sigma = (\mathsf {X}, \mathcal {R})\), where \(\mathsf {X}\) is a set of points and \(\mathcal {R}\) a set of disks in \(\mathbb {R}^2\), and the goal is to compute the largest subset \(\mathcal {I}\) of disks such that no point of \(\mathsf {X}\) is contained in more than one disk of \(\mathcal {I}\). We first use our second algorithm to compute a fractional solution to this problem, and then round the fractional solution to an integral one using the algorithm by Chan and Har-Peled [13]. The algorithm in [13] employs an existing LP solver and runs in quadratic time.

2 Preliminaries

Range space and\(\varvec{\varepsilon }\)-nets Let \(\Sigma = (\mathsf {X}, \mathcal {R})\) be a finite range space, as defined above. The dual range space of \(\Sigma \), denoted by \(\Sigma ^{\perp }= (\mathsf {X}^{\perp }, \mathcal {R}^{\perp })\) is defined as follows. There is an object in \(\mathsf {X}^{\perp }\) for each range \(R \in \mathcal {R}\), and \(\mathcal {R}^{\perp }\) contains a range \(R_x\) for each point \(x \in \mathsf {X}\), namely, \(R_x = \{ R \in \mathcal {R}\mid R \ni x \}\). As mentioned above, a set cover of \(\Sigma \) is a hitting set of \(\Sigma ^{\perp }\) and vice-versa. It is also known that if \(\Sigma \) has finite VC-dimension, then so does \(\Sigma ^{\perp }\) [29].

A hitting set \(\mathsf {H}\subseteq \mathsf {X}\) is called c-approximate, for \(c \ge 1\), if \(|\mathsf {H}| \le c \kappa (\Sigma )\). Similarly, a set cover \(\mathcal {C}\subseteq \mathcal {R}\) is called c-approximate if \(|\mathcal {C}| \le c \chi (\Sigma )\).

Given a weight function \(w:\mathsf {X}\rightarrow \mathbb {R}_{\ge 0}\), for a subset \(A \subseteq \mathsf {X}\), we use w(A) to denote the total weight of points in A. Given w and a parameter \(\varepsilon \in (0, 1]\), we call a range \(R \in \mathcal {R}\)\(\varepsilon \)-heavy if \(w(R) \ge \varepsilon w(\mathsf {X})\) and \(\varepsilon \)-light otherwise. A subset \(\mathsf {Q}\subseteq \mathsf {X}\) is called an \(\varepsilon \)-net with respect to weight function w if \(\mathsf {Q}\cap R \ne \varnothing \) for every \(\varepsilon \)-heavy range R of \(\mathcal {R}\).

Hitting-set algorithm We first briefly describe the main idea behind the algorithm of Brönnimann–Goodrich [10] (and Clarkson [18]), which uses a similar framework as other more general multiplicative-weight algorithms such as the Plotkin–Shmoys–Tardos algorithm [41] for solving packing and covering LPs.

Let k be an integer such that \(k/2 < \kappa \le k\). Initialize the weight of each point to 1, i.e., \(w(x) = 1\) for all \(x \in \mathsf {X}\), and repeat the following weight-doubling step until every range is \(\tfrac{1}{2k}\)-heavy:

When the process stops, return a \(\tfrac{1}{2k}\)-net \(\mathsf {Q}\) of \(\Sigma \) with respect to the final weights. Since each range in \(\Sigma \) is \(\tfrac{1}{2k}\)-heavy, \(\mathsf {Q}\) is a hitting set of \(\Sigma \). Hence, if a \(\tfrac{1}{2k}\)-net of size O(kg(k)) can be computed efficiently, the above algorithm computes a hitting set of size \(O(\kappa g(\kappa ))\). The following lemma, by now a well-known argument (see e.g. [8, 10, 16, 18, 19]), is the key to the performance of the algorithm. For completeness, we also give the proof here.Footnote 2

Lemma 2.1

If \(\Sigma \) has a hitting set of size at most k, then the algorithm performs at most \(\mu _k := 4k \log (n/k)\) weight-doubling steps. The final weight of \(\mathsf {X}\) is at most \(n^4/k^3\).

Proof

Let \(\mathsf {H}\) be a hitting set of size k. Since each weight-doubling step is performed on a \(\tfrac{1}{2k}\)-light range R, the total weight of \(\mathsf {X}\) increases by a factor of at most \(\bigl (1 + \tfrac{1}{2k}\bigr )\). Thus, after z weight-doubling steps,

$$\begin{aligned} w(\mathsf {X}) \le n\biggl (1+\frac{1}{2k}\biggr )^z \le n \exp \biggl (\frac{z}{2k}\biggl ). \end{aligned}$$

On the other hand, \(\mathsf {H}\cap R \ne \varnothing \), thus at least one point of \(\mathsf {H}\) gets its weight doubled after the weight-doubling step on R. Suppose each element of \(h \in \mathsf {H}\) is doubled \(z_h\) times after z weight-doubling steps, we have

$$\begin{aligned} w(\mathsf {H}) = \sum _{h \in \mathsf {H}} 2^{z_h} \ge k 2^{z/k}. \end{aligned}$$

Since \(w(\mathsf {H}) \le w(\mathsf {X})\), we get

$$\begin{aligned} k 2^{\frac{z}{k}} \le n e^{\frac{z}{2k}} \le n 2^{\frac{3z}{4k}}, \end{aligned}$$

from which \(z \le \mu _k = 4k \log (n/k)\) follows. The final weight of \(\mathsf {X}\) follows from the first inequality. \(\square \)

By Lemma 2.1, if the algorithm does not terminate within \(\mu _k\) steps, we can conclude that \(\Sigma \) does not have a hitting set of size at most k. An exponential search is used to guess the value of k such that \(k/2 < \kappa \le k\).

In each step, the Brönnimann–Goodrich algorithm computes a \(\tfrac{1}{2k}\)-net \(\mathsf {H}\) and identifies a range that does not intersect \(\mathsf {H}\) as a light range. However, verifying whether \(\mathsf {H}\) hits all ranges takes \(\Omega (N)\) time, and thus the algorithm can take \(\Omega (N^2)\) time in the worst case; see the original paper for details. In the next section, we group every O(k) weight-doubling steps into one round, and find light ranges for these steps together by looking at each range in R only once. This optimization makes it possible to finish all weight-doubling steps in \(\widetilde{O}(N)\) time for many geometric instances.

3 Hitting Set in Rounds

Algorithm As in [10], assume we have an integer k such that \(\kappa \in (k/2, k]\); we perform an exponential search to find such a value of k. The algorithm is similar to [10] except that it works in rounds. Initially, it sets \(w(x) = 1\) for all \(x \in \mathsf {X}\). Each round of the algorithm performs at most 2k weight-doubling steps, as follows. It processes each range \(R \in \mathcal {R}\) one by one. If R is \(\tfrac{1}{2k}\)-light, it doubles the weights of all points in R. This weight-doubling step on R is performed repeatedly until R becomes \(\tfrac{1}{2k}\)-heavy or 2k weight-doubling steps have been performed in the current round. Once R becomes \(\tfrac{1}{2k}\)-heavy, it is not processed again in the current round, even though it may become \(\tfrac{1}{2k}\)-light again later in the current round while other ranges are being processed.

If 2k weight-doubling steps have been performed in the current round, the algorithm aborts the current round and moves to the next round. On the other hand, if all ranges have been processed with less than 2k weight-doubling steps, the algorithm stops and returns a \(\tfrac{1}{2ke}\)-net \(\mathsf {Q}\) of \(\Sigma \) as a hitting set of \(\Sigma \).

By Lemma 2.1, if \(\Sigma \) has a hitting set of size at most k, then the algorithm performs at most \(\mu _k = 4k \log (n/k)\) weight-doubling steps. Since each except the last round performs 2k weight-doubling steps, the number of rounds is at most \(\frac{\mu _k}{2k} + 1 = 2\log (n/k)+1\).

Correctness The correctness of the algorithm follows from the following lemma.

Lemma 3.1

All ranges are \(\tfrac{1}{2ke}\)-heavy when the algorithm terminates.

Proof

Suppose the algorithm stops in round i. Let \(W_i\) be the total weight \(w(\mathsf {X})\) in the beginning of round i, and let \(W_f\) be the total weight when the algorithm stops. Since at most 2k weight-doubling steps are performed in round i and each of them increases the total weight by a factor of at most \(1+\tfrac{1}{2k}\),

$$\begin{aligned} W_f \le \biggl (1+\frac{1}{2k}\biggl )^{2k} W_i \le e W_i. \end{aligned}$$

After a range R has been processed in round i, it is \(\tfrac{1}{2k}\)-heavy with respect to the current weight of \(\mathsf {X}\), which is at least \(W_i\). Therefore

$$\begin{aligned} w(R) \ge \frac{1}{2k}\, W_i \ge \frac{1}{2ke}\, W_f, \end{aligned}$$

implying that R is \(\tfrac{1}{2ke}\)-heavy. Since the algorithm terminates in round i only after all ranges have been processed in that round, all ranges are \(\tfrac{1}{2ke}\)-heavy. \(\square \)

If an \(\varepsilon \)-net of \(\Sigma \) of size \(O\bigl (\tfrac{1}{\varepsilon } g\bigl (\tfrac{1}{\varepsilon }\bigr )\bigr )\) can be computed, then by Lemma 3.1, the above algorithm returns a hitting set of size \(O(\kappa g(\kappa ))\).

Running time To expedite the weight-doubling step, we perform the following preprocessing step before running the algorithm: compute a \(\tfrac{1}{2k}\)-net \(\mathsf {Q}_0\) of \(\Sigma \), and set \(\mathsf {X}= \mathsf {X}\setminus \mathsf {Q}_0\) and \(\mathcal {R}= \{R \mid R \cap \mathsf {Q}_0 = \varnothing \}\). After this preprocessing, each range in \(\mathcal {R}\) contains at most \(\tfrac{n}{2k}\) points. The algorithm returns \(\mathsf {Q}_0 \cup \mathsf {Q}\) as a hitting set of \(\Sigma \).

We assume the existence of the following three procedures:

  1. (P1)

    A net finder that can compute an \(\varepsilon \)-net of size \(O\bigl (\tfrac{1}{\varepsilon } g\bigl (\tfrac{1}{\varepsilon }\bigr )\bigr )\) in \(\varphi (N)\) time.

  2. (P2)

    A range-counting data structure that, given a range R, returns yes if R is \(\tfrac{1}{2k}\)-light and no otherwise, and that can also update the weight of a point. Let \(\tau (n) = \Omega (\log n)\) be the time taken by each of these two operations. We assume that the data structure can be built in \(O(n\tau (n))\) time.

  3. (P3)

    A range-reporting data structure that reports all s points of a range in \(O(\tau (n)+s)\) time, and that can be constructed in \(O(n\tau (n))\) time.

Using (P2) and (P3), a range \(R \in \mathcal {R}\) can be processed in each round as follows. First, using (P2) we check in \(\tau (n)\) time whether R is \(\tfrac{1}{2k}\)-light. If the answer is yes, we use (P3) to report all points of R in time \(\tau (n) + |R| \le \tau (n) + \tfrac{n}{2k}\). Recall that at most 2k weight-doubling steps are performed in each round, so O(n) points are reported in a round. For each reported point, it takes \(\tau (n)\) time to double its weight in the range-counting data structure. We thus conclude that a round takes \(O( N\tau (n) )\) time. Summing over all \(2 \log (n/2k) + 1\) rounds, the total time spent by the algorithm, including the time spent in the preprocessing phase, is \(O( N\tau (n) \log n + \varphi (N))\). We repeat the algorithm \(O(\log \kappa )\) times to perform the exponential search. Putting everything together, we obtain the following.

Theorem 3.2

Let \(\Sigma = (\mathsf {X}, \mathcal {R})\) be a finite range space with \(|\mathsf {X}| + |\mathcal {R}| = N\). Suppose procedures (P1), (P2) and (P3) exist for \(\Sigma \), then an \(O(g(\kappa ))\)-approximate hitting set of \(\Sigma \) can be computed in \(O( ( N\tau (N) \log N+ \varphi (N) ) \log \kappa )\) time.

Recalling that a set cover of \(\Sigma \) is a hitting set of the dual range space \(\Sigma ^{\perp }\), we obtain the following.

Theorem 3.3

Let \(\Sigma = (\mathsf {X}, \mathcal {R})\) be a finite range space with \(|\mathsf {X}| + |\mathcal {R}| = N\). Suppose procedures (P1), (P2) and (P3) exist for \(\Sigma ^{\perp }\), then an \(O(g(\chi ))\)-approximate set cover of \(\Sigma \) can be computed in \(O( (N\tau (N) \log N+ \varphi (N)) \log \chi )\) time.

Remarks

(i) An approximate range counting data structure can be used in (P2), namely, a data structure that always returns yes if R is \(\tfrac{1}{4k}\)-light and always returns no if R is \(\tfrac{1}{2k}\)-heavy. It may return yes or no if \(w(R) \in \bigl [\tfrac{1}{4k}w(\mathsf {X}), \tfrac{1}{2k}w(\mathsf {X})\bigr ]\). So if the algorithm returns yes, then R must be \(\tfrac{1}{2k}\)-light; on the other hand, if the algorithm returns no, then R must be \(\tfrac{1}{4k}\)-heavy. A weight-doubling step is performed if the procedure returns yes. The same argument as in Lemma 3.1 implies that each range is at least \(\frac{1}{4ke}\)-heavy when the algorithm terminates, after \(O(\log (n/k))\) rounds. Hence, the size of hitting set is affected by a factor of at most 2. This is convenient because faster data structures are known for approximate range counting for some range spaces (e.g., halfspace range counting).

(ii) By Lemma 2.1, the final weight of \(\mathsf {X}\) is bounded by \(O(n^4/k^3)\), so \(O(\log n)\) bits suffice for maintaining the weight of each point during the algorithm.

4 Hitting Set as 2-Player Game

Algorithm We now describe the second algorithm for computing a hitting set. As earlier, suppose we have an integer k such that \(k/2 < \kappa \le k\); we perform a backward exponential search to find such a value of k as described later. The algorithm now maintains weights on both \(\mathsf {X}\) and \(\mathcal {R}\), and it also works in rounds. For \(i \ge 1\), let \(\pi ^i:\mathsf {X}\rightarrow \mathbb {R}_{\ge 0}\) and \(\omega ^i:\mathcal {R}\rightarrow \mathbb {R}_{\ge 0}\) be the weights of points and ranges, respectively, in the beginning of round i. We also use \(\pi ^i(\mathsf {X})\) and \(\omega ^i(\mathcal {R})\) to denote the total weight of all points in \(\mathsf {X}\) and all ranges in \(\mathcal {R}\), respectively. Initially, \(\pi ^1(x) = 1\) for all \(x \in \mathsf {X}\) and \(\omega ^1(R) = 1\) for all \(R \in \mathcal {R}\). Let \(\Pi ^i\) (resp. \(\Omega ^i\)) denote the probability distribution induced by \(\pi ^i\) (resp. \(\omega ^i\)), i.e.,

$$\begin{aligned} \Pi ^i = \biggl \langle \frac{\pi ^i(x_j)}{\pi ^i(\mathsf {X})} \ \big | \ 1 \le j \le n \biggl \rangle , ~ ~ \Omega ^i = \biggl \langle \frac{\omega ^i(R_j)}{\omega ^i(\mathcal {R})} \ \big | \ 1 \le j \le m \biggl \rangle . \end{aligned}$$

We set

$$\begin{aligned} \mu := \frac{2}{\ln 2}\, k \ln (m^2 n). \end{aligned}$$

The algorithm performs \(\mu \) rounds of the following two steps: for \(1 \le i \le \mu \),

  1. (i)

    Sample a point \(\bar{x}_i \in \mathsf {X}\) from the distribution \(\Pi ^i\) and a range \(\overline{R}_i \in \mathcal {R}\) from the distribution \(\Omega ^i\).

  2. (ii)

    For each point \(x \in \overline{R}_i\), double its weight, i.e., \(\pi ^{i+1}(x) = 2 \pi ^{i}(x)\), and for each range R that contains \(\bar{x}_i\), halve its weight, i.e., \(\omega ^{i+1}(R) = \omega ^i(R)/2\).

Let \(\widetilde{\mathsf {X}}= \langle \bar{x}_1, \ldots , \bar{x}_{\mu } \rangle \) be the multi-set of points chosen by the algorithm. Let \(\widetilde{\Pi }\) be the distribution on \(\mathsf {X}\) induced by \(\widetilde{\mathsf {X}}\), i.e., if a point \(x \in \mathsf {X}\) appears \(\mu _x\) times in \(\widetilde{\mathsf {X}}\), then set \(\Pr (x \sim \widetilde{\Pi }) = \mu _x/\mu \). We compute a \(\tfrac{1}{8k}\)-net \(\mathsf {Q}\) of \(\Sigma \) with respect to \(\widetilde{\Pi }\). If \(\mathsf {Q}\) is a hitting set of \(\Sigma \), we return \(\mathsf {Q}\); otherwise, we repeat the above algorithm.

Intuitively, the weights of points that lie in many ranges are doubled more frequently, and thus more likely to be in the optimal hitting set; on the other hand, the weights of ranges that contain few points are halved less frequently, so that they can be sampled to ensure points contained in them have high weights.

Correctness We view the hitting-set problem for \(\Sigma \) as a two-player zero-sum game, and the above algorithm computes a near-optimal mixed strategy for each of the two players. More precisely, let Alice and Bob be two players who play the following game: Alice chooses a point \(x \in \mathsf {X}\), Bob chooses a range \(R \in \mathcal {R}\) and Bob pays I(xR) to Alice, where

$$\begin{aligned} I(x,R) = \bigg \{ \begin{array}{ll} 1 &{} \text {if } x \in R, \\ 0 &{} \text {if } x \notin R. \end{array} \end{aligned}$$

The algorithm can be interpreted as Alice and Bob playing the game for \(\mu \) rounds. Initially, both players have uniform distributions over their pure strategies. In each round, Alice samples a point x from the current distribution over points and Bob samples a range R from the current distribution over ranges. According to the definition of the game, points in R are good for Alice because had she played one of them, she would have won 1 dollar. Therefore, Alice doubles the weights of points in R so that she can choose them more often in the future. Similarly, ranges containing x are bad for Bob, thus he halves the weights of those ranges.

For a probability distribution \(\Pi \) over \(\mathsf {X}\) and for a range \(R\in \mathcal {R}\), let \(I(\Pi , R)\) denote the expected payoff to Alice if Bob chooses R, i.e.,

$$\begin{aligned} I(\Pi , R) = \sum _{x \in \mathsf {X}} \,\Pr (x\sim \Pi )\, I(x, R). \end{aligned}$$

Similarly, we define \(I(x, \Omega )\) for a point \(x \in \mathsf {X}\) and a distribution \(\Omega \) over \(\mathcal {R}\). Let \(\lambda ^*\) be the value of the above game. By the min-max theorem [37],

$$\begin{aligned} \lambda ^* = \max _{\Pi } \min _{R \in \mathcal {R}} I(\Pi , R) = \min _{\Omega } \max _{x \in \mathsf {X}} I(x, \Omega ), \end{aligned}$$
(1)

where \(\Pi \), \(\Omega \) are probability distributions over \(\mathsf {X}\) and \(\mathcal {R}\), respectively.

Let \(\mathsf {H}^* \subseteq \mathsf {X}\) be an optimal hitting set of \(\Sigma \) of size \(\kappa \), and let \(\Pi _{\mathsf {H}^*}\) be the distribution where \(\pi (x) = 1/\kappa \) if \(x \in \mathsf {H}^*\) and 0 otherwise. Then

$$\begin{aligned} \min _{R \in \mathcal {R}} I(\Pi _{\mathsf {H}^*}, R) \ge \frac{1}{\kappa } \end{aligned}$$

because \(R\cap \mathsf {H}^* \ne \varnothing \) for all \(R \in \mathcal {R}\). Hence \(\lambda ^* \ge 1/\kappa \ge 1/k\). Let

$$\begin{aligned} \Pi ^*= {\mathop {\hbox {arg max}}\limits _{\Pi }} \min _{R \in \mathcal {R}} I(\Pi , R) \end{aligned}$$

be the optimal (mixed) strategy for Alice. If we can compute \(\Pi ^*\), then we can simply return a (1 / k)-net \(\mathsf {Q}\) of \(\Sigma \) under the distribution \(\Pi ^*\): \(\mathsf {Q}\) is a hitting set of \(\Sigma \) because the weight of any \(R\in \mathcal {R}\) under \(\Pi ^*\) is at least \(\min _{R \in \mathcal {R}} I(\Pi ^*, R) = \lambda ^* \ge 1/k\). We show that \(\widetilde{\Pi }\) is an approximation of \(\Pi ^*\), in the sense that with constant probability,

$$\begin{aligned} \min _{R \in \mathcal {R}} I(\widetilde{\Pi }, R) \ge \frac{1}{8k}, \end{aligned}$$

and thus a \(\tfrac{1}{8k}\)-net of \(\Sigma \) under \(\widetilde{\Pi }\) is a hitting set of \(\Sigma \).

We begin by proving two lemmas, which follow from standard arguments for the MW method. The first one states that the total expected payoff to Alice over \(\mu \) rounds is not much less than the payoff she would get by the best pure strategy.

Lemma 4.1

For every \(x \in \mathsf {X}\),

$$\begin{aligned} \displaystyle \sum _{t=1}^{\mu } I( \Pi ^t, \overline{R}_t) \ge -\ln n + \ln 2 \sum _{t=1}^{\mu } I(x, \overline{R}_t). \end{aligned}$$

Proof

Let \(\pi ^t(\mathsf {X})\) be the total weight of \(\mathsf {X}\) in the beginning of round t. Then by construction,

$$\begin{aligned} \pi ^{t+1}(\mathsf {X})&= \sum _{x \in \mathsf {X}} \pi ^t(x) (1+ I(x, \overline{R}_t)) \nonumber \\&= \pi ^t(\mathsf {X}) \biggl (1 + \sum _{x \in \mathsf {X}} \frac{\pi ^t(x)}{\pi ^t(\mathsf {X})}\, I(x, \overline{R}_t)\biggr ) \nonumber \\&= \pi ^t(\mathsf {X}) (1 + I(\Pi ^t, \overline{R}_t)) \le \pi ^t(\mathsf {X}) \exp (I(\Pi ^t, \overline{R}_t)) \nonumber \\&\le n \exp \biggl (\sum _{i=1}^t I(\Pi ^i, \overline{R}_i)\biggr ). \end{aligned}$$
(2)

The last inequality follows because \(\pi ^1(\mathsf {X}) = n\). However, for every \(x \in \mathsf {X}\),

$$\begin{aligned} \pi ^{\mu +1} (\mathsf {X}) \ge \pi ^{\mu +1}(x) = \exp \biggl (\ln 2 \sum _{t=1}^{\mu } I(x, \overline{R}_t)\biggr ). \end{aligned}$$
(3)

The lemma follows from (2) and (3). \(\square \)

Since Lemma 4.1 holds for every x, it also holds for the optimal mixed strategy \(\Pi ^*\).

Corollary 4.2

$$\begin{aligned} \sum _{t=1}^{\mu } I( \Pi ^t, \overline{R}_t) \ge -\ln n + \ln 2 \sum _{t=1}^{\mu } I(\Pi ^*, \overline{R}_t). \end{aligned}$$

A similar argument proves the following lemma, which states that the expected cost to Bob is not much worse than that by his best pure strategy.

Lemma 4.3

For every \(R \in \mathcal {R}\),

$$\begin{aligned} \displaystyle \sum _{t=1}^{\mu } I( \bar{x}_t, \Omega ^t) \le 2\ln m + \ln 4 \sum _{t=1}^{\mu } I(\bar{x}_t, R). \end{aligned}$$

Proof

Let \(\omega ^t(\mathcal {R})\) be the total weight of \(\mathcal {R}\) in the beginning of round t. Then by construction,

$$\begin{aligned} \omega ^{t+1}(\mathcal {R})&= \sum _{R \in \mathcal {R}} \omega ^t(R) \biggl (1- \frac{1}{2}\, I(\bar{x}_t, R)\biggl ) \nonumber \\&= \omega ^t(\mathcal {R}) \biggl (1 - \frac{1}{2} \sum _{R \in \mathcal {R}} \frac{\omega ^t(R)}{\omega ^t(\mathcal {R})}\, I(\bar{x}_t, R)\biggr ) \nonumber \\&= \omega ^t(\mathcal {R}) \biggl (1 - \frac{1}{2}\, I(\bar{x}_t, \Omega ^t)\biggr ) \le \omega ^t(\mathcal {R}) \exp \biggl (-\frac{1}{2}\, I(\bar{x}_t, \Omega ^t)\biggr ) \nonumber \\&\le m \exp \biggl (-\frac{1}{2} \sum _{i=1}^t I(\bar{x}_i, \Omega ^i)\biggr ). \end{aligned}$$
(4)

The last inequality follows because \(\omega ^1(\mathcal {R}) = m\). However, for every \(R \in \mathcal {R}\),

$$\begin{aligned} \omega ^{\mu +1} (\mathcal {R}) \ge \omega ^{\mu +1}(R) = \exp \biggl (-\ln 2 \sum _{t=1}^{\mu } I(\bar{x}_t, R)\biggr ). \end{aligned}$$
(5)

The lemma follows from (4) and (5). \(\square \)

The following lemma follows from the fact that the game Alice and Bob play is a zero-sum game, but we sketch a proof for completeness.

Lemma 4.4

$$\begin{aligned} \displaystyle \mathsf {E}\biggl [\, \sum _{t=1}^{\mu } I(\Pi ^t, \overline{R}_t) \biggr ] = \mathsf {E}\biggl [\, \sum _{t=1}^{\mu } I(\bar{x}_t, \Omega ^t) \biggr ], \end{aligned}$$

where the expectation is taken over the random sequence of points and ranges chosen in all rounds.

Proof

For \(t \ge 1\), let \(\mathcal {S}^t\) be the set of all point-range-pair sequences of length t, i.e.,

$$\begin{aligned} \mathcal {S}^t = \{ \langle (x_1, R_1), \ldots , (x_t, R_t)\rangle \ | \ x_i \in \mathsf {X}, R_i \in \mathcal {R}, 1\le i \le t\}. \end{aligned}$$

If we fix a sequence \(S \in \mathcal {S}^{t-1}\), then the distributions \(\Pi ^t, \Omega ^t\) are fixed, which we denote by \(\Pi ^t_{|S}, \Omega ^t_{|S}\).

$$\begin{aligned} \mathsf {E}[I(\Pi ^t, \overline{R}_t)]&= \sum _{S\in \mathcal {S}^{t-1}} \Pr (S) \sum _{R_t \in \mathcal {R}} \Pr (R_t | S)\, I(\Pi ^t_{|S}, R_t) \\&= \sum _{S\in \mathcal {S}^{t-1}} \Pr (S) \sum _{\begin{array}{c} x_t \in \mathsf {X}\\ R_t \in \mathcal {R} \end{array}} \Pr (x_t, R_t | S)\, I(x_t, R_t) \\&= \sum _{S\in \mathcal {S}^{t-1}} \Pr (S) \sum _{x_t \in \mathsf {X}} \Pr (x_t | S)\, I(x_t, \Omega ^t_{|S})\\&= \mathsf {E}[I(x_t, \Omega ^t)]. \end{aligned}$$

The lemma now follows from linearity of expectation. \(\square \)

We are now ready to prove that the distribution \(\widetilde{\Pi }\) on \(\mathsf {X}\) implied by \(\widetilde{\mathsf {X}}\) is close to \(\Pi ^*\). For a range \(R \in \mathcal {R}\), let

$$\begin{aligned} \lambda _R = \frac{1}{\mu } \sum _{t=1}^{\mu } I(\bar{x}_t, R) = I(\widetilde{\Pi }, R) \end{aligned}$$

and

$$\begin{aligned} \tilde{\lambda }= \min _{R \in \mathcal {R}} \lambda _R. \end{aligned}$$

Note that \(\lambda _R\), for every \(R \in \mathcal {R}\), and \(\tilde{\lambda }\) are random variables.

Lemma 4.5

$$\begin{aligned} \mathrm{(i) }\quad&\mathsf {E}[\tilde{\lambda }] \ge \frac{\lambda ^*}{4}\ge \frac{1}{4k}, \text { and} \\ \mathrm{(ii) }\quad&\Pr [\tilde{\lambda }> \mathsf {E}[\tilde{\lambda }]/2] \ge \frac{1}{7}. \end{aligned}$$

Proof

Using Lemmas 4.3 and 4.4 and Corollary 4.2,

$$\begin{aligned} \mathsf {E}[\tilde{\lambda }]&\ge \frac{1}{\mu \ln 4} \,\mathsf {E}\biggl [\, \sum _{t=1}^{\mu } I(\bar{x}_t, \Omega ^t) \biggr ] - \frac{\ln m}{\mu \ln 2} ~ ~ ~ ~\text { (By Lemma } 4.3)\\&= \frac{1}{\mu \ln 4}\, \mathsf {E}\biggl [\, \sum _{t=1}^{\mu } I(\Pi ^t, \overline{R}_t) \biggr ] - \frac{\ln m}{\mu \ln 2} ~ ~ ~ ~\text { (By Lemma } 4.4)\\&\ge \frac{1}{\mu \ln 4}\, \mathsf {E}\biggl [\, \ln 2 \sum _{t=1}^{\mu } I(\Pi ^*, \overline{R}_t) - \ln n \biggr ] - \frac{\ln m}{\mu \ln 2}\\&\qquad \qquad \qquad \text{(By } \text{ Corollary }~4.2)\\&= \frac{1}{2\mu } \,\mathsf {E}\biggl [\, \sum _{t=1}^{\mu } I(\Pi ^*, \overline{R}_t) \biggr ] - \frac{1}{\mu \ln 4} \ln (m^2 n). \end{aligned}$$

However,

$$\begin{aligned} I(\Pi ^*, \overline{R}_t) \ge \min _{R \in \mathcal {R}} I(\Pi ^*, R) = \lambda ^* \end{aligned}$$

and

$$\begin{aligned} \mu = \frac{2}{\ln 2}\, k \ln (m^2 n). \end{aligned}$$

Therefore,

$$\begin{aligned} \mathsf {E}[\tilde{\lambda }] \ge \frac{\lambda ^*}{2} - \frac{1}{4k} \ge \frac{1}{4k} \end{aligned}$$

because \(\lambda ^* \ge 1/k\). This proves part (i) of the lemma.

We now prove (ii). By (1),

$$\begin{aligned} \tilde{\lambda }= \min _{R\in \mathcal {R}} \lambda _R = \min _{R \in \mathcal {R}} I(\widetilde{\Pi }, R) \le \lambda ^*. \end{aligned}$$

Also, according to part (i),

$$\begin{aligned} \mathsf {E}[\tilde{\lambda }] \ge \frac{\lambda ^*}{4}. \end{aligned}$$

Let \(\mathsf {p}= \Pr [ \tilde{\lambda }\le \frac{1}{2}\, \mathsf {E}[\tilde{\lambda }]]\). Then

$$\begin{aligned} \mathsf {E}[\tilde{\lambda }] \le \frac{\mathsf {p}}{2}\, \mathsf {E}[\tilde{\lambda }] + \lambda ^* (1-\mathsf {p}) \le \frac{\mathsf {p}}{2} \,\mathsf {E}[\tilde{\lambda }] + 4 \mathsf {E}[\tilde{\lambda }] (1-\mathsf {p}), \end{aligned}$$

and we obtain \(\mathsf {p}\le 6/7\). This proves part (ii) of the lemma. \(\square \)

Lemma 4.5 implies that, with constant probability, \(\tilde{\lambda }> \tfrac{1}{8k}\), i.e., every range is \(\tfrac{1}{8k}\)-heavy with respect to \(\widetilde{\Pi }\). Thus, the algorithm indeed returns a hitting set of \(\Sigma \).

Fast implementation and running time We perform two preprocessing steps to expedite the algorithm. In the first preprocessing step, as in Sect. 3, we compute a \(\tfrac{1}{k}\)-net \(\mathsf {Q}_0\) of \(\Sigma \), set \(\mathsf {X}= \mathsf {X}\setminus \mathsf {Q}_0\) and \(\mathcal {R}= \{ R \mid R\cap \mathsf {Q}_0 = \varnothing \}\), i.e., remove the ranges hit by \(\mathsf {Q}_0\). In the second preprocessing step, we choose a set \(\mathsf {Q}_1 \subseteq \mathsf {X}\) of O(k) points so that any point in \(\mathsf {X}\setminus \mathsf {Q}_1\) intersects at most m / k ranges of \(\{ R \mid R\cap \mathsf {Q}_1 = \varnothing \}\). Since \(\Sigma \) has a hitting set of size at most k, such a set \(\mathsf {Q}_1\) always exists. We now set \(\mathsf {X}= \mathsf {X}\setminus \mathsf {Q}_1\), \(\mathcal {R}= \{ R \mid R \cap \mathsf {Q}_1 = \varnothing \}\). After the preprocessing, each point of \(\mathsf {X}\) lies in at most m / k ranges, and each range of \(\mathcal {R}\) contains at most n / k points. The algorithm returns \(\mathsf {Q}_0 \cup \mathsf {Q}_1 \cup \mathsf {Q}\) as the hitting set of \(\Sigma \).

For this algorithm, we need the net finder of (P1), the range-reporting data structure of (P3) and two other procedures:

  1. (P4)

    A dual range-reporting data structure that reports all s ranges containing a query point in \(O(\tau (m)+s)\) time, and that can be constructed in \(O(m\tau (m))\) time.

  2. (P5)

    An algorithm that performs the second preprocessing step in \(\varphi (N)\) time.

Finally, we build a balanced binary search tree \(\mathcal {T}_{\mathsf {X}}\) on \(\mathsf {X}\) and another one \(\mathcal {T}_{\mathcal {R}}\) on \(\mathcal {R}\) so that the weights of a point in \(\mathsf {X}\) and of a range in \(\mathcal {R}\) can be updated in \(O(\log n)\) and \(O(\log m)\) time, respectively, and so that a random element of \(\mathsf {X}\) or \(\mathcal {R}\) can be chosen within the same time; see e.g. [3].

In round i, we report all points of \(\overline{R}_i\) in \(O(\tau (n) + n/k)\) time using (P3) and update the weights of these points in the binary tree \(\mathcal {T}_{\mathsf {X}}\) in \(O\bigl (\tfrac{n}{k} \log n\bigr )\) time. Similarly we report the ranges of \(\mathcal {R}\) that contain \(\bar{x}_i\) in \(O(\tau (m) + m/k)\) time using (P4) and update their weights in \(\mathcal {T}_{\mathcal {R}}\) in \(O\bigl (\tfrac{m}{k} \log m\bigr )\) time. Hence, round i takes \(O(\tau (N) + k^{-1} N\log (N))\) time. Summing over all \(\mu \) rounds, adding the preprocessing time, the algorithm takes \(O(\varphi (N) + k \log (N) \tau (N) + N\log ^2 (N)) = O(\varphi (N) + N\log (N) \tau (N))\) time. Since the algorithm succeeds with probability at least 1/7, it is repeated O(1) expected times.

We perform backward exponential search on the value of k as follows. Initially, we set \(k = n\). Given the current value of k, we run the algorithm for \(\Theta (\log N)\) times. If the algorithm returns a hitting set \(\mathsf {H}\), we remember \(\mathsf {H}\), halve the value of k, and continue. Otherwise, we stop and return the smallest hitting set remembered. At most \(\log n\) values of k are examined during the backward exponential search. If \(k \ge \kappa \), the expected number of times the algorithm is run for a fixed value of k is O(1) because each iteration returns a valid hitting set with constant probability. If \(k < c\kappa /g(\kappa )\) for some constant c, no hitting set of size \(O(k g(k)) < \kappa \) exists, and thus the algorithm will terminate in \(\Theta (\log N)\) iterations. For \(k \in [c\kappa /g(\kappa ), \kappa )\), a hitting set may be found at the end of the \(\Theta (\log N)\) iterations in the worst case. Thus, the algorithm is repeated \(O(\log (N) \log (g(\kappa )))\) times in the worst case, for this range of k.

Unlike Sect. 3, the weight of points can become quite large and those of ranges can become quite small, so algebraic complexity of the algorithm can be large if we store all weights explicitly. One possible method is to scale down all the weights by a factor of 2 periodically and set the minimum weight to 1, i.e., remove the points whose weights become less than 1. If we keep the ratio of the maximum and minimum weights polynomial (e.g., \(N^4\)), the approximation of the small weights will have negligible impact to the correctness analysis. We conclude the following.

Theorem 4.6

Let \(\Sigma = (\mathsf {X}, \mathcal {R})\) be a finite range space with \(|\mathsf {X}| + |\mathcal {R}| = N\).

  1. (i)

    Suppose procedures (P1) and (P3)–(P5) exist for \(\Sigma \), then an \(O(g(\kappa ))\)-approximate hitting set of \(\Sigma \) can be computed in expected time \(O( (\varphi (N) + N\tau (N) \log N)\log N\log (g(\kappa )))\), with probability at least \(1-\frac{1}{N^{\Omega (1)}}\).

  2. (ii)

    Suppose procedures (P1) and (P3)–(P5) exist for \(\Sigma ^{\perp }\), then an \(O(g(\chi ))\)-approximate set cover of \(\Sigma \) can be computed in expected time \(O( (\varphi (N) + N\tau (N) \log N)\log N\log (g(\chi )))\), with probability at least \(1-\frac{1}{N^{\Omega (1)}}\).

Remarks

(i) The algorithm described here can be viewed as a “kinder and gentler” version of the greedy algorithm. A point lying in many “uncovered” ranges is likely to have higher weight and thus higher probability of being chosen. Instead of removing the ranges containing a chosen point, we simply halve their weights. By the time algorithm stops, the weight of every range is very small.

(ii) It is worth contrasting this algorithm with the previous one. This one is simpler and does not require a dynamic data structure for range counting to verify whether a range is light. Also, this algorithm is slightly faster for some geometric instances with one less logarithmic factor in the running time. However, it requires a range-reporting data structure for both \(\Sigma \) and \(\Sigma ^{\perp }\), and an additional preprocessing step to compute \(\mathsf {Q}_1\) so that no point lies in more than m / k ranges.

(iii) This algorithm falls under the primal-dual framework for solving linear programs (LPs). A very similar approach was used by Koufogiannakis and Young [31] to give near-linear-time PTAS for fractional packing and covering LPs. Their running time depends on the number of non-zero elements in the constraint matrix, which can be quadratic for geometric covering problems. We achieve near-linear running time using geometric data structures and careful preprocessing.

5 Fast Algorithms for Geometric Instances

In this section, we show that the algorithms described in Sects. 3 and 4 yield near-linear hitting-set and set-cover algorithms for a number of geometric range spaces, for which the required procedures (P1)–(P5) can be performed efficiently. Let \(\mathsf {X}\) be a set of n points in \(\mathbb {R}^d\) and \(\mathcal {R}\) a set of m geometric shapes such as rectangles, balls, and simplices. As mentioned in Introduction, we will use \(R \in \mathcal {R}\) to denote a shape as well as the subset \(\mathsf {X}\cap R\). It will be clear from the context which of the two we are referring to.

We mainly describe the implementations of the second algorithm, which require procedures (P1) and (P3)–(P5) and give the desired approximation ratios with high probability. One can also use the first algorithm, which may add a logarithmic factor in the running time in some cases, but it always returns a hitting set or set cover with the approximation ratio stated in the corresponding theorems.

Rectangles in 2D and 3D Let \(\mathcal {R}\) be a set of m axis-parallel rectangles in \(\mathbb {R}^d\) for \(d=2,3\). Aronov et al. [6] have shown that an \(\varepsilon \)-net of \(\Sigma \) of size \(O( (1/\varepsilon ) \log \log (1/\varepsilon ))\) can be constructed in \(O(n \log ^{d-1} n)\) randomized expected time, thus \(g(\kappa ) = \log \log (\kappa )\) and \(\varphi (N) = O(N\log ^{d-1} N)\) in procedure (P1).

For the range reporting data structure in (P3), we construct a d-dimensional range tree on \(\mathsf {X}\) in \(O( n \log ^{d-1} n)\) time and a range reporting query can be answered in \(O(\log ^{d-1} n + s)\) time, where s is the output size. For (P4), we need a dual range-reporting data structure that reports all rectangles of \(\mathcal {R}\) containing a query point. This can be done in \(O(\log ^{d-1} m + s)\) time, where s is the output size, after \(O(m \log ^{d-1} m)\) preprocessing [15]. Hence, \(\tau (N) = O(\log ^{d-1} N)\).

Finally, we also need an algorithm for (P5) that computes a set \(\mathsf {Q}_1\) of O(k) points so that no point in \(\mathsf {X}\setminus \mathsf {Q}_1\) lies in more than m / k rectangles of \(\{ R\in \mathcal {R}\mid R\cap \mathsf {Q}_1 = \varnothing \}\). This can be accomplished using a d-dimensional segment tree on \(\mathcal {R}\), which can be constructed in \(O(m \log ^d m)\) time. For each point \(p \in \mathsf {X}\), the segment tree is queried in \(O(\log ^d m)\) time to check whether p lies in more than m / k rectangles of \(\mathcal {R}\). If so, we add p to \(\mathsf {Q}_1\) and delete all the rectangles of \(\mathcal {R}\) that contain p from the segment tree. Obviously, at most k points are reported, and the time spent is \(\varphi (N) = O( N\log ^d N)\).Footnote 3

Plugging the bounds of \(\tau (\cdot ), g(\cdot )\) and \(\varphi (\cdot )\) in Theorem 4.6, we obtain the following.

Theorem 5.1

Let \(\mathsf {X}\) be a set of n points in \(\mathbb {R}^d\) and \(\mathcal {R}\) a set of m rectangles in \(\mathbb {R}^d\), for \(d = 2, 3\), with \(|\mathsf {X}| + |\mathcal {R}| = N\). An \(O(\log \log \kappa )\)-approximate hitting set of \((\mathsf {X}, \mathcal {R})\) can be computed in \(O( N\log ^{d+1} N\log \log \log \kappa )\) expected time, with probability at least \(1-\frac{1}{N^{\Omega (1)}}\).

For the set-cover problem, an \(\varepsilon \)-net of \(\Sigma ^{\perp }\) required in (P1) can be computed simply by choosing a random sample of size \(O\bigl (\tfrac{1}{\varepsilon } \log \tfrac{1}{\varepsilon }\bigr )\), with probability at least 1 / 2 [30]. Pach and Tardos [40] showed that the size of \(\varepsilon \)-net for \(\Sigma ^{\perp }\), even for \(d = 2\), is \(\Omega \bigl (\frac{1}{\varepsilon } \log \frac{1}{\varepsilon }\bigr )\) in the worst case. The procedures (P3) and (P4) remain the same as (P4) and (P3) in the hitting-set case. The set \(\mathsf {Q}_1 \subseteq \mathcal {R}\) in procedure (P5) can be computed in a similar way, except that a d-dimensional range tree, instead of a segment tree, is used. The running time is \(\varphi (N) = O(N\log ^d N)\). Hence, we obtain the following.

Theorem 5.2

Let \(\mathsf {X}\) be a set of points in \(\mathbb {R}^d\) and \(\mathcal {R}\) a set of rectangles in \(\mathbb {R}^d\), for \(d = 2, 3\), with \(|\mathsf {X}| + |\mathcal {R}| = N\). An \(O(\log \chi )\)-approximate set cover of \((\mathsf {X}, \mathcal {R})\) can be computed in \(O( N\log ^{d+1} N\log \log \chi )\) expected time, with probability at least \(1-\frac{1}{N^{\Omega (1)}}\).

Halfspaces in 3D and disks in 2D Let \(\mathsf {X}\) be a set of n points and \(\mathcal {R}\) a set of m halfspaces in \(\mathbb {R}^3\). Building on the algorithm by Matoušek [36], Chan and Tsakalidis [14] have described an \(O\bigl (n \log \frac{1}{\varepsilon }\bigr )\)-time deterministic algorithm to compute an \(\varepsilon \)-net of size \(O(1/\varepsilon )\) for the range space \(\Sigma = (\mathsf {X}, \mathcal {R})\). Therefore \(g(\cdot ) = 1\) and \(\varphi (N) = O(N\log N)\) in (P1).

Using the data structure by Afshani and Chan [1], the halfspace range-reporting query in (P3) can be answered in \(O(\log n + s)\) time, after \(O(n \log n)\) expected-time preprocessing. By duality transform, the dual range-reporting query in (P4) can be answered using the same data structure in \(O(\log m + s)\) time, after \(O(m \log m)\) preprocessing. Hence \(\tau (n) = O(\log n)\).

We describe an algorithm for computing \(\mathsf {Q}_1\) in the preprocessing step of the second algorithm. The algorithm is based on constructing shallow cuttings of a set of hyperplanes, defined by Matoušek [36]. Given parameters \(\ell , r \in [1, n]\), an \(\ell \)-shallow \(\frac{1}{r}\)-cutting of a collection \(\mathcal {H}\) of n hyperplanes in \(\mathbb {R}^d\) is a set of interior-disjoint simplices such that the interior of every simplex intersects at most \(\frac{n}{r}\) hyperplanes of \(\mathcal {H}\), and the union of the simplices cover all points that lie above \(\ell \) or fewer hyperplanes in \(\mathcal {H}\).

The algorithm works in \(\lceil \log k \rceil \) rounds.Footnote 4 At the beginning of round i, we have a set \(\mathsf {X}_i \subseteq \mathsf {X}\) of points and a subset \(\mathcal {R}_i \subseteq \mathcal {R}\) of halfspaces. Initially, in round 0, \(\mathsf {X}_i = \mathsf {X}\) and \(\mathcal {R}_i = \mathcal {R}\). The maximum depth of a point in \(\mathsf {X}_i\) is at most \(\ell _i = |\mathcal {R}_{i-1}|/2^i\). We set \(r_i = 2^{i+1}\) and compute a \(\ell _i\)-shallow \((1/r_i)\)-cutting \(\Xi _i\) of the halfspaces of \(\mathcal {R}_i\); \(|\Xi _i| = O(2^i)\). Since the maximum depth of a point of \(\mathsf {X}_i\) is \(\ell _i\), each point of \(\mathsf {X}_i\) lies in a simplex of \(\Xi _i\). For each simplex \(\Delta \in \Xi _i\), if \(\mathsf {X}_i \cap \Delta \ne \varnothing \), we choose one point from \(\Delta \) and add it to \(\mathsf {Q}_1\). Let \(P_i\) be the set of points chosen in round i, and let \(\overline{\mathcal {R}}_i = \{ R \mid R\cap P_i \ne \varnothing \}\). We set \(\mathsf {X}_{i+1} = \mathsf {X}_i \setminus P_i\) and \(\mathcal {R}_{i+1} = \mathcal {R}_i \setminus \overline{\mathcal {R}}_i\). Since each simplex \(\Delta \in \Xi _i\) intersects the boundary of at most \(|\mathcal {R}_i|/2^{i+1}\) halfspaces, which are the only possible halfspaces remaining in \(\mathcal {R}_{i+1}\) that can cover points in \(\Delta \), the maximum depth of a point in \(\mathsf {X}_{i+1}\) is \(|\mathcal {R}_i|/2^{i+1} \le m/ 2^{i+1}\) in \(\mathcal {R}_{i+1}\).

By construction, \(|\mathsf {Q}_1| = \sum _i O(2^i) = O(k)\). We now analyze the running time of the above algorithm. Using the algorithm by Chan and Tsakalidis [14], \(\Xi _i\) can be computed in \(O(|\mathcal {R}_i| \log r_i) = O(m\cdot i)\) time. A property of their algorithm is that each cell in \(\Xi _i\) is a semi-bounded vertical prism that is unbounded from below and whose boundary consists of vertical walls and a top triangle. Furthermore, the upper boundary of these prisms forms an xy-monotone triangulated surface. The xy-projection of the upper boundary is a triangulation \(\Xi _i^*\) of \(\mathbb {R}^2\). By preprocessing \(\Xi _i^*\) for planar point-location queries (see e.g. [22]), in time \(O(r_i \log r_i)\), we can determine in \(O(\log r_i) = O(i)\) time whether a query point \(q \in \mathbb {R}^3\) lies in a prism of \(\Xi _i\), and also the prism of \(\Xi _i\) that contains q if the answer is yes. Thus, \(P_i\) can be computed by going through every point \(q \in \mathsf {X}_i\) and adding q to \(P_i\) if it is the first point visited in some simplex of \(\Xi _i\). The time for computing \(P_i\) is \(O(m \cdot i + n \cdot i)\). Finally, \(\overline{\mathcal {R}}_i\) is computed using \(|\mathcal {R}_i|\) range-emptiness queries on \(P_i\), each taking \(O(\log |\mathcal {R}_i|)\) time. Therefore, summing over all \(\lceil \log k \rceil \) rounds, the expected time for computing \(\mathsf {Q}_1\) is \(O(N\log ^2 N)\).

Lemma 5.3

A set \(\mathsf {Q}_1\) of size O(k) required by \(\mathrm{(P5)}\) can be computed in \(O(N\log ^2 N)\) time.

Hence \(\varphi (N) = O( N\log ^2 N)\) in (P5). Using the duality transform, the set-cover problem in the primal space becomes the hitting-set problem in the dual space. Plugging the bounds in Theorem 4.6, we obtain the following.

Theorem 5.4

Let \(\mathsf {X}\) be a set of points in \(\mathbb {R}^3\) and \(\mathcal {R}\) a set of halfspaces in \(\mathbb {R}^3\), with \(|\mathsf {X}| + |\mathcal {R}| = N\). An O(1)-approximate hitting set or set cover of \((\mathsf {X}, \mathcal {R})\) can be computed in \(O( N\log ^3 N)\) expected time, with probability at least \(1-\frac{1}{N^{\Omega (1)}}\).

By a standard lifting transform [22], we have the following result for disks in \(\mathbb {R}^2\).

Corollary 5.5

Let \(\mathsf {X}\) be a set of points in \(\mathbb {R}^2\) and \(\mathcal {R}\) a set of disks in \(\mathbb {R}^2\), with \(|\mathsf {X}| + |\mathcal {R}| = N\). An O(1)-approximate hitting set or set cover of \((\mathsf {X}, \mathcal {R})\) can be computed in \(O( N\log ^3 N)\) expected time, with probability at least \(1-\frac{1}{N^{\Omega (1)}}\).

Fat triangles Let \(\mathsf {X}\subseteq \mathbb {R}^2\) be a set of n points and \(\mathcal {R}\) a set of \(\alpha \)-fat triangles in \(\mathbb {R}^2\), for some constant \(\alpha \ge 1\), i.e., the aspect ratio of every triangle in \(\mathcal {R}\) is at most \(\alpha \). It is well known that \(\Sigma = (\mathsf {X}, \mathcal {R})\) has constant VC-dimension and by the \(\varepsilon \)-net theorem [30], an \(\varepsilon \)-net of \(\Sigma \) in procedure (P1) can be obtained by randomly sampling \(O\bigl (\tfrac{1}{\varepsilon } \log \tfrac{1}{\varepsilon }\bigr )\) points from \(\mathsf {X}\). A range-reporting query (P3) for \(\Sigma \) can be answered in \(O(\log ^3 n + s)\) time, after \(O(n \log ^3 n)\) preprocessing [43, Sect. 4.5], and a dual range-reporting query (P4) for \(\Sigma \) can also be answered in \(O(\log ^3 m +s)\) time [2]. The set \(\mathsf {Q}_1\) in (P5) for \(\Sigma \) can be computed using the same method as that for 3D halfspaces described in Lemma 5.3. We use the results in [3, Lems. 2.1 and 2.2] for computing the shallow cutting and point location data structure, and the running time is \(O(N\log ^3N\log ^* N)\) using the improved bound on the union of fat triangles [5]. Thus, \(g(1/\varepsilon ) = \log (1/\varepsilon )\), \(\tau (N) = O(\log ^3 N)\), \(\varphi (N) = O(N\log ^3N\log ^*N)\), and we obtain the following.

Theorem 5.6

Let \(\mathsf {X}\) be a set of points in \(\mathbb {R}^2\) and \(\mathcal {R}\) a set of \(\alpha \)-fat triangles, with \(|\mathsf {X}| + |\mathcal {R}| = N\). An \(O(\log \kappa )\)-approximate hitting set of \((\mathsf {X}, \mathcal {R})\) can be computed in \(O(N\log ^5 N\log \log \kappa )\) expected time, with probability at least \(1-\frac{1}{N^{\Omega (1)}}\).

For the set cover problem, an \(\varepsilon \)-net is computed by randomly sampling \(O\bigl (\tfrac{1}{\varepsilon } \log \tfrac{1}{\varepsilon }\bigr )\) triangles from \(\mathcal {R}\), since the dual range space \(\Sigma ^{\perp }\) also has constant VC-dimension.

Next, we describe an algorithm for computing \(\mathsf {Q}_1 \subseteq \mathcal {R}\) for \(\Sigma ^{\perp }\), i.e., each range in \(\mathcal {R}\setminus \mathsf {Q}_1\) contains at most n / k points of \(\mathsf {X}\setminus \bigcup _{R \in \mathsf {Q}_1} R\). \(\mathsf {Q}_1\) is constructed in \(\lceil \log _2 k \rceil \) steps. At the beginning of step i, we have a set \(\mathcal {R}_i \subseteq \mathcal {R}\) of triangles and a set \(\mathsf {X}_i \subseteq \mathsf {X}\) of points such that for any \(\Delta \in \mathcal {R}_i\), \(|\Delta \cap \mathsf {X}_i| \le \frac{n}{2^{i-1}}\). Initially, for \(i = 1\), \(\mathsf {X}_1 = \mathsf {X}\) and \(\mathcal {R}_1 = \mathcal {R}\). Step i chooses a subset \(\mathsf {T}_i \subseteq \mathcal {R}_i\) of \(O(2^i)\) triangles such that each triangle of \(\mathsf {T}_i\) contains \(\Omega \bigl (\frac{n}{2^i}\bigr )\) points of \(\mathsf {X}_i\), and no triangle of \(\mathcal {R}_{i+1} = \mathcal {R}_i \setminus \mathsf {T}_i\) contains more than \(\frac{n}{2^i}\) points of \(\mathsf {X}_{i+1} = \mathsf {X}_i \setminus \bigcup _{\Delta \in \mathsf {T}_i} \Delta \). We construct a (9 / 8)-approximate range-counting data structure on \(\mathsf {X}_i\) with respect to triangles in \(\mathcal {R}\). That is, for a triangle \(\Delta \), it returns a value \(\mu _{\Delta }\) such that

$$\begin{aligned} |\mathsf {X}_i \cap \Delta | \le \mu _{\Delta } \le \frac{9}{8}\, |\mathsf {X}_i \cap \Delta |. \end{aligned}$$

Such a data structure can be built in \(O(n \log ^3 n)\) time by applying the range-emptiness data structure [43] as a black-box to the data structure in [7, Sect. 5], and can answer a query in \(O(\log ^4 n)\) time. We also preprocess \(\mathsf {X}_i\) for range-reporting queries with respect to \(\mathcal {R}\) — the data structure can be built in \(O(n \log ^3 n)\) time and a query can be answered in \(O(\log ^3 n +t)\) time where t is the output size.

We also maintain a set \(\mathsf {Y}_i \subset \mathsf {X}_i\) in step i. Initially \(\mathsf {Y}_i = \varnothing \), and points will be added to \(\mathsf {Y}_i\) as step i progresses. We maintain a (9/8)-approximate range counting data structure on \(\mathsf {Y}_i\) with respect to \(\mathcal {R}\). Using the so-called logarithmic method [9] on the static range-counting data structure mentioned above, a point can be inserted in \(O(\log ^4 n)\) time and a query can be answered in \(O(\log ^5 n)\) time.

We are now ready to describe step i. We process the triangles of \(\mathcal {R}_i\) one by one. For each \(\Delta \in \mathcal {R}_i\), we query the two range-counting data structures with \(\Delta \). Suppose they return \(\mu _{\Delta }\) and \(\nu _{\Delta }\) as (9/8)-approximate values of \(|\Delta \cap \mathsf {X}_i|\) and \(|\Delta \cap \mathsf {Y}_i|\), respectively. If \(\mu _{\Delta } - \nu _{\Delta } \ge \frac{n}{2^{i+1}}\), we add \(\Delta \) to \(\mathsf {T}_i\), report \(\Delta \cap \mathsf {X}_i\), and insert each point of \(\Delta \cap \mathsf {X}_i\) to \(\mathsf {Y}_i\). (Note that a point p may be inserted into \(\mathsf {Y}_i\) multiple times; we keep only one copy of p in \(\mathsf {Y}_i\).)

After \(\lceil \log _2 k \rceil \) steps, we return \(\bigcup T_i\) as \(\mathsf {Q}_1\). We now prove the correctness of the procedure and analyze its running time.

Fix a triangle \(\Delta \in \mathcal {R}_i\). In what follows, we refer to \(\mathsf {Y}_i\) as the set just before \(\Delta \) was processed. We observe that \(\mu _{\Delta } - \nu _{\Delta } \ge \frac{n}{2^{i+1}}\) implies that

$$\begin{aligned} |\Delta \cap (\mathsf {X}_i \setminus \mathsf {Y}_i)| \ge \frac{n}{2^{i+1}} - \frac{1}{8}\, |\Delta \cap \mathsf {X}_i|. \end{aligned}$$

By the invariant maintained by the algorithm, \(|\Delta \cap \mathsf {X}_i| \le \frac{n}{2^{i-1}}\). Therefore

$$\begin{aligned} |\Delta \cap (\mathsf {X}_i \setminus \mathsf {Y}_i)| \ge \frac{n}{2^{i+1}} - \frac{n}{2^{i+2}} = \frac{n}{2^{i+2}}. \end{aligned}$$

Hence \(\Delta \) contains at least \(\frac{n}{2^{i+2}}\) “new” points of \(\mathsf {X}_i\), i.e., the ones that are not already in \(\mathsf {Y}_i\). Since all points of \(\Delta \cap \mathsf {X}_i\) are inserted into \(\mathsf {Y}_i\) and a point is never deleted from \(\mathsf {Y}_i\), we can conclude that \(|\mathsf {T}_i| = O(2^i)\). Furthermore,

$$\begin{aligned} |\Delta \cap \mathsf {X}_i| \le \frac{n}{2^{i-1}} \le 8 |\Delta \cap (\mathsf {X}_i \setminus \mathsf {Y}_i)|. \end{aligned}$$

Summing over all triangles of \(\mathsf {T}_i\), \(\sum _{\Delta \in \mathsf {T}_i} |\Delta \cap \mathsf {X}_i| = O(n)\). The total time spent in reporting \(\Delta \cap \mathsf {X}_i\) and adding these points to the data structure is thus \(O(n \log ^4 n)\). We spend \(O(\log ^5 n)\) time in computing \(\mu _{\Delta }\) and \(\nu _{\Delta }\). Summing over all triangles of \(\mathcal {R}_i\), and summing over all \(i \le \lceil \log _2 k \rceil \), the total time spent in computing \(\mathsf {Q}_1\) is \(O(N\log ^6 N)\).

Finally, if \(\Delta \notin \mathsf {T}_i\), i.e., \(\mu _{\Delta } - \nu _{\Delta } \le \frac{n}{2^{i+1}}\), then

$$\begin{aligned} |\Delta \cap \mathsf {X}_{i+1}| \le |\Delta \cap (\mathsf {X}_i \setminus \mathsf {Y}_i)| \le \frac{n}{2^{i+1}} + \frac{1}{8} \,|\mathsf {Y}_i \cap \Delta | \le \frac{n}{2^{i+1}} + \frac{1}{8} \cdot \frac{n}{2^{i-1}} < \frac{n}{2^i}. \end{aligned}$$

Therefore, \(|\Delta \cap \mathsf {X}_{i+1}| \le \frac{n}{2^i}\), for all \(\Delta \in \mathcal {R}_{i+1}\), as desired.

Hence, after \(u = \lceil \log _2 k \rceil \) steps,

$$\begin{aligned} |\Delta \cap \mathsf {X}_{u+1}| \le \frac{n}{2^u} \le \frac{n}{k}. \end{aligned}$$

Putting everything together, we conclude the following:

Theorem 5.7

Let \(\mathsf {X}\) be a set of points in \(\mathbb {R}^2\) and \(\mathcal {R}\) a set of \(\alpha \)-fat triangles, with \(|\mathsf {X}| + |\mathcal {R}| = N\). An \(O(\log \chi )\)-approximate set cover of \((\mathsf {X}, \mathcal {R})\) can be computed in \(O(N\log ^7 N\log \log \chi )\) expected time, with probability at least \(1-\frac{1}{N^{\Omega (1)}}\).

Remarks

Better bounds on the sizes of primal and dual \(\varepsilon \)-nets exist for fat triangles [6]. We use the simple random-sampling approach to compute an \(\varepsilon \)-net since it can be done in near-linear time. With a careful implementation, it might be possible to implement the \(\varepsilon \)-net construction algorithms in [6] in near-linear time, in which case the approximation ratios of the hitting-set and set-cover problems for fat triangles can be improved.

6 Independent Set for Disks

Given a finite range space \(\Sigma = (\mathsf {X}, \mathcal {R})\), an independent set is a subset \(\mathcal {I}\subseteq \mathcal {R}\) of ranges such that for any \(R_1, R_2 \in \mathcal {I}\), \(R_1 \cap R_2 \cap \mathsf {X}= \varnothing \), i.e., any point of \(\mathsf {X}\) lies in at most one range of \(\mathcal {I}\). The goal is to find a maximum-size independent set (MIS).

The main observation is that, an optimal mixed strategy for Bob in the second algorithm gives a fractional solution for the independent-set problem. So we proceed as follows. Suppose we have an integer k such that \(\delta /2 < k \le \delta \), where \(\delta \) is the size of an optimal independent set. We set \(\mu = 3 k \log N\) and run the second algorithm for \(\mu \) rounds. Let \(\mathcal {F}= \langle \overline{R}_1, \ldots , \overline{R}_{\mu } \rangle \) be the sequence of ranges chosen by Bob. Let \(\widetilde{\Omega }\) be the distribution of \(\mathcal {R}\) induced by \(\mathcal {F}\), i.e., \(\Pr (R \sim \widetilde{\Omega })\) is proportional to the number of times R appears in \(\mathcal {F}\). Using Lemmas 4.1, 4.3 and 4.4, we can prove that

$$\begin{aligned} \mathsf {E}\biggl [\max _{x \in \mathsf {X}} I(x, \widetilde{\Omega })\biggr ] \le \frac{3}{k}. \end{aligned}$$

By the Markov inequality, with probability at least 1 / 2,

$$\begin{aligned} \max _{x \in \mathsf {X}} I(x, \widetilde{\Omega }) \le \frac{6}{k}. \end{aligned}$$

Let \(y_i\) be the variable corresponding to the range \(R_i \in \mathcal {R}\). Then by setting \(y_i = \tfrac{k}{6} \Pr (R_i \sim \widetilde{\Omega })\), with probability at least 1 / 2, we obtain a fractional solution for the independent-set problem with objective value \(\sum y_i = k/6\). Next, we convert the fractional solution into an integral solution using the approach of Chan and Har-Peled [13]: we process the ranges of \(\mathcal {R}\) in an arbitrary order, and maintain an independent set \(\mathcal {I}\) of ranges. A range \(R_i\) is processed as follows: if \(R_i\) does not contain any of the points lying in a range of the current \(\mathcal {I}\), then \(R_i\) is added to \(\mathcal {I}\) with probability \(y_i\). Otherwise, it is discarded.

Chan and Har-Peled [13] have shown that if \(\mathcal {R}\) is a set of disks in \(\mathcal {R}^2\) and the fractional solution has value \(\nu \), then the above rounding scheme returns an independent set of size \(\Omega (\nu )\) with constant probability. Hence, the set \(\mathcal {I}\) has size \(\Omega (k)\) with constant probability.

We perform the same preprocessing as in Sect. 4 to compute a set \(\mathsf {Q}\) of at most 2k / 5 points such that in the range space \(\bar{\Sigma }= (\mathsf {X}\setminus \mathsf {Q}, \{ R \in \mathcal {R}\mid R \cap \mathsf {Q}= \varnothing \})\), each range of \(\bar{\Sigma }\) contains O(n / k) points and each point of \(\mathsf {X}\setminus \mathsf {Q}\) lies in O(m / k) ranges. Furthermore, one can argue that if \(\Sigma \) has an independent set of size \(\delta \), then \(\bar{\Sigma }\) has one of size at least \(3\delta /5\). Hence, we run the above algorithm on \(\bar{\Sigma }\), and the total time for computing the fractional solution, including preprocessing, is \(O(N\log ^3 N)\).

To convert the fractional solution into an integral one, a dynamic insertion-only disk-emptiness data structure \(\mathcal {D}\) is used, which can be implemented by using a static disk-emptiness data structure [2] with the logarithmic method [9]. Initially, \(\mathcal {D}\) contains no point. For each range \(R_i\), we perform a disk-emptiness query in \(\mathcal {D}\) in \(O(\log ^2 n)\) time to determine whether \(R_i\) can be added to \(\mathcal {I}\); if it is added, every point in \(R_i\) is inserted into \(\mathcal {D}\) in \(O(\log ^2 n)\) amortized time. The total running time is \(O(N\log ^2 N)\). Hence, we obtain the following.

Theorem 6.1

Let \(\mathsf {X}\) be a set of points in \(\mathbb {R}^2\) and \(\mathcal {R}\) a set of disks in \(\mathbb {R}^2\), with \(|\mathsf {X}| + |\mathcal {R}| = N\). An O(1)-approximate independent set of \((\mathsf {X}, \mathcal {R})\) can be computed in \(O(N\log ^3 N)\) expected time, with probability at least \(1-\frac{1}{N^{\Omega (1)}}\).

7 Conclusion

In this paper, we presented two simple, efficient algorithms, based on the MW method, to compute small-size hitting sets and set covers for range spaces with finite VC-dimension. The first algorithm is Las Vegas and the second one is Monte Carlo. They yield near-linear time algorithms for many geometric range spaces. Our second algorithm can be used to compute an O(1)-approximate solution for the (discrete) maximum independent set problem for disks in near-linear time.

The two algorithms require slightly different data structures, so depending on the application, one algorithm can be more efficient than the other. One advantage of the second algorithm is its simplicity. In distributed settings where two parties want to collaboratively compute nearoptimal hitting-set (or more generally, solve a zero-sum game), the second algorithm only requires the two parties to exchange the randomly sampled point/range in each iteration. Suppose each point/range can be represented by O(1) bits, then the second algorithm needs \(O(\textsc {OPT}\log N)\) bits of communication in total. On the other hand, using the first algorithm, one party eventually needs to send all the ranges to the other party to determine whether all ranges are \(\varepsilon \)-heavy, and thus \(\Omega (N)\) bits of communication is required. We also remark that both algorithms can achieve \((1+\varepsilon )\) approximations in the fractional solution by updating the weights by a factor of \((1 + \varepsilon )\) or \((1-\varepsilon )\) in each step (see e.g. [31]). However, since the final step of rounding using epsilon net already loses a constant factor, we chose to double or halve the weights in each step.

We conclude by mentioning several open problems:

  1. (i)

    Can our approach be extended to compute a multi-cover of range spaces [17]?

  2. (ii)

    Is there an \(O(\log \log \chi )\)-approximate algorithm for computing the set cover when \(\mathcal {R}\) is a set of rectangles in \(\mathbb {R}^2\)?

  3. (iii)

    Let \(\mathsf {X}\) be a set of points and \(\mathcal {R}\) a set of unit squares in \(\mathbb {R}^2\). Can a small-size hitting set of \(\mathcal {R}\) be maintained under insertion/deletion of points and squares?

  4. (iv)

    Is there a fast O(1)-approximate algorithm for the independent set problem when \(\mathcal {R}\) is a set of rectangles in \(\mathbb {R}^2\). The best known algorithm computes an \(O(\log \log n)\)-approximate independent set [12].