Abstract
In this paper we study several geometric problems of color-spanning sets: given n points with m colors in the plane, selecting m points with m distinct colors such that some geometric properties of the m selected points are minimized or maximized. The geometric properties studied in this paper are the maximum diameter, the largest closest pair, the planar smallest minimum spanning tree, the planar largest minimum spanning tree and the planar smallest perimeter convex hull. We propose an O(n 1+ε) time algorithm for the maximum diameter color-spanning set problem where ε could be an arbitrarily small positive constant. Then, we present hardness proofs for the other problems and propose two efficient constant factor approximation algorithms for the planar smallest perimeter color-spanning convex hull problem.
Similar content being viewed by others
Avoid common mistakes on your manuscript.
1 Introduction
Most of the classic algorithms in computational geometry are based on the assumption that the locations of input points are known exactly. In practice, that is not always the case. Data stored in computers are imprecise in most cases. The causes of imprecise data are various, for example, the uncertain properties of a moving object (Cheng et al. 2004), measurement error, sampling error, network latency (Sistla et al. 1997; Pfoser and Jensen 1999), location privacy protection (Beresford and Stajano 2003; Cheng et al. 2006; Gedik and Liu 2005), etc. Any or a combination of these could be the causes leading to imprecision of the data.
A continuous region model such as a disc model, a square model and so on can be used to describe an imprecise point in general. There are many studies based on such a model. Löffler et al. studied the largest and the smallest convex hull problems based on the disc model, the square model, and the parallel line segment model (Kreveld and Löffler 2007). They proved the NP-hardness for some problems and proposed polynomial time algorithms for some other problems, ranging in running time from O(nlogn) to O(n 13). Other papers dealing with the problems based on the continuous region models include (Khanban and Edalat 2003; Nagai and Tokura 2000; Boissonnat and Lazard 1996) and so on.
In addition to this, a discrete point set model can be used to describe imprecise points. In the discrete point set model, a set of discrete points is used for the possible positions where an imprecise point may appear. In computational geometry, the discrete point set model is also called the color-spanning set model because the set of positions for an imprecise point is regarded as the positions with the same color. The input of color-spanning set problems are usually n points with m colors.
There has also been a lot of research studying the color-spanning set problems. Zhang et al. proposed a brute force algorithm to solve the minimum diameter color-spanning set (MDCS) problem and the running time of their algorithm is O(n m) (Zhang et al. 2009). Fleischer and Xu showed that the MDCS problem can be solved in polynomial time for the L 1 and L ∞ metrics, while it is NP-hard for L p (p=2,3,4,…) metrics (Fleischer 2010). They also proposed an efficient constant factor approximation algorithm for the problem. Abellanas et al. showed that the Farthest Color Voronoi Diagram (FCVD) is of complexity Θ(nm) if m≤n/2 (Abellanas et al. 2001a, 2001b). They proposed an efficient algorithm to construct FCVD and then they presented efficient algorithms to construct the smallest color-spanning circle, the smallest color-spanning rectangle and the narrowest color-spanning strip of arbitrary orientation with the help of FCVD. In Das et al. (2009) proposed an algorithm for identifying the smallest color-spanning corridor in O(n 2logn) time and O(n) space and an algorithm for identifying the smallest color-spanning rectangle of arbitrary orientation with an O(n 3logm) running time and O(n) space.
In the database community, a similar framework under a different name “uncertain data” has been used. An imprecise point is called an uncertain object and the different positions with the same color are regarded as the different possible instances of an uncertain object. Pei et al. have performed some research that pertains to geometric problems in this framework (Pei et al. 2007; Cheema et al. 2010; Yuen et al. 2010).
In this paper, we study five color-spanning set problems. We select m points with m colors such that
-
The diameter of the m points is maximized. This problem is called the Maximum Diameter Color-spanning Set problem (MaxDCS).
-
The distance between the closest pair of the m points is maximized. This problem is called the Largest Closest Pair Color-spanning Set problem (LCPCS).
-
The length of the planar minimum spanning tree over the m points is minimized. This problem is called the Planar Smallest Minimum Spanning Tree Color-spanning Set problem (PSMSTCS).
-
The length of the planar minimum spanning tree over the m points is maximized. This problem is called the Planar Largest Minimum Spanning Tree Color-spanning Set problem (PLMSTCS).
-
The perimeter of the planar convex hull over the m points is minimized. This problem is called the Planar Smallest Perimeter Convex Hull Color-spanning Set problem (PSPCHCS).
We discuss these five problems in the following five sections and then give conclusions in the last section.
2 An efficient algorithm for MaxDCS
Computing the diameter of a point set has a long history. By a reduction to set disjointness, it can be shown that computing the diameter of n points with the same color in R d requires Ω(nlogn) operations in the algebraic computation tree model (Preparata and Shamos 1985). However, to the best of our knowledge, the maximum diameter color-spanning set problem (see an example in Fig. 1) has not been studied yet.
Let S be a set of n points with m colors. The steps of our algorithm (MaxDCS1) to compute the maximum color-spanning diameter of S are as follows:
-
1.
Compute the maximum distance between two points of S in the plane (ignoring colors) using the algorithm in Preparata and Shamos (1985). Let these two points be p a and p b . If p a and p b have different colors, then the distance d 0 between p a and p b is the maximum color-spanning diameter D of S, and we can exit the algorithm; else, we continue to the next step.
-
2.
Let the subset of points with the same colors as that of points p a and p b be S ab (S ab also includes p a and p b ). Let \(S_{ab}'=S-S_{ab}\). We compute the distance between every pair of points, one in S ab and the other in \(S_{ab}'\), and let the resulting maximum distance be d ab .
-
3.
Let the set of points of \(S_{ab}'\) which are on or to the left of the line p a p b be set S l . Let the set of points of \(S_{ab}'\) which are on the right of the line p a p b be S r . For any point p in S r such that no point in S l has the same color of p, we put p into set \(S^{\prime}_{r}\). Symmetrically, for any point p in S l such that no point in S r has the same color of p, we put p into set \(S^{\prime}_{l}\). Let \(S_{l}^{\prime\prime}=S_{l}\setminus S_{l}'\) and \(S_{r}^{\prime\prime}=S_{r}\setminus S_{r}'\).
-
4.
Let \(S_{1}=S_{l} \cup S^{\prime}_{r}\) and compute the maximum distance d l between two points in S 1 (ignoring colors) using the algorithm in Preparata and Shamos (1985). Similarly, let \(S_{2}=S_{r} \cup S^{\prime}_{l}\) and compute the maximum distance d r between two points in S 2.
-
5.
Compute the diameter D′ of point set \(S_{l}^{\prime\prime}\cup S_{r}^{\prime\prime}\) (considering colors). The details of this step will be given in Algorithm MaxDCS2.
-
6.
Let D=max(d ab ,d l ,d r ,D′).
Next, we prove the correctness and the time complexity of our algorithm.
Lemma 1
For two points p e and p f in set S l (S r ), at least one of the four segments \(\overline{p_{a}p_{e}}\), \(\overline{p_{a}p_{f}}\), \(\overline{p_{b}p_{e}}\), \(\overline{p_{b}p_{f}}\) has a length longer than that of \(\overline{p_{e}p_{f}}\).
Proof
Since p e and p f are on the same side of line p a p b , there are only two cases for the positions of four points p a , p b , p e , p f :
-
1.
Four points p a , p b , p e , p f form a convex quadrangle (see Fig. 2(a)). If the length of segment \(\overline{p_{e}p_{f}}\) is longer than or equal to the lengths of the segments \(\overline{p_{a}p_{e}}\), \(\overline{p_{a}p_{f}}\), \(\overline{p_{b}p_{e}}\), \(\overline{p_{b}p_{f}}\), then the angles ∠p e p f p b and ∠p f p e p a are smaller than π/2. Because the length of the segment \(\overline{p_{a}p_{b}}\) is longer than or equal to all the above segments, the angles ∠p a p b p f and ∠p b p a p e are smaller than π/2. This contradicts to the fact that the sum of four angles of a convex quadrangle is 2π.
-
2.
The three points p a , p b , p e form a triangle and p f is in △p a p b p e , \(\overline{p_{e}p_{h}}\) is the height of triangle; moreover, the point p h must be on segment \(\overline{p_{a}p_{b}}\) (see Fig. 2(b)). Otherwise, the length of \(\overline{p_{a}p_{e}}\) or \(\overline{p_{b}p_{e}}\) is longer than \(\overline{p_{a}p_{b}}\). Since p f is either in the right triangle △p a p h p e or in the right triangle △p b p e p h , the length of \(\overline{p_{e}p_{f}}\) is shorter than the longer of \(\overline{p_{e}p_{a}}\) and \(\overline{p_{e}p_{b}}\).
Hence the lemma is proven. □
Let p x and p y be the two points corresponding to the maximum color-spanning diameter of S. There are five cases:
-
1.
p x is p a and p y is p b .
-
2.
p x ∈S ab and \(p_{y}\in S_{ab}'\).
-
3.
p x and p y are in set S l , or p x and p y are in set S r .
-
4.
p x is in set S l , while p y is in set \(S_{r}'\), or p x in set \(S^{\prime}_{l}\) and p y in set S r .
-
5.
p x is in set \(S_{l}^{\prime\prime}\) and p y is in set \(S_{r}^{\prime\prime}\).
Cases 1, 2 and 4 are computed at Steps 1, 2 and 4 of the algorithm MaxDCS1. We do not need to compute the diameter in Case 3 according to Lemma 1 and the fact that the color of p a (p b ) is different from those points in S l (S r ). In other words, the maximum diameter in Case 3 is less than d ab in Case 2. Step 1, 3, 4, and 6 can be finished in O(nlogn) time. Step 2 can be finished in O(k′n) time where k′ is the size of the set S ab . However, if k′ is larger than O(logn), in order to reduce the time complexity to O(nlogn), we cannot use a brute force method at Step 2. Instead, we can compute the farthest-point Voronoi diagram of set S ab in O(k′logk′) time (Berg et al. 2008). The Voronoi cell where each point p in set \(S_{ab}'\) is located can be found in O(logk′) time, and then we can compute the distance between p and corresponding site in an additional O(1) time. Therefore, Step 2 can be finished in O(nlogn) time.
How about the time complexity of Step 5 (or the algorithm MaxDCS2) (Algorithm 1)? The time consuming parts of MaxDCS2 are two FOR loops. Since those two loops are symmetric, we only analyze the first one. Notice that the difference between the consecutive steps inside the loop are point sets of two colors. In total, each point is inserted and deleted from S′ once, that means only O(n) insertions and deletions. It is shown that the diameter for n points without color after each insertion or deletion can be updated in O(n ε) time where ε could be an arbitrarily small positive constant (Agarwal et al. 1992; Eppstein 1996). Therefore, the running time of MaxDCS2 is O(n 1+ε).
In Steps 4 and 5 of the algorithm MaxDCS1, the two points realizing d l (d r or D′) could be on the same side of p a p b and could be of the same color. However, according to Lemma 1, those d l ,d r ,D′ are all shorter than d ab and we let D=max(d ab ,d l ,d r ,D′). Therefore, those d l ,d r ,D′ cannot be the real diameter D. Our algorithm considers the distances of all points with its farthest point of different color. Hence we have the following theorem.
Theorem 1
Let S be a set of n points of m colors. The maximum color-spanning diameter of S can be computed in O(n 1+ε) time, where ε could be an arbitrarily small positive constant.
3 Hardness of LCPCS
In this section, we show that LCPCS is NP-Complete even in one dimension. To facilitate the reading, we first present the proofs in the plane, under different metrics.
Theorem 2
LCPCS is NP-hard under the L p metric, for 2≤p<∞,
Proof
We prove the hardness of LCPCS by a reduction from 3-SAT. We give the proof for the L 2 metric in two dimensions and then show how to extend it to any L p metric and to higher dimensions.
Let F be a Boolean formula in conjunctive normal form with n variables x 1,…,x n and m clauses c 1,…,c m , each of size at most three. We take the following steps to construct an instance I of LCPCS.
For each Boolean variable x i ,¬x i in F, let k i be the maximum number of times that x i and ¬x i appear in F. Then we draw a rectangle V i vertically and separate it into k i −1 small rectangles horizontally. Every small rectangle has length b and height a. The diagonal length of each small rectangle is d (d 2=a 2+b 2,d<2a). We place 2k i points with different colors on the 2k i vertices of small rectangles (see Fig. 3). The 2k i points of different colors are placed on k i rows and two columns. Let A xy (1≤x≤k i ,1≤y≤2) denote the point at row x and column y. Then we draw another rectangle H i horizontally which is far away from V i and separate it into k i −1 small rectangles vertically. Every small rectangle has length c (c=d+ε) and height d−ε (see Fig. 3). We place 2k i points with different colors on the 2k i vertices of those small rectangles. Let B xy (1≤x≤2,1≤y≤k i ) denote the point at row x and column y. When x is an odd number, only A x1 and B 2x have the same distinct color, and only A x2 and B 1x have the same distinct color.Footnote 1 When x is an even number, only A x2 and B 2x have the same distinct color, and only A x1 and B 1x have the same distinct color (see Fig. 3).
Let P 1 be the set of points A xy where x is odd and y=1 or x is even and y=2. Let P 2 be the set of points A xy where x is even and y=1 or x is odd and y=2. Let Q 1 be the set of points B xy where x=1 and Q 2 be the set of points B xy where x=2. The idea is that if we want to maximize the distance between the closest pair with this configuration, we either choose the point sets P 1 and Q 1, or point sets P 2 and Q 2. The fist case represents the value 1 for this variable, and the second case represents the value 0. The rectangles corresponding to different variables lie far away enough to each other (at least 4d). The points in different variables have totally different colors.
For each clause in F, we add three points and these three points have the same distinct color. We deal with the clauses from left to right. For example, let the i-th clause be (x 1∨x 2∨¬x 3), and suppose that x 1 appears l 1−1 times, x 2 appears l 2−1 times, ¬x 3 appears l 3−1 times in the previous i−1 clauses. Then we put one point next to rectangle H 1 and it is right below the point \(B_{2l_{1}}\) with distance ε. The second point is next to rectangle H 2 and it is right below \(B_{2l_{2}}\) with distance ε. The third point is next to rectangle H 3 and it is right above \(B_{1l_{3}}\) with distance ε (see Fig. 4). One of these three points has to be selected for this color. For the distance between the closest pair to be maximum (i.e., equal to d), it is only possible when x 1 is assigned 1, or x 2 is assigned 1, or x 3 is assigned 0.
It is easy to see that F is satisfiable if and only if the largest distance between the closest pair is equal to d.
In order to extend the proof to the L p metric, the only requirement is d>a and d>b. When 1≤p<∞, we have d>a and d>b. Only when p=∞, we have d=a or d=b (then the above construction fails). Also, the hardness for two dimensions implies the hardness for higher dimensions. Therefore LCPCS is NP-hard for the L p metric, for 1≤p<∞, in two or higher dimensions. □
We next show that LCPCS is NP-hard even in one dimension.
Theorem 3
LCPCS is NP-Complete in one dimension.
Proof
We prove LCPCS is NP-Complete by a reduction from the 3-SAT problem. Of course, in this case all points lie on a line l. Given a 3-SAT formula F, we make the following construction.
For each Boolean variable x i in F, let k i be the maximum number of times that x i and ¬x i appear in F (see Fig. 5) . We first put 2k i points \(A_{1},\ldots,A_{2k_{i}}\) on the segment l i1 which is a part of the line l and the distance between two adjacent points is d/2. Then we put anther 2k i points \(B_{1},\ldots,B_{2k_{i}}\) on the segment l i2 which is another part of the line l. The distance between two points B j and B j+1 is d−ε when j is odd, or 2d when j is even. Furthermore, we require that B j+1 and A j have the same distinct color for 1≤j≤2k i −1, and B 1 and \(A_{2k_{i}}\) have the same distinct color.
Let P 1 be the set of points A j when j is odd, and P 2 be the set of points A j when j is even. Let Q 1 be the set of points B j when j is odd, and Q 2 be the set of points B j when j is even. If we want to maximize the distance between the closest pair with this configuration, we must select either the point sets P 1 and Q 1 or the point sets P 2 and Q 2. If we select P 1 and Q 1, x i is assigned 0 and if we select P 2 and Q 2, x i is assigned 1. The distance between every two variable gadgets is great enough (not less than 4d). The colors of points in different variable gadgets are totally different.
For every clause in F, we need three additional points P a , P b and P c . Certainly, only P a , P b and P c for the same clause have the same distinct color. We deal with the clauses from left to right. For example, let the i-th clause be (x u ∨x v ∨¬x w ), and suppose that x u appears h 1−1 times, x v appears h 2−1 times, ¬x w appears h 3−1 times in the previous i−1 clauses. Then we put P a on the segment l u2 and it is to the left of the point \(B_{2h_{1}-1}\) with distance ε, P b on l v2 and it is to the left of the point\(B_{2h_{2}-1}\) with distance ε, and P c on the segment l w2 and it is to the right of the point \(B_{2h_{3}}\) (see Fig. 6). The distance between the closest pair become maximum (i.e., equal to d) if and only if at least one of x u or x v or ¬x w is assigned 1.
Therefore, F is satisfiable if and only if the largest possible distance between the closest pair is equal to d. □
Actually we can prove that LCPCS is \((\frac{1}{2}+\varepsilon)\)-APX-hard, which means that it is NP-hard to find any approximation algorithm with an approximation ratio better than \(\frac{1}{2}\).
Theorem 4
LCPCS is \((\frac{1}{2}+\delta)\)-APX-hard \((0<\delta \le \frac{1}{2})\) in one dimension.
Proof
Consider Theorem 3 and Figs. 5 and 6, and let ε=d/2. Let O ∗ be the optimal solution value of LCPCS. We need to consider two cases:
-
(1)
O ∗=d if and only if the 3-SAT formula F is satisfiable.
-
(2)
O ∗≤d/2 if and only if the 3-SAT formula F is not satisfiable.
Hence, to find an approximation algorithm whose approximation ratio is better than \(\frac{1}{2}\) is equivalent to finding an algorithm for O ∗=d, which is NP-hard. The theorem is proved. □
4 Hardness of PSMSTCS
In this section, we prove the NP-completeness of the Planar Smallest Minimum Spanning Tree Color-Spanning Set (PSMSTCS) problem. First we show that this problem belongs to NP. Given an instance of the problem, we use as a certificate the m different color points chosen from n points. The verification algorithm computes the MST of those m points and check whether the length is at most L. This process can certainly be done in polynomial time.
We then prove that PSMSTCS is NP-hard by a reduction from the 3-SAT problem. The general idea is that for a 3-SAT formula, we put some colored points on the plane such that the given 3-SAT formula is satisfiable if and only if the length of the smallest color-spanning minimum spanning tree equals some given value.
First we put the point O with a distinct color at (0,0). Given a 3-SAT formula ψ, suppose that it has n 1 variables \(x_{1},x_{2},\ldots,x_{n_{1}}\) and m 1 clauses. For each variable x i (1≤i≤n 1), six points \(p_{i}^{1}\), \(p_{i}^{2}\), \(p_{i}^{3}\), \(p_{i}^{4}\), \(p_{i}^{5}\) and \(p_{i}^{6}\) are put at (400i−300,0), (400i−200,0), (400i−100,0), (400i,0), (400i−200,−100) and (400i,−100) respectively.
If the j-th literal, which is in the k-th clause in ψ, is x i (¬x i ), we denote the literal by x i,j,k (¬x i,j,k ). For every literal x i,j,k (¬x i,j,k ), where 1≤i≤n 1, j=1,2,3,… and 1≤k≤m 1, eight additional points are constructed. \(p_{i,j,k}^{7}\) is put at (400i−200,−100), \(p_{i,j,k}^{8}\) at (400i,−100), \(p_{i,j,k}^{9}\) at (0,400j−300), \(p_{i,j,k}^{10}\) at (0,400j−200), \(p_{i,j,k}^{11}\) at (0,400j−100), \(p_{i,j,k}^{12}\) at (0,400j), \(p_{i,j,k}^{13}\) at \((-\frac{1}{3m_{1}},400j-200)\) and \(p_{i,j,k}^{14}\) at \((-\frac{1}{3m_{1}},400j)\). Among those fourteen points, only \(p_{i}^{5}\) and \(p_{i}^{6}\) have the same distinct color, only \(p_{i,j,k}^{7}\) and \(p_{i,j,k}^{13}\) have the same distinct color, only \(p_{i,j,k}^{8}\) and \(p_{i,j,k}^{14}\) have the same distinct color, and every one of the other points has a distinct color. Figure 7(a) shows the gadget for the variable x 1 which is also the first literal and in the first clause.
Then, we have a set P of 6n 1+24m 1+1 points for the formula ψ. 4n 1 points (except O) are put on the x-axis, 2n 1+2×3m 1=2n 1+6m 1 points are put on the line y=−100, 4×3m 1=12m 1 (except O) points are put on the y-axis, 2×3m 1=6m 1 points are put on the line \(x=-\frac{1}{3m_{1}}\), one point is on the point O. Let T P be the planar smallest color-spanning minimum spanning tree over P.
Since only \(p_{i}^{5}\) and \(p_{i}^{6}\) have the same distinct color, only \(p_{i,j,k}^{7}\) and \(p_{i,j,k}^{13}\) have the same distinct color and only \(p_{i,j,k}^{8}\) and \(p_{i,j,k}^{14}\) have the same distinct color, we have to select either \(\{p_{i}^{5},p_{i,j,k}^{7},p_{i,j,k}^{14}\}\) or \(\{p_{i}^{6},p_{i,j,k}^{8},p_{i,j,k}^{13}\}\) to get the planar smallest color-spanning minimum spanning tree T P . Whichever set of points is selected, the length of T P is \(500n_{1}+400\times 3m_{1} +3m_{1}\times \frac{1}{3m_{1}}=500n_{1}+1200m_{1}+1\). If \(\{p_{i}^{5},p_{i,j,k}^{7},p_{i,j,k}^{14}\}\) are selected, x i is assigned 0. If \(\{p_{i}^{6},p_{i,j,k}^{8},p_{i,j,k}^{13}\}\) are selected, x i is assigned 1 (see Fig. 8).
Now, we need three additional points for every clause in the 3-SAT formula, where only the three points in the same clause have the same distinct color. If there is a literal x i1,j1,k1 which is the j1-th literal and in the k1-th clause, we put a point \(p_{i1,\lceil \frac{j1}{3} \rceil}^{\mathit{or}}\) at the position of \(p_{i1,j1,k1}^{13}\). If there is a literal ¬x i1,j1,k1, we put a point \(p_{i1,\lceil \frac{j1}{3} \rceil}^{\mathit{or}}\) at the position of \(p_{i1,j1,k1}^{14}\). Note that the points representing the literals in the same clause should have the same distinct color. For example, we assume that one clause is ¬x i1,j1,k1∨x i2,j1+1,k2∨x i3,j1+2,k3. We put three points \(p_{i1,\lceil \frac{j1}{3} \rceil}^{\mathit{or}}\), \(p_{i2,\lceil \frac{j1+1}{3} \rceil}^{\mathit{or}}\) and \(p_{i3,\lceil \frac{j1+2}{3}\rceil}^{\mathit{or}}\) at \(p_{i1,j1,k1}^{14}\), \(p_{i2,j1+1,k2}^{13}\) and \(p_{i3,j1+2,k3}^{13}\) respectively. Note that only \(p_{i1,\lceil \frac{j1}{3} \rceil}^{\mathit{or}}\), \(p_{i2,\lceil \frac{j1+1}{3}\rceil}^{\mathit{or}}\) and \(p_{i3,\lceil \frac{j1+2}{3} \rceil}^{\mathit{or}}\) have the same distinct color.
Figure 9 shows the variable gadget for the first clause (x 1∨x 2∨¬x 3) and Fig. 10 shows the gadgets for ψ=(x 1∨x 2∨¬x 3)∧(¬x 1∨x 3∨x 4). Let T opt be the planar smallest color-spanning minimum spanning tree over P plus the 3m 1 points for every clause. For literal x i,j,k (¬x i,j,k ), if a variable x i is assigned 1 (0), \(\{p_{i}^{6},p_{i,j,k}^{8},p_{i,j,k}^{13}\}\) (\(\{p_{i}^{5},p_{i,j,k}^{7},p_{i,j,k}^{14}\}\)) are selected. That means the point \(p_{i,\lceil\frac{j}{3}\rceil}^{\mathit{or}}\) can be selected to construct T opt without adding any extra length comparing with T P . Therefore, if at least one literal is assigned 1 in every clause, the length of T opt equals that of T P . If all three literals in at least one clause are assigned 0, all the three points for the clause cannot be covered by points in T P and the length of T opt is greater than that of T P . Therefore, we obtain that a 3-SAT formula ψ with n 1 variables \(x_{1},x_{2},\ldots,x_{n_{1}}\) and m 1 clauses is satisfiable if and only if the length of the corresponding T opt is equal to 500n 1+1200m 1+1. Because the 3-SAT problem is NP-Complete, we obtain the following theorem:
Theorem 5
PSMSTCS is NP-Complete.
5 Hardness of PLMSTCS
In this section, we prove the NP-completeness of the Planar Largest Minimum Spanning Tree Color-Spanning Set (PSMSTCS) problem. Again, we first show that this problem belongs to NP. Given an instance of the problem, we use as a certificate the m different color points chosen from n points. The verification algorithm computes the MST of those m points and check whether the length is at most L. This process can certainly be done in polynomial time.
We prove this problem is NP-hard by a reduction from the 3-SAT problem. First we put a point O with a distinct color at (0,0). For a 3-SAT formula ψ, suppose that it has n 1 variables \(x_{1},x_{2},\ldots,x_{n_{1}}\) and m 1 clauses. For each variable x i (1≤i≤n 1), five points \(p_{i}^{1}\), \(p_{i}^{2}\), \(p_{i}^{3}\), \(p_{i}^{4}\) and \(p_{i}^{5}\) are put at (201i−101,0), (201i−1,0), (201i,0), (201i−1,−10), (201i,−10) respectively.
If the j-th literal, which is in the k-th clause in ψ, is x i (¬x i ), we also denote the literal by x i,j,k (¬x i,j,k ). For every literal x i,j,k (or ¬x i,j,k ), ten additional points are put down. \(p_{i,j,k}^{6}\) is put at (201i−1,−9−2k), \(p_{i,j,k}^{7}\) at (201i,−9−2k), \(p_{i,j,k}^{8}\) at (201i−1,−10−2k), \(p_{i,j,k}^{9}\) at (201i,−10−2k), \(p_{i,j,k}^{10}\) at (0,400j−300), \(p_{i,j,k}^{11}\) at (0,400j−200), \(p_{i,j,k}^{12}\) at (0,400j−100), \(p_{i,j,k}^{13}\) at (0,400j), \(p_{i,j,k}^{14}\) at (−0.5,400j−200) and \(p_{i,j,k}^{15}\) at (−0.5,400j). Among those fifteen points, only \(p_{i}^{4}\) and \(p_{i}^{5}\) have the same distinct color, only \(p_{i,j,k}^{6}\) and \(p_{i,j,k}^{14}\) have the same distinct color, only \(p_{i,j,k}^{7}\) and \(p_{i,j,k}^{15}\) have the same distinct color, only \(p_{i,j,k}^{8}\) and \(p_{i,j,k}^{9}\) have the same distinct color, and every one of the other points has a distinct color. Figure 7(b) shows the gadget for the variable x 1 which is the first literal and in the first clause.
Then, we obtain a set P of 5n 1+30m 1+1 points in the plane. Let T P be the planar largest color-spanning minimum spanning tree over P. 3n 1+12m 1+1 points are on the x-axis and the y-axis, and every one of them has a distinct color.
Since only \(p_{i}^{4}\) and \(p_{i}^{5}\) have the same distinct color, only \(p_{i,j,k}^{6}\) and \(p_{i,j,k}^{14}\) have the same distinct color, only \(p_{i,j,k}^{7}\) and \(p_{i,j,k}^{15}\) have the same distinct color, only \(p_{i,j,k}^{8}\) and \(p_{i,j,k}^{9}\) have the same distinct color, we have to select either \(\{p_{i}^{4},p_{i,j,k}^{7},p_{i,j,k}^{8}, p_{i,j,k}^{14}\}\) or \(\{p_{i}^{5},p_{i,j,k}^{6},p_{i,j,k}^{9},p_{i,j,k}^{15}\}\) to obtain the planar largest color-spanning minimum spanning tree T P whose length is \((201+10)n_{1}+3m_{1}(400+0.5+2\sqrt{2})=211n_{1}+1201.5m_{1}+6\sqrt{2}m_{1}\). If \(\{p_{i}^{4},p_{i,j,k}^{7},p_{i,j,k}^{8},p_{i,j,k}^{14}\}\) are selected, x i is assigned 0. If \(\{p_{i}^{5},p_{i,j,k}^{6},p_{i,j,k}^{9},p_{i,j,k}^{15}\}\) are selected, x i is assigned 1 (see Fig. 11).
Now, we need three additional points for every clause in the 3-SAT formula, where only the three points in the same clause have the same distinct color. For every literal x i,j,k (or ¬x i,j,k ), we put one point \(p_{i,\lceil \frac{j}{3} \rceil}^{\mathit{or}}\) at the position of \(p_{i,j,k}^{14}\) (or \(p_{i,j,k}^{15}\)). Figure 12 shows the gadgets for ψ=(x 1∨x 2∨¬x 3)∧(¬x 1∨x 3∨x 4).
Let T opt be the planar largest color-spanning minimum spanning tree over P plus the 3m 1 points for every clause. For literal x i,j,k (¬x i,j,k ), if a variable x i is assigned 1 (0), \(\{p_{i}^{5},p_{i,j,k}^{6},p_{i,j,k}^{9},p_{i,j,k}^{15}\}\) (\(\{p_{i}^{4},p_{i,j,k}^{7},p_{i,j,k}^{8},p_{i,j,k}^{14}\}\)) are selected. This means that the point \(p_{i,\lceil\frac{j}{3}\rceil}^{\mathit{or}}\) can be selected to construct T opt . Therefore, if at least one literal is assigned 1 in every clause, the length of T opt equals that of T P plus m 1∗0.5. If all three literals in at least one clause are assigned 0, the length of T opt is less than that of T P plus 3m 1∗0.5. Consequently, we obtain that a 3-SAT formula ψ with n 1 variables \(x_{1},x_{2},\ldots,x_{n_{1}}\) and m 1 clauses is satisfiable if and only if the length of the corresponding T opt is equal to \(211n_{1}+1202m_{1}+6\sqrt{2}m_{1}\). Because the 3-SAT problem is NP-Complete, we obtain the following theorem:
Theorem 6
PLMSTCS is NP-Complete.
6 Hardness of PSPCHCS
Finally in Sect. 6, we prove the NP-completeness of the Planar Smallest Perimeter Convex Hull Color-Spanning Set (PSPCHCS) problem, followed with two simple approximation algorithms.
6.1 NP-completeness for PSPCHCS
First we show that this problem belongs to NP. Given an instance of the problem, we use as a certificate the m different color points chosen from n points. The verification algorithm computes the perimeter of the convex hull of those m points and check whether the perimeter is at most p. This process can certainly be done in polynomial time.
Again, we prove this problem is NP-hard by a reduction from the 3-SAT problem. For a given 3-SAT formula, suppose that it has n 1 variables \(x_{1},x_{2},\ldots,x_{n_{1}}\). First we draw a circle C. For two points a and b on C, let \(\widetilde{ab}\) be the arc of C from a to b and \(\overline{ab}\) be the line segment between a and b. Let the length of \(\overline{ab}\) be \(|\overline{ab}|\).
For each variable x j (j=1,2,…,n 1), we put 10 points \(x_{j}^{1}, x_{j}^{2},\ldots, x_{j}^{10}\) on the circle C in clockwise order (see Fig. 13). Note that in Fig. 13 these points are shown just for the clarity purpose, we can in fact screeze these points so that all the 10n 1 points can be put on the circle C. Only \(x_{j}^{2}\) and \(x_{j}^{3}\) have the same distinct color and only p has a distinct color for every one point p of the other points. \(|\overline{x_{j}^{1}x_{j}^{2}}|=|\overline{x_{j}^{3}x_{j}^{4}}|\). Then we add two additional points \(x_{j}^{11}\) and \(x_{j}^{12}\), where \(x_{j}^{11}\) is on the line segment \(\overline{x_{j}^{1}x_{j}^{2}}\) and \(x_{j}^{12}\) is on the line segment \(\overline{x_{j}^{3}x_{j}^{4}}\). Only \(x_{j}^{11}\) and \(x_{j}^{6}\) have the same distinct color, and only \(x_{j}^{12}\) and \(x_{j}^{9}\) have the same distinct color. Moreover, we have the following equations:
Δp 1≫Δp 2 ensures that either \(\overline{x_{j}^{1}x_{j}^{2}}\cup\overline{x_{j}^{2}x_{j}^{4}}\) or \(\overline{x_{j}^{1}x_{j}^{3}}\cup\overline{x_{j}^{3}x_{j}^{4}}\) is the part of CH opt . Suppose we select \(x_{j}^{1}\), \(x_{j}^{2}\) and \(x_{j}^{4}\) to be the vertices of CH opt . Then we must select \(x_{j}^{11}\), \(x_{j}^{8}\), \(x_{j}^{9}\), \(x_{j}^{10}\), \(x_{j}^{5}\) and \(x_{j}^{7}\) in order to obtain the minimum length.
If we select \(\overline{x_{j}^{1}x_{j}^{2}}\cup\overline{x_{j}^{2}x_{j}^{4}}\cup \overline{x_{j}^{4}x_{j}^{5}} \cup\overline{x_{j}^{5}x_{j}^{7}} \cup \overline{x_{j}^{7}x_{j}^{8}} \cup \overline{x_{j}^{8}x_{j}^{9}}\cup\overline{x_{j}^{9}x_{j}^{10}}\) as a part of CH opt , the Boolean variable x j is assigned 1. If we select \(\overline{x_{j}^{1}x_{j}^{3}}\cup\overline{x_{j}^{3}x_{j}^{4}}\cup \overline{x_{j}^{4}x_{j}^{5}} \cup\overline{x_{j}^{5}x_{j}^{6}} \cup \overline{x_{j}^{6}x_{j}^{7}}\cup\overline{x_{j}^{7}x_{j}^{8}}\cup \overline{x_{j}^{8}x_{j}^{10}}\) as a part of CH opt , x j is assigned 0.
Therefore, if we connect all the n 1 gadgets for the n 1 variables together to construct a convex hull CH, the perimeter of CH is n 1×(Z 1+2Z 2+ΔZ 2)+Z 4 where \(Z_{4}=2\sum_{j=1}^{n_{1}} (|\overline{x_{j}^{4}x_{j}^{5}}|+|\overline{x_{j}^{7}x_{j}^{8}}|)+\sum_{j=1}^{n_{1}-1}|\overline{x_{j}^{10}x_{j+1}^{1}}|+|\overline{x_{n_{1}}^{10}x_{1}^{1}}|\).
We need three additional points for every clause and only the three points for the same clause have the same distinct color. For every positive literal x j in one clause, we put a new point at the position of \(x_{j}^{13}\) which is inside the triangle \(\bigtriangleup x_{j}^{5}x_{j}^{6}x_{j}^{7}\) such that \(\Delta Z_{3} =|\overline{x_{j}^{5}x_{j}^{13}}|+|\overline{x_{j}^{13}x_{j}^{7}}|-|\overline{x_{j}^{5}x_{j}^{7}}|\ll\Delta Z_{2}\). For every negative literal ¬x j in one clause, we put a new point at the position of \(x_{j}^{14}\) which is inside the triangle \(\bigtriangleup x_{j}^{8}x_{j}^{9}x_{j}^{10}\) such that \(\Delta Z_{3} =|\overline{x_{j}^{8}x_{j}^{14}}|+|\overline{x_{j}^{14}x_{j}^{10}}|-|\overline{x_{j}^{8}x_{j}^{10}}|\ll\Delta Z_{2}\). For example, if there is a clause (x 1∨x 2∨¬x 3), we put three points at the position of \(x_{1}^{13}\), \(x_{2}^{13}\) and \(x_{3}^{14}\) respectively. \(x_{1}^{13}\), \(x_{2}^{13}\) and \(x_{3}^{14}\) lie in the triangle \(\bigtriangleup x_{1}^{5}x_{1}^{6}x_{1}^{7}\), \(\bigtriangleup x_{2}^{5}x_{2}^{6}x_{2}^{7}\) and \(\bigtriangleup x_{3}^{8}x_{3}^{9}x_{3}^{10}\) respectively.
We call \(x_{j}^{6}\) and \(x_{j}^{9}\) the apex-points, and call \(x_{j}^{13}\) and \(x_{j}^{14}\) the or-points. If all the three literals in one clause are assigned 0, then the corresponding apex-points will not be selected as the vertices of CH opt . One of the three or-points has to be selected as the vertex of CH opt because only they have the same distinct color. Then the perimeter of CH opt at least equals that of CH plus ΔZ 3. If a given 3-SAT formula with n 1 variables is satisfiable, the perimeter of CH opt equals n 1×(Z 1+2Z 2+ΔZ 2)+Z 4. Similarly, at least one literal of every clause is assigned 1 and the given 3-SAT formula is satisfiable if the perimeter of CH opt over the gadgets equals n 1×(Z 1+2Z 2+ΔZ 2)+Z 4. Otherwise, the perimeter of the convex hull CH opt ≥n 1×(Z 1+2Z 2+ΔZ 2)+Z 4+ΔZ 3. Because the 3-SAT problem is NP-Complete, we obtain the following theorem:
Theorem 7
PSPCHCS is NP-Complete.
6.2 Approximation algorithms for PSPCHCS
Here we present two simple approximation algorithms for the PSPCHCS problem.
6.2.1 The π-approximation algorithm
For every point p in the plane, select m−1 points p 1,…,p m−1. The colors of every two points of the m selected points are different. Moreover, p i (1≤i≤m−1) is the closest point from p among all the points which have the same color with p i . Use the m points to construct a convex hull CH p and compute the perimeter of CH p . Thus, we can obtain n perimeters and the smallest one is what we would return.
The running time for constructing a convex hull is O(mlogm) according to Graham (1972) and the running time for finding out the other m−1 points is O(n) when p is selected. Therefore, the total running time of this algorithm is O(n(n+mlogm)).
Assume that the convex hull we obtain is CH app , the length of CH app is per app , the optimal convex hull is CH opt and perimeter of CH opt is per opt . Now we prove that per app ≤πper opt .
Suppose p a and p b are the vertices of CH opt and \(\overline{p_{a}p_{b}}\) is the diameter of CH opt (see Fig. 14). Let \(r=|\overline{p_{a}p_{b}}|\). We draw a circle C with center p a and radius r. When we select p a as the above p and select the m−1 points p 1,…,p m−1 to construct \(\mathit{CH}_{p_{a}}\), \(\mathit{CH}_{p_{a}}\) must be inside C because p i (1≤i≤m−1) is the closest point from p among all the points which have the same color with p i and CH opt is inside C. Then the perimeter of \(\mathit{CH}_{p_{a}}\), denoted by \(\mathit{per}_{p_{a}}\), satisfies \(\mathit{per}_{p_{a}} \le 2\pi r\). Because \(\mathit{per}_{\mathit{app}} \le \mathit{per}_{p_{a}}\) and per opt ≥2r, per app ≤π∗per opt .
Theorem 8
There is a π-approximation algorithm for PSPCHCS with a running time O(n 2+nmlogm).
6.2.2 The \(\sqrt{2}\)-approximation algorithm
In Abellanas et al. (2001b) proposed an O(min{n(n−m)2,nm(n−m)}) time algorithm for computing the smallest perimeter axis-parallel rectangle enclosing at least one point of each color. We use the above algorithm to obtain the smallest perimeter axis-parallel rectangle R. After R is obtained, we construct a convex hull CH app over the points inside R. CH app is the convex hull we need.
Let the perimeter of R be per R and let the diagonal of R be l dia and the length of l dia be d. CH app contains at least one point of each color and CH app is totally inside R (see Fig. 15). Therefore per app ≤per R . Let R′ be the smallest perimeter axis-parallel rectangle enclosing CH opt , per R′ be the perimeter of R′, l′ dia be the diagonal of R′ and d′ be the length of l′ dia . Because R′ is the smallest perimeter axis-parallel rectangle, at least one point with a distinct color lies on every edge of R′. Then per R′≥per opt ≥2d′. We know \(\mathit{per}_{R'} \le 2\sqrt{2}d'\). Therefore, \(\mathit{per}_{R'}\ge \mathit{per}_{\mathit{opt}} \ge {\sqrt{2}\over 2}\mathit{per}_{R'}\). Hence we have \(\mathit{per}_{\mathit{opt}}\le \mathit{per}_{\mathit{app}}\le \sqrt{2} \mathit{per}_{\mathit{opt}}\), due to that per R′≥per R ≥per app .
Theorem 9
There is a \(\sqrt{2}\)-approximation algorithm for PSPCHCS with a running time O(min{n(n−m)2,nm(n−m)}).
7 Conclusions
In this paper we study several geometric problems of color-spanning sets. We propose an O(n 1+ε) time algorithm for MaxDCS and show that LCPCS is NP-Complete for the L p (1≤p<∞) metric. Moreover, we prove that LCPCS is \((\frac{1}{2}+\varepsilon)\)-APX-hard in one dimension, which means that finding an approximation algorithm whose approximation ratio is better than \(\frac{1}{2}\) is NP-hard. Then we prove that PSMSTCS, PLMSTCS and PSPCHCS are NP-Complete and propose two efficient constant factor approximation algorithms for PSPCHCS. For the future work, it will be interesting to investigate whether there exists an \(\frac{1}{2}\)-approximation for LCPCS in one dimension.
Notes
Throughout this paper, when we say that points in set S′ have the same distinct color and points in set S″ have the same distinct color, it means that points in S′ have a color c′ and points in S″ have a color c″ with c′≠c″.
References
Abellanas M, Hurtado F, Icking C, Klein R, Langetepe E, Ma L, Palop B, Sacristan V (2001a) The farthest color Voronoi diagram and related problems. In: Proceedings of the 17th European workshop on computational geometry (EWCG’01), pp 113–116
Abellanas M, Hurtado F, Icking C, Klein R, Langetepe E, Ma L, Palop B, Sacristan V (2001b) Smallest color-spanning objects. In: Proc 9th annu European sympos algorithms, pp 278–289
Agarwal PK, Eppstein D, Matousek J (1992) Dynamic half-space reporting, geometric optimization, and minimum spanning trees. In: Proceedings of the 33rd annual symposium on foundations of computer science, pp 80–89. ISBN:0-8186-2900-2
Beresford AR, Stajano F (2003) Location privacy in pervasive computing. IEEE Pervasive Comput 2(1):46–55
Berg M, Cheong O, Kreveld M, Overmars M (2008) Computational geometry: algorithms and applications, 3rd edn. Springer, Berlin
Boissonnat JD, Lazard S (1996) Convex hulls of bounded curvature. In: Proc 8th Canad conf on comput geom, pp 14–19
Cheema MA, Lin X, Wang W, Zhang W, Pei J (2010) Probabilistic reverse nearest neighbor queries on uncertain data. IEEE Trans Knowl Data Eng 22(4):550–564
Cheng R, Kalashnikov DV, Prabhakar S (2004) Querying imprecise data in moving object environments, knowledge and data engineering. IEEE Trans Knowl Data Eng 16(9):1112–1127
Cheng R, Zhang Y, Bertino E, Prabhakar S (2006) Preserving user location privacy in mobile data management infrastructures. In: Proceedings of the 6th intl workshop on privacy enhancing technologies (PET’06). LNCS, vol 4258. Springer, Berlin, pp 393–412
Das S, Goswani PP, Nandy SC (2009) Smallest color-spanning object revised. Int J Comput Geom Appl 19(5):457–478
Eppstein D (1996) Average case analysis of dynamic geometric optimization. Comput Geom Theory Appl 6(1):45–68
Fleischer R, Xu X (2010) Computing minimum diameter color-spanning sets. In: Proceedings of the 4th international workshop on frontiers in algorithmics (FAW’10). LNCS, vol 6213. Springer, Berlin, pp 285–292
Gedik B, Liu L (2005) A customizable k-anonymity model for protecting location privacy. In: Proceedings of the 25th international conference on distributed computing systems (ICDCS’05), pp 620–629
Graham RL (1972) An efficient algorithm for determining the convex hull of a finite planar set. Inf Process Lett 26:132–133
Khanban AA, Edalat A (2003) Computing Delaunay triangulation with imprecise input data. In: Proc 15th Canad conf on comput geom, pp 94–97
Kreveld MV, Löffler M (2007) Largest bounding box, smallest diameter, and related problems on imprecise points. In: Proc 10th workshop on algorithms and data structures. LNCS, vol 4619. Springer, Berlin, pp 447–458
Nagai T, Tokura N (2000) Tight error bounds of geometric problems on convex objects with imprecise coordinates. In: Jap conf on discrete and comput geom. LNCS, vol 2098. Springer, Berlin, pp 252–263
Pei J, Jiang B, Lin X, Yuan Y (2007) Probabilistic skylines on uncertain data. In: VLDB’2007, pp 15–26
Pfoser D, Jensen C (1999) Capturing the uncertainty of moving-objects representations. In: Proceedings of the 6th intl symp on advances in spatial databases (SSD’99). LNCS, vol 1651. Springer, Berlin, pp 111–131
Preparata FP, Shamos MI (1985) Computational geometry: an introduction. Springer, New York
Sistla PA, Wolfson O, Chamberlain S, Dao S (1997) Querying the uncertain position of moving objects. In: Temporal databases: research and practice. LNCS, vol 1399. Springer, Berlin, pp 310–337
Yuen SM, Tao Y, Xiao X, Pei J, Zhang D (2010) Superseding nearest neighbor search on uncertain spatial databases. IEEE Trans Knowl Data Eng 22(7):1041–1055
Zhang D, Chee YM, Mondal A, Tung AKH, Kitsuregawa M (2009) Keyword search in spatial databases: towards searching by document. In: Proceedings of the 25th IEEE international conference on data engineering (ICDE’09), pp 688–699
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Ju, W., Fan, C., Luo, J. et al. On some geometric problems of color-spanning sets. J Comb Optim 26, 266–283 (2013). https://doi.org/10.1007/s10878-012-9458-y
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10878-012-9458-y