On the Budgeted Priority p-Median Problem in High-Dimensional Euclidean Spaces

Zhang, Zhen; Huang, Zi-Yun; Tian, Zhi-Ping; Liu, Li-Mei; Xu, Xue-Song; Feng, Qi-Long

doi:10.1007/s40305-023-00533-w

On the Budgeted Priority p-Median Problem in High-Dimensional Euclidean Spaces

Published: 22 March 2024

(2024)
Cite this article

Download PDF

Access provided by Autonomous University of Puebla

Journal of the Operations Research Society of China Aims and scope Submit manuscript

On the Budgeted Priority p-Median Problem in High-Dimensional Euclidean Spaces

Download PDF

Zhen Zhang^1,2,
Zi-Yun Huang³,
Zhi-Ping Tian^1,2,
Li-Mei Liu ORCID: orcid.org/0009-0006-0473-2104^1,2,
Xue-Song Xu^1,2 &
…
Qi-Long Feng^2,4

117 Accesses
Explore all metrics

Abstract

Given a set of clients and a set of facilities with different priority levels in a metric space, the Budgeted Priority $p$-Median problem aims to open a subset of facilities and connect each client to an opened facility with the same or a higher priority level, such that the number of opened facilities associated with each priority level is no more than a given upper limit, and the sum of the client-connection costs is minimized. In this paper, we present a data reduction-based approach for limiting the solution search space of the Budgeted Priority p-Median problem, which yields a $(1+\varepsilon )$-approximation algorithm running in $O(nd\log n)+(p\varepsilon ^{-1})^{p\varepsilon ^{-O(1)}}n^{O(1)}$ time in d-dimensional Euclidean space, where $n$ is the size of the input instance and p is the maximal number of opened facilities. The previous best approximation ratio for this problem obtained in the same time is $(3+\varepsilon )$.

To Close Is Easier Than To Open: Dual Parameterization To k-Median

A PTAS for the Geometric Connected Facility Location Problem

Article 02 February 2017

Approximation Algorithms for the Lower-Bounded Knapsack Median Problem

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

Locating facilities to serve a given set of clients is a fundamental problem in the field of operational research. This problem becomes difficult when considering the following trade-off: Each client should be served by a neighboring facility, while we cannot open too many facilities.

The p-Median problem is one of the most commonly studied facility location-type problems, which balances the client-connection and facility-opening costs by limiting the number of opened facilities and minimizing the sum of distances of the clients from the corresponding facilities. It was known that the p-Median problem is NP-hard [1, 2], and considerable attention has been paid on developing approximation algorithms for it. The current best known approximation guarantees for the problem are the ratios of 2.67059 [3] and 2.406 [4] in general and Euclidean metrics, respectively.

The p-Median problem assumes that each client can be connected to an arbitrary opened facility, and thus, its algorithms fail to work in applications where clients differ in levels of required services. One such example is in the construction of supply chain networks, where the algorithms for the p-Median problem are not guaranteed to yield feasible solutions since they tend to minimize the supply chain cost without concerning the target service levels of the chain members. Motivated by such applications, Kumar and Sabharwal [5] introduced the Budgeted Priority p-Median problem, which can be formally defined as follows.

Definition 1

(Budgeted Priority p-Median [5]) An instance $({\mathcal {C}},{\mathcal {F}},{\mathcal {L}}, {\mathcal {P}},\ell )$ of the Budgeted Priority p-Median problem is specified by a set ${\mathcal {C}}$ of clients, a set ${\mathcal {F}}$ of facilities, a set ${\mathcal {L}}=\{1,\cdots , |{\mathcal {L}}|\}$ of service levels (or called priority levels), and a set ${\mathcal {P}}=\{p_1,\cdots , p_{|{\mathcal {L}}|}\}$ of real numbers no less than 1, where each $x\in {\mathcal {C}}\cup {\mathcal {F}}$ is located in a metric space and associated with a service level $\ell (x)\in {\mathcal {L}}$. A feasible solution $({\mathcal {D}},\varphi )$ to this instance opens a subset ${\mathcal {D}}\subseteq {\mathcal {F}}$ of facilities satisfying $|\{f\in {\mathcal {D}}:\ell (f)=i\}|\leqslant p_i$ for each $i\in {\mathcal {L}}$ and connects each client $c\in {\mathcal {C}}$ to an opened facility $\varphi (c)\in {\mathcal {D}}$ satisfying $\ell (c)\leqslant \ell (\varphi (c))$. The cost of such a solution is $\sum _{c\in {\mathcal {C}}}\delta (c,\varphi (c))$, where $\delta (c,\varphi (c))$ denotes the distance of c from $\varphi (c)$, and the Budgeted Priority p-Median problem aims to identify a feasible solution with minimal cost.

Despite its important role in applications involving different service levels, the Budgeted Priority p-Median problem is far less understood from the theoretical point of view compared with the standard p-Median problem (the latter is a special case of the Budgeted Priority p-Median problem where all the clients and facilities are associated with the same service level). Kumar and Sabharwal [5] showed that the natural linear programming relaxation of the Budgeted Priority p-Median problem has an unbounded integrality gap if the number of service levels is more than four, and gave a constant-factor approximation algorithm for instances with no more than two service levels. Without such a limitation on the number of service levels, whether the Budgeted Priority p-Median problem admits an O(1)-approximation algorithm still remains elusive.

In most real-world applications of facility location-type problems, the facilities providing services are much less than the clients with service requirements. Thus, taking the number of opened facilities as a fixed parameter is a feasible way for relaxing these problems in practice, as done in [6,7,8]. However, the Budgeted Priority p-Median problem has turned out to be still difficult when the maximal number of opened facilities (denoted by p) is small: Reductions to the Set Covering problem given by Guha and Khuller [1] indicate that the Budgeted Priority p-Median problem, even for the case where the service levels associated with the clients and facilities are uniform, is W[2]-hard if parameterized by p; Cohen-Addad et al. [2] showed that the approximation ratio of any FPT(p)-time (i.e., $g(p)n^{O(1)}$ time for an instance of size n and a positive function g) algorithm for the problem cannot be better than $(1+2\textrm{e}^{-1})$ under the Gap-Exponential Time Hypothesis. On the positive side, Feng et al. [9] gave a $(3+\varepsilon )$-approximation algorithm for the Budgeted Priority p-Median problem running in $(p\varepsilon ^{-1})^{O(p)}n^{O(1)}$ time based on a greedy sampling approach, which significantly improves the performance guarantees of polynomial-time algorithms.

1.1 Our Results

The negative result given by Cohen-Addad et al. [2] implies that we are unlikely to approximate the Budgeted Priority p-Median problem with a ratio better than $1+2\textrm{e}^{-1}$ in FPT time if taking p as the fixed parameter, but we can still hope to near-exactly solve the problem in Euclidean spaces using FPT-time algorithms since such a lower bound only holds in general metrics. Indeed, for continuous Euclidean instances where the services have only a single level and each facility can be opened at an arbitrary location of the space, a series of sampling-based $(1+\varepsilon )$-approximation algorithms running in FPT time are known for the Budgeted Priority p-Median problem [10,11,12]. Unfortunately, without such limitations on the Euclidean instances, the current best FPT-time approximation guarantee for the problem is still the ratio of $(3+\varepsilon )$ given in [9], and whether the problem can be near-exactly solved in FPT time remains an open question. The main result in this paper affirmatively answers this question.

Theorem 1

Given a constant $\varepsilon \in (0,1)$ and an instance $({\mathcal {C}},{\mathcal {F}},{\mathcal {L}}, {\mathcal {P}},\ell )$ of the Budgeted Priority p-Median problem with ${\mathcal {C}}\cup {\mathcal {F}}\subset {\mathbb {R}}^d$, there is a $(1+\varepsilon )$-approximation algorithm running in $O(nd\log n)+(p\varepsilon ^{-1})^{p\varepsilon ^{-O(1)}}n^{O(1)}$ time, where $n=|{\mathcal {C}}|+|{\mathcal {F}}|$ and $p=\sum _{i=1}^{|{\mathcal {L}}|}\lfloor p_i\rfloor $.

1.2 Our Techniques

As mentioned above, an Euclidean instance $({\mathcal {C}},{\mathcal {F}},{\mathcal {L}}, {\mathcal {P}},\ell )$ of the Budgeted Priority p-Median problem involving only a single service level (i.e., $|{\mathcal {L}}|=|{\mathcal {P}}|=1$) has several known FPT-time $(1+\varepsilon )$-approximation algorithms [10,11,12]. The main challenge encountered when removing such a limitation on the instance lies in the significantly increased difficulty of constructing feasible solutions: We need to simultaneously satisfy the different service requirements of the clients and the budgeted constraints defined by ${\mathcal {P}}$. This invalidates existing algorithms in Euclidean spaces.

In this paper, we deal with the Budgeted Priority p-Median problem based on a quite different data reduction-based method. Given an Euclidean instance of the Budgeted Priority p-Median problem, we show that a combination of an adapted coreset-construction approach and a Johnson–Lindenstrauss transform-type method can significantly reduce its size and map it into a low-dimensional Euclidean space. Based on such a reduced instance, we estimate the locations of the facilities opened in optimal solutions and in turn construct a small solution search space by exploring the properties of the Euclidean metrics. It is shown that enumerating over this search space yields the desired $(1+\varepsilon )$-approximation solution in FPT time.

1.3 Preliminaries

Let $\varepsilon $ denote a constant from (0, 1) and ${\mathcal {I}}=({\mathcal {C}},{\mathcal {F}},{\mathcal {L}}, {\mathcal {P}},\ell )$ be an instance of the Budgeted Priority p-Median problem satisfying $\sum _{i=1}^{|{\mathcal {L}}|}\lfloor p_i\rfloor =p$, $|{\mathcal {C}}|+|{\mathcal {F}}|=n$, and ${\mathcal {C}}\cup {\mathcal {F}}\subset {\mathbb {R}}^d$. Define $[t]=\{1,\cdots ,t\}$ for each integer $t\geqslant 1$. For each $i\in {\mathcal {L}}$, let ${\mathcal {C}}_i=\{c\in {\mathcal {C}}:\ell (c)=i\}$ and ${\mathcal {F}}_i=\{f\in {\mathcal {F}}:\ell (f)=i\}$, and define ${\mathcal {F}}^+_{i}=\bigcup _{j=i}^{|{\mathcal {L}}|}{\mathcal {F}}_j$. Let $({\mathcal {D}}^*,\varphi ^*)$ denote an optimal solution to ${\mathcal {I}}$. Without loss of generality, we can assume that $|{\mathcal {D}}^*|=p$ and let ${\mathcal {D}}^*=\{f^*_1,\cdots ,f^*_p\}$. Let $\ell ^*_i=\ell (f^*_i)$ for each $i\in [p]$. Given two data points x, y and a set ${\mathcal {X}}$ located in an Euclidean space, define $\delta (x,y)$ as the distance of x from y, and let $\delta ({\mathcal {X}},y)=\sum _{z\in {\mathcal {X}}}\delta (z,y)$ and $\delta (y,{\mathcal {X}})=\min _{z\in {\mathcal {X}}}\delta (y,z)$. Let $\textrm{opt}=\sum _{c\in {\mathcal {C}}}\delta (c,\varphi ^*(c))$ denote the cost of an optimal solution to ${\mathcal {I}}$.

The running time of our algorithm is analyzed based on the following algebraic fact.

Lemma 1

We have $\log ^j i\leqslant \max \{i,j^{O(j)}\}$ for each $i,j>1$.

Proof

If $j\geqslant \frac{\log i}{\log \log i}$, then we have $\log i\leqslant j^{O(1)}$ and hence $\log ^j i\leqslant j^{O(j)}$. If $j<\frac{\log i}{\log \log i}$, then we have $\log ^j i<\log ^{\frac{\log i}{\log \log i}}i=i$. Thus, Lemma 1 is true.

Our algorithm constructs nets for the facilities to identify a small solution search space. Such nets can be formally defined as follows.

Definition 2

($\lambda $-net [13]) Given a set ${\mathcal {X}}\subset {\mathbb {R}}^d$ and a real number $\lambda >0$, a subset ${\mathcal {X}}'\subseteq {\mathcal {X}}$ is called a $\lambda $-net of ${\mathcal {X}}$ if $\delta (x,{\mathcal {X}}'\backslash \{x\})>\lambda $ for each $x\in {\mathcal {X}}'$ and $\delta (x,{\mathcal {X}}')<\lambda $ for each $x\in {\mathcal {X}}$.

Har-Peled and Mendel [14] showed that a small-size net of a given set located in a low-dimensional Euclidean space can be constructed in near-linear time.

Lemma 2

([14]) Given a set ${\mathcal {X}}\subset {\mathbb {R}}^d$ and a real number $\lambda >0$, one can construct a $\lambda $-net of ${\mathcal {X}}$ of size no more than $\min \{(\lambda ^{-1}\max _{x,y\in {\mathcal {X}}}\delta (x,y))^d, |{\mathcal {X}}|\}$ in $|{\mathcal {X}}|\log |{\mathcal {X}}|2^{O(d)}$ time.

2 Data Reduction

In this section we reduce the size of instance ${\mathcal {I}}$ based on the stronger version of Johnson–Lindenstrauss transform [15] given by Narayanan and Nelson [16] and the coreset-construction approach given by Chen [11]. Lemma 3 and Lemma 4 are the guarantees of these two approaches.

Lemma 3

([16]) Given a set ${\mathcal {X}}\subset {\mathbb {R}}^d$ and a real number $\varepsilon \in (0,1)$, a mapping $h: {\mathbb {R}}^{d}\rightarrow {\mathbb {R}}^{{\tilde{d}}}$ satisfying ${\tilde{d}}=O(\varepsilon ^{-2}\log |{\mathcal {X}}|)$ and $\delta (h(x),h(y))\in [1,1+\varepsilon ]\delta (x,y)$ for each $x\in {\mathcal {X}}$ and $y\in {\mathbb {R}}^d$ can be constructed in $O(|{\mathcal {X}}|d\log |{\mathcal {X}}|)$ time.

We can assume that the mapping $h: {\mathcal {X}}'\rightarrow {\mathbb {R}}^{{\tilde{d}}}$ constructed by Lemma 3 is injective for each finite set ${\mathcal {X}}'\subset {\mathbb {R}}^d$. This is without loss of generality since we can make copies of the points from ${\mathbb {R}}^{{\tilde{d}}}$ that have multiple inverse images under h. Such an assumption means that we need to differentiate h(x) and h(y) for each two distinct points $x,y\in {\mathbb {R}}^d$, even for the case where h(x) and h(y) share the same value in each dimension (differentiating h(x) and h(y) is necessary since two points with the same dimension values can be associated with different service levels).

Lemma 4

([11]) Given a set ${\mathcal {X}}\subset {\mathbb {R}}^d$, a real number $\varepsilon \in (0,1)$, and an integer $k>0$, one can construct a weighted subset $\tilde{{\mathcal {X}}}\subseteq {\mathcal {X}}$ with a weight function $w:\tilde{{\mathcal {X}}}\rightarrow [1,+\infty )$ satisfying $\sum _{c\in \tilde{{\mathcal {X}}}}w(c)=|{\mathcal {X}}|$ in $O(|{\mathcal {X}}|\mathrm{{d}} k)$ time, such that $|\tilde{{\mathcal {X}}}|\leqslant \mathrm{{d}}(k\varepsilon ^{-1}\log |{\mathcal {X}}|)^{O(1)}$ and $\sum _{c\in \tilde{{\mathcal {X}}}}w(c)\delta (c,{\mathcal {D}})\in [1-\varepsilon ,1+\varepsilon ]\sum _{c\in {\mathcal {X}}}\delta (c,{\mathcal {D}})$ for each ${\mathcal {D}}\subset {\mathbb {R}}^d$ with $|{\mathcal {D}}|\leqslant k$.

The following corollary of Lemma 4 says that the coreset-construction approach given in [11] can be adapted to the Budgeted Priority p-Median problem.

Corollary 1

Given a real number $\varepsilon \in (0,1)$ and an instance ${\mathcal {I}}=({\mathcal {C}},{\mathcal {F}},{\mathcal {L}}, {\mathcal {P}},\ell )$ of the Budgeted Priority p-Median problem with $\sum _{i=1}^{|{\mathcal {L}}|}\lfloor p_i\rfloor =p$ and ${\mathcal {C}}\cup {\mathcal {F}}\subset {\mathbb {R}}^d$, one can construct a weighted subset $\tilde{{\mathcal {C}}}\subseteq {\mathcal {C}}$ with a weight function $w:\tilde{{\mathcal {C}}}\rightarrow [1,+\infty )$ satisfying $\sum _{c\in \tilde{{\mathcal {C}}}}w(c)=|{\mathcal {C}}|$ in $O(|{\mathcal {C}}|\mathrm{{d}} p)$ time, such that $|\tilde{{\mathcal {C}}}|\leqslant \mathrm{{d}}(p\varepsilon ^{-1}\log |{\mathcal {C}}|)^{O(1)}$ and $\sum _{c\in \tilde{{\mathcal {C}}}}w(c)\delta (c,{\mathcal {D}}\cap {\mathcal {F}}^+_{\ell (c)})\in [1-\varepsilon ,1+\varepsilon ]\sum _{c\in {\mathcal {C}}}\delta (c,{\mathcal {D}}\cap {\mathcal {F}}^+_{\ell (c)})$ for each ${\mathcal {D}}\subseteq {\mathcal {F}}$ with $|{\mathcal {D}}|\leqslant p$.

Proof

For each $i\in {\mathcal {L}}$, denote by $\tilde{{\mathcal {C}}}_i$ the weighted subset of ${\mathcal {C}}_i$ and $w: \tilde{{\mathcal {C}}}_i\rightarrow [1,+\infty )$ the corresponding weight function constructed by Lemma 4 when taking ${\mathcal {C}}_i$, $\varepsilon $, and p as inputs. We have $\ell (c)=i$ for each $c\in \tilde{{\mathcal {C}}}_i$ due to the definition of ${\mathcal {C}}_i$ and the fact that $\tilde{{\mathcal {C}}}_i\subseteq {\mathcal {C}}_i$. Let $\tilde{{\mathcal {C}}}=\bigcup _{i=1}^{|{\mathcal {L}}|}\tilde{{\mathcal {C}}}_i$. Lemma 4 implies that

$$\begin{aligned} \sum _{c\in \tilde{{\mathcal {C}}}}w(c) =\sum _{i=1}^{|{\mathcal {L}}|}\sum _{c\in \tilde{{\mathcal {C}}}_i}w(c) =\sum _{i=1}^{|{\mathcal {L}}|}|{\mathcal {C}}_i|=|{\mathcal {C}}| \end{aligned}$$

(1)

and

$$\begin{aligned} |\tilde{{\mathcal {C}}}|&= \sum _{i=1}^{|{\mathcal {L}}|}|\tilde{{\mathcal {C}}}_i|\leqslant \sum _{i=1}^{|{\mathcal {L}}|} \mathrm{{d}}(p\varepsilon ^{-1}\log |{\mathcal {C}}_i|)^{O(1)} \leqslant \mathrm{{d}}(p\varepsilon ^{-1}\log |{\mathcal {C}}|)^{O(1)}. \end{aligned}$$

(2)

Also by Lemma 4, we know that constructing $\tilde{{\mathcal {C}}}$ and the corresponding weight function takes

$$\begin{aligned} O \left( \sum _{i=1}^{|{\mathcal {L}}|} |{\mathcal {C}}_i|\mathrm{{d}}p \right) = O(|{\mathcal {C}}|\mathrm{{d}}p) \end{aligned}$$

(3)

time in total. Moreover, it is the case that

$$\begin{aligned} \sum _{c\in \tilde{{\mathcal {C}}}_i}w(c)\delta (c,{\mathcal {D}}\cap {\mathcal {F}}^+_{\ell (c)})&=\sum _{c\in \tilde{{\mathcal {C}}}_i}w(c)\delta (c,{\mathcal {D}}\cap {\mathcal {F}}^+_{i})\nonumber \\&\in [1-\varepsilon ,1+\varepsilon ]\sum _{c\in {\mathcal {C}}_i}\delta (c,{\mathcal {D}}\cap {\mathcal {F}}^+_{i})\nonumber \\&= [1-\varepsilon ,1+\varepsilon ]\sum _{c\in {\mathcal {C}}_i}\delta (c,{\mathcal {D}}\cap {\mathcal {F}}^+_{\ell (c)}) \end{aligned}$$

(4)

for each $i\in {\mathcal {L}}$ and ${\mathcal {D}}\subseteq {\mathcal {F}}$ with $|{\mathcal {D}}|\leqslant p$, where the first and third steps are due to the fact that $\ell (c)=i$ for each $c\in \tilde{{\mathcal {C}}}_i$, and the second step follows from Lemma 4. Summing both sides of equality (4) over $i\in {\mathcal {L}}$, we know that each ${\mathcal {D}}\subseteq {\mathcal {F}}$ with $|{\mathcal {D}}|\leqslant p$ satisfies

$$\begin{aligned} \sum _{c\in \tilde{{\mathcal {C}}}}w(c)\delta (c,{\mathcal {D}}\cap {\mathcal {F}}^+_{\ell (c)})&= \sum _{i=1}^{|{\mathcal {L}}|}\sum _{c\in \tilde{{\mathcal {C}}}_i}w(c)\delta (c,{\mathcal {D}}\cap {\mathcal {F}}^+_{\ell (c)})\nonumber \\&\in [1-\varepsilon ,1+\varepsilon ]\sum _{i=1}^{|{\mathcal {L}}|}\sum _{c\in {\mathcal {C}}_i}\delta (c,{\mathcal {D}}\cap {\mathcal {F}}^+_{\ell (c)})\nonumber \\&= [1-\varepsilon ,1+\varepsilon ]\sum _{c\in {\mathcal {C}}}\delta (c,{\mathcal {D}}\cap {\mathcal {F}}^+_{\ell (c)}), \end{aligned}$$

combining which with equality (1), inequality (2), and equality (3), we complete the proof of Corollary 1.

Based on Lemma 3 and Corollary 1, we obtain the following result about data reduction for the Budgeted Priority p-Median problem.

Lemma 5

Given a real number $\varepsilon \in (0,1)$ and an instance ${\mathcal {I}}=({\mathcal {C}},{\mathcal {F}},{\mathcal {L}}, {\mathcal {P}},\ell )$ of the Budgeted Priority p-Median problem satisfying $\sum _{i=1}^{|{\mathcal {L}}|}\lfloor p_i\rfloor =p$ and ${\mathcal {C}}\cup {\mathcal {F}}\subset {\mathbb {R}}^d$, one can construct a mapping $h: {\mathbb {R}}^d\rightarrow {\mathbb {R}}^{{\tilde{d}}}$ and a weighted set $\tilde{{\mathcal {C}}}\subset {\mathbb {R}}^{{\tilde{d}}}$ with a weight function $w: \tilde{{\mathcal {C}}}\rightarrow [1,+\infty )$ satisfying $\sum _{c\in \tilde{{\mathcal {C}}}}w(c)=|{\mathcal {C}}|$ and a level function $\ell : \tilde{{\mathcal {C}}}\rightarrow {\mathcal {L}}$ in $O(|{\mathcal {C}}|\mathrm{{d}}\log |{\mathcal {C}}|+(p \varepsilon ^{-1}|{\mathcal {C}}|)^{O(1)})$ time, such that ${\tilde{d}}=\varepsilon ^{-O(1)}(\log p+\log \log |{\mathcal {C}}|)$, $|\tilde{{\mathcal {C}}}|\leqslant (p\varepsilon ^{-1}\log |{\mathcal {C}}|)^{O(1)}$, and $\sum _{c\in \tilde{{\mathcal {C}}}}w(c)\delta (c,\{h(f): f\in {\mathcal {D}}\cap {\mathcal {F}}^+_{\ell (c)}\})\in [1-\varepsilon ,1+O(\varepsilon )]\sum _{c\in {\mathcal {C}}}\delta (c,{\mathcal {D}}\cap {\mathcal {F}}^+_{\ell (c)})$ for each ${\mathcal {D}}\subseteq {\mathcal {F}}$ with $|{\mathcal {D}}|\leqslant p$.

Proof

Denote by $h_1: {\mathbb {R}}^{d}\rightarrow {\mathbb {R}}^{d'}$ the mapping constructed by Lemma 3 when taking $\varepsilon $ and ${\mathcal {C}}$ as inputs, and let $\ell (h_1(x))=\ell (x)$ for each $x\in {\mathcal {C}}\cup {\mathcal {F}}$. Denote by ${\mathcal {C}}'$ the weighted subset of $\{h_1(c): c\in {\mathcal {C}}\}$ constructed by Corollary 1 when taking $\varepsilon $ and $(\{h_1(c): c\in {\mathcal {C}}\}, \{h_1(f): f\in {\mathcal {F}}\}, {\mathcal {L}}, {\mathcal {P}},\ell )$ as inputs, and let $w: {\mathcal {C}}'\rightarrow [1,+\infty )$ be the corresponding weight function. Let $h_2: {\mathbb {R}}^{d'}\rightarrow {\mathbb {R}}^{{\tilde{d}}}$ be the mapping constructed by Lemma 3 when taking ${\mathcal {C}}'$ and $\varepsilon $ as inputs. Define $\tilde{{\mathcal {C}}}=\{h_2(c):c\in {\mathcal {C}}'\}$. Let $h(x)=h_2(h_1(x))$ for each $x\in {\mathbb {R}}^{d}$. Let $w(c)=w(h_2^{-1}(c))$ and $\ell (c)=\ell (h_2^{-1}(c))$ for each $c\in \tilde{{\mathcal {C}}}$. The above process for data reduction is illustrated in Fig. 1.

Observe that

$$\begin{aligned} \sum _{c\in \tilde{{\mathcal {C}}}}w(c) =\sum _{c\in \tilde{{\mathcal {C}}}}w(h_2^{-1}(c)) =\sum _{c\in {\mathcal {C}}'}w(c) =|\{h_1(c):c\in {\mathcal {C}}\}|=|{\mathcal {C}}|, \end{aligned}$$

(5)

where the second and fourth steps are due to the assumption that the mappings constructed by Lemma 3 are injective, and the third step is due to Corollary 1. Moreover, Lemma 3 and Corollary 1 imply that

$$\begin{aligned} d'=O(\varepsilon ^{-2}\log |{\mathcal {C}}|) \end{aligned}$$

and

$$\begin{aligned} |\tilde{{\mathcal {C}}}|= |{\mathcal {C}}'|\leqslant d' (p\varepsilon ^{-1}\log |{\mathcal {C}}|)^{O(1)} = (p\varepsilon ^{-1}\log |{\mathcal {C}}|)^{O(1)}, \end{aligned}$$

(6)

combining which with Lemma 3 and Corollary 1, we know that constructing $\tilde{{\mathcal {C}}}$ and $h: {\mathbb {R}}^d\rightarrow {\mathbb {R}}^{{\tilde{d}}}$ takes

$$\begin{aligned} O(|{\mathcal {C}}|\mathrm{{d}}\log |{\mathcal {C}}|+|{\mathcal {C}}|d' p+|{\mathcal {C}}'|d' \log |{\mathcal {C}}'|)&\leqslant O(|{\mathcal {C}}|\mathrm{{d}}\log |{\mathcal {C}}|+(p \varepsilon ^{-1} |{\mathcal {C}}|)^{O(1)}) \end{aligned}$$

(7)

time in total, and

$$\begin{aligned} {\tilde{d}}&=O(\varepsilon ^{-2}\log |{\mathcal {C}}'|) \leqslant O(\varepsilon ^{-2}\log (p\varepsilon ^{-1}\log |{\mathcal {C}}|)) = \varepsilon ^{-O(1)}(\log p +\log \log |{\mathcal {C}}|). \end{aligned}$$

(8)

Considering a set ${\mathcal {D}}\subseteq {\mathcal {F}}$ with $|{\mathcal {D}}|\leqslant p$, we have

$$\begin{aligned}&\sum _{c\in \tilde{{\mathcal {C}}}}w(c)\delta (c,\{h(f): f\in {\mathcal {D}}\cap {\mathcal {F}}^+_{\ell (c)}\})\nonumber \\&\;\;\;\quad \quad \in [1,1+\varepsilon ] \sum _{c\in {\mathcal {C}}'}w(c)\delta (c,\{h_1(f): f\in {\mathcal {D}}\cap {\mathcal {F}}^+_{\ell (c)}\})\nonumber \\&\;\;\;\quad \quad \in [1-\varepsilon ,1+O(\varepsilon )]\sum _{c\in {\mathcal {C}}}\delta (h_1(c),\{h_1(f): f\in {\mathcal {D}}\cap {\mathcal {F}}^+_{\ell (c)}\})\nonumber \\&\;\;\;\quad \quad \in [1-\varepsilon ,1+O(\varepsilon )]\sum _{c\in {\mathcal {C}}}\delta (c,{\mathcal {D}}\cap {\mathcal {F}}^+_{\ell (c)}), \end{aligned}$$

(9)

where the first and third steps follow from Lemma 3, and the second step is due to Corollary 1. Combining formula (9) with equality (5), inequality (6), inequality (7), and inequality (8), we complete the proof of Lemma 5.

3 Our Algorithm

In this section we show how the desired $(1+\varepsilon )$-approximation solution can be constructed based on the reduced instance given by Lemma 5. We first introduce some notations. Let $h: {\mathbb {R}}^d\rightarrow {\mathbb {R}}^{{\tilde{d}}}$ and $\tilde{{\mathcal {C}}}\subset {\mathbb {R}}^{{\tilde{d}}}$ be the mapping and weighted set given by Lemma 5 when taking $\varepsilon $ and ${\mathcal {I}}$ as inputs, and let $w: \tilde{{\mathcal {C}}}\rightarrow [1,+\infty )$ and $\ell : \tilde{{\mathcal {C}}}\rightarrow {\mathcal {L}}$ denote the corresponding weight function and level function, respectively. As done in Sect. 2, we can make $h: {\mathcal {F}}\rightarrow {\mathbb {R}}^{{\tilde{d}}}$ injective by copying the points from ${\mathbb {R}}^{{\tilde{d}}}$ that have multiple inverse images. For each $f\in {\mathcal {F}}$, let $\ell (h(f))=\ell (f)$. Define $\tilde{{\mathcal {F}}}=\{h(f):f\in {\mathcal {F}}\}$ and $\tilde{{\mathcal {D}}}^*=\{h(f): f\in {\mathcal {D}}^*\}$. Let $\tilde{{\mathcal {F}}}_{i}=\{h(f): f\in {\mathcal {F}}_{i}\}$ and $\tilde{{\mathcal {F}}}^+_{i}=\{h(f): f\in {\mathcal {F}}^+_{i}\}$ for each $i\in {\mathcal {L}}$. Define $\delta _{\max }=\max _{c\in \tilde{{\mathcal {C}}}}\delta (c,\tilde{{\mathcal {D}}}^*\cap \tilde{{\mathcal {F}}}^+_{\ell (c)})$. The following lemma provides a way of estimating the cost of an optimal solution to ${\mathcal {I}}$.

Lemma 6

$\delta _{\max }\leqslant \sum _{c\in \tilde{{\mathcal {C}}}}w(c)\delta (c,\tilde{{\mathcal {D}}}^*\cap \tilde{{\mathcal {F}}}^+_{\ell (c)})\leqslant (1+O(\varepsilon ))\mathrm{{opt}}.$

Proof

Observe that

$$\begin{aligned} \delta _{\max }&\leqslant \max _{c\in \tilde{{\mathcal {C}}}}w(c)\delta (c,\tilde{{\mathcal {D}}}^*\cap \tilde{{\mathcal {F}}}^+_{\ell (c)})\nonumber \\&\leqslant \sum _{c\in \tilde{{\mathcal {C}}}}w(c)\delta (c,\tilde{{\mathcal {D}}}^*\cap \tilde{{\mathcal {F}}}^+_{\ell (c)})\nonumber \\&= \sum _{c\in \tilde{{\mathcal {C}}}}w(c)\delta (c,\{h(f): f\in {\mathcal {D}}^*\cap {\mathcal {F}}^+_{\ell (c)}\})\nonumber \\&\leqslant (1+O(\varepsilon ))\sum _{c\in {\mathcal {C}}}\delta (c,{\mathcal {D}}^*\cap {\mathcal {F}}^+_{\ell (c)})\nonumber \\&= (1+O(\varepsilon ))\mathrm{{opt}}, \end{aligned}$$

where the first step follows from the fact that $w(c)\geqslant 1$ for each $c\in \tilde{{\mathcal {C}}}$ and the definition of $\delta _{\max }$, the fourth step is due to Lemma 5, and the fifth step follows from the fact that $({\mathcal {D}}^*,\varphi ^*)$ is an optimal solution to ${\mathcal {I}}$. This completes the proof of Lemma 6.

3.1 The Partition Step

In this section, we estimate the locations of the facilities from $\tilde{{\mathcal {D}}}^*$ based on their neighboring clients. For each $i\in [p]$, let $c_i$ denote the client from $\{c\in \tilde{{\mathcal {C}}}:\ell (c)\leqslant \ell ^*_i\}$ nearest to $h(f^*_i)$, and define ${\mathcal {B}}_i(\alpha )=\{f\in \tilde{{\mathcal {F}}}: \delta (c_i,f)\leqslant \alpha \}$ for each $\alpha >0$. We partition the facilities from $\tilde{{\mathcal {F}}}$ into a set of ring cells centered at the clients from $\{c_1,\cdots ,c_p\}$: Let ${\mathcal {Q}}(i,0)={\mathcal {B}}_i(\varepsilon \delta _{\max }n^{-1})$ and ${\mathcal {Q}}(i,j)={\mathcal {B}}_i(\varepsilon (1+\varepsilon )^{j}\delta _{\max }n^{-1})\backslash {\mathcal {B}}_i(\varepsilon (1+\varepsilon )^{j-1}\delta _{\max }n^{-1})$ for each $i\in [p]$ and $j\in [\lceil \varepsilon ^{-2}\log n\rceil ]$, as shown in Fig. 2a. We have

$$\begin{aligned} \varepsilon (1+\varepsilon )^{\lceil \varepsilon ^{-2}\log n\rceil }\delta _{\max }n^{-1}&> \delta _{\max } = \max _{c\in \tilde{{\mathcal {C}}}}\delta (c,\tilde{{\mathcal {D}}}^*\cap \tilde{{\mathcal {F}}}^+_{\ell (c)}) \geqslant \delta (c_i,h(f^*_i)) \end{aligned}$$

for each $i\in [p]$, where the second step follows from the definition of $\delta _{\max }$, and the third step follows from the fact that $h(f^*_i)\in \tilde{{\mathcal {D}}}^*\cap \tilde{{\mathcal {F}}}^+_{\ell (c_i)}$ (due to the definition of $c_i$). Combining this inequality with the definition of ${\mathcal {Q}}(i,j)$, we know that there is an integer $j\in [0, \lceil \varepsilon ^{-2}\log n\rceil ]$ satisfying $h(f^*_i)\in {\mathcal {Q}}(i,j)$ for each $i\in [p]$. Denote by ${\mathcal {Q}}_i$ such a set ${\mathcal {Q}}(i,j)$ containing $h(f^*_i)$. Our idea for constructing solutions to ${\mathcal {I}}$ in FPT(p) time is to guess the sets ${\mathcal {Q}}_1,\cdots ,{\mathcal {Q}}_p$, such that we can limit the search range of the opened facilities to the inverse images of the members of $\bigcup _{i=1}^{p}{\mathcal {Q}}_i$ under h. The following result implies that we can guess the sets ${\mathcal {Q}}_1,\cdots ,{\mathcal {Q}}_p$ with a $(p\varepsilon ^{-1})^{O(p)}n^{O(1)}$ multiplicative overhead in the running time of our algorithm.

Lemma 7

Given the reduced instance $(\tilde{{\mathcal {C}}}, \tilde{{\mathcal {F}}}, {\mathcal {L}}, {\mathcal {P}}, \ell )$ of the Budgeted Priority p-Median problem, the set $\{{\mathcal {Q}}_1,\cdots ,{\mathcal {Q}}_p\}$ can be guessed by enumerating over no more than $(p\varepsilon ^{-1})^{O(p)}n^{O(1)}$ items.

Proof

We have $|\tilde{{\mathcal {F}}}|=|{\mathcal {F}}|\leqslant n$ and $|\tilde{{\mathcal {C}}}|\leqslant (p\varepsilon ^{-1}\log n)^{O(1)}$ due to Lemma 5. Using the definitions of $\delta _{\max }$ and $c_i$, it can be seen that there are $|\tilde{{\mathcal {F}}}|\cdot |\tilde{{\mathcal {C}}}|\leqslant n(p\varepsilon ^{-1}\log n)^{O(1)}$ choices of the value of $\delta _{\max }$ and $|\tilde{{\mathcal {C}}}|^p\leqslant (p\varepsilon ^{-1}\log n)^{O(p)}$ choices of the set $\{c_1,\cdots ,c_p\}$. Consequently, the definition of ${\mathcal {Q}}(i,j)$ indicates that we have no more than $n(p\varepsilon ^{-1}\log n)^{O(p)}$ choices of the set $\{{\mathcal {Q}}(1,0),\cdots ,{\mathcal {Q}}(p,\lceil \varepsilon ^{-2}\log n\rceil )\}$. Combining this with the fact that each $i\in [p]$ satisfies ${\mathcal {Q}}_i\in \{{\mathcal {Q}}(i,0),\cdots ,{\mathcal {Q}}(i,\lceil \varepsilon ^{-2}\log n\rceil )\}$, the number of choices of the set $\{{\mathcal {Q}}_1,\cdots ,{\mathcal {Q}}_p\}$ can be upper-bounded by $n(p\varepsilon ^{-1}\log n)^{O(p)}\cdot (\varepsilon ^{-2}\log n)^p$, which is no more than $(p\varepsilon ^{-1})^{O(p)}n^{O(1)}$ due to Lemma 1. Thus, Lemma 7 is true.

3.2 The Construction Step

Given the set $\{{\mathcal {Q}}_1,\cdots ,{\mathcal {Q}}_p\}$, our algorithm constructs a net of ${\mathcal {Q}}_i\cap \tilde{{\mathcal {F}}}_j$ for each $i\in [p]$ and $j\in {\mathcal {L}}$ and considers the inverse images of the members of the net under h as candidates for the opened facilities (see Fig. 2b), as described in Algorithm 1. Define ${\mathcal {J}}$, ${\mathbb {D}}$, and ${\mathcal {N}}(i,j)$ in the same way as Algorithm 1 for each $i\in [p]$ and $j\in {\mathcal {L}}$. Let $({\mathcal {D}}^\dag ,\varphi ^\dag )$ denote the solution returned by the algorithm, and let $\tilde{{\mathcal {D}}}^\dag =\{h(f):f\in {\mathcal {D}}^\dag \}$. The following lemma implies that $({\mathcal {D}}^\dag ,\varphi ^\dag )$ can be constructed in FPT(p) time.

Lemma 8

We can obtain the solution $({\mathcal {D}}^\dag ,\varphi ^\dag )$ to ${\mathcal {I}}$ in no more than $O(nd\log n)+(p\varepsilon ^{-1})^{p\varepsilon ^{-O(1)}}n^{O(1)}$ time.

Proof

Define ${\mathcal {G}}=\{(i\in [p], j\in {\mathcal {L}}): |{\mathcal {Q}}_i\cap \tilde{{\mathcal {F}}}_j|>1\}$. The definition of ${\mathcal {N}}(i,j)$ and Lemma 2 imply that

$$\begin{aligned} |{\mathcal {N}}(i,j)|&\leqslant \left( \frac{\max _{x,y\in {\mathcal {Q}}_i\cap \tilde{{\mathcal {F}}}_j}\delta (x,y)}{\varepsilon \max _{x,y\in {\mathcal {Q}}_i\cap \tilde{{\mathcal {F}}}_j}\delta (x,y)}\right) ^{{\tilde{d}}} =\varepsilon ^{-{\tilde{d}}} \leqslant (p\log n)^{\varepsilon ^{-O(1)}} \end{aligned}$$

(10)

for each $(i,j)\in {\mathcal {G}}$, where the third step follows from the fact that ${\tilde{d}}\leqslant \varepsilon ^{-O(1)}(\log p+\log \log n)$ (due to Lemma 5). Consequently, we know that

$$\begin{aligned} |{\mathcal {J}}|&=\sum _{(i,j)\in {\mathcal {G}}}|{\mathcal {N}}(i,j)|+ p|{\mathcal {L}}|- |{\mathcal {G}}|\nonumber \\&\leqslant |{\mathcal {L}}|(p\log n)^{\varepsilon ^{-O(1)}}\nonumber \\&\leqslant (p\log n)^{\varepsilon ^{-O(1)}}\sum _{i=1}^{|{\mathcal {L}}|}\lfloor p_i\rfloor \nonumber \\&= (p\log n)^{\varepsilon ^{-O(1)}}, \end{aligned}$$

(11)

where the first step follows from the definition of ${\mathcal {J}}$, the second step is due to inequality (10), the third step follows from the fact that $p_i\geqslant 1$ for each $i\in {\mathcal {L}}$, and the fourth step is derived from the fact that $p=\sum _{i=1}^{|{\mathcal {L}}|}\lfloor p_i\rfloor $.

Observe that Algorithm 1 constructs a net of ${\mathcal {Q}}_i\cap \tilde{{\mathcal {F}}}_j$ for each $(i,j)\in {\mathcal {G}}$ according to the maximal distance between the facilities from ${\mathcal {Q}}_i\cap \tilde{{\mathcal {F}}}_j$. Lemma 2 and the fact that ${\tilde{d}}\leqslant \varepsilon ^{-O(1)}(\log p+\log \log n)$ imply that this takes no more than

$$\begin{aligned} \sum _{(i,j)\in {\mathcal {G}}}2^{O({\tilde{d}})}|{\mathcal {Q}}_i\cap \tilde{{\mathcal {F}}}_j|^{O(1)}&\leqslant 2^{O({\tilde{d}})}n^{O(1)} \leqslant (p\log n)^{\varepsilon ^{-O(1)}}n^{O(1)} \end{aligned}$$

(12)

time in total. After this, Algorithm 1 enumerates all subsets $\tilde{{\mathcal {D}}}\subseteq {\mathcal {J}}$ with $|\tilde{{\mathcal {D}}}\cap \tilde{{\mathcal {F}}}_j|\leqslant p_i$, $\forall \, j\in {\mathcal {L}}$ and selects the one corresponding to the solution with minimal cost to the reduced instance $(\tilde{{\mathcal {C}}},\tilde{{\mathcal {F}}},{\mathcal {L}},{\mathcal {P}},\ell )$. Equality $\sum _{i=1}^{|{\mathcal {L}}|}\lfloor p_i\rfloor =p$ implies that each subset considered in this enumeration is of size no more than p, and we can complete this enumeration in

$$\begin{aligned} O(|{\mathcal {J}}|^p |\tilde{{\mathcal {C}}}|{\tilde{d}}p)&\leqslant (p\log n)^{p\varepsilon ^{-O(1)}}|\tilde{{\mathcal {C}}}|{\tilde{d}}p = (p\log n)^{p\varepsilon ^{-O(1)}} \end{aligned}$$

(13)

time, where the first step follows from inequality (11), and the second step follows from the fact that $|\tilde{{\mathcal {C}}}|\leqslant (p\varepsilon ^{-1}\log n)^{O(1)}$ and ${\tilde{d}}\leqslant \varepsilon ^{-O(1)}(\log p+\log \log n)$ (due to Lemma 5). Using inequality (12) and inequality (13), we know that Algorithm 1 runs in no more than $(p\log n)^{p\varepsilon ^{-O(1)}}n^{O(1)}$ time.

Lemma 5 implies that constructing the mapping $h:{\mathbb {R}}^d\rightarrow {\mathbb {R}}^{{\tilde{d}}}$ and the reduced instance $(\tilde{{\mathcal {C}}},\tilde{{\mathcal {F}}},{\mathcal {L}},{\mathcal {P}},\ell )$ takes $O(nd\log n+(p\varepsilon ^{-1}n)^{O(1)})$ time, and Lemma 7 says that there are at most $(p\varepsilon ^{-1})^{O(p)}n^{O(1)}$ choices of the set $\{{\mathcal {Q}}_1,\cdots ,{\mathcal {Q}}_p\}$. Therefore, all inputs of Algorithm 1 can be known with a $(p\varepsilon ^{-1})^{O(p)}n^{O(1)}$ multiplicative and an $O(nd\log n+(p\varepsilon ^{-1}n)^{O(1)})$ additive overheads in the running time of the algorithm. Combining this with the fact that Algorithm 1 runs in at most $(p\log n)^{p\varepsilon ^{-O(1)}}n^{O(1)}$ time, we know that constructing $({\mathcal {D}}^\dag ,\varphi ^\dag )$ takes no more than $O(nd\log n)+(p\log n)^{p\varepsilon ^{-O(1)}}n^{O(1)}$ time, which can be upper-bounded by $O(nd\log n)+(p\varepsilon ^{-1})^{p\varepsilon ^{-O(1)}}n^{O(1)}$ due to Lemma 1, as desired.

We now consider the cost of $({\mathcal {D}}^\dag ,\varphi ^\dag )$. The following lemma says that it is a $(1+O(\varepsilon ))$-approximation solution to ${\mathcal {I}}$.

Lemma 9

$\sum _{c\in {\mathcal {C}}}\delta (c,\varphi ^\dag (c))\leqslant (1+O(\varepsilon ))\mathrm{{opt}}.$

Proof

The fact that $h(f^*_i)\in {\mathcal {Q}}_i$ and $\ell (h(f^*_i))=\ell (f^*_i)=\ell ^*_i$ implies that $h(f^*_i)\in {\mathcal {Q}}_i\cap \tilde{{\mathcal {F}}}_{\ell ^*_i}$ and thus $|{\mathcal {Q}}_i\cap \tilde{{\mathcal {F}}}_{\ell ^*_i}|\geqslant 1$ for each $i\in [p]$. Given a real number $i\in [p]$, we, respectively, consider the following three cases: (1) $|{\mathcal {Q}}_i\cap \tilde{{\mathcal {F}}}_{\ell ^*_i}|= 1$, (2) $|{\mathcal {Q}}_i\cap \tilde{{\mathcal {F}}}_{\ell ^*_i}|>1$ and ${\mathcal {Q}}_i={\mathcal {Q}}(i,0)$, and (3) $|{\mathcal {Q}}_i\cap \tilde{{\mathcal {F}}}_{\ell ^*_i}|>1$ and ${\mathcal {Q}}_i\ne {\mathcal {Q}}(i,0)$.

For case (1), the facility from ${\mathcal {Q}}_i\cap \tilde{{\mathcal {F}}}_{\ell ^*_i}$ is added to ${\mathcal {J}}$ by Algorithm 1, and the fact that $h(f^*_i)\in {\mathcal {Q}}_i\cap \tilde{{\mathcal {F}}}_{\ell ^*_i}$ indicates that

$$\begin{aligned} \{h(f^*_i)\}={\mathcal {Q}}_i\cap \tilde{{\mathcal {F}}}_{\ell ^*_i}\subseteq {\mathcal {J}}. \end{aligned}$$

(14)

For the case where $|{\mathcal {Q}}_i\cap \tilde{{\mathcal {F}}}_{\ell ^*_i}|>1$, it can be seen that Algorithm 1 adds the facilities from a net ${\mathcal {N}}(i,\ell ^*_i)$ of ${\mathcal {Q}}_i\cap \tilde{{\mathcal {F}}}_{\ell ^*_i}$ to ${\mathcal {J}}$, and the fact that ${\mathcal {N}}(i,\ell ^*_i)\subseteq {\mathcal {Q}}_i\cap \tilde{{\mathcal {F}}}_{\ell ^*_i}$ implies that $({\mathcal {Q}}_i\cap \tilde{{\mathcal {F}}}_{\ell ^*_i})\cap {\mathcal {J}}\ne \varnothing $. For case (2), let $f'_i\in {\mathcal {F}}$ be a facility satisfying $h(f'_i)\in ({\mathcal {Q}}_i\cap \tilde{{\mathcal {F}}}_{\ell ^*_i})\cap {\mathcal {J}}$. We have

$$\begin{aligned} \delta (h(f'_i),h(f^*_i))&\leqslant \delta (c_i,h(f'_i))+\delta (c_i,h(f^*_i))\nonumber \\&\leqslant 2\max _{f\in {\mathcal {Q}}_i}\delta (c_i,f)\nonumber \\&\leqslant \frac{2\varepsilon }{n}\delta _{\max }\nonumber \\&\leqslant \frac{2\varepsilon }{n}(1+O(\varepsilon ))\hbox {opt}, \end{aligned}$$

(15)

where the first step is due to triangle inequality, the second step follows from the fact that $\{h(f'_i),h(f^*_i)\}\subseteq {\mathcal {Q}}_i$, the third step is due to the assumption that ${\mathcal {Q}}_i={\mathcal {Q}}(i,0)$ and the definition of ${\mathcal {Q}}(i,0)$, and the fourth step follows from Lemma 6.

For case (3), let $h(f'_i)$ be the facility from ${\mathcal {N}}(i,\ell ^*_i)$ nearest to $h(f^*_i)$. We have $h(f'_i)\in {\mathcal {N}}(i,\ell ^*_i)\subseteq {\mathcal {Q}}_i\cap \tilde{{\mathcal {F}}}_{\ell ^*_i}$, and

$$\begin{aligned} \delta (h(f'_i),h(f^*_i))&\leqslant \varepsilon \max _{x,y\in {\mathcal {Q}}_i}\delta (x,y)\nonumber \\&\leqslant 2\varepsilon \max _{f\in {\mathcal {Q}}_i}\delta (c_i,f)\nonumber \\&\leqslant 2\varepsilon (1+\varepsilon )\delta (c_i,h(f^*_i))\nonumber \\&= O(\varepsilon )\delta (c_i,h(f^*_i)), \end{aligned}$$

(16)

where the first step follows from the fact that $h(f'_i)$ is the nearest facility to $h(f^*_i)$ in a $\max _{x,y\in {\mathcal {Q}}_i}\varepsilon \delta (x,y)$-net of ${\mathcal {Q}}_i\cap \tilde{{\mathcal {F}}}_{\ell ^*_i}$, the second step follows from triangle inequality, and the third step is due to the assumption that ${\mathcal {Q}}_i\ne {\mathcal {Q}}(i,0)$ (which says that there exists a real number $j\in [\lceil \varepsilon ^{-2}\log n\rceil ]$ satisfying ${\mathcal {Q}}_i={\mathcal {Q}}(i,j)$) and the definition of ${\mathcal {Q}}(i,j)$ for $j>0$.

The argument above implies the existence of a set ${\mathcal {D}}'=\{f'_1,\cdots ,f'_p\}\subseteq {\mathcal {F}}$ with $h(f'_i)\in {\mathcal {J}}\cap \tilde{{\mathcal {F}}}_{\ell ^*_i}$ for each $i\in [p]$, which satisfies

$$\begin{aligned} \delta (h(f'_i),h(f^*_i)) \leqslant \frac{2\varepsilon }{n}(1+O(\varepsilon ))\hbox {opt} + O(\varepsilon )\delta (c_i,h(f^*_i)) \end{aligned}$$

(17)

for each $i\in [p]$ due to equality (14), inequality (15), and inequality (16). Define $\tilde{{\mathcal {D}}}'=\{h(f):f\in {\mathcal {D}}'\}$. The fact that $h(f'_i)\in \tilde{{\mathcal {F}}}_{\ell ^*_i}$ for each $i\in [p]$ and $({\mathcal {D}}^*,\varphi ^*)$ is a feasible solution to ${\mathcal {I}}$ implies that inequality $|\tilde{{\mathcal {D}}}'\cap \tilde{{\mathcal {F}}}_j|=|{\mathcal {D}}^*\cap {\mathcal {F}}_j|\leqslant p_j$ holds for each $j\in {\mathcal {L}}$. Combining this with the definition of ${\mathbb {D}}$, we know that $\tilde{{\mathcal {D}}}'\in {\mathbb {D}}$.

For each $i\in [p]$, define $\tilde{{\mathcal {C}}}^*_i=\{c\in \tilde{{\mathcal {C}}}:\arg \min _{f\in \tilde{{\mathcal {D}}}^*\cap {\tilde{{\mathcal {F}}}^+_{\ell (c)}}}\delta (c,f)=h(f^*_i)\}$. It is the case that

$$\begin{aligned} \sum _{i=1}^p\sum _{c\in \tilde{{\mathcal {C}}}^*_i}w(c)\delta (h(f'_i),h(f^*_i))&\leqslant \sum _{i=1}^p\sum _{c\in \tilde{{\mathcal {C}}}^*_i}w(c) \left( \frac{2\varepsilon }{n}(1+O(\varepsilon ))\mathrm{{opt}} + O(\varepsilon )\delta (c_i,h(f^*_i))\right) \nonumber \\&\leqslant O(\varepsilon )\mathrm{{opt}} +O(\varepsilon )\sum _{i=1}^p\sum _{c\in \tilde{{\mathcal {C}}}^*_i}w(c)\delta (c_i,h(f^*_i))\nonumber \\&\leqslant O(\varepsilon )\mathrm{{opt}} +O(\varepsilon )\sum _{i=1}^p\sum _{c\in \tilde{{\mathcal {C}}}^*_i}w(c)\delta (c,h(f^*_i))\nonumber \\&= O(\varepsilon )\mathrm{{opt}} +O(\varepsilon )\sum _{c\in \tilde{{\mathcal {C}}}}w(c)\delta (c,\tilde{{\mathcal {D}}}^*\cap {\tilde{{\mathcal {F}}}^+_{\ell (c)}})\nonumber \\&\leqslant O(\varepsilon )\mathrm{{opt}}, \end{aligned}$$

(18)

where the first step follows from inequality (17), the second step is derived from the fact that $\sum _{c\in \tilde{{\mathcal {C}}}}w(c)=|{\mathcal {C}}|<n$ (due to Lemma 5), the third step follows from the fact that $c_i$ is the client from $\{c\in \tilde{{\mathcal {C}}}: \ell (c)\leqslant \ell ^*_i\}$ nearest to $h(f^*_i)$ for each $i\in [p]$, the fourth step is due to the definition of $\tilde{{\mathcal {C}}}^*_i$, and the fifth step is derived from Lemma 6. Consequently, we have

$$\begin{aligned} \sum _{c\in \tilde{{\mathcal {C}}}}w(c)\delta (c,\tilde{{\mathcal {D}}}'\cap \tilde{{\mathcal {F}}}^+_{\ell (c)}) =&\sum _{i=1}^p\sum _{c\in \tilde{{\mathcal {C}}}^*_i}w(c)\delta (c,\tilde{{\mathcal {D}}}'\cap \tilde{{\mathcal {F}}}^+_{\ell (c)})\nonumber \\&\leqslant \sum _{i=1}^p\sum _{c\in \tilde{{\mathcal {C}}}^*_i}w(c)(\delta (c,h(f^*_i))+\delta (h(f^*_i),\tilde{{\mathcal {D}}}'\cap \tilde{{\mathcal {F}}}^+_{\ell (c)}))\nonumber \\&\leqslant \sum _{i=1}^p\sum _{c\in \tilde{{\mathcal {C}}}^*_i}w(c)(\delta (c,h(f^*_i))+\delta (h(f^*_i),h(f'_i)))\nonumber \\ =&\sum _{c\in \tilde{{\mathcal {C}}}}w(c)\delta (c,\tilde{{\mathcal {D}}}^*\cap {\tilde{{\mathcal {F}}}^+_{\ell (c)}})\nonumber \\&+ \sum _{i=1}^p\sum _{c\in \tilde{{\mathcal {C}}}^*_i}w(c)\delta (h(f^*_i),h(f'_i))\nonumber \\ =&(1+O(\varepsilon ))\mathrm{{opt}}, \end{aligned}$$

(19)

where the second step is due to triangle inequality, the third step follows from the fact that $\ell (c)\leqslant \ell ^*_i$ and $h(f'_i)\in \tilde{{\mathcal {F}}}_{\ell ^*_i}\subseteq \tilde{{\mathcal {F}}}^+_{\ell (c)}$ for each $c\in \tilde{{\mathcal {C}}}^*_i$ and $i\in [p]$, the fourth step follows from the definition of $\tilde{{\mathcal {C}}}^*_i$, and the fifth step is due to Lemma 6 and inequality (18).

Combining the fact that $\tilde{{\mathcal {D}}}'\in {\mathbb {D}}$ and $\tilde{{\mathcal {D}}}^\dag = \arg \min _{\tilde{{\mathcal {D}}}\in {\mathbb {D}}}\sum _{c\in \tilde{{\mathcal {C}}}}w(c)\delta (c,\tilde{{\mathcal {D}}}\cap \tilde{{\mathcal {F}}}^+_{\ell (c)})$ with inequality (19) yields

$$\begin{aligned} \sum _{c\in \tilde{{\mathcal {C}}}}w(c)\delta (c,\tilde{{\mathcal {D}}}^\dag \cap \tilde{{\mathcal {F}}}^+_{\ell (c)})&\leqslant \sum _{c\in \tilde{{\mathcal {C}}}}w(c)\delta (c,\tilde{{\mathcal {D}}}'\cap \tilde{{\mathcal {F}}}^+_{\ell (c)}) \leqslant (1+O(\varepsilon ))\mathrm{{opt}}. \end{aligned}$$

(20)

Therefore, we get

$$\begin{aligned} \sum _{c\in {\mathcal {C}}}\delta (c,\varphi ^\dag (c))&= \sum _{c\in {\mathcal {C}}}\delta (c,{\mathcal {D}}^\dag \cap {\mathcal {F}}^+_{\ell (c)})\nonumber \\&\leqslant \frac{1}{1-\varepsilon }\sum _{c\in \tilde{{\mathcal {C}}}}w(c)\delta (c,\tilde{{\mathcal {D}}}^\dag \cap \tilde{{\mathcal {F}}}^+_{\ell (c)})\nonumber \\&\leqslant \frac{1+O(\varepsilon )}{1-\varepsilon }\hbox {opt}\nonumber \\&= (1+O(\varepsilon ))\hbox {opt}, \end{aligned}$$

where the first step is due to the definition of $\varphi ^\dag $, the second step follows from Lemma 5, and the third step is due to inequality (20). This completes the proof of Lemma 9.

Lemma 8 and Lemma 9 imply that a $(1+O(\varepsilon ))$-approximation solution to ${\mathcal {I}}$ can be constructed in $O(nd\log n)+(p\varepsilon ^{-1})^{p\varepsilon ^{-O(1)}}n^{O(1)}$ time, which in turn implies that Theorem 1 is true.

4 Conclusions

In this paper we consider the Budgeted Priority p-Median problem in high-dimensional Euclidean spaces. We give a $(1+O(\varepsilon ))$-approximation algorithm running in FPT(p) time. Our main technical contribution is a data reduction-based approach for constructing small solution search spaces, which is quite different from the commonly used sampling-based methods and we believe is of independent interest.

References

Guha, S., Khuller, S.: Greedy strikes back: improved facility location algorithms. J. Algorithm. 31(1), 228–248 (1999)
Article MathSciNet Google Scholar
Cohen-Addad V., Gupta A., Kumar A., Lee E., Li J.: Tight FPT approximations for $k$-median and $k$-means. In: The Proceedings of the 46th International Colloquium on Automata, Languages, and Programming, pp. 42 (2019)
Cohen-Addad V., Grandoni F., Lee E., Schwiegelshohn C.: Breaching the 2 LMP approximation barrier for facility location with applications to $k$-median. In: The Proceedings of the 34th ACM-SIAM Symposium on Discrete Algorithms, pp. 940–986 (2023)
Cohen-Addad V., Esfandiari H., Mirrokni V. S., Narayanan S.: Improved approximations for Euclidean $k$-means and $k$-median, via nested quasi-independent sets. In: The Proceedings of the 54th Annual ACM SIGACT Symposium on Theory of Computing, pp. 1621–1628 (2022)
Kumar A., Sabharwal Y.: The priority $k$-median problem. In: The Proceedings of the 27th IARCS Annual Conference on Foundations of Software Technology and Theoretical Computer Science, pp. 71–83 (2007)
Goyal, D., Jaiswal, R.: Tight FPT approximation for socially fair clustering. Inf. Process. Lett. 182, 106383 (2023)
Article MathSciNet Google Scholar
Goyal, D., Jaiswal, R.: Tight FPT approximation for constrained $k$-center and $k$-supplier. Theor. Comput. Sci. 940, 190–208 (2023)
Article MathSciNet Google Scholar
Bandyapadhyay S., Lochet W., Saurabh S.: FPT constant-approximations for capacitated clustering to minimize the sum of cluster radii. In: The Proceedings of the 39th International Symposium on Computational Geometry, pp. 12 (2023)
Feng Q., Zhang Z., Huang Z., Xu J., Wang J.: A unified framework of FPT approximation algorithms for clustering problems. In: The Proceedings of the 31st International Symposium on Algorithms and Computation, pp. 5 (2020)
Kumar, A., Sabharwal, Y., Sen, S.: Linear-time approximation schemes for clustering problems in any dimensions. J. ACM 57(2), pp. 5 (2010)
Chen K.: On $k$-median clustering in high dimensions. In: The Proceedings of the 17th Annual ACM-SIAM Symposium on Discrete Algorithms, pp. 1177–1185 (2006)
Jaiswal, R., Kumar, A., Sen, S.: A simple $D^2$-sampling based PTAS for $k$-means and other clustering problems. Algorithmica 70(1), 22–46 (2014)
Article MathSciNet Google Scholar
Gupta, A., Krauthgamer, R., Lee, J. R.: Bounded geometries, fractals, and low-distortion embeddings. In: The Proceedings of the 44th IEEE Annual Symposium on Foundations of Computer Science, pp. 534–543 (2003)
Har-Peled, S., Mendel, M.: Fast construction of nets in low-dimensional metrics and their applications. SIAM J. Comput. 35(5), 1148–1184 (2006)
Article MathSciNet Google Scholar
Johnson, W.B., Lindenstrauss, J.: Extensions of Lipschitz mappings into a Hilbert space. Contemp. Math. 26, 189–206 (1984)
Article MathSciNet Google Scholar
Narayanan S., Nelson J.: Optimal terminal dimensionality reduction in Euclidean space. In: The Proceedings of the 51st Annual ACM SIGACT Symposium on Theory of Computing, pp. 1064–1069 (2019)

Download references

Author information

Authors and Affiliations

School of Advanced Interdisciplinary Studies, Hunan University of Technology and Business, Changsha, 410205, Hunan, China
Zhen Zhang, Zhi-Ping Tian, Li-Mei Liu & Xue-Song Xu
Xiangjiang Laboratory, Changsha, 410205, Hunan, China
Zhen Zhang, Zhi-Ping Tian, Li-Mei Liu, Xue-Song Xu & Qi-Long Feng
Department of Computer Science and Software Engineering, Penn State Erie, The Behrend College, Erie, PA, 16563, USA
Zi-Yun Huang
School of Computer Science and Engineering, Central South University, Changsha, 410012, Hunan, China
Qi-Long Feng

Authors

Zhen Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Zi-Yun Huang
View author publications
You can also search for this author in PubMed Google Scholar
Zhi-Ping Tian
View author publications
You can also search for this author in PubMed Google Scholar
Li-Mei Liu
View author publications
You can also search for this author in PubMed Google Scholar
Xue-Song Xu
View author publications
You can also search for this author in PubMed Google Scholar
Qi-Long Feng
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

L. -M. Liu, X. -S. Xu, and Q. -L. Feng designed the study and contributed the central idea. Z. Zhang and Z.-Y. Huang did the analyses and wrote the initial draft of the paper. All authors discussed the results and revised the manuscript.

Corresponding author

Correspondence to Li-Mei Liu.

Ethics declarations

Conflict of interest

The authors declare no conflict of interest.

Additional information

This work was supported by the National Natural Science Foundation of China (Nos. 62202161, 62172446, and 62376092), Natural Science Foundation of Hunan Province (No. 2023JJ40240), and Open Project of Xiangjiang Laboratory (Nos.22XJ02002 and 22XJ03005).

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Zhang, Z., Huang, ZY., Tian, ZP. et al. On the Budgeted Priority p-Median Problem in High-Dimensional Euclidean Spaces. J. Oper. Res. Soc. China (2024). https://doi.org/10.1007/s40305-023-00533-w

Download citation

Received: 04 September 2023
Revised: 20 November 2023
Accepted: 24 November 2023
Published: 22 March 2024
DOI: https://doi.org/10.1007/s40305-023-00533-w

Keywords

Mathematics Subject Classification

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

On the Budgeted Priority p-Median Problem in High-Dimensional Euclidean Spaces

Abstract

Similar content being viewed by others

To Close Is Easier Than To Open: Dual Parameterization To k-Median

A PTAS for the Geometric Connected Facility Location Problem

Approximation Algorithms for the Lower-Bounded Knapsack Median Problem

1 Introduction

Definition 1

1.1 Our Results

Theorem 1

1.2 Our Techniques

1.3 Preliminaries

Lemma 1

Proof

Definition 2

Lemma 2

2 Data Reduction

Lemma 3

Lemma 4

Corollary 1

Proof

Lemma 5

Proof

3 Our Algorithm

Lemma 6

Proof

3.1 The Partition Step

Lemma 7

Proof

3.2 The Construction Step

Lemma 8

Proof

Lemma 9

Proof

4 Conclusions

References

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Mathematics Subject Classification

Search

Navigation