1 Introduction

Locating facilities to serve a given set of clients is a fundamental problem in the field of operational research. This problem becomes difficult when considering the following trade-off: Each client should be served by a neighboring facility, while we cannot open too many facilities.

The p-Median problem is one of the most commonly studied facility location-type problems, which balances the client-connection and facility-opening costs by limiting the number of opened facilities and minimizing the sum of distances of the clients from the corresponding facilities. It was known that the p-Median problem is NP-hard [1, 2], and considerable attention has been paid on developing approximation algorithms for it. The current best known approximation guarantees for the problem are the ratios of 2.67059 [3] and 2.406 [4] in general and Euclidean metrics, respectively.

The p-Median problem assumes that each client can be connected to an arbitrary opened facility, and thus, its algorithms fail to work in applications where clients differ in levels of required services. One such example is in the construction of supply chain networks, where the algorithms for the p-Median problem are not guaranteed to yield feasible solutions since they tend to minimize the supply chain cost without concerning the target service levels of the chain members. Motivated by such applications, Kumar and Sabharwal [5] introduced the Budgeted Priority p-Median problem, which can be formally defined as follows.

Definition 1

(Budgeted Priority p-Median [5]) An instance \(({\mathcal {C}},{\mathcal {F}},{\mathcal {L}}, {\mathcal {P}},\ell )\) of the Budgeted Priority p-Median problem is specified by a set \({\mathcal {C}}\) of clients, a set \({\mathcal {F}}\) of facilities, a set \({\mathcal {L}}=\{1,\cdots , |{\mathcal {L}}|\}\) of service levels (or called priority levels), and a set \({\mathcal {P}}=\{p_1,\cdots , p_{|{\mathcal {L}}|}\}\) of real numbers no less than 1, where each \(x\in {\mathcal {C}}\cup {\mathcal {F}}\) is located in a metric space and associated with a service level \(\ell (x)\in {\mathcal {L}}\). A feasible solution \(({\mathcal {D}},\varphi )\) to this instance opens a subset \({\mathcal {D}}\subseteq {\mathcal {F}}\) of facilities satisfying \(|\{f\in {\mathcal {D}}:\ell (f)=i\}|\leqslant p_i\) for each \(i\in {\mathcal {L}}\) and connects each client \(c\in {\mathcal {C}}\) to an opened facility \(\varphi (c)\in {\mathcal {D}}\) satisfying \(\ell (c)\leqslant \ell (\varphi (c))\). The cost of such a solution is \(\sum _{c\in {\mathcal {C}}}\delta (c,\varphi (c))\), where \(\delta (c,\varphi (c))\) denotes the distance of c from \(\varphi (c)\), and the Budgeted Priority p-Median problem aims to identify a feasible solution with minimal cost.

Despite its important role in applications involving different service levels, the Budgeted Priority p-Median problem is far less understood from the theoretical point of view compared with the standard p-Median problem (the latter is a special case of the Budgeted Priority p-Median problem where all the clients and facilities are associated with the same service level). Kumar and Sabharwal [5] showed that the natural linear programming relaxation of the Budgeted Priority p-Median problem has an unbounded integrality gap if the number of service levels is more than four, and gave a constant-factor approximation algorithm for instances with no more than two service levels. Without such a limitation on the number of service levels, whether the Budgeted Priority p-Median problem admits an O(1)-approximation algorithm still remains elusive.

In most real-world applications of facility location-type problems, the facilities providing services are much less than the clients with service requirements. Thus, taking the number of opened facilities as a fixed parameter is a feasible way for relaxing these problems in practice, as done in [6,7,8]. However, the Budgeted Priority p-Median problem has turned out to be still difficult when the maximal number of opened facilities (denoted by p) is small: Reductions to the Set Covering problem given by Guha and Khuller [1] indicate that the Budgeted Priority p-Median problem, even for the case where the service levels associated with the clients and facilities are uniform, is W[2]-hard if parameterized by p; Cohen-Addad et al. [2] showed that the approximation ratio of any FPT(p)-time (i.e., \(g(p)n^{O(1)}\) time for an instance of size n and a positive function g) algorithm for the problem cannot be better than \((1+2\textrm{e}^{-1})\) under the Gap-Exponential Time Hypothesis. On the positive side, Feng et al. [9] gave a \((3+\varepsilon )\)-approximation algorithm for the Budgeted Priority p-Median problem running in \((p\varepsilon ^{-1})^{O(p)}n^{O(1)}\) time based on a greedy sampling approach, which significantly improves the performance guarantees of polynomial-time algorithms.

1.1 Our Results

The negative result given by Cohen-Addad et al. [2] implies that we are unlikely to approximate the Budgeted Priority p-Median problem with a ratio better than \(1+2\textrm{e}^{-1}\) in FPT time if taking p as the fixed parameter, but we can still hope to near-exactly solve the problem in Euclidean spaces using FPT-time algorithms since such a lower bound only holds in general metrics. Indeed, for continuous Euclidean instances where the services have only a single level and each facility can be opened at an arbitrary location of the space, a series of sampling-based \((1+\varepsilon )\)-approximation algorithms running in FPT time are known for the Budgeted Priority p-Median problem [10,11,12]. Unfortunately, without such limitations on the Euclidean instances, the current best FPT-time approximation guarantee for the problem is still the ratio of \((3+\varepsilon )\) given in [9], and whether the problem can be near-exactly solved in FPT time remains an open question. The main result in this paper affirmatively answers this question.

Theorem 1

Given a constant \(\varepsilon \in (0,1)\) and an instance \(({\mathcal {C}},{\mathcal {F}},{\mathcal {L}}, {\mathcal {P}},\ell )\) of the Budgeted Priority p-Median problem with \({\mathcal {C}}\cup {\mathcal {F}}\subset {\mathbb {R}}^d\), there is a \((1+\varepsilon )\)-approximation algorithm running in \(O(nd\log n)+(p\varepsilon ^{-1})^{p\varepsilon ^{-O(1)}}n^{O(1)}\) time, where \(n=|{\mathcal {C}}|+|{\mathcal {F}}|\) and \(p=\sum _{i=1}^{|{\mathcal {L}}|}\lfloor p_i\rfloor \).

1.2 Our Techniques

As mentioned above, an Euclidean instance \(({\mathcal {C}},{\mathcal {F}},{\mathcal {L}}, {\mathcal {P}},\ell )\) of the Budgeted Priority p-Median problem involving only a single service level (i.e., \(|{\mathcal {L}}|=|{\mathcal {P}}|=1\)) has several known FPT-time \((1+\varepsilon )\)-approximation algorithms [10,11,12]. The main challenge encountered when removing such a limitation on the instance lies in the significantly increased difficulty of constructing feasible solutions: We need to simultaneously satisfy the different service requirements of the clients and the budgeted constraints defined by \({\mathcal {P}}\). This invalidates existing algorithms in Euclidean spaces.

In this paper, we deal with the Budgeted Priority p-Median problem based on a quite different data reduction-based method. Given an Euclidean instance of the Budgeted Priority p-Median problem, we show that a combination of an adapted coreset-construction approach and a Johnson–Lindenstrauss transform-type method can significantly reduce its size and map it into a low-dimensional Euclidean space. Based on such a reduced instance, we estimate the locations of the facilities opened in optimal solutions and in turn construct a small solution search space by exploring the properties of the Euclidean metrics. It is shown that enumerating over this search space yields the desired \((1+\varepsilon )\)-approximation solution in FPT time.

1.3 Preliminaries

Let \(\varepsilon \) denote a constant from (0, 1) and \({\mathcal {I}}=({\mathcal {C}},{\mathcal {F}},{\mathcal {L}}, {\mathcal {P}},\ell )\) be an instance of the Budgeted Priority p-Median problem satisfying \(\sum _{i=1}^{|{\mathcal {L}}|}\lfloor p_i\rfloor =p\), \(|{\mathcal {C}}|+|{\mathcal {F}}|=n\), and \({\mathcal {C}}\cup {\mathcal {F}}\subset {\mathbb {R}}^d\). Define \([t]=\{1,\cdots ,t\}\) for each integer \(t\geqslant 1\). For each \(i\in {\mathcal {L}}\), let \({\mathcal {C}}_i=\{c\in {\mathcal {C}}:\ell (c)=i\}\) and \({\mathcal {F}}_i=\{f\in {\mathcal {F}}:\ell (f)=i\}\), and define \({\mathcal {F}}^+_{i}=\bigcup _{j=i}^{|{\mathcal {L}}|}{\mathcal {F}}_j\). Let \(({\mathcal {D}}^*,\varphi ^*)\) denote an optimal solution to \({\mathcal {I}}\). Without loss of generality, we can assume that \(|{\mathcal {D}}^*|=p\) and let \({\mathcal {D}}^*=\{f^*_1,\cdots ,f^*_p\}\). Let \(\ell ^*_i=\ell (f^*_i)\) for each \(i\in [p]\). Given two data points xy and a set \({\mathcal {X}}\) located in an Euclidean space, define \(\delta (x,y)\) as the distance of x from y, and let \(\delta ({\mathcal {X}},y)=\sum _{z\in {\mathcal {X}}}\delta (z,y)\) and \(\delta (y,{\mathcal {X}})=\min _{z\in {\mathcal {X}}}\delta (y,z)\). Let \(\textrm{opt}=\sum _{c\in {\mathcal {C}}}\delta (c,\varphi ^*(c))\) denote the cost of an optimal solution to \({\mathcal {I}}\).

The running time of our algorithm is analyzed based on the following algebraic fact.

Lemma 1

We have \(\log ^j i\leqslant \max \{i,j^{O(j)}\}\) for each \(i,j>1\).

Proof

If \(j\geqslant \frac{\log i}{\log \log i}\), then we have \(\log i\leqslant j^{O(1)}\) and hence \(\log ^j i\leqslant j^{O(j)}\). If \(j<\frac{\log i}{\log \log i}\), then we have \(\log ^j i<\log ^{\frac{\log i}{\log \log i}}i=i\). Thus, Lemma 1 is true.

Our algorithm constructs nets for the facilities to identify a small solution search space. Such nets can be formally defined as follows.

Definition 2

(\(\lambda \)-net [13]) Given a set \({\mathcal {X}}\subset {\mathbb {R}}^d\) and a real number \(\lambda >0\), a subset \({\mathcal {X}}'\subseteq {\mathcal {X}}\) is called a \(\lambda \)-net of \({\mathcal {X}}\) if \(\delta (x,{\mathcal {X}}'\backslash \{x\})>\lambda \) for each \(x\in {\mathcal {X}}'\) and \(\delta (x,{\mathcal {X}}')<\lambda \) for each \(x\in {\mathcal {X}}\).

Har-Peled and Mendel [14] showed that a small-size net of a given set located in a low-dimensional Euclidean space can be constructed in near-linear time.

Lemma 2

([14]) Given a set \({\mathcal {X}}\subset {\mathbb {R}}^d\) and a real number \(\lambda >0\), one can construct a \(\lambda \)-net of \({\mathcal {X}}\) of size no more than \(\min \{(\lambda ^{-1}\max _{x,y\in {\mathcal {X}}}\delta (x,y))^d, |{\mathcal {X}}|\}\) in \(|{\mathcal {X}}|\log |{\mathcal {X}}|2^{O(d)}\) time.

2 Data Reduction

In this section we reduce the size of instance \({\mathcal {I}}\) based on the stronger version of Johnson–Lindenstrauss transform [15] given by Narayanan and Nelson [16] and the coreset-construction approach given by Chen [11]. Lemma 3 and Lemma 4 are the guarantees of these two approaches.

Lemma 3

([16]) Given a set \({\mathcal {X}}\subset {\mathbb {R}}^d\) and a real number \(\varepsilon \in (0,1)\), a mapping \(h: {\mathbb {R}}^{d}\rightarrow {\mathbb {R}}^{{\tilde{d}}}\) satisfying \({\tilde{d}}=O(\varepsilon ^{-2}\log |{\mathcal {X}}|)\) and \(\delta (h(x),h(y))\in [1,1+\varepsilon ]\delta (x,y)\) for each \(x\in {\mathcal {X}}\) and \(y\in {\mathbb {R}}^d\) can be constructed in \(O(|{\mathcal {X}}|d\log |{\mathcal {X}}|)\) time.

We can assume that the mapping \(h: {\mathcal {X}}'\rightarrow {\mathbb {R}}^{{\tilde{d}}}\) constructed by Lemma 3 is injective for each finite set \({\mathcal {X}}'\subset {\mathbb {R}}^d\). This is without loss of generality since we can make copies of the points from \({\mathbb {R}}^{{\tilde{d}}}\) that have multiple inverse images under h. Such an assumption means that we need to differentiate h(x) and h(y) for each two distinct points \(x,y\in {\mathbb {R}}^d\), even for the case where h(x) and h(y) share the same value in each dimension (differentiating h(x) and h(y) is necessary since two points with the same dimension values can be associated with different service levels).

Lemma 4

([11]) Given a set \({\mathcal {X}}\subset {\mathbb {R}}^d\), a real number \(\varepsilon \in (0,1)\), and an integer \(k>0\), one can construct a weighted subset \(\tilde{{\mathcal {X}}}\subseteq {\mathcal {X}}\) with a weight function \(w:\tilde{{\mathcal {X}}}\rightarrow [1,+\infty )\) satisfying \(\sum _{c\in \tilde{{\mathcal {X}}}}w(c)=|{\mathcal {X}}|\) in \(O(|{\mathcal {X}}|\mathrm{{d}} k)\) time, such that \(|\tilde{{\mathcal {X}}}|\leqslant \mathrm{{d}}(k\varepsilon ^{-1}\log |{\mathcal {X}}|)^{O(1)}\) and \(\sum _{c\in \tilde{{\mathcal {X}}}}w(c)\delta (c,{\mathcal {D}})\in [1-\varepsilon ,1+\varepsilon ]\sum _{c\in {\mathcal {X}}}\delta (c,{\mathcal {D}})\) for each \({\mathcal {D}}\subset {\mathbb {R}}^d\) with \(|{\mathcal {D}}|\leqslant k\).

The following corollary of Lemma 4 says that the coreset-construction approach given in [11] can be adapted to the Budgeted Priority p-Median problem.

Corollary 1

Given a real number \(\varepsilon \in (0,1)\) and an instance \({\mathcal {I}}=({\mathcal {C}},{\mathcal {F}},{\mathcal {L}}, {\mathcal {P}},\ell )\) of the Budgeted Priority p-Median problem with \(\sum _{i=1}^{|{\mathcal {L}}|}\lfloor p_i\rfloor =p\) and \({\mathcal {C}}\cup {\mathcal {F}}\subset {\mathbb {R}}^d\), one can construct a weighted subset \(\tilde{{\mathcal {C}}}\subseteq {\mathcal {C}}\) with a weight function \(w:\tilde{{\mathcal {C}}}\rightarrow [1,+\infty )\) satisfying \(\sum _{c\in \tilde{{\mathcal {C}}}}w(c)=|{\mathcal {C}}|\) in \(O(|{\mathcal {C}}|\mathrm{{d}} p)\) time, such that \(|\tilde{{\mathcal {C}}}|\leqslant \mathrm{{d}}(p\varepsilon ^{-1}\log |{\mathcal {C}}|)^{O(1)}\) and \(\sum _{c\in \tilde{{\mathcal {C}}}}w(c)\delta (c,{\mathcal {D}}\cap {\mathcal {F}}^+_{\ell (c)})\in [1-\varepsilon ,1+\varepsilon ]\sum _{c\in {\mathcal {C}}}\delta (c,{\mathcal {D}}\cap {\mathcal {F}}^+_{\ell (c)})\) for each \({\mathcal {D}}\subseteq {\mathcal {F}}\) with \(|{\mathcal {D}}|\leqslant p\).

Proof

For each \(i\in {\mathcal {L}}\), denote by \(\tilde{{\mathcal {C}}}_i\) the weighted subset of \({\mathcal {C}}_i\) and \(w: \tilde{{\mathcal {C}}}_i\rightarrow [1,+\infty )\) the corresponding weight function constructed by Lemma 4 when taking \({\mathcal {C}}_i\), \(\varepsilon \), and p as inputs. We have \(\ell (c)=i\) for each \(c\in \tilde{{\mathcal {C}}}_i\) due to the definition of \({\mathcal {C}}_i\) and the fact that \(\tilde{{\mathcal {C}}}_i\subseteq {\mathcal {C}}_i\). Let \(\tilde{{\mathcal {C}}}=\bigcup _{i=1}^{|{\mathcal {L}}|}\tilde{{\mathcal {C}}}_i\). Lemma 4 implies that

$$\begin{aligned} \sum _{c\in \tilde{{\mathcal {C}}}}w(c) =\sum _{i=1}^{|{\mathcal {L}}|}\sum _{c\in \tilde{{\mathcal {C}}}_i}w(c) =\sum _{i=1}^{|{\mathcal {L}}|}|{\mathcal {C}}_i|=|{\mathcal {C}}| \end{aligned}$$
(1)

and

$$\begin{aligned} |\tilde{{\mathcal {C}}}|&= \sum _{i=1}^{|{\mathcal {L}}|}|\tilde{{\mathcal {C}}}_i|\leqslant \sum _{i=1}^{|{\mathcal {L}}|} \mathrm{{d}}(p\varepsilon ^{-1}\log |{\mathcal {C}}_i|)^{O(1)} \leqslant \mathrm{{d}}(p\varepsilon ^{-1}\log |{\mathcal {C}}|)^{O(1)}. \end{aligned}$$
(2)

Also by Lemma 4, we know that constructing \(\tilde{{\mathcal {C}}}\) and the corresponding weight function takes

$$\begin{aligned} O \left( \sum _{i=1}^{|{\mathcal {L}}|} |{\mathcal {C}}_i|\mathrm{{d}}p \right) = O(|{\mathcal {C}}|\mathrm{{d}}p) \end{aligned}$$
(3)

time in total. Moreover, it is the case that

$$\begin{aligned} \sum _{c\in \tilde{{\mathcal {C}}}_i}w(c)\delta (c,{\mathcal {D}}\cap {\mathcal {F}}^+_{\ell (c)})&=\sum _{c\in \tilde{{\mathcal {C}}}_i}w(c)\delta (c,{\mathcal {D}}\cap {\mathcal {F}}^+_{i})\nonumber \\&\in [1-\varepsilon ,1+\varepsilon ]\sum _{c\in {\mathcal {C}}_i}\delta (c,{\mathcal {D}}\cap {\mathcal {F}}^+_{i})\nonumber \\&= [1-\varepsilon ,1+\varepsilon ]\sum _{c\in {\mathcal {C}}_i}\delta (c,{\mathcal {D}}\cap {\mathcal {F}}^+_{\ell (c)}) \end{aligned}$$
(4)

for each \(i\in {\mathcal {L}}\) and \({\mathcal {D}}\subseteq {\mathcal {F}}\) with \(|{\mathcal {D}}|\leqslant p\), where the first and third steps are due to the fact that \(\ell (c)=i\) for each \(c\in \tilde{{\mathcal {C}}}_i\), and the second step follows from Lemma 4. Summing both sides of equality (4) over \(i\in {\mathcal {L}}\), we know that each \({\mathcal {D}}\subseteq {\mathcal {F}}\) with \(|{\mathcal {D}}|\leqslant p\) satisfies

$$\begin{aligned} \sum _{c\in \tilde{{\mathcal {C}}}}w(c)\delta (c,{\mathcal {D}}\cap {\mathcal {F}}^+_{\ell (c)})&= \sum _{i=1}^{|{\mathcal {L}}|}\sum _{c\in \tilde{{\mathcal {C}}}_i}w(c)\delta (c,{\mathcal {D}}\cap {\mathcal {F}}^+_{\ell (c)})\nonumber \\&\in [1-\varepsilon ,1+\varepsilon ]\sum _{i=1}^{|{\mathcal {L}}|}\sum _{c\in {\mathcal {C}}_i}\delta (c,{\mathcal {D}}\cap {\mathcal {F}}^+_{\ell (c)})\nonumber \\&= [1-\varepsilon ,1+\varepsilon ]\sum _{c\in {\mathcal {C}}}\delta (c,{\mathcal {D}}\cap {\mathcal {F}}^+_{\ell (c)}), \end{aligned}$$

combining which with equality (1), inequality (2), and equality (3), we complete the proof of Corollary 1.

Based on Lemma 3 and Corollary 1, we obtain the following result about data reduction for the Budgeted Priority p-Median problem.

Lemma 5

Given a real number \(\varepsilon \in (0,1)\) and an instance \({\mathcal {I}}=({\mathcal {C}},{\mathcal {F}},{\mathcal {L}}, {\mathcal {P}},\ell )\) of the Budgeted Priority p-Median problem satisfying \(\sum _{i=1}^{|{\mathcal {L}}|}\lfloor p_i\rfloor =p\) and \({\mathcal {C}}\cup {\mathcal {F}}\subset {\mathbb {R}}^d\), one can construct a mapping \(h: {\mathbb {R}}^d\rightarrow {\mathbb {R}}^{{\tilde{d}}}\) and a weighted set \(\tilde{{\mathcal {C}}}\subset {\mathbb {R}}^{{\tilde{d}}}\) with a weight function \(w: \tilde{{\mathcal {C}}}\rightarrow [1,+\infty )\) satisfying \(\sum _{c\in \tilde{{\mathcal {C}}}}w(c)=|{\mathcal {C}}|\) and a level function \(\ell : \tilde{{\mathcal {C}}}\rightarrow {\mathcal {L}}\) in \(O(|{\mathcal {C}}|\mathrm{{d}}\log |{\mathcal {C}}|+(p \varepsilon ^{-1}|{\mathcal {C}}|)^{O(1)})\) time, such that \({\tilde{d}}=\varepsilon ^{-O(1)}(\log p+\log \log |{\mathcal {C}}|)\), \(|\tilde{{\mathcal {C}}}|\leqslant (p\varepsilon ^{-1}\log |{\mathcal {C}}|)^{O(1)}\), and \(\sum _{c\in \tilde{{\mathcal {C}}}}w(c)\delta (c,\{h(f): f\in {\mathcal {D}}\cap {\mathcal {F}}^+_{\ell (c)}\})\in [1-\varepsilon ,1+O(\varepsilon )]\sum _{c\in {\mathcal {C}}}\delta (c,{\mathcal {D}}\cap {\mathcal {F}}^+_{\ell (c)})\) for each \({\mathcal {D}}\subseteq {\mathcal {F}}\) with \(|{\mathcal {D}}|\leqslant p\).

Proof

Denote by \(h_1: {\mathbb {R}}^{d}\rightarrow {\mathbb {R}}^{d'}\) the mapping constructed by Lemma 3 when taking \(\varepsilon \) and \({\mathcal {C}}\) as inputs, and let \(\ell (h_1(x))=\ell (x)\) for each \(x\in {\mathcal {C}}\cup {\mathcal {F}}\). Denote by \({\mathcal {C}}'\) the weighted subset of \(\{h_1(c): c\in {\mathcal {C}}\}\) constructed by Corollary 1 when taking \(\varepsilon \) and \((\{h_1(c): c\in {\mathcal {C}}\}, \{h_1(f): f\in {\mathcal {F}}\}, {\mathcal {L}}, {\mathcal {P}},\ell )\) as inputs, and let \(w: {\mathcal {C}}'\rightarrow [1,+\infty )\) be the corresponding weight function. Let \(h_2: {\mathbb {R}}^{d'}\rightarrow {\mathbb {R}}^{{\tilde{d}}}\) be the mapping constructed by Lemma 3 when taking \({\mathcal {C}}'\) and \(\varepsilon \) as inputs. Define \(\tilde{{\mathcal {C}}}=\{h_2(c):c\in {\mathcal {C}}'\}\). Let \(h(x)=h_2(h_1(x))\) for each \(x\in {\mathbb {R}}^{d}\). Let \(w(c)=w(h_2^{-1}(c))\) and \(\ell (c)=\ell (h_2^{-1}(c))\) for each \(c\in \tilde{{\mathcal {C}}}\). The above process for data reduction is illustrated in Fig. 1.

Fig. 1
figure 1

Illustration of our data reduction method, which reduces the dimensionality twice and constructs a coreset. Here, we have \({\mathcal {C}}=\{c_1,\cdots ,c_4\}\), \({\mathcal {C}}'=\{h_1(c_2),h_1(c_3)\}\), and \(\tilde{{\mathcal {C}}}=\{h(c_2),h(c_3)\}\)

Observe that

$$\begin{aligned} \sum _{c\in \tilde{{\mathcal {C}}}}w(c) =\sum _{c\in \tilde{{\mathcal {C}}}}w(h_2^{-1}(c)) =\sum _{c\in {\mathcal {C}}'}w(c) =|\{h_1(c):c\in {\mathcal {C}}\}|=|{\mathcal {C}}|, \end{aligned}$$
(5)

where the second and fourth steps are due to the assumption that the mappings constructed by Lemma 3 are injective, and the third step is due to Corollary 1. Moreover, Lemma 3 and Corollary 1 imply that

$$\begin{aligned} d'=O(\varepsilon ^{-2}\log |{\mathcal {C}}|) \end{aligned}$$

and

$$\begin{aligned} |\tilde{{\mathcal {C}}}|= |{\mathcal {C}}'|\leqslant d' (p\varepsilon ^{-1}\log |{\mathcal {C}}|)^{O(1)} = (p\varepsilon ^{-1}\log |{\mathcal {C}}|)^{O(1)}, \end{aligned}$$
(6)

combining which with Lemma 3 and Corollary 1, we know that constructing \(\tilde{{\mathcal {C}}}\) and \(h: {\mathbb {R}}^d\rightarrow {\mathbb {R}}^{{\tilde{d}}}\) takes

$$\begin{aligned} O(|{\mathcal {C}}|\mathrm{{d}}\log |{\mathcal {C}}|+|{\mathcal {C}}|d' p+|{\mathcal {C}}'|d' \log |{\mathcal {C}}'|)&\leqslant O(|{\mathcal {C}}|\mathrm{{d}}\log |{\mathcal {C}}|+(p \varepsilon ^{-1} |{\mathcal {C}}|)^{O(1)}) \end{aligned}$$
(7)

time in total, and

$$\begin{aligned} {\tilde{d}}&=O(\varepsilon ^{-2}\log |{\mathcal {C}}'|) \leqslant O(\varepsilon ^{-2}\log (p\varepsilon ^{-1}\log |{\mathcal {C}}|)) = \varepsilon ^{-O(1)}(\log p +\log \log |{\mathcal {C}}|). \end{aligned}$$
(8)

Considering a set \({\mathcal {D}}\subseteq {\mathcal {F}}\) with \(|{\mathcal {D}}|\leqslant p\), we have

$$\begin{aligned}&\sum _{c\in \tilde{{\mathcal {C}}}}w(c)\delta (c,\{h(f): f\in {\mathcal {D}}\cap {\mathcal {F}}^+_{\ell (c)}\})\nonumber \\&\;\;\;\quad \quad \in [1,1+\varepsilon ] \sum _{c\in {\mathcal {C}}'}w(c)\delta (c,\{h_1(f): f\in {\mathcal {D}}\cap {\mathcal {F}}^+_{\ell (c)}\})\nonumber \\&\;\;\;\quad \quad \in [1-\varepsilon ,1+O(\varepsilon )]\sum _{c\in {\mathcal {C}}}\delta (h_1(c),\{h_1(f): f\in {\mathcal {D}}\cap {\mathcal {F}}^+_{\ell (c)}\})\nonumber \\&\;\;\;\quad \quad \in [1-\varepsilon ,1+O(\varepsilon )]\sum _{c\in {\mathcal {C}}}\delta (c,{\mathcal {D}}\cap {\mathcal {F}}^+_{\ell (c)}), \end{aligned}$$
(9)

where the first and third steps follow from Lemma 3, and the second step is due to Corollary 1. Combining formula (9) with equality (5), inequality (6), inequality (7), and inequality (8), we complete the proof of Lemma 5.

3 Our Algorithm

In this section we show how the desired \((1+\varepsilon )\)-approximation solution can be constructed based on the reduced instance given by Lemma 5. We first introduce some notations. Let \(h: {\mathbb {R}}^d\rightarrow {\mathbb {R}}^{{\tilde{d}}}\) and \(\tilde{{\mathcal {C}}}\subset {\mathbb {R}}^{{\tilde{d}}}\) be the mapping and weighted set given by Lemma 5 when taking \(\varepsilon \) and \({\mathcal {I}}\) as inputs, and let \(w: \tilde{{\mathcal {C}}}\rightarrow [1,+\infty )\) and \(\ell : \tilde{{\mathcal {C}}}\rightarrow {\mathcal {L}}\) denote the corresponding weight function and level function, respectively. As done in Sect. 2, we can make \(h: {\mathcal {F}}\rightarrow {\mathbb {R}}^{{\tilde{d}}}\) injective by copying the points from \({\mathbb {R}}^{{\tilde{d}}}\) that have multiple inverse images. For each \(f\in {\mathcal {F}}\), let \(\ell (h(f))=\ell (f)\). Define \(\tilde{{\mathcal {F}}}=\{h(f):f\in {\mathcal {F}}\}\) and \(\tilde{{\mathcal {D}}}^*=\{h(f): f\in {\mathcal {D}}^*\}\). Let \(\tilde{{\mathcal {F}}}_{i}=\{h(f): f\in {\mathcal {F}}_{i}\}\) and \(\tilde{{\mathcal {F}}}^+_{i}=\{h(f): f\in {\mathcal {F}}^+_{i}\}\) for each \(i\in {\mathcal {L}}\). Define \(\delta _{\max }=\max _{c\in \tilde{{\mathcal {C}}}}\delta (c,\tilde{{\mathcal {D}}}^*\cap \tilde{{\mathcal {F}}}^+_{\ell (c)})\). The following lemma provides a way of estimating the cost of an optimal solution to \({\mathcal {I}}\).

Lemma 6

\(\delta _{\max }\leqslant \sum _{c\in \tilde{{\mathcal {C}}}}w(c)\delta (c,\tilde{{\mathcal {D}}}^*\cap \tilde{{\mathcal {F}}}^+_{\ell (c)})\leqslant (1+O(\varepsilon ))\mathrm{{opt}}.\)

Proof

Observe that

$$\begin{aligned} \delta _{\max }&\leqslant \max _{c\in \tilde{{\mathcal {C}}}}w(c)\delta (c,\tilde{{\mathcal {D}}}^*\cap \tilde{{\mathcal {F}}}^+_{\ell (c)})\nonumber \\&\leqslant \sum _{c\in \tilde{{\mathcal {C}}}}w(c)\delta (c,\tilde{{\mathcal {D}}}^*\cap \tilde{{\mathcal {F}}}^+_{\ell (c)})\nonumber \\&= \sum _{c\in \tilde{{\mathcal {C}}}}w(c)\delta (c,\{h(f): f\in {\mathcal {D}}^*\cap {\mathcal {F}}^+_{\ell (c)}\})\nonumber \\&\leqslant (1+O(\varepsilon ))\sum _{c\in {\mathcal {C}}}\delta (c,{\mathcal {D}}^*\cap {\mathcal {F}}^+_{\ell (c)})\nonumber \\&= (1+O(\varepsilon ))\mathrm{{opt}}, \end{aligned}$$

where the first step follows from the fact that \(w(c)\geqslant 1\) for each \(c\in \tilde{{\mathcal {C}}}\) and the definition of \(\delta _{\max }\), the fourth step is due to Lemma 5, and the fifth step follows from the fact that \(({\mathcal {D}}^*,\varphi ^*)\) is an optimal solution to \({\mathcal {I}}\). This completes the proof of Lemma 6.

3.1 The Partition Step

In this section, we estimate the locations of the facilities from \(\tilde{{\mathcal {D}}}^*\) based on their neighboring clients. For each \(i\in [p]\), let \(c_i\) denote the client from \(\{c\in \tilde{{\mathcal {C}}}:\ell (c)\leqslant \ell ^*_i\}\) nearest to \(h(f^*_i)\), and define \({\mathcal {B}}_i(\alpha )=\{f\in \tilde{{\mathcal {F}}}: \delta (c_i,f)\leqslant \alpha \}\) for each \(\alpha >0\). We partition the facilities from \(\tilde{{\mathcal {F}}}\) into a set of ring cells centered at the clients from \(\{c_1,\cdots ,c_p\}\): Let \({\mathcal {Q}}(i,0)={\mathcal {B}}_i(\varepsilon \delta _{\max }n^{-1})\) and \({\mathcal {Q}}(i,j)={\mathcal {B}}_i(\varepsilon (1+\varepsilon )^{j}\delta _{\max }n^{-1})\backslash {\mathcal {B}}_i(\varepsilon (1+\varepsilon )^{j-1}\delta _{\max }n^{-1})\) for each \(i\in [p]\) and \(j\in [\lceil \varepsilon ^{-2}\log n\rceil ]\), as shown in Fig. 2a. We have

$$\begin{aligned} \varepsilon (1+\varepsilon )^{\lceil \varepsilon ^{-2}\log n\rceil }\delta _{\max }n^{-1}&> \delta _{\max } = \max _{c\in \tilde{{\mathcal {C}}}}\delta (c,\tilde{{\mathcal {D}}}^*\cap \tilde{{\mathcal {F}}}^+_{\ell (c)}) \geqslant \delta (c_i,h(f^*_i)) \end{aligned}$$

for each \(i\in [p]\), where the second step follows from the definition of \(\delta _{\max }\), and the third step follows from the fact that \(h(f^*_i)\in \tilde{{\mathcal {D}}}^*\cap \tilde{{\mathcal {F}}}^+_{\ell (c_i)}\) (due to the definition of \(c_i\)). Combining this inequality with the definition of \({\mathcal {Q}}(i,j)\), we know that there is an integer \(j\in [0, \lceil \varepsilon ^{-2}\log n\rceil ]\) satisfying \(h(f^*_i)\in {\mathcal {Q}}(i,j)\) for each \(i\in [p]\). Denote by \({\mathcal {Q}}_i\) such a set \({\mathcal {Q}}(i,j)\) containing \(h(f^*_i)\). Our idea for constructing solutions to \({\mathcal {I}}\) in FPT(p) time is to guess the sets \({\mathcal {Q}}_1,\cdots ,{\mathcal {Q}}_p\), such that we can limit the search range of the opened facilities to the inverse images of the members of \(\bigcup _{i=1}^{p}{\mathcal {Q}}_i\) under h. The following result implies that we can guess the sets \({\mathcal {Q}}_1,\cdots ,{\mathcal {Q}}_p\) with a \((p\varepsilon ^{-1})^{O(p)}n^{O(1)}\) multiplicative overhead in the running time of our algorithm.

Fig. 2
figure 2

The images of candidates for the opened facilities under h

Lemma 7

Given the reduced instance \((\tilde{{\mathcal {C}}}, \tilde{{\mathcal {F}}}, {\mathcal {L}}, {\mathcal {P}}, \ell )\) of the Budgeted Priority p-Median problem, the set \(\{{\mathcal {Q}}_1,\cdots ,{\mathcal {Q}}_p\}\) can be guessed by enumerating over no more than \((p\varepsilon ^{-1})^{O(p)}n^{O(1)}\) items.

Proof

We have \(|\tilde{{\mathcal {F}}}|=|{\mathcal {F}}|\leqslant n\) and \(|\tilde{{\mathcal {C}}}|\leqslant (p\varepsilon ^{-1}\log n)^{O(1)}\) due to Lemma 5. Using the definitions of \(\delta _{\max }\) and \(c_i\), it can be seen that there are \(|\tilde{{\mathcal {F}}}|\cdot |\tilde{{\mathcal {C}}}|\leqslant n(p\varepsilon ^{-1}\log n)^{O(1)}\) choices of the value of \(\delta _{\max }\) and \(|\tilde{{\mathcal {C}}}|^p\leqslant (p\varepsilon ^{-1}\log n)^{O(p)}\) choices of the set \(\{c_1,\cdots ,c_p\}\). Consequently, the definition of \({\mathcal {Q}}(i,j)\) indicates that we have no more than \(n(p\varepsilon ^{-1}\log n)^{O(p)}\) choices of the set \(\{{\mathcal {Q}}(1,0),\cdots ,{\mathcal {Q}}(p,\lceil \varepsilon ^{-2}\log n\rceil )\}\). Combining this with the fact that each \(i\in [p]\) satisfies \({\mathcal {Q}}_i\in \{{\mathcal {Q}}(i,0),\cdots ,{\mathcal {Q}}(i,\lceil \varepsilon ^{-2}\log n\rceil )\}\), the number of choices of the set \(\{{\mathcal {Q}}_1,\cdots ,{\mathcal {Q}}_p\}\) can be upper-bounded by \(n(p\varepsilon ^{-1}\log n)^{O(p)}\cdot (\varepsilon ^{-2}\log n)^p\), which is no more than \((p\varepsilon ^{-1})^{O(p)}n^{O(1)}\) due to Lemma 1. Thus, Lemma 7 is true.

Algorithm 1
figure a

Construct the approximation solution

3.2 The Construction Step

Given the set \(\{{\mathcal {Q}}_1,\cdots ,{\mathcal {Q}}_p\}\), our algorithm constructs a net of \({\mathcal {Q}}_i\cap \tilde{{\mathcal {F}}}_j\) for each \(i\in [p]\) and \(j\in {\mathcal {L}}\) and considers the inverse images of the members of the net under h as candidates for the opened facilities (see Fig. 2b), as described in Algorithm 1. Define \({\mathcal {J}}\), \({\mathbb {D}}\), and \({\mathcal {N}}(i,j)\) in the same way as Algorithm 1 for each \(i\in [p]\) and \(j\in {\mathcal {L}}\). Let \(({\mathcal {D}}^\dag ,\varphi ^\dag )\) denote the solution returned by the algorithm, and let \(\tilde{{\mathcal {D}}}^\dag =\{h(f):f\in {\mathcal {D}}^\dag \}\). The following lemma implies that \(({\mathcal {D}}^\dag ,\varphi ^\dag )\) can be constructed in FPT(p) time.

Lemma 8

We can obtain the solution \(({\mathcal {D}}^\dag ,\varphi ^\dag )\) to \({\mathcal {I}}\) in no more than \(O(nd\log n)+(p\varepsilon ^{-1})^{p\varepsilon ^{-O(1)}}n^{O(1)}\) time.

Proof

Define \({\mathcal {G}}=\{(i\in [p], j\in {\mathcal {L}}): |{\mathcal {Q}}_i\cap \tilde{{\mathcal {F}}}_j|>1\}\). The definition of \({\mathcal {N}}(i,j)\) and Lemma 2 imply that

$$\begin{aligned} |{\mathcal {N}}(i,j)|&\leqslant \left( \frac{\max _{x,y\in {\mathcal {Q}}_i\cap \tilde{{\mathcal {F}}}_j}\delta (x,y)}{\varepsilon \max _{x,y\in {\mathcal {Q}}_i\cap \tilde{{\mathcal {F}}}_j}\delta (x,y)}\right) ^{{\tilde{d}}} =\varepsilon ^{-{\tilde{d}}} \leqslant (p\log n)^{\varepsilon ^{-O(1)}} \end{aligned}$$
(10)

for each \((i,j)\in {\mathcal {G}}\), where the third step follows from the fact that \({\tilde{d}}\leqslant \varepsilon ^{-O(1)}(\log p+\log \log n)\) (due to Lemma 5). Consequently, we know that

$$\begin{aligned} |{\mathcal {J}}|&=\sum _{(i,j)\in {\mathcal {G}}}|{\mathcal {N}}(i,j)|+ p|{\mathcal {L}}|- |{\mathcal {G}}|\nonumber \\&\leqslant |{\mathcal {L}}|(p\log n)^{\varepsilon ^{-O(1)}}\nonumber \\&\leqslant (p\log n)^{\varepsilon ^{-O(1)}}\sum _{i=1}^{|{\mathcal {L}}|}\lfloor p_i\rfloor \nonumber \\&= (p\log n)^{\varepsilon ^{-O(1)}}, \end{aligned}$$
(11)

where the first step follows from the definition of \({\mathcal {J}}\), the second step is due to inequality (10), the third step follows from the fact that \(p_i\geqslant 1\) for each \(i\in {\mathcal {L}}\), and the fourth step is derived from the fact that \(p=\sum _{i=1}^{|{\mathcal {L}}|}\lfloor p_i\rfloor \).

Observe that Algorithm 1 constructs a net of \({\mathcal {Q}}_i\cap \tilde{{\mathcal {F}}}_j\) for each \((i,j)\in {\mathcal {G}}\) according to the maximal distance between the facilities from \({\mathcal {Q}}_i\cap \tilde{{\mathcal {F}}}_j\). Lemma 2 and the fact that \({\tilde{d}}\leqslant \varepsilon ^{-O(1)}(\log p+\log \log n)\) imply that this takes no more than

$$\begin{aligned} \sum _{(i,j)\in {\mathcal {G}}}2^{O({\tilde{d}})}|{\mathcal {Q}}_i\cap \tilde{{\mathcal {F}}}_j|^{O(1)}&\leqslant 2^{O({\tilde{d}})}n^{O(1)} \leqslant (p\log n)^{\varepsilon ^{-O(1)}}n^{O(1)} \end{aligned}$$
(12)

time in total. After this, Algorithm 1 enumerates all subsets \(\tilde{{\mathcal {D}}}\subseteq {\mathcal {J}}\) with \(|\tilde{{\mathcal {D}}}\cap \tilde{{\mathcal {F}}}_j|\leqslant p_i\), \(\forall \, j\in {\mathcal {L}}\) and selects the one corresponding to the solution with minimal cost to the reduced instance \((\tilde{{\mathcal {C}}},\tilde{{\mathcal {F}}},{\mathcal {L}},{\mathcal {P}},\ell )\). Equality \(\sum _{i=1}^{|{\mathcal {L}}|}\lfloor p_i\rfloor =p\) implies that each subset considered in this enumeration is of size no more than p, and we can complete this enumeration in

$$\begin{aligned} O(|{\mathcal {J}}|^p |\tilde{{\mathcal {C}}}|{\tilde{d}}p)&\leqslant (p\log n)^{p\varepsilon ^{-O(1)}}|\tilde{{\mathcal {C}}}|{\tilde{d}}p = (p\log n)^{p\varepsilon ^{-O(1)}} \end{aligned}$$
(13)

time, where the first step follows from inequality (11), and the second step follows from the fact that \(|\tilde{{\mathcal {C}}}|\leqslant (p\varepsilon ^{-1}\log n)^{O(1)}\) and \({\tilde{d}}\leqslant \varepsilon ^{-O(1)}(\log p+\log \log n)\) (due to Lemma 5). Using inequality (12) and inequality (13), we know that Algorithm 1 runs in no more than \((p\log n)^{p\varepsilon ^{-O(1)}}n^{O(1)}\) time.

Lemma 5 implies that constructing the mapping \(h:{\mathbb {R}}^d\rightarrow {\mathbb {R}}^{{\tilde{d}}}\) and the reduced instance \((\tilde{{\mathcal {C}}},\tilde{{\mathcal {F}}},{\mathcal {L}},{\mathcal {P}},\ell )\) takes \(O(nd\log n+(p\varepsilon ^{-1}n)^{O(1)})\) time, and Lemma 7 says that there are at most \((p\varepsilon ^{-1})^{O(p)}n^{O(1)}\) choices of the set \(\{{\mathcal {Q}}_1,\cdots ,{\mathcal {Q}}_p\}\). Therefore, all inputs of Algorithm 1 can be known with a \((p\varepsilon ^{-1})^{O(p)}n^{O(1)}\) multiplicative and an \(O(nd\log n+(p\varepsilon ^{-1}n)^{O(1)})\) additive overheads in the running time of the algorithm. Combining this with the fact that Algorithm 1 runs in at most \((p\log n)^{p\varepsilon ^{-O(1)}}n^{O(1)}\) time, we know that constructing \(({\mathcal {D}}^\dag ,\varphi ^\dag )\) takes no more than \(O(nd\log n)+(p\log n)^{p\varepsilon ^{-O(1)}}n^{O(1)}\) time, which can be upper-bounded by \(O(nd\log n)+(p\varepsilon ^{-1})^{p\varepsilon ^{-O(1)}}n^{O(1)}\) due to Lemma 1, as desired.

We now consider the cost of \(({\mathcal {D}}^\dag ,\varphi ^\dag )\). The following lemma says that it is a \((1+O(\varepsilon ))\)-approximation solution to \({\mathcal {I}}\).

Lemma 9

\(\sum _{c\in {\mathcal {C}}}\delta (c,\varphi ^\dag (c))\leqslant (1+O(\varepsilon ))\mathrm{{opt}}.\)

Proof

The fact that \(h(f^*_i)\in {\mathcal {Q}}_i\) and \(\ell (h(f^*_i))=\ell (f^*_i)=\ell ^*_i\) implies that \(h(f^*_i)\in {\mathcal {Q}}_i\cap \tilde{{\mathcal {F}}}_{\ell ^*_i}\) and thus \(|{\mathcal {Q}}_i\cap \tilde{{\mathcal {F}}}_{\ell ^*_i}|\geqslant 1\) for each \(i\in [p]\). Given a real number \(i\in [p]\), we, respectively, consider the following three cases: (1) \(|{\mathcal {Q}}_i\cap \tilde{{\mathcal {F}}}_{\ell ^*_i}|= 1\), (2) \(|{\mathcal {Q}}_i\cap \tilde{{\mathcal {F}}}_{\ell ^*_i}|>1\) and \({\mathcal {Q}}_i={\mathcal {Q}}(i,0)\), and (3) \(|{\mathcal {Q}}_i\cap \tilde{{\mathcal {F}}}_{\ell ^*_i}|>1\) and \({\mathcal {Q}}_i\ne {\mathcal {Q}}(i,0)\).

For case (1), the facility from \({\mathcal {Q}}_i\cap \tilde{{\mathcal {F}}}_{\ell ^*_i}\) is added to \({\mathcal {J}}\) by Algorithm 1, and the fact that \(h(f^*_i)\in {\mathcal {Q}}_i\cap \tilde{{\mathcal {F}}}_{\ell ^*_i}\) indicates that

$$\begin{aligned} \{h(f^*_i)\}={\mathcal {Q}}_i\cap \tilde{{\mathcal {F}}}_{\ell ^*_i}\subseteq {\mathcal {J}}. \end{aligned}$$
(14)

For the case where \(|{\mathcal {Q}}_i\cap \tilde{{\mathcal {F}}}_{\ell ^*_i}|>1\), it can be seen that Algorithm 1 adds the facilities from a net \({\mathcal {N}}(i,\ell ^*_i)\) of \({\mathcal {Q}}_i\cap \tilde{{\mathcal {F}}}_{\ell ^*_i}\) to \({\mathcal {J}}\), and the fact that \({\mathcal {N}}(i,\ell ^*_i)\subseteq {\mathcal {Q}}_i\cap \tilde{{\mathcal {F}}}_{\ell ^*_i}\) implies that \(({\mathcal {Q}}_i\cap \tilde{{\mathcal {F}}}_{\ell ^*_i})\cap {\mathcal {J}}\ne \varnothing \). For case (2), let \(f'_i\in {\mathcal {F}}\) be a facility satisfying \(h(f'_i)\in ({\mathcal {Q}}_i\cap \tilde{{\mathcal {F}}}_{\ell ^*_i})\cap {\mathcal {J}}\). We have

$$\begin{aligned} \delta (h(f'_i),h(f^*_i))&\leqslant \delta (c_i,h(f'_i))+\delta (c_i,h(f^*_i))\nonumber \\&\leqslant 2\max _{f\in {\mathcal {Q}}_i}\delta (c_i,f)\nonumber \\&\leqslant \frac{2\varepsilon }{n}\delta _{\max }\nonumber \\&\leqslant \frac{2\varepsilon }{n}(1+O(\varepsilon ))\hbox {opt}, \end{aligned}$$
(15)

where the first step is due to triangle inequality, the second step follows from the fact that \(\{h(f'_i),h(f^*_i)\}\subseteq {\mathcal {Q}}_i\), the third step is due to the assumption that \({\mathcal {Q}}_i={\mathcal {Q}}(i,0)\) and the definition of \({\mathcal {Q}}(i,0)\), and the fourth step follows from Lemma 6.

For case (3), let \(h(f'_i)\) be the facility from \({\mathcal {N}}(i,\ell ^*_i)\) nearest to \(h(f^*_i)\). We have \(h(f'_i)\in {\mathcal {N}}(i,\ell ^*_i)\subseteq {\mathcal {Q}}_i\cap \tilde{{\mathcal {F}}}_{\ell ^*_i}\), and

$$\begin{aligned} \delta (h(f'_i),h(f^*_i))&\leqslant \varepsilon \max _{x,y\in {\mathcal {Q}}_i}\delta (x,y)\nonumber \\&\leqslant 2\varepsilon \max _{f\in {\mathcal {Q}}_i}\delta (c_i,f)\nonumber \\&\leqslant 2\varepsilon (1+\varepsilon )\delta (c_i,h(f^*_i))\nonumber \\&= O(\varepsilon )\delta (c_i,h(f^*_i)), \end{aligned}$$
(16)

where the first step follows from the fact that \(h(f'_i)\) is the nearest facility to \(h(f^*_i)\) in a \(\max _{x,y\in {\mathcal {Q}}_i}\varepsilon \delta (x,y)\)-net of \({\mathcal {Q}}_i\cap \tilde{{\mathcal {F}}}_{\ell ^*_i}\), the second step follows from triangle inequality, and the third step is due to the assumption that \({\mathcal {Q}}_i\ne {\mathcal {Q}}(i,0)\) (which says that there exists a real number \(j\in [\lceil \varepsilon ^{-2}\log n\rceil ]\) satisfying \({\mathcal {Q}}_i={\mathcal {Q}}(i,j)\)) and the definition of \({\mathcal {Q}}(i,j)\) for \(j>0\).

The argument above implies the existence of a set \({\mathcal {D}}'=\{f'_1,\cdots ,f'_p\}\subseteq {\mathcal {F}}\) with \(h(f'_i)\in {\mathcal {J}}\cap \tilde{{\mathcal {F}}}_{\ell ^*_i}\) for each \(i\in [p]\), which satisfies

$$\begin{aligned} \delta (h(f'_i),h(f^*_i)) \leqslant \frac{2\varepsilon }{n}(1+O(\varepsilon ))\hbox {opt} + O(\varepsilon )\delta (c_i,h(f^*_i)) \end{aligned}$$
(17)

for each \(i\in [p]\) due to equality (14), inequality (15), and inequality (16). Define \(\tilde{{\mathcal {D}}}'=\{h(f):f\in {\mathcal {D}}'\}\). The fact that \(h(f'_i)\in \tilde{{\mathcal {F}}}_{\ell ^*_i}\) for each \(i\in [p]\) and \(({\mathcal {D}}^*,\varphi ^*)\) is a feasible solution to \({\mathcal {I}}\) implies that inequality \(|\tilde{{\mathcal {D}}}'\cap \tilde{{\mathcal {F}}}_j|=|{\mathcal {D}}^*\cap {\mathcal {F}}_j|\leqslant p_j\) holds for each \(j\in {\mathcal {L}}\). Combining this with the definition of \({\mathbb {D}}\), we know that \(\tilde{{\mathcal {D}}}'\in {\mathbb {D}}\).

For each \(i\in [p]\), define \(\tilde{{\mathcal {C}}}^*_i=\{c\in \tilde{{\mathcal {C}}}:\arg \min _{f\in \tilde{{\mathcal {D}}}^*\cap {\tilde{{\mathcal {F}}}^+_{\ell (c)}}}\delta (c,f)=h(f^*_i)\}\). It is the case that

$$\begin{aligned} \sum _{i=1}^p\sum _{c\in \tilde{{\mathcal {C}}}^*_i}w(c)\delta (h(f'_i),h(f^*_i))&\leqslant \sum _{i=1}^p\sum _{c\in \tilde{{\mathcal {C}}}^*_i}w(c) \left( \frac{2\varepsilon }{n}(1+O(\varepsilon ))\mathrm{{opt}} + O(\varepsilon )\delta (c_i,h(f^*_i))\right) \nonumber \\&\leqslant O(\varepsilon )\mathrm{{opt}} +O(\varepsilon )\sum _{i=1}^p\sum _{c\in \tilde{{\mathcal {C}}}^*_i}w(c)\delta (c_i,h(f^*_i))\nonumber \\&\leqslant O(\varepsilon )\mathrm{{opt}} +O(\varepsilon )\sum _{i=1}^p\sum _{c\in \tilde{{\mathcal {C}}}^*_i}w(c)\delta (c,h(f^*_i))\nonumber \\&= O(\varepsilon )\mathrm{{opt}} +O(\varepsilon )\sum _{c\in \tilde{{\mathcal {C}}}}w(c)\delta (c,\tilde{{\mathcal {D}}}^*\cap {\tilde{{\mathcal {F}}}^+_{\ell (c)}})\nonumber \\&\leqslant O(\varepsilon )\mathrm{{opt}}, \end{aligned}$$
(18)

where the first step follows from inequality (17), the second step is derived from the fact that \(\sum _{c\in \tilde{{\mathcal {C}}}}w(c)=|{\mathcal {C}}|<n\) (due to Lemma 5), the third step follows from the fact that \(c_i\) is the client from \(\{c\in \tilde{{\mathcal {C}}}: \ell (c)\leqslant \ell ^*_i\}\) nearest to \(h(f^*_i)\) for each \(i\in [p]\), the fourth step is due to the definition of \(\tilde{{\mathcal {C}}}^*_i\), and the fifth step is derived from Lemma 6. Consequently, we have

$$\begin{aligned} \sum _{c\in \tilde{{\mathcal {C}}}}w(c)\delta (c,\tilde{{\mathcal {D}}}'\cap \tilde{{\mathcal {F}}}^+_{\ell (c)}) =&\sum _{i=1}^p\sum _{c\in \tilde{{\mathcal {C}}}^*_i}w(c)\delta (c,\tilde{{\mathcal {D}}}'\cap \tilde{{\mathcal {F}}}^+_{\ell (c)})\nonumber \\&\leqslant \sum _{i=1}^p\sum _{c\in \tilde{{\mathcal {C}}}^*_i}w(c)(\delta (c,h(f^*_i))+\delta (h(f^*_i),\tilde{{\mathcal {D}}}'\cap \tilde{{\mathcal {F}}}^+_{\ell (c)}))\nonumber \\&\leqslant \sum _{i=1}^p\sum _{c\in \tilde{{\mathcal {C}}}^*_i}w(c)(\delta (c,h(f^*_i))+\delta (h(f^*_i),h(f'_i)))\nonumber \\ =&\sum _{c\in \tilde{{\mathcal {C}}}}w(c)\delta (c,\tilde{{\mathcal {D}}}^*\cap {\tilde{{\mathcal {F}}}^+_{\ell (c)}})\nonumber \\&+ \sum _{i=1}^p\sum _{c\in \tilde{{\mathcal {C}}}^*_i}w(c)\delta (h(f^*_i),h(f'_i))\nonumber \\ =&(1+O(\varepsilon ))\mathrm{{opt}}, \end{aligned}$$
(19)

where the second step is due to triangle inequality, the third step follows from the fact that \(\ell (c)\leqslant \ell ^*_i\) and \(h(f'_i)\in \tilde{{\mathcal {F}}}_{\ell ^*_i}\subseteq \tilde{{\mathcal {F}}}^+_{\ell (c)}\) for each \(c\in \tilde{{\mathcal {C}}}^*_i\) and \(i\in [p]\), the fourth step follows from the definition of \(\tilde{{\mathcal {C}}}^*_i\), and the fifth step is due to Lemma 6 and inequality (18).

Combining the fact that \(\tilde{{\mathcal {D}}}'\in {\mathbb {D}}\) and \(\tilde{{\mathcal {D}}}^\dag = \arg \min _{\tilde{{\mathcal {D}}}\in {\mathbb {D}}}\sum _{c\in \tilde{{\mathcal {C}}}}w(c)\delta (c,\tilde{{\mathcal {D}}}\cap \tilde{{\mathcal {F}}}^+_{\ell (c)})\) with inequality (19) yields

$$\begin{aligned} \sum _{c\in \tilde{{\mathcal {C}}}}w(c)\delta (c,\tilde{{\mathcal {D}}}^\dag \cap \tilde{{\mathcal {F}}}^+_{\ell (c)})&\leqslant \sum _{c\in \tilde{{\mathcal {C}}}}w(c)\delta (c,\tilde{{\mathcal {D}}}'\cap \tilde{{\mathcal {F}}}^+_{\ell (c)}) \leqslant (1+O(\varepsilon ))\mathrm{{opt}}. \end{aligned}$$
(20)

Therefore, we get

$$\begin{aligned} \sum _{c\in {\mathcal {C}}}\delta (c,\varphi ^\dag (c))&= \sum _{c\in {\mathcal {C}}}\delta (c,{\mathcal {D}}^\dag \cap {\mathcal {F}}^+_{\ell (c)})\nonumber \\&\leqslant \frac{1}{1-\varepsilon }\sum _{c\in \tilde{{\mathcal {C}}}}w(c)\delta (c,\tilde{{\mathcal {D}}}^\dag \cap \tilde{{\mathcal {F}}}^+_{\ell (c)})\nonumber \\&\leqslant \frac{1+O(\varepsilon )}{1-\varepsilon }\hbox {opt}\nonumber \\&= (1+O(\varepsilon ))\hbox {opt}, \end{aligned}$$

where the first step is due to the definition of \(\varphi ^\dag \), the second step follows from Lemma 5, and the third step is due to inequality (20). This completes the proof of Lemma 9.

Lemma 8 and Lemma 9 imply that a \((1+O(\varepsilon ))\)-approximation solution to \({\mathcal {I}}\) can be constructed in \(O(nd\log n)+(p\varepsilon ^{-1})^{p\varepsilon ^{-O(1)}}n^{O(1)}\) time, which in turn implies that Theorem 1 is true.

4 Conclusions

In this paper we consider the Budgeted Priority p-Median problem in high-dimensional Euclidean spaces. We give a \((1+O(\varepsilon ))\)-approximation algorithm running in FPT(p) time. Our main technical contribution is a data reduction-based approach for constructing small solution search spaces, which is quite different from the commonly used sampling-based methods and we believe is of independent interest.