1 Introduction

The uncapacitated facility location problem (UFLP) has been extensively studied in the field of the facility location problem. Shmoys et al. (1997) provided the first constant factor approximation ratio 3.16 for UFLP. Chudak and Shmoys (2003) presented an improved approximation algorithm for UFLP with performance guarantee (1+1/e). Sviridenko (2002) used pipage rounding to get an approximation algorithm with approximation ratio 1.582. Jain and Vazirani (2001) proposed a primal-dual approximation algorithm for UFLP with approximation ratio 3. Mahdian et al. (2006) studied UFLP by dual-fitting and greedy augmentation which give approximation ratio 1.52. The currently best approximation ratio for the UFLP is 1.488 provided by Li (2013). By relating UFLP to set cover, Charikar and Guha (2005) proved the lower bound of UFLP is 1.463 by assuming NP \(\subseteq \) DTIME\((n^{\log \log n})\). The greedy augmentation introduced by Charikar and Guha (2005). Charikar et al. (2001) considered the uncapacitated facility location problem with penalties FLPWP and obtained a 3-performance guarantee based on primal-dual. Jain et al. (2003) presented a combinatorial 2-approximation ratio by dual fitting. Xu and Xu (2009) presented a 1.8526-approximation algorithm by primal-dual and local search heuristic. The currently best known approximation ratio for FLPWP is 1.488 which was given by Qiu and Kern (2016) based on dual-fitting and LP-rounding. Hayrapetyan et al. (2005) investigated the single level facility location problem with submodular penalties (FLPSP) and provided an 2.488-approximation algorithm. The penalties cost is a monotone increasing submodular function \(h(\cdot )\) defined on client set \({\mathcal {D}}\), i.e., for any \(A, B\subseteq {\mathcal {D}}\) with \(A\subseteq B\subseteq {\mathcal {D}}\), \(h(A)\le h(B)\) and \(h(A+{j})-h(A)\ge h(B+{j})-h(B)\). Li et al. (2013) gave an 2.375-approximation algorithm for the FLPSP by using the primal-dual and the greedy augmentation scheme. Then, Li et al. (2015b) provided an LP-rounding 2-approximation algorithm for the single level FLPSP. The k-facility location problem (k-UFLP) is a generalization of UFLP. The aim of k-UFLP is to open a subset of facilities and the size of the facility subset is at most k facilities, where k is a given constant positive integer, connect all clients to the closest opened facilities such that the total cost is minimized. Jain and Vazirani (2001) firstly considered the metric k-UFLP and gave a 6 primal-dual approximation algorithm. Jain et al. (2003) proved a 4-approximation algorithm by combining greedy scheme and dual fitting with factor-revealing LP. Zhang (2007) used local search approach and gave a 2+\(\sqrt{3}\)+\(\epsilon \)-approximation ratio, which is the best approximation algorithm for k-UFLP. There are also many variant such as the squared metric k-facility location problem, Zhang et al. (2023) gave a \(36.342+\epsilon \)-approximation algorithm based on local search scheme, which is the currently best known approximation ratio. The k-facility location problem with linear penalties is also an extension of k-UFLP, such as Wang et al. (2018) proved a \(2+1/p+\sqrt{3+2/p+1/p^{2}}+\epsilon \) based on local search scheme, where p is positive integer, \(\epsilon >0\). k-level uncapacitated facility location problem (k-LFLP) is a generalization of UFLP, when \(k=1\), the k-LFLP is UFLP. The aim of k-LFLP is to connect all clients to opened facilities from level 1 to level k such that the sum of the opening and connection cost is minimized. By observing the structure of solution of k-LFLP, Byrka and Rybicki (2012) proposed a 3-approximation algorithm by rounding a fractional solution to an extended LP formulation, which is also the currently best approximation algorithm for k-LFLP. For k-LFLP, Krishnaswamy and Sviridenko (2012) proved that there is no polynomial time approximation algorithm with performance guarantee better than 1.61 unless NP \(\subseteq \) DTIME\((n^{O(\log \log n)})\). The k-level facility location problem with submodular penalties (k-FLPSP) is a variant of k-LFLP. Li et al. (2012) presented a primal-dual algorithms with a performance guarantee of 6 for k-FLPSP. Later on, Li et al. (2015a) proposed an LP-rounding \(1+\frac{2}{1-e^{-2}}(\approx 3.314)\)-approximation algorithm, which is the best known approximation ratio. In this paper, we present an improved combinatorial algorithm for the k-FLPSP. The main idea to get the result are local search and greedy augmentation. The main steps is as follows: firstly, we restructure the k-FLPSP, and then we present our primal-dual greedy augmentation approximation algorithm for the k-FLPSP. Finally, scaling the cost of the facility and penalty function, we prove that the presented algorithm has an approximation ratio 2.9444.

2 Problem statement and notation

In this paper, we present an improved combinatorial algorithm for the k-FLPSP. In the k-FLPSP, let \({\mathcal {D}}\) be the client set, \({\mathcal {F}}\) be the facility set, \({\mathcal {F}}=\displaystyle \bigcup _{t}{\mathcal {F}}^{t}\), \(t\in \{1, 2, \cdots , k\} \). And \({\mathcal {F}}^{t}\) is the facilities set on the tth level, where \(t\in \{1, 2, \cdots , k\} \). P is the paths set of k-level facilities, \(P=\{p: p=(i_{1}\in {\mathcal {F}}^{1}, i_{2}\in {\mathcal {F}}^2,\cdots , i_{k}\in {\mathcal {F}}^{k})\}\). Each client is assigned to a sequence of k different facilities, each of the k facilities belongs to a distinct level from level 1 to level k. Each client should pay a connection cost \(c_{jp}\) for being connected, where \(c_{jp}=c_{ji_{1}}+\sum \limits _{t=2}^{k}c_{i_{t-1}i_{t}}\) is the connection cost between client j and path p, \(c_{ji_{1}}\) is the connection cost between client j and facility \(i_{1}\in {\mathcal {F}}^{1}\), \(c_{i_{t-1}i_{t}}\) is the connection cost between \(i_{t-1}\in {\mathcal {F}}^{t-1}\) and \(i_{t} \in {\mathcal {F}}^{t}\), where \(t\in \{2, \cdots , k\} \). We consider the uncapacitated k-FLPSP, there is no capacitated restrictions for each facility. For the sake of simplification, \(f_{i_{t}}\) is the opening cost of facility \(i_{t}\in {\mathcal {F}}^{t}\), \(t\in \{1, 2, \cdots , k\} \). Given a nondecreasing submodular function \(h(\cdot )\), h(S) is the penalty cost of client set \(S\subseteq {\mathcal {D}}\) and \(h(\phi ) = 0\). We consider the metric k-FLPSP, this means that the connection cost satisfies symmetry triangle inequality, such as \(c_{ij}\le c_{ij'}+c_{i'j'}+c_{i'j}\), for \(i, i'\in {\mathcal {F}}, j, j'\in {\mathcal {D}}\). Our aim is to select the facility subset of each level to open and connect all clients to the open facilities such that the total cost of k-FLPSP is minimized. The k-FLPSP can be formulated as the following integer programming:

$$\begin{aligned}&\min ~~\sum _{l=1}^{k}\sum _{i_{l}\in {\mathcal {F}}^{l}}f_{i_{l}}y_{i_{l}}+\sum _{j\in {\mathcal {D}}}\sum _{p\in P}c_{jp}x_{jp}+\sum _{S\subseteq {\mathcal {D}}}h(S)z_{S}\\ {}&\begin{array}{rl@{}ll} s.t.&{}{}\displaystyle \sum _{p\in P}x_{jp}+\sum _{S\subseteq {\mathcal {D}},j\in S}z_{S}\ge 1, ~~~~~\forall j\in {\mathcal {D}}, \\ {} &{}{} \displaystyle \sum _{p:i_{l}\in p}x_{jp}\le y_{i_{l}}, ~~~~~~~~~~~~~~~~~~\forall j\in {\mathcal {D}}, ~i_{l}\in {\mathcal {F}}^{l},~l=1,\cdots ,k,\\ {} &{}{} x_{jp}\ge 0,~~~~~~~~~~~~~~~~~~~~\forall p\in P,~j\in {\mathcal {D}},\\ {} &{}{} y_{i_{l}}\ge 0,~~~~~~~~~~~~~~~~~~~~\forall i_{l}\in \mathcal {F}^{l},~l=1,\cdots ,k,\\ {} &{}{} z_{S}\ge 0,~~~~~~~~~~~~~~~~~~~~\forall S\subseteq {\mathcal {D}}. \end{array} \end{aligned}$$
(1)

where \(x_{jp}=1\) if client j is connected to path p, otherwise variable \(x_{jp}=0\). When the facility on level l is open \(y_{i_{l}}=1\), and 0 otherwise. Variable \(z_{S}\) equals 1 if the client set S is penaltied, and 0 otherwise. The first constraint implies that a client can be connected to a path p or be punished at some set \(S\subseteq {\mathcal {D}}\). The second constraint indicates that if a client is connected to a path p, all the facilities of the path p must be opened.

The relaxation of the integer programming (1) is given as follows:

$$\begin{aligned}&\min ~~\sum _{l=1}^{k}\sum _{i_{l}\in {\mathcal {F}}^{l}}f_{i_{l}}y_{i_{l}}+\sum _{j\in {\mathcal {D}}}\sum _{p\in P}c_{jp}x_{jp}+\sum _{S\subseteq {\mathcal {D}}}h(S)z_{S}\\&\begin{array}{rl@{}ll} s.t.&{}\displaystyle \sum _{p\in P}x_{jp}+\sum _{S\subseteq {\mathcal {D}},j\in S}z_{S}\ge 1, ~~~~~\forall j\in {\mathcal {D}}, \\ &{} \displaystyle \sum _{p:i_{l}\in p}x_{jp}\le y_{i_{l}}, ~~~~~~~~~~~~~~~~~~\forall j\in {\mathcal {D}}, ~i_{l}\in {\mathcal {F}}^{l},~l=1,\cdots ,k,\\ &{} x_{jp}\ge 0,~~~~~~~~~~~~~~~~~~~~\forall p\in P,~j\in {\mathcal {D}},\\ &{} y_{i_{l}}\ge 0,~~~~~~~~~~~~~~~~~~~~\forall i_{l}\in {\mathcal {F}}^{l},~l=1,\cdots ,k,\\ &{} z_{S}\ge 0,~~~~~~~~~~~~~~~~~~~~\forall S\subseteq {\mathcal {D}}. \end{array} \end{aligned}$$
(2)

The dual program corresponding to the linear programming relaxation (2) is the following:

$$\begin{aligned}&\max ~~\sum _{j\in {\mathcal {D}}}\alpha _{j}\\&\begin{array}{rl@{}ll} s.t.&{}\displaystyle \alpha _{j}\le c_{jp}+\sum _{i_{l}\in P}\beta _{i_{l}j}, ~~~~~\forall p\in P, j\in {\mathcal {D}}, \\ &{} \displaystyle \sum _{j\in S}\alpha _{j}\le h(S), ~~~~~~~~~~~~~\forall S\subseteq {\mathcal {D}},\\ &{} \displaystyle \sum _{j\in {\mathcal {D}}}\beta _{i_{l}j}\le f_{i_{l}},~~~~~~~~~~~~~\forall i_{l}\in {\mathcal {F}}^{l},~l=1, \cdots , k,\\ &{} \alpha _{j}\ge 0,~~~~~~~~~~~~~~~~~~~~\forall j\in {\mathcal {D}},\\ &{} \beta _{i_{l}j}\ge 0,~~~~~~~~~~~~~~~~~~~\forall j\in {\mathcal {D}}, i_{l}\in \mathcal {F}^{l},~l=1, \cdots , k. \end{array} \end{aligned}$$

where \(\alpha _{j}\) is the total spending of client j in the process, and \(\beta _{i_{l}j}\) indicates the opening cost of facility \(i_{l}\) given by client j.

3 Primal-dual approximation algorithm

Different from the primal-dual approximation algorithm of Li et al. (2012), we change the way of last opening facility set to get much more tight result for k-FLPSP.

In order to understand the situation of each client when we run primal-dual algorithm, we give some definitions. Initially, all clients in \({\mathcal {D}}\) are unfrozen, if client j is connected to path p of which each facility of path p is open, client j is frozen. When \(\sum \limits _{j\in {\mathcal {D}}}\beta _{i_{l}j}=f_{i_{l}}\), the facilities \(i_{l}\) is open. For some path \(p=(i_{1}\in {\mathcal {F}}^{1}, \cdots , i_{l}\in {\mathcal {F}}^{l})\), if facilities \(i_{1}, i_{2},\cdots , i_{l-1}\) are all open and \(\alpha _{j}=c_{jp}+\sum _{l^{'}=1}^l\beta _{i_{l'}j}\), client j reaches facility \(i_{l}\in {\mathcal {F}}^{l}\). If facilities \(i_{l}\in {\mathcal {F}}^{l}\) is open, client j leaves \(i_{l}\) and makes contribution to connection cost for the facility of the \(l+1\)th level, \(1 \le l< k\) or \(l=k\), client j is connected. The dual variable \(\alpha _{j}\) of all unfrozen clients \(j\in {\mathcal {D}}\) increase uniformly at unit rate of time t. \(\beta _{i_{l}j}\) increases at same rate of \(\alpha _{j}\) when client \(j\in {\mathcal {D}}\) reaches unopened facility \(i_{l}\in {\mathcal {F}}^{l}\). All dual variables \(\beta _{i_{l}j}(j\in {\mathcal {D}})\) stop increasing when facility \(i_{l}\) is open. Time will stop until there is no unfrozen client. Facility \(i_{l}\in {\mathcal {F}}^{l}\) is temporarily open when \(\displaystyle \sum _{j\in {\mathcal {D}}}\beta _{i_{l}j}=f_{i_{l}}\), \(t_{i_{l}}\) is the moment of facility \(i_{l}\) temporarily open. The predecessor of \(i_{l}\) will be the facility in the \(l- 1\)th level via which \(i_{l}\) was for the first time reached by a client, i.e. \(pred(i_{l}):=\displaystyle \arg \min _{i\in {\mathcal {F}}^{l-1}}\{ t_{i}+c_{ii_{l}}\}\), \(t_{i}\) is time of facility i temporarily open, the predecessor of \(i_{1}\in {\mathcal {F}}^{1}\) is the client which is closest to \(i_{1}\), \(t_{pred(i_{1})}:=0\).

Algorithm 1

Step 1. Initialization: we introduce the notion of time t, Initially, \(t=0\), set \(\alpha _{j}=0(\forall j\in {\mathcal {D}})\), \(\beta _{i_{l}j}=0(i_{l}\in {\mathcal {F}}^{l}, 1\le l\le k, j\in {\mathcal {D}})\), all facilities are unopened and all clients are unfrozen. \({\tilde{S}}\) is the set of penalized clients, initially \({\tilde{S}}=\emptyset \). With increase of time, there will occur three events:

Event 1. Facility \(i_{k}\) is temporarily open, we freeze those clients j with \(\beta _{i_{k}j}>0\) and let those clients be connected to \(i_{k}\), facility \(i_{k}\) is the connecting witness for client j. The associated path of \(i_{k}\) is \(p(i_{k})=(i_{1}, i_{2}, \cdots , i_{k})\), \(i_{l}=pred(i_{l+1})\), \(\forall 1\le l\le k-1\), the predecessor of \(i_{1}\) is the client \(j_{i_{1}}\). The neighborhood of facility \(i_{k}\) are the clients which pay for connected path \(p(i_{k})\), i.e. \(N(i_{k}):=\{j\in {\mathcal {D}}|\beta _{i_{l}j}>0, i_{l}\in p(i_{k})\}.\)

Event 2. When unfrozen client j reaches temporarily open facility \(i_{k}\), freeze client j, we call facility \(i_{k}\) the connecting witness of client j.

Event 3. For some subset \(S\subseteq {\mathcal {D}}\), if \(\displaystyle \sum _{j\in S}\alpha _{j}=h(S)\), freeze all unfrozen clients in S, set \({\tilde{S}}:={\tilde{S}}\cup S\), we call clients in \({\tilde{S}}\) the penalized clients.

When all clients are frozen, Step 1 terminates. If several events occur simultaneously, the algorithm executes these events in an arbitrary order.

Step 2. We choose set \({\tilde{S}}\) in Step 1 as the penalized client set, the temporarily open facility set on level k is \(\mathcal {{\tilde{F}}}^{k}\) and sort these facilities temporarily according to the open time t with nondecreasing order. Let \(\mathcal {{\bar{F}}}^{k}\) be finial open facility set, we add facility \(i_k\) to \(\bar{{\mathcal {F}}}^k\) with order if and only if there is no facility \(i'_{k}\in \mathcal {{\bar{F}}}^{k}\), which is satisfied with \(c_{i_{k}i'_{k}}\le 3t_{i_{k}}\), upset \(\mathcal {{\bar{F}}}^{k}:=\mathcal {{\bar{F}}}^{k}\cup \{i_{k}\}\), open facility \(i_{k}\in \mathcal {{\bar{F}}}^{k}\) and the associated path \(p(i_{k})\). For each client \(j\in N(i_{k})\), if \(i_{k}\in \mathcal {{\bar{F}}}^{k}\), connect client j to the associated path \(p(i_{k})\) of facility \(i_{k}\). Otherwise, connect client j to the associated path \(p(i'_{k})\) of the closest facility \(i'_{k}\).

Algorithm 1
figure a

.

Lemma 1

Li et al. (2012) Algorithm 1 can be solved in polynomial time.

Lemma 2

If facility \(i_{k}, i'_{k}\in \mathcal {{\bar{F}}}^{k}\), then \(N(i_{k})\cap N(i'_{k})=\emptyset \).

Proof

We assume that \(N(i_{k})\cap N(i'_{k})\ne \emptyset \), there exists a client \(j\in N(i_{k})\cap N(i'_{k})\). Suppose that facility \(i_{k}\) is open after the opening of facility \(i'_{k}\), then

$$\begin{aligned} c_{i_{k}i'_{k}}>3t_{i_{k }}>2t_{i_{k }}>t_{i_{k }}+t_{i'_{k}}. \end{aligned}$$

Due to \(j\in N(i_{k})\), there exists a facility \(i_{l}\in p(i_{k})\) with \(\beta _{i_{i}j}>0\), and there exists a path \(p_{i_{l}}\) from \({\mathcal {F}}^{1}\) to facility \(i_{l}\), where \(c_{jp_{i_{l}}}\le t_{i_{l}}\). We consider the restructure of the path \(p_{i_{k}}\) as following: firstly, along the path \(p_{i_{l}}\) from \({\mathcal {F}}^{1}\) to \(i_{l}\), then along the associated path \(p(i_{k})\) of facility \(i_{k}\) from \(i_{l}\) to \(i_{k}\). According to the definition of predecessor, \(c_{jp_{i_{k}}}\le t_{i_{k}}\). Analogously, there exists a path \(p_{i'_{k}}\) of \(i'_{k}\) from \({\mathcal {F}}^{1}\) to \(i'_{k}\) with \(c_{jp_{i'_{k}}}\le t_{i'_{k}}\). In conclusion, \(c_{ji_{k}}+c_{ji'_{k}}\le c_{j p_{i_{k}}}+c_{j p_{i'_{k}}}\le t_{i_{k }}+t_{i'_{k}}< c_{i_{k}i'_{k}}\), this conflict with triangle inequality. \(\square \)

Lemma 3

If \(j\in N(i_{k})\backslash {\tilde{S}}\), \(i_{k}\in \mathcal {{\bar{F}}}^{k}\), then \(t_{i_{k}}\le 2\alpha _{j}\).

Proof

We assume that \(t_{i_{k}}> 2\alpha _{j}\), since client \(j\in N(i_{k})\backslash {\tilde{S}}\), from the proof of Lemma 2, we know that there exists a path \(p_{i_{k}}\) of facility \(i_{k}\) from \({\mathcal {F}}^{1}\) to facility \(i_{k}\) with \(c_{jp_{i_{k}}}\le t_{i_{k}}\). Client j firstly reaches the open facility \(i'_{k}\in {\mathcal {F}}^{k}\). Then we have \(t_{i'_{k}}\le \alpha _{j}\), there also exists a path \(p_{i'_{k}}\), such that \(c_{jp_{i'_{k}}}\le \alpha _{j}\).

Case 1. If facility \(i'_{k}\) is open, then \(c_{i_{k}i'_{k}}\le c_{jp_{i'_{k}}}+c_{jp_{i_{k}}}\le t_{i_{k}}+\alpha _{j}<\frac{3}{2}t_{i_{k}}<3t_{i_{k}}.\)

Case 2. If facility \(i'_{k}\) is not open, there exists an open facility \(i''_{k}\in \mathcal {{\bar{F}}}^{k}\), \(c_{i'_{k}i''_{k}}\le 3t_{i'_{k}}\) and \(t_{i''_{k}}\le t_{i'_{k}}\le \alpha _{j}<t_{i_{k}}\). So we can get

$$\begin{aligned} c_{i_{k}i''_{k}}&\le c_{jp_{i_{k}}}+c_{jp_{i'_{k}}}+c_{i'_{k}i''_{k}}\\&\le t_{i_{k}}+\alpha _{j}+3t_{i'_{k}}\\&\le t_{i_{k}}+4\alpha _{j}\\&<3t_{i_{k}}. \end{aligned}$$

This conflict with the condition of facility opening in Algorithm 1. \(\square \)

Lemma 4

Li et al. (2012) At any moment t in Step 1 of Algorithm 1, the cost of penalized clients set \({\tilde{S}}\) is:

$$\begin{aligned} \sum _{j\in {\tilde{S}}}\alpha _{j}(t)=h({\tilde{S}}). \end{aligned}$$

\(\alpha _{j}(t)\) is equal to \(\alpha _{j}\) which \(\alpha _{j}\) is at moment t, and \(\alpha _{j}\) increases uniformly with time t until client j is frozen.

In order to get tight bound of total cost of k-FLPSP, we analyze the opening cost of facilities in detail.

Lemma 5

\(f(p{(i_{k})})\le \displaystyle \sum _{i\in p{(i_{k})}}\sum _{j\in N(i_{k})\backslash {\tilde{S}}}\beta _{ij}+\sum _{j\in N(i_{k})\cap {\tilde{S}}}\alpha _{j}\), \(i_{k}\in \mathcal {{\bar{F}}}^{k}\).

Proof

$$\begin{aligned} f(p{(i_{k})})&=\sum _{i\in p{(i_{k})}}\sum _{j\in N(i_{k})}\beta _{ij}\\&=\sum _{i\in p{(i_{k})}}\sum _{j\in N(i_{k})\backslash {\tilde{S}}}\beta _{ij}+\sum _{j\in N(i_{k})\cap {\tilde{S}}}\beta _{ij}\\&\le \sum _{i\in p{(i_{k})}}\sum _{j\in N(i_{k})\backslash {\tilde{S}}}\beta _{ij}+\sum _{j\in N(i_{k})\cap {\tilde{S}}}\alpha _{j}. \end{aligned}$$

\(\square \)

Lemma 6

On Step 2 of the Algorithm 1, client j is connected to facility \(i_{k}\) and \(i_{k}\) is not a connecting witness of client j, while j makes no contribution to facilities of path \(p(i_{k})\), i.e. \(M(i_{k}):=\{j\in {\mathcal {D}}\backslash N(i_{k})\cup {\tilde{S}} | i(j)\ne i_{k}, {\bar{i}}(j)=i_{k}\}\). If facility i(j) is a connecting witness of client j at Step 1, client j at Step 2 is connected to facility \({\bar{i}}(j)\), then

$$\begin{aligned}c_{jp(i_{k})}\le 6\alpha _{j}, j\in M(i_{k}).\end{aligned}$$

Proof

We assumed that client j is the connecting witness of \(i'_{k}\) at Step 1, then there exists a path \(p_{i'_{k}}\), which \(c_{jp_{i'_{k}}}\le \alpha _{j}\) and \(t_{i'_{k}}\le \alpha _{j}\). According to triangle inequality, we have:

$$\begin{aligned} \begin{array}{rl} c_{jp(i'_{k})}&{}=c_{ji_{1}}+\displaystyle \sum _{l'=2}^{k}c_{i_{l'-1}i_{l'}}\\ &{}\le c_{jp_{i'_{k}}}+2\displaystyle \sum _{l'=2}^{k}c_{i_{l'-1}i_{l'}}\\ &{}\le \alpha _{j}+2t_{i'_{k}}\\ &{}\le 3\alpha _{j}. \end{array} \end{aligned}$$

If \(i'_{k}\in \mathcal {{\bar{F}}}^{k}\), then \(c_{jp(i_{k})}\le c_{jp(i'_{k})}\le 3\alpha _{j}.\)

If \(i'_{k}\notin \mathcal {{\bar{F}}}^{k}\), there exist facility \(i''_{k}\in \mathcal {{\bar{F}}}^{k}\) with \(c_{i'_{k}i''_{k}}\le 3t_{i'_{k}}\), and \(t_{i''_{k}}\le t_{i'_{k}}\), \(N(i'_{k})\cap N(i''_{k})\ne \emptyset .\) Let client \(j\in N(i'_{k})\cap N(i''_{k})\), from the proof above, we know that there exist a associated path \(p(i''_{k}):=(i''_{1},\cdots , i''_{k-1}, i''_{k}=i'')\) of facility \(i''_{k}\). The connection cost of client j satisfies the following inequality:

$$\begin{aligned} \begin{array}{rl} c_{jp(i_{k})}&{}\le c_{jp(i''_{k})}\\ &{}=c_{jp(i''_{1})}+\sum _{l=2}^{k}c_{i''_{l-1}i''_{l}}\\ &{}\le c_{jp(i'_{k})}+c_{i''_{k}i''_{k}}+2\sum _{l=2}^{k}c_{i''_{l-1}i''_{l}}\\ &{}\le c_{jp(i'_{k})}+c_{i''_{k}i''_{k}}+2t_{i''_{k}}\\ &{}\le \alpha _{j}+5t_{i'_{k}}\\ &{}\le 6\alpha _{j}. \end{array} \end{aligned}$$

\(\square \)

Lemma 7

Client j is connected to facility \(i_{k}\) at Step 2 of Algorithm 1 and \(i_{k}\) is the connecting witness of client j at Step 1, while client j makes no contribution to the opening of facilities of path \(p(i_{k})\), i.e. \(j\in M'(i_{k}), M'(i_{k}):=\{j\in {\mathcal {D}}\backslash N(i_{k})\cup {\tilde{S}}|i(j)=i_{k},{\bar{i}}(j)=i_{k}\}\), i(j) and \({\bar{i}}(j)\) are the same with Lemma 6, so we have

$$\begin{aligned} c_{jp(i_{k})}\le 3\alpha _{j}, j\in M'(i_{k}). \end{aligned}$$

Proof

Since client j is the connecting witness of facility \(i_{k}\) at Step 1, from Lemma 6 we know that: \(c_{jp(i_{k})}\le 3\alpha _{j}, j\in M'(i_{k}).\) \(\square \)

Lemma 8

For \(i_{k}\in \mathcal {{\bar{F}}}^{k}\),

$$\begin{aligned} 3f(p(i_{k}))+\sum _{j\in N(i_{k})\backslash {\tilde{S}}}c(jp(i_{k}))\le 6\sum _{j\in N(i_{k})\backslash {\tilde{S}}}\alpha _{j}+3\sum _{j\in N(i_{k})\cap {\tilde{S}}}\alpha _{j}. \end{aligned}$$

Proof

For client \(j\in N(i_{k})\backslash {\tilde{S}}\), the contribution of client j to the facilities of path \(p(i_{k})\), facility \(i_{l}\) is the first facility which client j makes contribution to, i.e. \(i_{l}(j):=\{i_{m}\in p(i_{k})| \beta _{i_{m}j}>0, \beta _{i_{m}j}=0, \forall 1\le n<m\}\). There exists a path \(p_{i_{l}(j)}\) from \(\mathcal {{\bar{F}}}^{1}\) to facility \(i_{l}\) with \(c_{jp_{i_{l}(j)}}\le t_{i_{l}}\). Let \(A=\sum _{l'=2}^{l}c_{i_{l'-1}i_{l'}}+\sum _{l'=2}^{k}c_{i_{l'-1}i_{l'}}\)

$$\begin{aligned}&3f(p(i_{k}))+\sum \limits _{j\in N(i_{k})\backslash {\tilde{S}}}c_{jp(i_{k})}\\&\le 3\sum \limits _{j\in N(i_{k})\backslash {\tilde{S}}}\sum \limits _{i\in p(i_{k})}\beta _{ij}+3\sum \limits _{j\in N(i_{k})\cap {\tilde{S}}}\alpha _{j}+\sum \limits _{j\in N(i_{k})\backslash {\tilde{S}}}c_{jp(i_{k})}\\&\le 3\sum \limits _{j\in N(i_{k})\backslash {\tilde{S}}}\sum \limits _{l'=1}^{k}\beta _{i_{l'}j}+\sum \limits _{j\in N(i_{k})\backslash {\tilde{S}} }(c_{jp_{i_{l}(j)}}+A)+3\sum \limits _{j\in N(i_{k})\cap {\tilde{S}}}\alpha _{j}\\&\le \sum \limits _{j\in N(i_{k})\backslash {\tilde{S}}}(3\sum \limits _{l'=1}^{k}\beta _{i_{l'}j}+c_{jp_{i_{l}(j)}}+A )+3\sum \limits _{j\in N(i_{k})\cap {\tilde{S}}}\alpha _{j}\\&\le \sum \limits _{j\in N(i_{k})\backslash {\tilde{S}}}[(\sum \limits _{l'=1}^{k}\beta _{i_{l'}j}+c_{jp_{i_{l}(j)}} )+2A]+3\sum \limits _{j\in N(i_{k})\cap {\tilde{S}}}\alpha _{j}\\&\le \sum \limits _{j\in N(i_{k})\backslash {\tilde{S}}}[(\sum \limits _{l'=1}^{k}\beta _{i_{l'}j}+c_{jp_{i_{l}(j)}})+2t_{i_{k}} ]+3\sum \limits _{j\in N(i_{k})\cap {\tilde{S}}}\alpha _{j}\\&\le \sum \limits _{j\in N(i_{k})\backslash {\tilde{S}}}(2\alpha _{j}+2t_{i_{k}})+3\sum \limits _{j\in N(i_{k})\cap {\tilde{S}}}\alpha _{j}\\&\le 6\sum \limits _{j\in N(i_{k})\backslash {\tilde{S}}}\alpha _{j}+3\sum \limits _{j\in N(i_{k})\cap {\tilde{S}}}\alpha _{j}. \end{aligned}$$

Lemma 8 is proved. \(\square \)

Theorem 1

For k-FLPSP, the approximation ratio of Algorithm 1 is 6.

Proof

FCh are the opening, connection and penalty cost, respectively.

$$\begin{aligned}&3(F+h)+C\\&=3(\sum _{i_{k}\in \mathcal {{\bar{F}}}^k}f(p(i_{k}))+\sum _{j\in {\tilde{S}}}\alpha _{j})+\sum _{i_{k}\in \mathcal {{\bar{F}}}^k}(\sum _{j\in M(i_{k})}c_{jp(i_{k})}+\sum _{j\in M'(i_{k})}c_{jp(i_{k})}\\&\quad +\sum _{j\in N(i_{k})\backslash {\tilde{S}}}c_{jp(i_{k})})\\&=\sum _{i_{k}\in \mathcal {{\bar{F}}}^k}(3f(p(i_{k}))+\sum _{j\in N(i_{k})\backslash {\tilde{S}}}c_{jp(i_{k})} )+\sum _{i_{k}\in \mathcal {{\bar{F}}}^k}(\sum _{j\in M(i_{k})}c_{jp(i_{k})}\\&\quad +\sum _{j\in M'(i_{k})}c_{jp(i_{k})} )+3\sum _{j\in {\tilde{S}}}\alpha _{j}\\&\le \sum _{i_{k}\in \mathcal {{\bar{F}}}^k}(6\sum _{j\in N(i_{k})\backslash {\tilde{S}}}\alpha _{j}+3\sum _{j\in N(i_{k})\cap {\tilde{S}}}\alpha _{j})+\sum _{i_{k}\in \mathcal {{\bar{F}}}^k}(6\sum _{j\in M(i_{k})}\alpha _{j}\\&\quad +3\sum _{j\in M'(i_{k})}\alpha _{j})+3\sum _{j\in {\tilde{S}}}\alpha _{j}\\&\quad \le 6\sum _{i_{k}\in \mathcal {{\bar{F}}}^k}(\sum _{j\in N(i_{k})\backslash {\tilde{S}}}\alpha _{j}+\sum _{j\in M(i_{k})}\alpha _{j}+\sum _{j\in M'(i_{k})}\alpha _{j} )\\&\quad +3\sum _{i_{k}\in \mathcal {{\bar{F}}}^k}\sum _{j\in N(i_{k})\cap {\tilde{S}}}\alpha _{j}+3\sum _{j\in {\tilde{S}}}\alpha _{j}\\&\quad \le 6\sum _{j\in {\mathcal {D}}\backslash {\tilde{S}}}\alpha _{j}+6\sum _{j\in {\tilde{S}}}\alpha _{j}. \end{aligned}$$

Which proves the Theorem. \(\square \)

We restructure the k-FLPSP and firstly present the method of restructuring in the following, then we give some notations. Lastly, we use greedy augmentation to improve the initial solution.

Restructures: Consider the k-level facilities of the k-FLPSP. For every \(i_{1}\in {\mathcal {F}}^{1}\), let \(P_{i_{1}}=\{i_{1}, i_{2}\in {\mathcal {F}}^{2},\cdots ,i_{k}\in {\mathcal {F}}^{k}\}\). Let \({\mathcal {D}}\) be the set of clients, and the set of the "facility" be \(P:=\{p:p\in P _{i_{1}}, \forall i_{1}\in {\mathcal {F}}^{1}\}\), such as \(p:=(i_{1}\in {\mathcal {F}}^{1}, i_{2}\in {\mathcal {F}}^{2},\cdots , i_{k}\in {\mathcal {F}}^{k})\). the opening cost of each facility of each path p is calculated as follows:

$$\begin{aligned} f'_{i}:=\left\{ \begin{array}{ll} f_{i}, &{} when~i~firstly~be~computed~in~a~path,\\ 0, &{} otherwise.\\ \end{array} \right. \end{aligned}$$

This means that the opening cost of each facility can just be computed once. The opening cost of each path \(p\in P\) is \(f'_{p}=\displaystyle \sum _{i\in p}f'_{i}\). In order to improve the approximation ratio of the current solution, we use greedy augmentation technique. By executing some local search operations from an arbitrary integer feasible solution, we can improve the current approximation ratio. Let \({\mathcal {F}}_{0}\) be the set of open facilities, we randomly choose a facility of each level from \({\mathcal {F}}_{0}\), enumerate all paths, let \(P_{0}\) be the path set and \( S_{0}\) be the set of rejected clients in the current solution, let \( F_{0}\), \( C_{0}\), \( h_{0}\) be the current opening, connection and penalty costs. \(c^{{\mathcal {F}}_{0}}_{j}\) is the connection cost of the client to its closest open path in current solution, \(C({\mathcal {F}}_{0}\cup p)\) is the connection cost of all clients after adding path p to facility set \({\mathcal {F}}_{0}\). gain(S) and gain(p) can be calculated as follows:

$$\begin{aligned} gain(S)&=\sum _{j\in S\backslash S_{0}}c^{{\mathcal {F}}_{0}}_{j}+h_{0}-h_{S},~~~~~~S_{0}\subseteq S;\\ gain(p)&=C_{0}-C({\mathcal {F}}_{0}\cup p)-f'_{p},~~~~~p\in P\backslash P_{0}. \end{aligned}$$

We will try to improve the current solution by one of the two local search operations: either replacing \( S_{0}\) with a larger set S \( (S_{0}\subseteq S\subseteq {\mathcal {D}})\) or incorporating a path p from the path set P minus the current solution.

First, we extend the definition of gain(p) for \(p\in P_{0}\), and set \(gain(p)=0\) for \(p\in P_{0}\). We will improve the approximation ratio by iteration upset the set penaltied client or facility set until there is no set satisfies the condition. The following is the new greedy augmentation algorithm.

Algorithm 2

Step 1. The arbitrary initial feasible solution \(SOL_{0}\) with open facilities set \({\mathcal {F}}_{0}\), the path set is \(P_{0}\) and rejected clients set \( S_{0}\). Initialization s:\(=1\).

Step 2. Find a path \(p^{*}\in P\backslash P_{s-1}\) to maximize \(\displaystyle {\left\{ \frac{gain(p)}{f'_{p}}\right\} }\), where \(P_{s-1}\) is the path set of current solution, when \(s=1\), \(P_{s-1}= P_{0}\), let

$$\begin{aligned} r_{s}:=\frac{gain(p^{*})}{f'_{p^{*}}}=\max _{p\in P\backslash P_{s-1}} \displaystyle {\left\{ \frac{gain(p)}{f'_{p}}\right\} }. \end{aligned}$$

          We can calculate

$$\begin{aligned} m_{k}=\max _{S\subseteq {\mathcal {D}}\backslash S_{s-1}}{gain(S_{s-1}\cup S)}. \end{aligned}$$

Where \(S_{s-1}\) is the penaltied clients set in the \((s-1)\)th iteration, S is client subset which \(S\subseteq {\mathcal {D}}\backslash S_{s-1}\). If \(m_{s}> 0\), find a set of unrejected clients \(S^{*}\) which is the optimal solution of the following optimization problem

$$\begin{aligned} r'_{s}:=\max _{S\subseteq {\mathcal {D}}\backslash S_{s-1}}\displaystyle {\left\{ \frac{gain(S_{s-1}\cup S)}{h(S_{s-1}\cup S)-h(S_{s-1})}\right\} }, \end{aligned}$$

otherwise, set \(r'_{s}:=0\).

Step 3. If \(\max \{r_{s},r'_{s}\}\le 0\), the algorithm terminates, and outputs a feasible solution \(SOL_{s-1}\) (\(SOL_{s-1}\) is the solution which means we can’t found any client subset or facility subset to be added, so the Algorithm terminates in the sth iteration. The Algorithm outputs the \((s-1)th\) solution.) with open facilities set \(P_{s-1}\) and the rejected clients set \(S_{s-1}\).

Step 4. If \(r_{s}\ge r'_{s}\), open the path p and maintain the rejected clients set, meaning that we get a feasible solution \(SOL_{s-1}\) with \(P_{s}:=P_{s-1}\cup \{p\}\) and \(S_{s}:=S_{s-1}\); otherwise, extending the rejected clients set to \(S_{s-1}\cup S^{*}\) and maintaining the opening facilities set, meaning that we get a feasible solution \(SOL_{s-1}\) with \(P_{s}:= P_{s-1}\) and \(S_{s}:=S_{s-1}\cup S^{*}.\) Update \(s:=s+1\), and return to Step 1.

Algorithm 2
figure b

.

In the following, we present the whole algorithm.

Algorithm 3

Step 0. Given an instance of k-FLPSP, scale the facility cost and penalty function by a factor \(\delta =0.8571\).

Step 1. Through running the primal-dual algorithm (Algorithm 1) on the scaled instance to obtain a feasible solution \(SOL_{0}\) to the original instance.

Step 2. Let \(SOL_0\) be the initial feasible solution, apply the greedy augmentation algorithm (Algorithm 2) to get the solution \(\check{SOL}\), Where \(\check{SOL}\) is the solution which we obtain.

Algorithm 3
figure c

.

4 Analysis

We give the proof in Lemma 912 to prove that our algorithm can obtain a penaltied client subset in polynomial time or we can enumerate all the paths, and we also give the approximation ratio of Algorithm 3 in the following.

Lemma 9

Fujishige (2005) Suppose that f: \(2^{{\mathcal {D}}}\rightarrow R\) is a non-negative function with \(f(\emptyset )=0\), and g: \(2^{{\mathcal {D}}}\rightarrow R \) is a nonnegative function satisfying \(g(\emptyset )=0\) and \(g(S)>0\) for some \(S\subseteq {\mathcal {D}}\). Define the following minimum-ratio problem

$$\begin{aligned}&\min ~~~\frac{f(S)}{g(S)} \nonumber \\&\begin{array}{rl@{}ll} s.t.&{}\displaystyle g(S)>0, \\ &{}S\subseteq {\mathcal {D}}. \end{array} \end{aligned}$$
(4.1)

The Lagrangian function for f and \(-g\) associated with (4.1) is given by

$$\begin{aligned} L(\lambda , S)=f(S)-\lambda g(S), \end{aligned}$$

for \(\lambda \ge 0\) and \(S\subseteq {\mathcal {D}}\). Then a nonnegative \({\hat{\lambda }}\) is the minimum value of (4.1) if and only if

$$\begin{aligned} \min _{S\subseteq {\mathcal {D}}}L(\lambda ,S)&=0,~~0\le \lambda \le {\hat{\lambda }},\\ \min _{S\subseteq {\mathcal {D}}}L(\lambda ,S)&=0,~~{\hat{\lambda }}<\lambda . \end{aligned}$$

Furthermore, the minimum-ratio problem (4.1) is solvable in polynomial time if \(\displaystyle \min _{S\subseteq {\mathcal {D}}}L(\lambda ,S)\) is solvable in polynomial time for any \(\lambda \ge 0\).

Lemma 10

Li et al. (2013) For Step 2 of Algorithm 2,

$$\begin{aligned} m_{s}=\max \limits _{S\subseteq {\mathcal {D}}\backslash S_{s-1}}\{gain(S_{s-1}\cup S)\} \end{aligned}$$

is solvable in polynomial time and the maximum-ratio problem

$$\begin{aligned} r'_{s}:= \max _{S\subseteq {\mathcal {D}}\backslash S_{s-1}}\left\{ \frac{gain(S_{s-1}\cup S)}{P(S_{s-1})-P_{S_{s-1}}}\right\} \end{aligned}$$

is solvable in polynomial time when \(m_{k}>0\).

Lemma 11

The path p can be found in polynomial time in Algorithm 2.

Proof

There are total \(|{\mathcal {F}}^{1}||{\mathcal {F}}^{2}|\cdots |{\mathcal {F}}^{k}|\) paths after we restructure the k-FLPSP, for any constant k. The number of paths of the final solution is at most \(|{\mathcal {D}}||{\mathcal {F}}^{1}||{\mathcal {F}}^{2}|\cdots |{\mathcal {F}}^{k}|\) minus the number of paths which in the initial solution \(SOL_{0}\). So Step 2 of the Algorithm 2 can be finished in polynomial time. \(\square \)

Lemma 12

The Algorithm 2 can be completed in polynomial time.

Proof

In the Step 1 of Algorithm 2, we know that the Algorithm of Li et al. (2013) can be complete in the polynomial time. For Lemmas 911, it is possible in polynomial time to find a path or a subset of clients to be punished. So the Algorithm 2 can be executed in polynomial time. \(\square \)

Lemma 13

\(\sum \limits _{p\in P_{SOL}}gain(p)+gain(S_{SOL}\cup S_{s})\ge C_{s}-(F_{SOL}+h_{SOL}+C_{SOL}), \) where \(F_{SOL}\), \(C_{SOL}\), \(h_{SOL}\) are the opening, connection and penalty costs of arbitrary integer solution SOL of k-FLPSP, \(P_{SOL}\) is the path set of solution SOL. \(S_{s}\) is the penaltied clients set in current solution \(SOL_{s}\) and \(S_{SOL}\) is the penaltied clients set in solution SOL.

Proof

For arbitrary path \(p\in {\mathcal {F}}_{SOL}\), \({\mathcal {D}}_{SOL}(p)\) is the set of clients which are assigned to path p in SOL. For arbitrary client \(j\in {\mathcal {D}}_{SOL}(p)\), let \(\sigma (j)\) and \(\sigma _{SOL}(j)\) be the paths servicing j in the current solution \(SOL_{s}\)(the output feasible solution of the sth iteration in Algorithm 3 ) and solution SOL, respectively. Let \(c_{jp}=c_{ji_{1}}+\displaystyle \sum _{t=2}^kc_{i_{t-1}i_{t}}\).

By including path p and reassigning the clients in \({\mathcal {D}}_{SOL}(p)\backslash S_{s}\) to p, we can update the current solution. The \(gain'(p)\) is the resulted saving in cost, i.e.,

$$\begin{aligned} \begin{array}{rl} gain'(p)=\sum \limits _{j\in {\mathcal {D}}_{SOL}(p)\backslash S_{s}}(c_{j\sigma (j)}-c_{j\sigma _{SOL}(j)})-f'_{p}. \end{array} \end{aligned}$$

Note that it may occur that \(gain'(p)< 0\). From the definition of gain(p), we know that \(gain(p)>gain'(p)\). For \(j\in {\mathcal {D}}_{SOL}(p)\), let \(p=\sigma _{SOL}(j)\). Let \(P_{SOL}\) be the path set of solution SOL, \(S=S_{SOL}\cup S_{s}\). We have

$$\begin{aligned}&\sum \limits _{p\in P_{SOL}}gain'(p)+gain(S)\\&=\sum \limits _{p\in P_{SOL}}(-f'_{p}+\sum \limits _{j\in {\mathcal {D}}_{SOL}(p)\backslash S_{s}}(c_{j\sigma (j)}-c_{j\sigma _{SOL}(j)}))+ \sum \limits _{j\in S\backslash S_{s}}c_{j\sigma (j)}\\&\quad -(h(S)-h(S_{s}))\\&=\displaystyle -\sum \limits _{p\in P_{SOL}}f'_{p}+(\sum \limits _{p\in P_{SOL}}\sum \limits _{j\in {\mathcal {D}}_{SOL}(p)\backslash S_{s}}c_{j\sigma (j)}+\sum \limits _{j\in S\backslash S_{s}}c_{j\sigma (j)})\\&\quad -\sum \limits _{p\in P_{SOL}}\sum \limits _{j\in {\mathcal {D}}_{SOL}(p)\backslash S_{s}}c_{j\sigma _{SOL}(j)} -(h(S)-h(S_{s}))\\&\ge C_{s}-F_{SOL}-h_{SOL}-C_{SOL}. \end{aligned}$$

So we obtain the inequality \(\square \)

From the definition of \(r_{s+1}\) and \(r'_{s+1}\) in Algorithm 3, and Lemma 13, we have the following lemma.

Lemma 14

\(\max \{r_{s+1},r'_{s+1}\}\ge \frac{C_{s}-F_{SOL}-h_{SOL}-C_{SOL}}{F_{SOL}+h_{SOL}}.\)

Proof

Firstly, we assume \(h(S_{SOL}\cup S_{s})-h(S_{s})>0\). Otherwise, with slight simplification the following proof can be adapted to the special cases with \(f'_{p}=0\) for paths \(p\in {\mathcal {F}}_{SOL}\) or \(h(S_{SOL}\cup S_{s})-h(S_{s})=0\). We consider the special case that there exists some special paths with \(f'_{p}=0\), when \(f'_{p}=0\), this means that the cost of facilities in p have been computed before, let \(P''\) be the facility set of all the special paths, we can calculate as follows:

$$\begin{aligned}&\max \limits _{p\in P_{SOL}\backslash P''}{\left\{ \frac{gain(p)}{f'_{p}}\right\} }F_{SOL}+\sum \limits _{p\in P''}gain(p)\\&=\max \limits _{p\in P_{SOL}\backslash P''}{\left\{ \frac{gain(p)}{f'_{p }}\right\} }\sum \limits _{p\in P_{SOL}\backslash P''}\sum \limits _{i\in p}f'_{i}+\sum \limits _{p\in P''}gain(p)\\&=\sum \limits _{p\in P_{SOL}\backslash P''}\max \limits _{p\in P_{SOL}}{\left\{ \frac{gain(p)}{\sum \limits _{i\in p} f'_{i}}\right\} }\sum _{i\in p}f'_{i}+\sum _{p\in P''}gain(p)\\&\ge \sum \limits _{ p\in P_{SOL}\backslash P''}{\left( \frac{gain(p)}{\sum \limits _{i\in p}f'_{i}}\sum \limits _{i\in p}f'_{i}\right) }+\sum \limits _{p\in P''}gain(p)\\&=\sum \limits _{p\in P_{SOL}}gain(p). \end{aligned}$$

We can obtain the result by corresponding adjustment.

Now we consider the general case. From the above definition, we have

$$\begin{aligned} \max _{p\in P_{SOL}}{\left\{ \frac{gain(p)}{f'_{p}}\right\} }F_{SOL}&=\max _{p\in P_{SOL}}{\left\{ \frac{gain(p)}{f'_{p }}\right\} }\sum _{p\in P_{SOL}}\sum _{i\in p}f'_{i} \nonumber \\&=\sum _{p\in P_{SOL}}\max _{p\in P_{SOL}}{\left\{ \frac{gain(p)}{\sum \limits _{i\in p} f'_{i}}\right\} }\sum \limits _{i\in p}f'_{i} \nonumber \\&\ge \sum \limits _{ p\in P_{SOL}}{\left( \frac{gain(p)}{\sum \limits _{i\in p}f'_{i}}\sum \limits _{i\in p}f'_{i}\right) } \nonumber \\&=\sum _{p\in P_{SOL}}gain(p). \end{aligned}$$
(4.2)

The reason for the second equality holds is that the opening cost of each facility of path p is just computed once, then the connection cost of the opened facilities is 0 after it is first opened in a path. So the total opening cost of facility set of solution SOL is \(F_{SOL}=\displaystyle \sum _{p\in P_{SOL}}\sum _{i\in p}f'_{i}\).

Due to the submodularity of \(h(\cdot )\), we have

$$\begin{aligned} \frac{gain(S_{SOL}\cup S_{s})}{h(S_{SOL}\cup S_{s})-h(S_{s})}h(S_{SOL})&\ge \frac{gain(S_{SOL}\cup S_{s})}{h(S_{SOL}\cup S_{s})-h(S_{s})}(h(S_{SOL}\cup S_{s})-h(S_{s})) \nonumber \\&=gain(S_{SOL}\cup S_{s}). \end{aligned}$$
(4.3)

Combining the result of Lemma 11 and (4.2), (4.3), we have

$$\begin{aligned} \max&\displaystyle {\left\{ \max _{{p}\in P_{SOL}}\displaystyle {\left\{ \frac{gain(p)}{f'_{p}}\right\} },\frac{gain(S_{SOL}\cup S_{s})}{h(S_{SOL}\cup S_{s})-h(S_{s})}\right\} }(F_{SOL}+h(S_{SOL}))\\&\ge \max _{{p}\in P_{SOL}}\displaystyle {\left\{ \frac{gain(p)}{f'_{p}}\right\} }F_{SOL}+\frac{gain(S_{SOL}\cup S_{s})}{h(S_{SOL}\cup S_{s})-h(S_{s})}h(S_{SOL})\\&\ge \sum _{p\in P_{SOL}}gain(p)+gain(S_{SOL}\cup S_{s})\\&\ge C_{s}-(F_{SOL}+h_{SOL}+C_{SOL}). \end{aligned}$$

According to the definitions of \(r_{s+1}\) and \(r'_{s+1}\), we have the following:

$$\begin{aligned} \max \{r_{s+1},r'_{s+1}\}&\ge \max \displaystyle {\left\{ \max _{{p}\in P_{SOL}\backslash P_{s}}\displaystyle {\left\{ \frac{gain(p)}{f'_{p}}\right\} }, \frac{gain((S_{SOL}\backslash S_{s})\cup S_{s})}{h((S_{SOL}\backslash S_{s})\cup S_{s})-h(S_{s})}\right\} }\\&=\max \displaystyle {\left\{ \max _{{p}\in P_{SOL}}\displaystyle {\left\{ \frac{gain(p)}{f'_{p}}\right\} }, \frac{gain(S_{SOL}\cup S_{s})}{h(S_{SOL}\cup S_{s})-h(S_{s})}\right\} }\\&\ge \frac{C_{s}-(F_{SOL}+h_{SOL}+C_{SOL})}{F_{SOL}+h_{SOL}}. \end{aligned}$$

\(\square \)

Lemma 15

Assume that SOL is an arbitrary feasible solution, \(F_{0}\), \(C_{0}\), \(h_{0}\) are the opening, connection and penalty cost of the initial feasible solution \(SOL_{0}\), respectively. After greedy augmentation, the total cost of the resulted solution is not exceeding

\(F_{0}+h_{0}+(F_{SOL}+h_{SOL})\max \{0,\ln (\frac{C_{0}-C_{SOL}}{F_{SOL}+h_{SOL}})\}+F_{SOL}+h_{SOL}+C_{SOL}.\)

Proof

For iteration s \((s\ge 0)\), the current solution \(SOL_{s}\) has opening cost \(F_{s}\), connection cost \(C_{s}\) and penalty cost \(h_{s}\). When \(C_{0}\le F_{SOL}+h_{SOL}+C_{SOL}\), the lemma is true. Lemma 14 indicates that there exists an integer \(m\ge 1\) such that \(C_{m}\le F_{SOL}+h_{SOL}+C_{SOL}\) and \(C_{k} > F_{SOL}+h_{SOL}+C_{SOL}\) for all \(0\le s< m\). It suffices to bound the cost at iteration m and the same bound will still hold for the final solution.

Consider an arbitrary iteration s \((0\le s <m)\). It follows from Lemma 14 and Algorithm 3 that

$$\begin{aligned} \frac{C_{s}+F_{s}+h_{s}-(C_{s+1}+F_{s+1}+h_{s+1})}{F_{s+1}+h_{s+1}-(F_{s}+h_{s})}\ge \frac{C_{s}-F_{SOL}-h_{SOL}-C_{SOL}}{F_{SOL}+h_{SOL}}. \end{aligned}$$

or equivalently,

$$\begin{aligned} F_{s+1}+h_{s+1}-F_{s}-h_{s}\le (F_{SOL}+h_{SOL})\frac{C_{s}-C_{s+1}}{C_{s}-C_{SOL}}. \end{aligned}$$

From the definition that in every iteration only one of \(F_{s}\) and \(h_{s}\) changes. We have

$$\begin{aligned} \begin{array}{rl} F_{m}+h_{m}+C_{m}&{}= F_{0}+h_{0}+\sum \limits _{s=1}^m(F_{s}+h_{s}-F_{s-1}-h_{s-1})+C_{m}\\ &{}\le F_{0}+h_{0}+(F_{SOL}+h_{SOL})\sum \limits _{s=1}^m\frac{C_{s-1}-C_{s}}{C_{s-1}-C_{SOL}}+C_{m}. \end{array} \end{aligned}$$
(4.4)

The derivation of the last expression of (4.4) for \(C_m\) is \(1-\frac{F_{SOL}+C_{SOL}}{C_{m-1}-C_{SOL}}\), for \(C_{m-1}>F_{SOL}+C_{SOL}\ge C_{SOL}\), so the right hand of inequality (4.4) increases monotonically about \(C_{m}\). When \(C_{m}=F_{SOL}+h_{SOL}+C_{SOL}\), (4.4) can arrive the maximal value. In the following discussion, we assume that \(C_{m}=F_{SOL}+h_{SOL}+C_{SOL}\). Finally, we have

$$\begin{aligned} \begin{array}{rl} &{}F_{m}+h_{m}+C_{m}\\ &{}\le F_{0}+h_{0}+(F_{SOL}+h_{SOL})\sum \limits _{s=1}^m\frac{C_{s-1}-C_{s}}{C_{s-1}-C_{SOL}}+C_{m}\\ &{}=F_{0}+h_{0}+(F_{SOL}+h_{SOL})\sum \limits _{s=1}^m(1- \frac{C_{s}-C_{SOL}}{C_{s-1}-C_{SOL}})+C_{m}\\ &{}\le F_{0}+h_{0}+(F_{SOL}+h_{SOL})\sum \limits _{s=1}^m \ln (\frac{C_{s-1}-C_{SOL}}{C_{s}-C_{SOL}})+C_{m}\\ &{}=F_{0}+h_{0}+(F_{SOL}+h_{SOL})\ln (\frac{C_{0}-C_{SOL}}{C_{m}-C_{SOL}})+C_{m}\\ &{}=F_{0}+h_{0}+(F_{SOL}+h_{SOL})\ln (\frac{C_{0}-C_{SOL}}{F_{SOL}+P_{SOL}})+F_{SOL}+h_{SOL}+C_{SOL}. \end{array} \end{aligned}$$

\(\square \)

Theorem 2

The approximation ratio for Algorithm 3 is no more than 2.9444.

Proof

Let \(F_{OPT}, C_{OPT}, h_{OPT}\) denote the opening, connection and penalty costs of the optimal solution to the original instance. Let the opening, connection and penalty costs of the solution output of Algorithm 2 be F, C and h. Applying the primal-dual algorithm to the modified instance, we get a solution \(SOL_{0}\) with facility opening cost \(F'\), penalty cost \(h'\) and connection cost \(C'\), which corresponding to facility cost \(F_{0}=F'/ \delta \), penalty cost \(h_{0}=h'/ \delta \) and connection cost \(C_{0}=C'\), respectively. If the initial solution \(SOL_{0}\), which is the output solution of Algorithm 1, is viewed as a feasible solution of the original instance.

From Theorem 1 and scale the opening and penalty costs by a factor \(\delta \),

$$\begin{aligned} \begin{array}{rl} 3\delta (F_{0}+h_{0})+C_{0}&{}=3(F'+h')+C'\\ &{}\le 6[\delta (F_{OPT}+h_{OPT})+C_{OPT}]. \end{array} \end{aligned}$$

There are two possibilities.

Case 1. \(C_{0} \le F_{OPT}+h_{OPT}+C_{OPT}\).

$$\begin{aligned} \check{F}+\check{h}+\check{C}&\le F_{0}+h_{0}+C_{0}\\&=\frac{3\delta (F_{0}+h_{0})+C_{0}}{3\delta }+(1-\frac{1}{3\delta })C_{0}\\&\le (3-\frac{1}{3\delta })(F_{OPT}+h_{OPT})+(1+\frac{5}{3\delta })C_{OPT}. \end{aligned}$$

Case 2. \(C_{0} > F_{OPT}+h_{OPT}+C_{OPT}\).

$$\begin{aligned} C_{0} \le 6[\delta (F_{OPT}+h_{OPT})+C_{OPT}]-3\delta (F_{0}+h_{0}), \end{aligned}$$

Lemma 15 implies that the cost after greedy augmentation is at most

$$\begin{aligned}&F_{0}+h_{0}+(F_{OPT}+h_{OPT})\ln \displaystyle {\left( \frac{6\delta ( F_{OPT}+h_{OPT})+5C_{OPT}-3\delta (F_{0}+h_{0}) }{F_{OPT}+h_{OPT}}\right) }\\&+F_{OPT}+h_{OPT}+C_{OPT}. \end{aligned}$$

The derivation of \(F_{0}+h_{0}\) in the above indicates that when

$$\begin{aligned} F_{0}+h_{0}=F_{OPT}+h_{OPT}+ \frac{5}{3\delta }C_{OPT}, \end{aligned}$$

The polynomial achieves its maximal value.

$$\begin{aligned} \check{F}+\check{h}+\check{C}\le (2+\ln (3\delta ))(F_{OPT}+h_{OPT})+(1+\frac{5}{3\delta })C_{OPT}. \end{aligned}$$

From Case 1 and Case 2, we have

$$\begin{aligned} \check{F}+\check{h}+\check{C}&\le \max \{3-\frac{1}{3\delta },2+\ln (3\delta )\}(F_{OPT}+h_{OPT})+(1+\frac{5}{3\delta })C_{OPT}\\&\le (2+\ln (3\delta ))(F_{OPT}+h_{OPT})+(1+\frac{5}{3\delta })C_{OPT}\\&\le \max \{2+\ln (3\delta ), 1+\frac{5}{3\delta }\} OPT. \end{aligned}$$

When \(\delta =0.8571\), Algorithm 3 can achieve the best approximation ratio 2.9444. So the approximation factor for the Algorithm 3 is no more than 2.9444. \(\square \)

5 Conclusion

We consider k-FLPSP for any constant k in this paper and give an improved approximation algorithm. In Algorithm 2, when we use greedy augmentation for path, the facility in a path may appear in many paths, we define a new opening cost for each facility to avoid repeating the computation of opening costs, every facility cost is just computed once. How to analyse the opening cost properly, it is a question also exist in k-FLPSP and all extensions of k-FLPSP. k-FLPSP is an extension of k-LFLP, we don’t know whether our algorithm can be used in k-LFLP. The lower bound of k-LFLP is 1.61, there is a sharp gap between the current upper bound and lower bound.