1 Introduction and background

The decisions involved in supply chain management can be distinguished, based on the time horizon of impact, into strategic level decisions, such as facility location, tactical level decisions, such as inventory policy and operational level decisions, such as transportation and routing [11]. The impact of strategic level decisions spans over a greater period than tactical decisions, while the latter have a longer impact than operational decisions [21]. The integration of decisions across different levels has been increasingly studied in recent research streams, due to the significant benefits it can deliver for supply chains. Despite the fact that management at the strategic and tactical level concerns long-term and short-term decisions respectively, they are nonetheless interrelated and thus more information is exploited when integrating them.

Several authors have addressed integration in supply chains; Daskin et al. [1] integrate location and inventory decisions at distribution centers (DCs) and develop a mixed integer non-linear program (MINLP) which they solve with the help of Lagrangian relaxation (LR). The same model was studied by Shen et al. [19], who tackle it using the Branch-and-Price method. Risk pooling was accounted for, in the sense that a retailer can act as a distribution center. More thoroughly, risk pooling was discussed by Vidyarthi et al. [23]. The model addressed by Daskin et al. [1] and Shen et al. [19] was expanded by Ozsen et al. [15] to include capacity constraints for the single-sourcing case, in which there is a single plant that supplies all the retailers. Multiple sourcing for the capacitated problem was studied in later work of Diabat et al. [5], for which the authors develop an efficient Genetic Algorithm (GA) and report satisfactory results.

While the aforementioned works consider inventory only at the distribution centers (DCs), it is more practical in many cases to also consider inventory at the level of the retailers, although this leads to a more complicated formulation. This was done by Diabat et al. [9], where the authors simultaneously decide the location as well as inventory policy at both DCs and retailers. Once again, an LR based heuristic is developed to obtain results. In later work Diabat et al. [3], develop a set of propositions in order to improve the solutions obtained through the sub-problems, which were obtained through heuristics in their earlier work, while Diabat et al. [7] develop a GA to solve a capacitated location inventory problem. The integrated location inventory problem was also addressed by Diabat et al. [3], but this time for a closed-loop supply chain, meaning that the authors model both the forward and the reverse logistics network, the latter involving the collection of used products and their delivery to remanufacturing centers. Their model is implemented as a MINLP and solved with the help of an LR heuristic. Diabat [2] uses a hybrid genetics/simulated annealing algorithm to solve a vendor managed inventory (VMI) problem.

Shu [20] and Teo and Shu [22] consider location, distribution and inventory replenishment in a formulation solved with the help of a greedy heuristic and column generation, respectively. The heuristic is better able to tackle large sized problems than column generation. Another notable heuristic is developed by Jayaraman and Ross [12], who integrate strategic with operational level decisions and further consider cross-docking in the supply chain. Diabat and Richard [8] develop a nested LR approach for an integrated location and inventory model, formulated as a Mixed Integer Linear Program (MILP). Diabat and Theodorou [10] use the idea of reformulation and piecewise linearization in solving the uncapacitated joint location inventory problem that was developed by Diabat et al. [9]. Aside from the integration of strategic and tactical level decisions, or strategic and operational level decisions, the integration of tactical and operational level decisions has also been addressed and yields benefits for the supply chain. Le et al. [14] study the integrated inventory and routing problem for perishable products and implement a column-generation based solution approach. A tabu search heuristic was also developed by Diabat et al. [4] for the integrated inventory and routing problem with perishable products.

As mentioned, previous work of Diabat et al. [9] focuses on developing an LR approach for the multi-echelon joint location-inventory model that simultaneously decides the location of DCs as well as inventory policies at both DCs and retailers. The continuous relaxation of the formulation is non-convex and thus the problem cannot be solved to optimality using commercial software, even for small instances. This was the motivation for developing an LR approach; however, despite its contribution, the developed method posed certain drawbacks. These included the fact that the sub-problems were not solved to optimality. The authors observed that the value of the lower bound was sometimes greater than the upper bound, producing a negative gap. Furthermore, another disadvantage of the method is that any minor change in the objective function or in the constraints could result in the method not functioning and requiring further calibration. Therefore, the formulation does not provide flexibility in modifying or extending it to account for realistic considerations, such as capacity constraints. Later work of Diabat et al. [6] develops an improved LR based heuristic to solve the problem. Yet, the problem remains of developing an extended formulation that can be solved without the use of heuristics, but rather with the help of commercial software.

In light of the aforementioned points, the aim of the current work is to extend the formulation studied in Diabat et al. [6, 9] and Diabat and Theodorou [10] to account for capacity constraints and to develop an appropriate method to address the extended formulation. The LR approach developed in Diabat et al. [6, 9] is not capable of solving the extended problem with capacity constraints. Thus, the main contribution of this work is the transformation of the problem into one that is not only solvable using commercial software but also capable of finding a high quality feasible solution through commercial software, by the linearization of the non-linear terms. While LR thus far has contributed in producing a set of lower bounds for the problem at hand, the best of which is then converted into a feasible upper bound with the help of heuristics, there is no guarantee that even the optimal Lagrangian bound will be within a certain gap of the optimal solution. This is overcome in our new approach due to the fact that the generated solution is feasible, so there is no need for more heuristics to modify the lower bound and make it feasible.

The current paper is structured as follows: Sect. 2 presents the problem statement as well as the formulation of the problem, while Sect. 3 is dedicated to the computational analysis conducted and the discussion of the results. Finally, Sect. 4 summarizes the conclusions of the current work, while providing certain directions for future research.

2 Problem statement and formulation

The current section will begin by describing the multi-echelon joint inventory location problem, without capacity constraints and with the adoption of the power-of-two inventory policy. Subsequently, the capacity constraints will be modeled and described, before being added to create the extended formulation. Finally, the current section will conclude with the linearization of the non-linear terms encountered in the extended formulation.

2.1 Multi-echelon joint inventory-location model (MJIL)

The objective of the multi-echelon joint inventory-location problem is to simultaneously determine the number and location of distribution centers (DCs), the retailers assigned to each DC and the amount and delivery time of products to each facility (retailer or DC). The retailers, with deterministic demand and with the ability to hold inventory, are assigned to DCs from which they will be supplied with the product. On their part, DCs order a product at regular intervals from the manufacturer and they also hold inventory resulting from product that has been ordered without having been shipped to the retailer. The aim of the model is to minimize the incurred costs, which include the fixed costs of placing any order, the inventory and shipping costs, as well as the fixed costs of setting up a DC. For a complete description of the model parameters the reader is referred to Diabat et al. [9]. At this point, only certain parameters as well as the sets and decision variables of the problem are presented. These hold for all formulations provided beyond this point, while subsequent models will introduce certain new variables.

Sets

I::

set of retailers, indexed by \( i=1,{\ldots },\left| I \right| \)

J::

set of possible DC locations, indexed by \( j=1,{\ldots }, \left| J \right| \)

Parameters

\(f_j:\) :

fixed cost of establishing a DC at location j

\(\hat{k}_j , k_i , b_{ij} , c_{ij} , e_{ij}:\) :

parameters calculated for convenience, see [9]

M::

sufficiently large number

\(\varepsilon :\) :

sufficiently small number

Decision variables

\(T_{ij}:\) :

cycle-time (time between orders placed) of retailer i when served by DC j

\(\hat{T}_j:\) :

cycle-time of DC j

Binary decision variables

$$\begin{aligned} X_j&=\left\{ {\begin{array}{ll} 1&{} \text {if a DC is opened at location }j \\ 0&{} \mathrm{otherwise}\\ \end{array}} \right. \\ Y_{ij}&=\left\{ {\begin{array}{ll} 1 &{} \text {if a retailer }i\text { is served by DC at location }j\\ 0 &{} \mathrm{otherwise} \\ \end{array}} \right. \end{aligned}$$

The formulation of the original problem is presented as follows:

$$\begin{aligned} \mathop {\min }\limits _{X,Y,\hat{T},T} \sum \limits _{j\in J} \left( {f_j +\frac{\hat{k}_j }{\hat{T}_j }}\right) X_j +\sum \limits _{j\in J} \sum \limits _{i\in I} \left( {\frac{k_i }{T_{ij} }+b_{ij} +c_{ij} T_{ij} +e_{ij} max\left\{ {\hat{T}_j ,T_{ij} } \right\} }\right) Y_{ij} \end{aligned}$$
(1)

Subject to:

$$\begin{aligned}&\sum \limits _{j\in J} Y_{ij} =1 \quad \quad \quad \quad \forall i\in I\end{aligned}$$
(2)
$$\begin{aligned}&Y_{ij} \le X_j \quad \;\, \quad \quad \quad \quad \forall i\in I,j\in J\end{aligned}$$
(3)
$$\begin{aligned}&Y_{ij} , X_j \in \left\{ {0, 1} \right\} \quad \quad \forall i\in I,j\in J\end{aligned}$$
(4)
$$\begin{aligned}&\frac{\hat{T}_j }{T_{ij} }=2^{N_{ij} } \;\;\qquad \quad \quad \forall i\in I,j\in J\end{aligned}$$
(5)
$$\begin{aligned}&N_{ij} \in {\mathbb {Z}} \qquad \, \qquad \quad \quad \forall i\in I,j\in J\end{aligned}$$
(6)
$$\begin{aligned}&\hat{T}_j , T_{ij} \ge \varepsilon \;\, \qquad \quad \quad \forall i\in I, j\in J \end{aligned}$$
(7)

The objective function aims to minimize the sum of inventory, ordering, location and shipping costs. Constraints (2) ensure that each retailer is assigned to exactly one DC, while constraints (3) prohibit retailers to be assigned to DCs that are not opened. Constraints (4) ensure that the location and assignment variables are binary, while constraints (5)–(7) represent the power-of-two inventory policy adopted. Regarding the inventory, a power-of-two policy is adopted, according to which the ratio of the time elapsed between orders for the DC over the time between orders at the retailer is a power of two. It has been shown by Roundy [18] that this policy is 98 % effective at finding an optimal policy. Thus, adopting this simplifying policy can greatly reduce the complexity, without greatly sacrificing the accuracy of the solution.

2.2 MJIL extended with capacity constraints

A reformulation of the problem is presented at this point, with the addition of a set of capacity constraints and the elimination of the power-of-two inventory policy constraints. The resulting non-linear formulation will then be linearized through exact and piecewise linearization of the various non-linear terms. The extended model will be more practical, but at the same time more difficult to solve mainly due to the fact that it is not possible to know in advance which retailers will be assigned to each DC.

As mentioned, one of the contributions of the current work is the addition of capacity constraints to the formulation presented in Diabat et al. [6, 9] and Diabat and Theodorou [10]. When adding capacity constraints, two types of problems can be distinguished: the problem with single-sourcing, in which the entire demand of a retailer is satisfied by one DC, and that with multiple-sourcing, according to which each retailer can be served by more than one DC. In this paper we consider single sourcing. It is important to consider whether the cycle time of the DC is greater than the cycle times of all retailers assigned to it, or in other words, it is important to focus on the case where \(\hat{T}_j >T_{ij}\) for any retailer i that is assigned to DC j. The reason behind this consideration is that if the opposite holds, \(T_{ij} \ge \hat{T}_j\), it is implied that the DC will place an order every time retailer i places an order. Therefore, the DC would not carry any inventory to be shipped to retailer i. We introduce the following binary variable:

$$\begin{aligned} Z_{ij} =\left\{ {\begin{array}{ll} 1&{} \quad \mathrm{if}\quad T_{ij} \le \hat{T}_j \\ 0&{} \quad \mathrm{otherwise} \\ \end{array}} \right. \end{aligned}$$

The problem that arises now is that the capacity constraint for each DC will depend on which retailers are assigned to it. However, as we do not know the assignment in advance, the capacity constraint modeled in Eq. (14) requires multiplying the demand of all retailers by \(Y_{ij}\). Further, let us define \(q_j\), which denotes the maximum capacity of DC j. Then, the capacity constraints can be written as follows:

$$\begin{aligned}&\hat{T}_j \sum \limits _{i\in I} d_i Y_{ij} Z_{ij} \le q_j \;\quad \quad \quad \quad \forall j\epsilon J\end{aligned}$$
(8)
$$\begin{aligned}&MZ_{ij} \ge \hat{T}_j -T_{ij} \quad \quad \quad \quad \quad \quad \forall i\in I; j\in J\end{aligned}$$
(9)
$$\begin{aligned}&M( {Z_{ij} -1})\le \hat{T}_j -T_{ij} \!\quad \quad \quad \forall i\in I; j\in J \end{aligned}$$
(10)

Constraints (8) state that DC j is responsible for carrying inventory that is enough for all retailers assigned to it, with cycle times that are not greater than the DC cycle time. Constraints (9) state that if \(\hat{T}_j\) is larger than \(T_{ij}\) then \(Z_{ij}\) must be equal to 1, whereas if \(\hat{T}_j\) is not larger than \(T_{ij}\), then \(Z_{ij}\) can be either 0 or 1. In a similar manner, constraints (10) ensure that if \(\hat{T}_j\) is less than \(T_{ij}\), then \(Z_{ij}\) must be equal to 0. On the other hand, if \(\hat{T}_j\) is not less than \(T_{ij}\), then \(Z_{ij}\) can be either 0 or 1. At this point, it is worth noting the case where \(\hat{T}_j =T_{ij}\). In this case, constraints (9)–(10) are redundant. However, recall that for the case that \(Y_{ij} =1\) and \(\hat{T}_j =T_{ij}\), DC j still does not need to carry any inventory to be shipped to retailer i. Therefore, constraints (8) will still hold for this case. Another important note is that since the problem described by Eqs. (1)–(10) is solved without considering the power-of-two constraints, and then the optimal values of \(\hat{T}_j\) and \(T_{ij}\) are rounded to the nearest power-of-two values, the relation between \(\hat{T}_j\) and \(T_{ij}\) may change before and after the rounding. However, for any possible change in the relationship, the capacity restriction will still be respected. For example, if \(\hat{T}_j <T_{ij}\) changes to \(\hat{T}_j =T_{ij}\), then in both cases DC j will not need to carry any inventory to be shipped to retailer i. If \(\hat{T}_j >T_{ij}\) changes to \(\hat{T}_j =T_{ij}\), then the left hand side of constraints (8) would be larger than its actual value and therefore this constraint would still be satisfied. After dropping the power-of-two constraints, adding the capacity constraints and dropping the index j from \(T_{ij}\), the model can be stated as follows:

$$\begin{aligned} \min \sum \limits _{j\in J} \left( {f_j +\frac{\hat{k}_j }{\hat{T}_j }}\right) X_j + \sum \limits _{j\in J} \sum \limits _{i\in I} \left( {\frac{k_i }{T_i }+b_{ij} +c_{ij} T_i +e_{ij} max\left\{ {\hat{T}_j ,T_i } \right\} }\right) Y_{ij} \end{aligned}$$
(11)

s.t.

$$\begin{aligned}&\sum \limits _{j\in J} Y_{ij} =1 \quad \quad \quad \quad \,\,\quad \quad \quad \forall i\in I\end{aligned}$$
(12)
$$\begin{aligned}&Y_{ij} \le X_j \quad \quad \quad \quad \quad \quad \quad \quad \,\,\,\forall i\in I, j\in J\end{aligned}$$
(13)
$$\begin{aligned}&\hat{T}_j \sum \limits _{i\in I} d_{i} Y_{ij} Z_{ij} \le q_j \quad \quad \,\,\, \forall j\epsilon J\end{aligned}$$
(14)
$$\begin{aligned}&MZ_{ij} \ge \hat{T}_j -T_i \;\;\quad \quad \quad \quad \forall i\in I, j\in J\end{aligned}$$
(15)
$$\begin{aligned}&M( {Z_{ij} -1})\le \hat{T}_j -T_i \;\;\quad \forall i\in I, j\in J\end{aligned}$$
(16)
$$\begin{aligned}&Y_{ij} ,X_j , Z_{ij} \in \left\{ {0, 1} \right\} \;\;\quad \quad \forall i\in I, j\in J\end{aligned}$$
(17)
$$\begin{aligned}&\hat{T}_j , T_{i} \ge \varepsilon \quad \qquad \qquad \qquad \quad \forall i\in I, \forall j\in J \end{aligned}$$
(18)

It is possible to observe by comparing the formulation presented by (11)–(18) with the formulation by (1)–(7) that the additional constraints are (14)–(16) (capacity constraints), while constraints (5) and (6) (power-of-two inventory policy) have been eliminated. Here, constraints (14) ensure that the total demand of retailers assigned to a DC will not exceed its capacity. Constraints (15) and (16) determine the binary variable \(Z_{ij}\) based on the cycle times of retailers and DCs. This non-linearity is one of the reasons for which this new problem cannot be solved using the Lagrangian relaxation heuristic developed in Diabat et al. [6, 9]. The objective of this paper is to reformulate some of the non-linear terms and use piecewise linearization to approximate the remaining non-linear terms to convert the MINLP (11)–(18) to an MILP that is solvable using commercial solvers, such as CPLEX. We also test the quality of the new approach by comparing the objective function values of the obtained solutions to the objective function values of the solutions generated by solving the problem using a different heuristic that is explained in the following section of the computational analysis.

2.3 Piecewise linear based approximation

In this section we propose an MILP approximation for problem (11)–(18). The approximation is based on finding an exact linear formulation for some of the non-linear terms in the problem and linearizing the remaining non-linear terms using piecewise linear approximation.

2.3.1 An equivalent linear reformulation

We define the variables \(T_{ij}^{max}\), \(W_{ij}^{max}\), \(W_{ij}\) and \(\hat{W} _{ij}\) as follows:

$$\begin{aligned} \begin{array}{lll} &{} max\left\{ {\hat{T}_j ,T_i } \right\} =T_{ij}^{max} \\ &{} T_{ij}^{max} Y_{ij} =W_{ij}^{max}\\ &{}T_i Y_{ij} =W_{ij} \\ &{} Z_{ij} \hat{T}_j Y_{ij} =\hat{W} _{ij} \\ \end{array} \end{aligned}$$

Then the following problem is equivalent to problem (11)–(18)

$$\begin{aligned} \min \sum \limits _{j\in J} \left( {f_j +\frac{\hat{k}_j }{\hat{T}_j }}\right) X_j +\sum \limits _{j\in J} \sum \limits _{i\in I} \left( {\frac{k_i }{T_i }Y_{ij} +b_{ij} Y_{ij} +c_{ij} W_{ij} +e_{ij} W_{ij}^{max} }\right) \end{aligned}$$
(19)

s.t.

$$\begin{aligned}&\sum \limits _{j\in J} Y_{ij} =1\, \;\qquad \quad \quad \quad \quad \quad \quad \quad \quad \forall i\in I\end{aligned}$$
(20)
$$\begin{aligned}&Y_{ij} \le X_j \,\;\qquad \quad \quad \quad \quad \quad \quad \quad \quad \quad \forall i\in I, j\in J\end{aligned}$$
(21)
$$\begin{aligned}&\sum \limits _{i\in I} d_{i} \hat{W} _{ij} \le q_j \;\qquad \quad \quad \quad \quad \quad \quad \forall j\in J\end{aligned}$$
(22)
$$\begin{aligned}&M Z_{ij} \ge \hat{T}_j -T_i \;\quad \; \qquad \quad \quad \quad \quad \forall i\in I, j\in J \end{aligned}$$
(23)
$$\begin{aligned}&M ( {Z_{ij} -1})\le \hat{T}_j -T_i \,\;\qquad \quad \quad \forall i\in I, j\in J \end{aligned}$$
(24)
$$\begin{aligned}&T_{ij}^{max} \ge \hat{T}_j \!\,\;\;\quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \forall i\in I, j\in J \end{aligned}$$
(25)
$$\begin{aligned}&T_{ij}^{max} \ge T_i \;\quad \;\, \quad \quad \quad \quad \quad \quad \quad \quad \quad \forall i\in I, j\in J \end{aligned}$$
(26)
$$\begin{aligned}&T_{ij}^{max} \le W_{ij}^{max} +M( {1-Y_{ij} })\;\quad \forall i\in I, j\in J \end{aligned}$$
(27)
$$\begin{aligned}&T_i \le W_{ij} + M( {1-Y_{ij} })\;\!\quad \quad \quad \quad \forall i\in I, j\in J \end{aligned}$$
(28)
$$\begin{aligned}&\hat{T}_j \le \hat{W} _{ij} + M( {2-Y_{ij} -Z_{ij} })\; \!\!\!\quad \forall i\in I, j\in J \end{aligned}$$
(29)
$$\begin{aligned}&Y_{ij} , X_j \in \left\{ {0, 1} \right\} \; \quad {\begin{array}{ll} \quad \quad \quad \qquad \quad \!\!\quad {\forall i\in I,} {j\in J} \\ \end{array} } \end{aligned}$$
(30)
$$\begin{aligned}&T_{ij}^{max} , W_{ij}^{max} , W_{ij} , \hat{W} _{ij} \ge 0\, {\begin{array}{ll} \quad \quad {\forall i\in I,} {j\in J} \\ \end{array} } \end{aligned}$$
(31)
$$\begin{aligned}&T_i , \hat{T}_j \ge \varepsilon \quad {\begin{array}{ll} \quad \quad \qquad \qquad \qquad \qquad {\forall i\in I,} {j\in J} \\ \end{array} } \end{aligned}$$
(32)

Note how this formulation contains additional constraints, namely (26)–(29), as part of the linearization, while the objective function and constraints (22) have been adapted based on the newly introduced variables. Regarding the new constraints, (25) and (26) ensure that the new variable \(T_{ij}^{max}\) will be equal to the greatest among \(T_i \) and \(\hat{T}_j\), thus replacing \(max\left\{ {\hat{T}_j ,T_i } \right\} \) from the objective function. Constraints (27)–(29) help define the new variables \(W^{max}_{ij} W_{ij}\) and \(\hat{W} _{ij}\) which replace non-linear terms in the objective function and in constraints (22).

2.3.2 Piecewise linearization

The only non-linear terms remaining in problem (19)–(31) are \(\dfrac{\hat{k}_j }{\hat{T}_j } X_j\) and \(\dfrac{k_i }{T_i } Y_{ij}\). To linearize these two terms, we use piecewise linear functions to approximate \(\dfrac{1}{\hat{T}_j }\) and \(\dfrac{1}{T_i }.\) Since \(\dfrac{1}{\hat{T}_j }\) and \(\dfrac{1}{T_i }\) are convex functions, they can be linearized by allowing the binary variables that are used in the linearization to be continuous. This is very important as this linearization does not result in a higher number of binary variables.

To linearize \(\dfrac{1}{\hat{T}_j }\), assume that we divide the range of \(\hat{T}_j\) into \(n=1,\ldots , N\) linear pieces using equidistant placement of the breakpoints. This will lead to \(N+1\) break points \(\hat{t} _j ^1, \hat{t} _j ^2, \ldots , \hat{t} _j ^{(N+1)}\). Then,\( \dfrac{1}{\hat{T}_j }\cong \dfrac{1}{\hat{t} _j ^n}\hat{U} _j ^n+\dfrac{1}{\hat{t} _j ^{n+1} }(1-\hat{U} _j ^n)=\dfrac{1}{\hat{t} _j ^{n+1} }+(\dfrac{1}{\hat{t} _j ^n}-\dfrac{1}{\hat{t} _j ^{n+1}})\hat{U} _j ^n\), where \(\hat{U} _j ^n\in \left[ {0,1} \right] .\) Let \(\hat{V} _j ^n=\hat{U} _j ^nX_j\), then, \(\dfrac{1}{\hat{T}_j }X_j \cong \dfrac{1}{\hat{t} _j ^{n+1}}X_j +(\dfrac{1}{\hat{t} _j ^n}-\dfrac{1}{\hat{t} _j ^{n+1}})\hat{V} _j ^n\). Similarly, \(\dfrac{1}{T_i }\cong \dfrac{1}{t_i ^{n+1}}+(\dfrac{1}{t_i ^n}-\dfrac{1}{t_i ^{n+1}})U_i ^n\). Now, by letting \(V_{ij} ^n=U_i ^nY_{ij}\) we get \(\dfrac{1}{T_i }Y_{ij} \cong \dfrac{1}{t_i ^{n+1}}Y_{ij} +(\dfrac{1}{t_i ^n}-\dfrac{1}{t_i ^{n+1}})V_{ij} ^n\)

Therefore, problem (19)–(31) can be approximated as follows

$$\begin{aligned}&\min \sum \limits _{j\in J} f_j X_j + \sum \limits _{j\in J} \sum \limits _{n\in N} \frac{\hat{k}_j }{\hat{t} _j ^{n+1}}X_j + \sum \limits _{j\in J} \sum \limits _{n\in N} \hat{k}_j \left( \frac{1}{\hat{t} _j ^n}-\frac{1}{\hat{t} _j ^{n+1}}\right) \hat{V} _j ^n+ \sum \limits _{j\in J} \sum \limits _{i\in I} \sum \limits _{n\in N} \frac{k_i }{t_i ^{n+1}}Y_{ij} \nonumber \\&\quad + \sum \limits _{j\in J} \sum \limits _{i\in I} \sum \limits _{n\in N} k_i \left( \frac{1}{t_i ^n}-\frac{1}{t_i ^{n+1}}\right) V_{ij} ^n+ \sum \limits _{j\in J} \sum \limits _{i\in I} \left( {b_{ij} Y_{ij} +c_{ij} W_{ij} +e_{ij} W_{ij}^{max} }\right) \end{aligned}$$
(33)

s.t.

$$\begin{aligned}&\sum \limits _{j\in J} Y_{ij} =1\; \quad \qquad \qquad \qquad \qquad \quad \forall i\in I\end{aligned}$$
(34)
$$\begin{aligned}&Y_{ij} \le X_j \;\quad \,\qquad \qquad \qquad \qquad \qquad \forall i\in I,j\in J\end{aligned}$$
(35)
$$\begin{aligned}&\sum \limits _{i\in I} d_{i} \hat{W} _{ij} \le q_j \;\quad \qquad \qquad \qquad \quad \forall j\epsilon J\end{aligned}$$
(36)
$$\begin{aligned}&MZ_{ij} \ge \hat{T}_j -T_i \;\quad \qquad \quad \qquad \,\,\,\,\,\,\forall i\in I, j\in J\end{aligned}$$
(37)
$$\begin{aligned}&M( {Z_{ij} -1})\le \hat{T}_j -T_i\quad \qquad \quad \,\,\forall i\in I,j\in J\end{aligned}$$
(38)
$$\begin{aligned}&T_{ij}^{max} \ge \hat{T}_j \qquad \qquad \qquad \qquad \qquad \!\;\;\forall i\in I,j\in J\end{aligned}$$
(39)
$$\begin{aligned}&T_{ij}^{max} \ge T_i \qquad \qquad \qquad \qquad \qquad \;\;\forall i\in I,j\in J\end{aligned}$$
(40)
$$\begin{aligned}&T_{ij}^{max} \le W_{ij}^{max} +M( {1-Y_{ij} })\;\!\quad \forall i\in I,j\in J\end{aligned}$$
(41)
$$\begin{aligned}&T_i \le W_{ij} + M( {1-Y_{ij} })\;\quad \qquad \quad \!\! \forall i\in I,j\in J\end{aligned}$$
(42)
$$\begin{aligned}&\hat{T}_j \le \hat{W} _{ij} + M( {2-Y_{ij} -Z_{ij} })\!\!\quad \forall i\in I,j\in J\end{aligned}$$
(43)
$$\begin{aligned}&\sum \limits _{n\in N} \hat{U} _j ^n=1\;\quad \qquad \,\!\!\qquad \qquad \qquad \quad \forall j\in J\end{aligned}$$
(44)
$$\begin{aligned}&\sum \limits _{n\in N} U_{ij} ^n=1\;\quad \!\!\qquad \qquad \qquad \qquad \quad \forall i\in I,j\in J\end{aligned}$$
(45)
$$\begin{aligned}&X_j -1\le \hat{V} _j ^n-\hat{U} _j ^n\;\quad \!\!\qquad \quad \quad \quad \forall n\in N\quad \forall j\in J\end{aligned}$$
(46)
$$\begin{aligned}&Y_{ij} -1\le \hat{V} _{ij} ^n-\hat{U} _{ij} ^n\;\quad \;\,\,\,\,\,\qquad \quad \forall n \in N,j\in J,i\in I\end{aligned}$$
(47)
$$\begin{aligned}&{\begin{array}{ll} {\hat{V} _j ^n,\hat{U} _j ^n,\hat{V} _{ij} ^n,} \\ {\hat{U} _{ij} ^n\in [0, 1]} \\ \end{array} } \qquad \qquad \qquad \qquad \quad \! \,\,\forall i\in I,j\in J\end{aligned}$$
(48)
$$\begin{aligned}&X_j ,Y_{ij} ,Z_{ij} \in \left\{ {0, 1} \right\} \;\qquad \qquad \quad \forall i\in I,j\in J\end{aligned}$$
(49)
$$\begin{aligned}&{\begin{array}{ll} {\hat{T}_j ,T_i , T_{ij}^{max} , W_{ij}^{max} ,} \\ { W_{ij} , \hat{W} _{ij} \ge 0} \\ \end{array} }\,\qquad \qquad \quad \forall i\in I,j\in J \end{aligned}$$
(50)

The additional constraints in this formulation are constraints (44)–(47), which represent the piecewise linearization for the terms found in the previous objective function, \(\dfrac{\hat{k}_j }{\hat{T}_j } X_j\) and \(\dfrac{k_i }{T_i } Y_{ij}\). Aside from these, the formulation is identical to that described by (19)–(32).

3 Computational analysis

In this section we present the results obtained from the computational analysis. The data sets used for the current analysis were found in Diabat et al. [6, 9] and they include both small and large size instances of the problem. For small sizes of the problem, i.e. (\(\left| I \right| \times \left| J \right| )=(6\times 3, 8\times 4, 10\times 5)\), we are able to obtain an optimal solution with the help of the commercial software GAMS (General Algebraic Modeling System), using the BARON solver. Despite the high computational time required for this solver, and its inability to solve certain instances to optimality, nevertheless we compare its results with those obtained using the piecewise linearization, which constitute an upper bound to the objective value, in order to assess the quality of the bound and evaluate the proposed approach.

Table 1 Comparative results for instance (\(6\times 3\))
Table 2 Comparative results for instance (\(8\times 4)\)
Table 3 Comparative results for instance (\(10\times 5\))

We conduct our experiments on a total of 48 instances, 12 different instances for each of these sizes, namely (6 \(\times \) 3), (8 \(\times \) 4), (10 \(\times \) 5) and (15 \(\times \) 10). The results are summarized in Tables 1, 2, 3, 4 and they include the number of retailers and DCs, as well as the value obtained from each method and the time required to reach a solution, reported in both seconds and hours. Finally, the time and value ratio are given in order to compare the two solution approaches. From all tables we can observe the very high computational time for BARON, while Table 4 shows that the solver reaches its limit as GAMS runs out of memory and optimality is not achieved.

Table 4 Comparative results for instance (\(15\times 10)\)

Results indicate that the linearization achieves very good results, with solutions being on overall average for all instances within 1 % of the solution obtained from the commercial solver, or 1.02 % if we exclude the instances of Table 4, for which BARON solver achieved worse results than the linearization. In addition, the gap remains fairly stable even as the instance size increases: from Table 1, the average solution gap is 1.0226, while from Tables 2 and 3 it is 1.0268 and 1.0247, respectively. These results were obtained within a fraction of the time, on average 0.01 %. The small gap demonstrates the high solution quality obtained through the linearization, while its significantly reduced time further enhances its attractiveness.

For large instances a different heuristic is developed, which relies on solving the two problems sequentially. Essentially, the new heuristic approach is based on solving the facility location problem and the inventory management problem separately. The first step is to solve the following uncapacitated facility location problem (UFLP).

$$\begin{aligned} \min \sum \limits _{j\in J} f_j X_j +\sum \limits _{j\in J} \sum \limits _{i\in I} b_{ij} Y_{ij} \end{aligned}$$

s.t.

$$\begin{aligned} \begin{array}{ll} \sum \limits _{j\in J} Y_{ij} =1 &{}\quad \forall i\in I\\ Y_{ij} \le X_j \quad \forall i\in I; &{}\quad \forall j\in J\\ X_j ,Y_{ij} \in \left\{ {0, 1} \right\} &{} \quad \forall i\in I;\quad \forall j\in J \end{array} \end{aligned}$$

From solving this problem, we obtain the optimal values corresponding to the set of open DCs and the set of assignments, \(X_j^*\) and \(Y_{ij}^*\), respectively. Then, we define \(J_o \)to be the set of open DCs, i.e. \(J_o =\left\{ {j{\setminus } X_j^*=1} \right\} \). Also, let \(I_j\) be the set of retailers assigned to DC \(j\in J_o \), i.e. \(I_j =\left\{ {i{\setminus } Y_{ij}^*=1} \right\} , \forall j\in J_o .\) Having defined these subsets, we can now decompose the problem into multiple one-DC multi-retailer problems. For every \(j\in J_o \), we have a one-DC multi-retailer inventory problem, as follows, which we solve using the algorithm by Roundy [18].

$$\begin{aligned} \min \frac{\hat{k}_j }{\hat{T}_j }+\sum \limits _{i\in I_j } \left( \frac{k_i }{T_i }+c_{ij} T_i +e_{ij} max\left\{ {\hat{T}_j ,T_i } \right\} \right) \end{aligned}$$

This decomposition allows for easier solving of the problem, since we know the assignment of retailers to DCs. Now, we solve for \(\hat{T}_j ^*\) and \(T_i ^*\). Then, for every \(j\in J_o \) and for every \(i\in I_j\) if \(\hat{T}_j ^*>T_i ^*\) we set \(Z_{ij} ^*=1\), otherwise we set \(Z_{ij} ^*=0\). At this point we need to check whether the capacity constraints (14) are satisfied. Given that we have solved the inventory problem without considering the capacity constraints, they will most likely be violated. Let \(J_V\) be the set of open DCs for which the capacity constraint is violated, i.e. \(J_V =\{ {j\in J_o {\setminus } \hat{T}_j ^*\sum _{i\in I_j } d_{ij} Y_{ij} ^*Z_{ij} ^*>q_j } \}\). For these DCs, the demand of retailers assigned with cycle time less than that of the DC exceeds the capacity. Thus, it is important to adjust the cycle time of some of the retailers, in order to make it equal to the cycle time of the DC so that the latter does not carry inventory for that retailer. Having detected the retailers for which the cycle time should be adjusted, we solve the knapsack problem for every DC in the violated set, or for every \(j\in J_V\):

Table 5 Comparative results for instance (\(20\times 15\))
Table 6 Comparative results for instance (\(25\times 15\))
$$\begin{aligned} \max \sum \limits _{i\in I_j } Z_{ij} ^V \end{aligned}$$

s.t.

$$\begin{aligned} \begin{array}{ll} \sum \limits _{i\in I_j } d_{i} Z_{ij} ^V\le \dfrac{q_j }{\hat{T}_j ^*}&{} \quad \forall j\\ Z_{ij} ^V\in \left\{ {0,1} \right\} &{} \quad \forall i, j \end{array} \end{aligned}$$
Table 7 Comparative results for instance (\(30\times 15\))
Table 8 Comparative results for instance (\(50\times 20\))

Now, for any \(j\in J_V\) and \(i\in I_j\) with \(Z_{ij} ^V=0\), we change \(Z_{ij} ^*\) from 1 to 0, and \(T_i ^*\) from its old value to \(\hat{T}_j ^*\). The objective is to maintain the values of \(Z_{ij} ^*\) wherever possible; however, some will have to become equal to 0, only enough to make the solution feasible. The aim of this problem is to decide which will become 0 and which will remain 1. Thus, this problem will dictate which retailers are best removed from the capacity constraint, which means we would have to adjust their cycle times. Next, we calculate the objective function value of the sequential heuristics approach as follows:

$$\begin{aligned} SeqHeu=\sum \limits _{j\in J_o } \left( {f_j +\frac{\hat{k}_j }{\hat{T}_j ^*}}\right) +\sum \limits _{j\in J_o } \mathop \sum \limits _{i\in I_j } \left( {\frac{k_i }{T_i ^*}+b_{ij} +c_{ij} T_i ^*+e_{ij} max\left\{ {\hat{T}_j ^*,T_i ^*} \right\} }\right) \end{aligned}$$

For the significantly larger instances, where (\(\text{ retailers }\times \text{ DCs })\) are (20 \(\times \) 15), (25 \(\times \) 15), (30 \(\times \) 15) and (50 \(\times \) 20) we compare the piecewise linear heuristics and the sequential heuristic in terms of the objective function value obtained, as well as the solution time. For each of these sizes, we run 12 different instances, leading to a total of 48 instances, and the results are shown in Tables 5, 6, 7, 8.

From these tables it is possible to observe that the longest time used by the piecewise linearization based heuristic was 8.5 h for solving the instance of size (50 \(\times \) 20), while for the sequential based heuristic this time was 6.8 h, reported in Table 8. While the linearization requires a slightly higher computational time, on total average 3.3 h compared to 2.9 h for the sequential heuristic, its solutions are better by an average of 50 %, weighing the trade-off between time and quality to its advantage.

4 Conclusion

The current work addresses the capacitated facility location inventory problem. The addition of capacity constraints leads to a model of higher practical value. However, it poses an additional challenge in terms of solving the problem, due to the non-linear terms that occur. In light of this, we aim to reformulate some of the terms to make them linear and implement piecewise linearization for the rest, in order to render the problem solvable with the help of commercial software. For large problems, we develop a sequential heuristic approach. We solve both small and large instances of the problem to evaluate the linearization and the sequential heuristic. Our results demonstrate the overall superiority of the linearization in terms of time and solution quality.

The promising results obtained by using the linearization demonstrate the need for further investigating such methods in future research. For example, future work could focus on implementing for the same problem a strategy for optimally placing breakpoints for the linearization and selecting the placement which minimizes either the maximal deviation of the piecewise linear function from the original one or the area between the functions. This can be found in the works of [13, 16, 17]. The current work could also be extended by adding multiple products or various transportation options to the supply chain.