1 Introduction

Design of logistics networks is one of the main planning processes in supply chain management (Melo et al. 2009). Design decisions such as location, number, and capacity of supply chain’s facilities have long-term effects on the performance of a company in terms of costs and customer service level. Logistics networks can be categorized into three main groups including: forward logistics networks (FLNs), reverse logistics networks (RLNs), and finally, integrated forward/reverse logistics networks (IFRLNs).

During the last decade, due to the stringent pressures from environmental regulations and economic beneficial from remanufacturing and recovery activities, many companies such as Dell, Kodak, GM, and Xerox have emphasized on designing RLNs (see Easwaran and Üster 2010; Fleischmann et al. 2000). The RLNs are often designed for the purpose of collecting used, refurbished, or defective products from customers and then carrying out some recovery activities. Final destination of collected products in these networks is often recovery, remanufacturing, disposal centers, or secondary markets. Reverse and forward logistics networks have usually influence to each other and many resources, such as transport and warehouse capacity can be share between them (see Fleischmann et al. 2001; Lee and Dong 2008). Therefore, the design of IFRLNs has gained significant attention in both practice and academia during the recent years that is also considered in this paper.

In reverse networks, acquisition or collection of used products is introduced as a main part of product recovery management (Guide et al. 2003). Several companies offer financial incentives to holders of used products in order to influence quantity and quality of returns. The amount of financial incentive paid by a company to collect one unit of a used product is called unit acquisition price (Aras et al. 2008). Acquisition prices are critical decisions in product recovery management playing two main roles: they determine the cost for buying per unit of each used product, and through a relationship between the acquisition price and return amount of each used product, they can change the required network’s facilities and their capacities for performing recovery activities. Figure 1 illustrates the relation between strategic design decisions and acquisition price decisions in recovery or reverse networks.

Fig. 1
figure 1

The relationship between strategic design decisions and acquisition price decisions in reverse logistics networks

Figure 1 illustrates the importance of integrating design and acquisition price decisions in RLNs. Additionally, as mentioned by Guide (2000), market-driven product-acquisition management approaches have been applied by many remanufacturing firms. In 1997, Xerox Europe implemented an end-of-life equipment take-back and reprocessing program and obtained more than $80 million savings (Maslennikova and Foley 2000). However, in the related literature, a few studies including (Aras et al. 2010, 2008), and (Keyvanshokooh et al. 2013) have addressed location and acquisition price decisions simultaneously in an integrated optimization problem. In Aras et al. (2010, 2008), location decisions are made for collection centers in a considerably simplified RLN. Keyvanshokooh et al. (2013) proposed a deterministic MILP model to design an IFRLN and they solved the problem using a commercial solver for small-sized test problems. In this paper, for the first time, we consider acquisition price decisions in a comprehensive model for design and planning of an IFRLN under stochastic demands and potential returns of customer zones. In addition, we propose an efficient solution approach for solving the problem.

In logistics network design, strategic decisions should be viable to function well under complex and stochastic business environments for many years. Stochastic IFRLN design have been addressed by several research studies (See Listeş 2007; Lee et al. 2010; Pishvaee et al. 2009; Lee and Dong 2009; De Rosa et al. 2013) and most of them created stochastic optimization models using two-stage stochastic programming. In spite of many advantages of using two-stage stochastic programs, there exist two main difficulties with this methodology. Firstly, creating scenarios and obtaining their associated probabilities could be a problematic and cumbersome task. Secondly, an adequate number of scenarios could lead to a large-scale optimization problem. In the area of IFRLN design under stochastic parameters, a few studies have simultaneously addressed these difficulties. In this paper, to overcome the first issue, LHS is applied to generate a fan of scenarios and backward scenario reduction technique is used to reduce the number of scenarios. To overcome the second one, a novel simulation-based simulated annealing (SA) algorithm is proposed as a solution approach. Additionally, to capture the time variable probabilistic behavior of stochastic parameters, a planning horizon with multiple tactical periods is taken into account and a dynamic pricing approach for collecting used products is developed.

The remainder of this paper is organized as follows: in Sect. 2, a literature review in the related area is represented. Section 3 defines the problem’s characteristics. The mathematical model is developed in Sect. 4. Simulation-based SA is proposed in Sect. 5 to solve the stochastic problem, and the scenario generation technique is described in Sect. 6. In Sect. 7, computational results are presented. Finally, Sect. 8 concludes the paper and suggests future research directions.

2 Literature review

Based on the presented surveys in the area of supply chain network design and closed loop supply chains(See, Akçalı et al. 2009; Melo et al. 2009; Govindan et al. 2015), there are few works that have dealt with IFRLN design under uncertainty by using scenario-based stochastic programs. In this section, the research studies related to this area are reviewed.

Sheppard (1974) was one of the first research studies that used a scenario approach for a FL problem and then, this approach has been used for designing logistics networks. In the presented literature review, the studies are classified based on the supply chain structure, decisions, and objective. In addition, a discussion about optimization issues related to these works is presented.

In IFRLN design, the main characteristics of logistics networks are the types of echelons and the echelons in which location decisions should be determined. In general, the echelons related to the forward direction consist of suppliers, plants, warehouses, and distribution centers, whereas the echelons corresponding to the reverse direction include collection centers, disposal centers, and recovery/remanufacturing centers (Govindan and Fattahi 2015). Nowadays, both strategic and tactical decisions should be determined simultaneously in designing logistics networks to achieve an integrated and efficient system (Goetschalckx et al. 2002). In the area of IFRLN design, capacity and supplier selection as strategic planning decisions and transportation and inventory decisions as tactical ones have been included in the problem in addition to classical location-allocation decisions. A few studies have considered stochastic parameters in the problem with a multi-period planning horizon. Lee and Dong (2009), Cardoso et al. (2013), and De Rosa et al. (2013) considered multiple strategic periods to deal with dynamic design of IFRLNs. However, in order to capture time variability of problem’s parameters, (Zeballos et al. 2014) such as our case addressed a stochastic multi-period IFRLN design in which periods were tactical planning intervals.

Most of the studies in this area proposed mixed-integer linear programming models (MILP) using two-stage stochastic programming (see e.g. Lee and Dong 2009; De Rosa et al. 2013; Zeballos et al. 2014). Furthermore, the presented optimization models have two common objective functions: cost minimization and profit maximization, and a few studies such as (Ramezani et al. 2013a) examined the problem with multiple objectives. In this area, a sufficient number of scenarios for scenario-based stochastic programs can lead to an extremely large-scale optimization problem. Therefore, it is essential to propose a solution approach for solving the stochastic programs. Unfortunately, the majority of studies did not attempt to deal with this issue and used commercial solvers to solve their mathematical models. Listeş (2007) and Khatami et al. (2015) have used Benders’ decomposition to solve problems with simple IFRLN under stochastic demand and return.

Heuristic and meta-heuristic algorithms can also be developed for solving logistics network design problems (Melo et al. 2009), and in this area, (Lee and Dong 2009) developed a solution approach by integrating sample average approximation (SAA) with a heuristic algorithm. However, these approaches for stochastic RLN or IFRLN design are very scarce. SA as a local-search based meta-heuristic is presented by Kirkpatrick (1984). The local-search based meta-heuristics unlike constructive and population-based algorithms work with only one individual (see Glover and Kochenberger 2003; Blum and Roli 2003). In deterministic decision-making environments, the SA is applied for solving logistics network design problems by Pishvaee et al. (2010b), Fattahi et al. (2015a), and Hsu and Li (2011) and in this paper, a novel simulation-based SA algorithm is developed to solve the stochastic IFRLN design problem. In Table 1, studies related to IFRLN design with stochastic parameters are classified in terms of above-mentioned features.

Table 1 Review of studies related to stochastic IFRLN design

From the presented literature review and Table 1, we can conclude that stochastic models for planning and designing IFRLNs are still scarce in the related literature. Moreover, a few models can capture stochasticity of parameters over a planning horizon with multiple tactical intervals. Additionally, proposing efficient solution approaches will be required in this area. To the best of our knowledge, no work has integrated the acquisition price decisions for collecting used products and location decisions in a stochastic model. Therefore, it is crucial to conduct this study to address the mentioned gaps.

3 Problem description

The IFRLN discussed in this paper is a multi-stage and multi-product logistics network. In the forward network, suppliers, production/recovery centers, and distribution centers are considered whereas the reverse network includes collection centers, disposal centers and production/recovery centers. In the integrated network, hybrid facilities are considered in which both production and recovery facilities are established at the same location. As mentioned by Lee and Dong (2009), hybrid facilities may produce significant cost savings in comparison with separate production or recovery centers. Several studies (See e.g. Easwaran and Üster 2010; Pishvaee et al. 2010a) considered logistics networks with hybrid production and recovery centers. Furthermore, (Lee et al. 2010) and (Salema et al. 2010) considered hybrid facilities for real-life case studies related to an international electronic company and a glass closed loop supply chain, respectively. The structure of the IFRLN is illustrated in Fig. 2.

Fig. 2
figure 2

The IFRLN structure

As shown in Fig. 2, in the forward network, the required raw materials are shipped from suppliers to production/recovery centers. New products are produced in production/recovery centers with consideration of their pre-specified bill of materials and then, the products are delivered to customer zones via distribution centers. Here, customer zones’ demands for multiple products are assumed stochastic in multiple tactical periods.

In the reverse direction, the return amount of used products from each customer zone is dependent on the financial incentive offered by the company. In accordance with quality levels of the used products, they are classified into different types and thereupon, various acquisition prices can be offered for each used product with different qualities. Furthermore, the potential amounts of used products with different quality levels are considered stochastic in multiple periods. Collection centers collect the returned products from customer zones and after inspection, divide them into scrapped and recoverable products. The scrapped products are shipped to disposal centers for recycling or proper disposal. The recoverable products are sent to production/recovery centers. It is worth noting that the recoverable products after recovery activities (remanufacturing or de-manufacturing) are forwarded to distribution centers as new ones.

The transformation of used products into usable products again may take different forms consist of recycling, repair, de-manufacturing and remanufacturing (Fleischmann et al. 2000). In this paper, such as some real-life recovery systems, recovered products after remanufacturing or de-manufacturing are the same as new produced ones, and several studies (see e.g. Pishvaee et al. 2010a; Listeş 2007; Salema et al. 2010; Cardoso et al. 2013) proposed their models in accordance with this assumption. Furthermore, examples of real-life case studies for such a recovery system are cleaning of polluted sands (Barros et al. 1998), recovering lead/acid used batteries (Subulan et al. 2014) and (Kannan et al. 2010), remanufacturing of electronic equipment (Listeş 2007). Figure. 3 illustrates the process of IFRLN in this paper.

Fig. 3
figure 3

The processes of IFRLN

The other underlying main assumptions of the model are as follows:

  • The locations of customer zones, suppliers, production/recovery centers, and disposal centers are predetermined and fixed.

  • The strategic design decisions consist of the locations of distribution centers, collection centers, and hybrid distribution-collection facilities. Furthermore, these design decisions are under an available budget limitation.

  • The tactical decisions consist of the amount of production, transportation flows through the network, inventory levels of products at distribution centers, and acquisition prices for used products.

  • All types of facilities are considered capacitated.

  • The objective is to maximize the net income of the IFRLN over a multiple tactical periods.

  • It is not mandatory to satisfy all customer zones’ demands. In order to satisfy the demands, the products can only be forwarded to them from the distribution centers.

  • The expected price of customer zones for returning one unit of each used product, customer zones’ willingness to return used products, is described by uniform distribution.

  • Regarding quality levels of returned products, their remanufacturing costs are different.

3.1 Dynamic pricing for collection of used products

In this paper, return amounts of used products from customer zones are dependent to the corresponding acquisition prices that are offered by the company. In addition, the used products have different quality levels in accordance with their usage rate and duration. Therefore, a set of discrete quality levels \(Q, q\in Q\), is considered for each used product \(m\in M\), and the used products with quality level 1 and \(\left| Q \right| \) have the highest and lowest quality, respectively.

Each customer having a used product with quality level \(q\in Q\) has an expected price for returning it. Moreover, because of diversity of customers in each customer zone, the expected price of customer zone \(i\in C\) for returning one unit of used product \(m\in M\) with quality level \(q\in Q\) is described by uniform distribution \(\left[ {A_{i,m,q},B_{i,m,q} } \right] \) (see Keyvanshokooh et al. 2013). For each used product with lower quality, the values of \(A_{i,m,q}\) and \(B_{i,m,q}\) are less. In Fig. 4, the relations between the offered acquisition prices and proportion returns of used product \(m\in M\) with quality levels \(q,{q}^{\prime }\in Q, \left( {{q}^{\prime }>q} \right) \), from customer zone \(i\in C\) are illustrated.

Fig. 4
figure 4

The relation between acquisition price value and proportion return of used product m with quality levels q and \(q^{\prime }\) form customer zone \(i (q^{\prime } > q)\)

Let \({\textit{APR}}_{i,m,q}^t\) and \({\textit{PRE}}_{i,m,q}^t\) be the offered acquisition price to customer zone \(i\in C\) and proportion return from customer zone \(i\in C\) for used product \(m\in M\) with quality level \(q\in Q\) in tactical period \(t\in T\), respectively and \(A_{i,m,q} \le {\textit{APR}}_{i,m,q}^t \le B_{i,m,q} \), then,

$$\begin{aligned} {\textit{PRE}}_{i,m,q}^t =\frac{{\textit{APR}}_{i,m,q}^t -A_{i,m,q} }{B_{i,m,q} -A_{i,m,q} }, \qquad \forall i\in C,\,\forall m\in M,\, \forall q\in Q,\, \forall t\in T. \end{aligned}$$
(1)

The uniform distribution for reservation incentives of customer zones makes the mathematical model nonlinear. Furthermore, it is desirable to restrict the acquisition prices of the used products to a finite set in real applications. Therefore, \(L=\left\{ {1,2,\ldots ,l,\ldots ,NL} \right\} \) is considered as a set of acquisition price levels for buying used products and we discretize acquisition prices into NL disjoint levels from \(A_{i,m,q}\) to \(B_{i,m,q}\).

Let \(\theta _{i, m, q, l}^t\) be binary decision variables that are equal to one if the price level \(l\in L\) is chosen for buying per unit used product \(m\in M\) with quality level \(q\in Q\) from customer zone \(i\in C\), in tactical period \(t\in T\). Then, \({\textit{APR}}_{i,m,q}^t\) and \({\textit{PRE}}_{i,m,q}^t\) can be obtained by using the following constraints:

$$\begin{aligned} \sum _{l\in L} {\theta _{i, m, q, l}^t }= & {} 1 \qquad \forall i\in C,\, \forall m\in M,\,\forall q\in Q,\,\forall t\in T \end{aligned}$$
(2)
$$\begin{aligned} {\textit{APR}}_{i,m,q}^t= & {} A_{i,m,q} +\sum _{l\in L} {\theta _{i, m, q, l}^t \left( {\frac{l-1}{NL-1}} \right) } \left( {B_{i,m,q} -A_{i,m,q} } \right) \nonumber \\&\forall i\in C,\,\forall m\in M,\,\forall q\in Q,\, \forall t\in T \end{aligned}$$
(3)
$$\begin{aligned} {\textit{PRE}}_{i,m,q}^t= & {} \sum _{l\in L} {\left( {\frac{l-1}{NL-1}} \right) \theta _{i, m, q, l}^t } \qquad \forall i\in C,\, \forall m\in M,\,\forall q\in Q,\,\forall t\in T \end{aligned}$$
(4)

4 Mathematical modeling

In this section, an MILP model is presented for the problem using two-stage stochastic programming with recourse. The general formulation for two-stage stochastic programs is as follows:

$$\begin{aligned} {\mathop {\hbox {Min}}\limits _{\mathbf{x}\in X}} \, \mathbf{c}^{T}{} \mathbf{x} + E\left( {Q\left( \mathbf{x} \right) } \right) . \end{aligned}$$
(5)

In optimization problem (5), \(\mathbf{x}, X\) and \(Q\left( \mathbf{x} \right) \) are the vector of first stage decisions, a non-empty set of feasible decisions, and the recourse function. The stochastic parameters are assumed as a discrete and finite sample space, in this paper. The sample space can be presented by \(\varvec{\upzeta } = \left\{ {\zeta ^{1}, \zeta ^{2},\ldots , \zeta ^{\left| S \right| } } \right\} \) with corresponding probabilities \(\pi ^{1}, \pi ^{2},\ldots , \pi ^{\left| S \right| }\) and \(\zeta ^{s}\) is a given realization of the stochastic parameters that depends on scenario \(s\in S\). Therefore, \(E\left( {Q\left( \mathbf{x} \right) } \right) \) in problem (5) can be calculated as \(\sum \limits _{s\in S} {\pi ^{s}\times Q\left( {\mathbf{x}, \zeta ^{s}} \right) }\), where,

$$\begin{aligned} Q\left( {\mathbf{x}, \zeta ^{s}} \right) = \mathop {\hbox {min}}\limits _{\mathbf{y}^{s}} \left\{ {\left( {\mathbf{q}^{s}} \right) ^{T}{} \mathbf{y}^{s}: W^{s}{} \mathbf{y}^{s}=\mathbf{h}^{s}-T^{s}{} \mathbf{x}, \mathbf{y}^{s}\ge 0} \right\} . \end{aligned}$$

Here, \(\mathbf{y}^{s}\) and \(\zeta ^{s}=\left( {\mathbf{q}^{s},W^{s},T^{s}, \mathbf{h}^{s}} \right) \) represent the vector of recourse decisions and particular realization of stochastic parameters for scenario \(s\in S\), respectively. For more information about the two-stage stochastic programs, one can refer to Birge and Louveaux (2011). Here, the required notations for our two-stage stochastic programming formulation are introduced.

4.1 Two-stage stochastic programming formulation

In this subsection, the problem is formulated using two-stage stochastic programing. The stochastic parameters in our problem are potential return (RE) and demand (DE) of customer zones and we have \(\zeta ^{s}=\left( {{\textit{DE}}^{s},{\textit{RE}}^{s}} \right) \) as a specific realization of the uncertain parameters in scenario \(s\in S\). Here, \(\varvec{x}, \varvec{y}, \varvec{z}, \varvec{\theta }\) represent binary variables, the first stage decisions, in which the indices are omitted.

$$\begin{aligned} \hbox {Max:} \quad {\textit{BU}}-\sum _{i\in J} {{\textit{CD}}_i x_i } -\sum _{i\in J} {{\textit{CC}}_i y_i } -\sum _{i\in J} {{\textit{CH}}_i z_i } +\sum _{s\in S} {\pi ^{s}\times Q\left( {\varvec{x}, \varvec{y}, \varvec{z}, \varvec{\theta }, \zeta ^{s}} \right) }. \end{aligned}$$
(6)

Subject to:

$$\begin{aligned}&x_i +y_i +z_i \le 1, \qquad \forall i\in J, \end{aligned}$$
(7)
$$\begin{aligned}&\sum _{i\in J} {{\textit{CD}}_i x_i } +\sum _{i\in J} {{\textit{CC}}_i y_i } +\sum _{i\in J} {{\textit{CH}}_i z_i } \le {\textit{BU}}, \end{aligned}$$
(8)
$$\begin{aligned}&{{\varvec{x}}}, {{\varvec{y}}}, {{\varvec{z}}}, {\varvec{\theta }} \in \left\{ {0, 1} \right\} . \end{aligned}$$
(9)

In which \(Q\left( {\varvec{x}, \varvec{y}, \varvec{z}, \varvec{\theta }, \zeta ^{s}} \right) \) is the solution of the following second-stage problem:

$$\begin{aligned}&Q\left( {\varvec{x}, \varvec{y}, \varvec{z}, \varvec{\theta }, \zeta ^{s}} \right) \nonumber \\&\quad = \hbox {Max:} \left( {\sum _{i\in J} {\sum _{{i}^{\prime }\in C} {\sum _{m\in M} {\sum _{t\in T} {{\textit{SP}}_m^t f_{i, {i}^{\prime }, m}^{t,s} } } } } } \right) -\left( {\sum _{i\in K} {\sum _{{i}^{\prime }\in P} {\sum _{r\in R} {\sum _{t\in T} {{\textit{CB}}_{i,r}^t w_{i, {i}^{\prime }, r}^{t,s} } } } } } \right) \nonumber \\&\qquad -\left( {\sum _{i\in P} {\sum _{m\in M} {\sum _{t\in T} {{\textit{CP}}_{i,m}^t n_{i, m}^{t,s} } } } } \right) -\left( {\sum _{i\in P} {\sum _{m\in M} {\sum _{q\in Q} {\sum _{t\in T} {{\textit{CR}}_{i,m,q}^t g_{i, m, q}^{t,s} } } } } } \right) \nonumber \\&\qquad -\left( {\sum _{i\in J} {\sum _{m\in M} {\sum _{t\in T} {{\textit{CHD}}_{i,m} h_{i, m}^{t,s} } } } +\sum _{i\in P} {\sum _{{i}^{\prime }\in J} {\sum _{m\in M} {\sum _{t\in T} {{\textit{CHD}}_{{i}^{\prime },m} \left( {{f_{i, {i}^{\prime }, m}^{t,s} }/{2\Lambda _{i{i}^{\prime }} }} \right) } } } } } \right) \nonumber \\&\qquad -\left( {\sum _{i\in J} {\sum _{{i}^{\prime }\in C} {\sum _{m\in M} {\sum _{t\in T} {{\textit{CPF}}_{i,m}^t f_{i, {i}^{\prime }, m}^{t,s} } } } } } \right) -\left( {\sum _{i\in C} {\sum _{{i}^{\prime }\in J} {\sum _{m\in M} {\sum _{q\in Q} {\sum _{t\in T} {{\textit{CPB}}_{i,m}^t v_{i, {i}^{\prime }, m, q}^{t,s} } } } } } } \right) \nonumber \\&\qquad -\left( {\sum _{i\in J} {\sum _{{i}^{\prime }\in D} {\sum _{m\in M} {\sum _{q\in Q} {\sum _{t\in T} {{\textit{CDS}}_{i,m}^t v_{i, {i}^{\prime }, m, q}^{t,s} } } } } } } \right) \nonumber \\&\qquad -\left( {\begin{array}{l} \sum \limits _{i\in K} {\sum \limits _{{i}^{\prime }\in P} {\sum \limits _{r\in R} {\sum \limits _{t\in T} {{\textit{CTR}}_{i,{i}^{\prime },r} w_{i, {i}^{\prime }, r}^{t,s} } } } } +\sum \limits _{i\in P} {\sum \limits _{{i}^{\prime }\in J} {\sum \limits _{m\in M} {\sum \limits _{t\in T} {{\textit{CTF}}_{i,{i}^{\prime },m} f_{i, {i}^{\prime }, m}^{t,s} } } } } \\ \quad +\sum \limits _{i\in J} {\sum \limits _{{i}^{\prime }\in C} {\sum \limits _{m\in M} {\sum \limits _{t\in T} {{\textit{CTF}}_{i,{i}^{\prime },m} f_{i, {i}^{\prime }, m}^{t,s} } } } } \\ \sum \limits _{i\in C} {\sum \limits _{{i}^{\prime }\in J} {\sum \limits _{m\in M} {\sum \limits _{q\in Q} {\sum \limits _{t\in T} {{\textit{CTB}}_{i,{i}^{\prime },m} v_{i, {i}^{\prime }, m, q}^{t,s} } } } } } +\sum \limits _{i\in J} {\sum \limits _{{i}^{\prime }\in D} {\sum \limits _{m\in M} {\sum \limits _{q\in Q} {\sum \limits _{t\in T} {{\textit{CTB}}_{i,{i}^{\prime },m} v_{i, {i}^{\prime }, m, q}^{t,s} } } } } }\\ \quad +\sum \limits _{i\in J} {\sum \limits _{{i}^{\prime }\in P} {\sum \limits _{m\in M} {\sum \limits _{q\in Q} {\sum \limits _{t\in T} {{\textit{CTB}}_{i,{i}^{\prime },m} v_{i, {i}^{\prime }, m, q}^{t,s} } } } } } \\ \end{array}} \right) \nonumber \\&\qquad -\left( \sum _{i\in C} \sum _{m\in M} \sum _{q\in Q} \sum _{l\in L} \sum _{t\in T} \left( {A_{ i, m, q} +\left( {\frac{l-1}{\left| L \right| -1}} \right) \left( {B_{ i, m, q} -A_{ i, m, q} } \right) } \right) \right. \nonumber \\&\qquad \left. \times \left( {\left( {\frac{l-1}{\left| L \right| -1}} \right) RE_{i,m,q}^{t,s} } \right) \times \theta _{i, m, q, l}^t \right) -\left( {\sum _{i\in C} {\sum _{m\in M} {\sum _{t\in T} {{\textit{CS}}_{i,m}^t u_{i, m}^{t,s} } } } } \right) . \end{aligned}$$
(10)

Subject to:

$$\begin{aligned}&\sum _{m\in M} {{\textit{BM}}_{r,m} n_{i, m}^{t,s} } =\sum _{{i}^{\prime }\in K} {w_{{i}^{\prime }, i, r}^{t,s} }, \qquad \forall i\in P,\,\forall r\in R,\,\forall t\in T, \end{aligned}$$
(11)
$$\begin{aligned}&\sum _{{i}^{\prime }\in P} {w_{i, {i}^{\prime }, r}^{t,s} } \le {\textit{SC}}_{i,r}^t, \qquad \forall i\in K,\,\forall r\in R,\,\forall t\in T, \end{aligned}$$
(12)
$$\begin{aligned}&n_{i, m}^{t,s} +\sum _{q\in Q} {g_{i, m, q}^{t,s} } =\sum _{{i}^{\prime }\in J} {f_{i, {i}^{\prime }, m}^{t,s} }, \qquad \forall i\in P, \,\forall m\in M,\, \forall t\in T, \end{aligned}$$
(13)
$$\begin{aligned}&\sum _{m\in M} {\gamma _{i,m}^{\mathrm{P}} n_{i, m}^{t,s} } +\sum _{m\in M} {\gamma _{i,m}^{\mathrm{R}} \left( {\sum _{q\in Q} {g_{i, m, q}^{t,s} } } \right) } \le {\textit{PC}}_i^t, \qquad \forall i\in P,\,\forall t\in T, \end{aligned}$$
(14)
$$\begin{aligned}&\sum _{{i}^{\prime }\in P} {f_{{i}^{\prime },i, m}^{t,s} } +h_{i, m}^{t-1,s} =\sum _{{i}^{\prime }\in C} {f_{i, {i}^{\prime }, m}^{t,s} } +h_{i, m}^{t,s}, \quad \forall i\in J,\,\forall m\in M,\,\forall t\in T, \end{aligned}$$
(15)
$$\begin{aligned}&\sum _{{i}^{\prime }\in J} {f_{{i}^{\prime },i, m}^{t,s} } +u_{i, m}^{t,s} ={\textit{DE}}_{i,m}^{t,s}, \qquad \forall i\in C, \, \forall m\in M, \,\forall t\in T, \end{aligned}$$
(16)
$$\begin{aligned}&\sum _{m\in M} {\gamma _{i,m}^{\mathrm{S}} \times \left( {\sum _{{i}^{\prime }\in P} {\left( {{f_{{i}^{\prime },i, m}^{t,s} }/{\Lambda _{{i}^{\prime }i} }} \right) } +h_{i, m}^{t,s} } \right) } \le {\textit{HC}}_i x_i +{\textit{HCH}}_i z_i, \qquad \forall i\in J,\,\forall t\in T, \end{aligned}$$
(17)
$$\begin{aligned}&\sum _{m\in M} {\left( {\gamma _{i,m}^{\mathrm {PF}} \sum _{{i}^{\prime }\in C} {f_{i, {i}^{\prime }, m}^{t,s} } } \right) } \le {\textit{FC}}_i x_i +{\textit{FCH}}_i z_i, \qquad \forall i\in J,\,\forall t\in T, \end{aligned}$$
(18)
$$\begin{aligned}&\left( {1-\beta _{m,q} } \right) \sum _{{i}^{\prime }\in C} {v_{{i}^{\prime },i,m, q}^{t,s} } =\sum _{{i}^{\prime }\in D} {v_{i,{i}^{\prime },m, q}^{t,s} }, \qquad \forall i\in J,\, \forall m\in M,\,\forall q\in Q,\,\forall t\in T, \end{aligned}$$
(19)
$$\begin{aligned}&\beta _{m,q} \sum _{{i}^{\prime }\in C} {v_{{i}^{\prime },i,m, q}^{t,s} } =\sum _{{i}^{\prime }\in P} {v_{i,{i}^{\prime },m, q}^{t,s} }, \qquad \forall i\in J,\,\forall m\in M,\,\forall q\in Q,\,\forall t\in T, \end{aligned}$$
(20)
$$\begin{aligned}&g_{i, m, q}^{t,s} =\sum _{{i}^{\prime }\in J} {v_{{i}^{\prime },i,m, q}^{t,s} }, \qquad \forall i\in P,\,\forall m\in M,\,\forall q\in Q,\,\forall t\in T, \end{aligned}$$
(21)
$$\begin{aligned}&\sum _{m\in M} {\left( {\gamma _{i,m}^{{\textit{PB}}} \left( {\sum _{{i}^{\prime }\in C} {\sum _{q\in Q} {v_{{i}^{\prime },i,m, q}^{t,s} } } } \right) } \right) } \le {\textit{CCA}}_i y_i +{\textit{CCH}}_i z_i, \qquad \forall i\in J,\,\forall t\in T, \end{aligned}$$
(22)
$$\begin{aligned}&\sum _{l\in L} {\theta _{i, m, q, l}^t } =1, \qquad \forall i\in C,\, \forall m\in M,\,\forall q\in Q,\,\forall t\in T, \end{aligned}$$
(23)
$$\begin{aligned}&\sum _{{i}^{\prime }\in J} {v_{i,{i}^{\prime },m, q}^{t,s} } \le \sum _{l\in L} {\left( {\frac{l-1}{L-1}} \right) \theta _{i, m, q, l}^t RE_{i,m,q}^{t,s} }, \qquad \forall i\in C,\,\forall m\in M,\, \forall q\in Q,\,\forall t\in T, \end{aligned}$$
(24)
$$\begin{aligned}&\varvec{w}, \varvec{f}, \varvec{v}, \varvec{n}, \varvec{u}, \varvec{g}, \varvec{h} \ge 0. \end{aligned}$$
(25)

The objective function (6) is the sum of unspent available budget for designing the IFRLN and the expected net income of the logistics network over the planning horizon. The objective function of the second-stage (10) is equal to the total revenue from selling the final products to customer zones minus the total tactical costs. The tactical costs consist of ten parts: (1) costs of buying raw materials form suppliers, (2) the production costs, (3) the remanufacturing costs, (4) the storage costs, (5) the processing costs to forward products from distribution centers and hybrid facilities to customer zones, (6) the processing costs on used products in collection centers and hybrid facilities, (7) the disposal costs, (8) the transportation costs, (9) costs of buying used products from customer zones, (10) shortfall penalty for unsatisfied customer zones’ demands.

Sets

I

Set of all entities in the network including suppliers, production/recovery centers, disposal centers, customer zones, and potential locations for distribution, collection centers, and hybrid facilities, \(\left( {i, {i}^{\prime }\in I} \right) \),

K

Set of suppliers \((K\subset I)\),

P

Set of production/recovery centers \((P\subset I)\),

J

Set of potential locations for distribution centers, collection centers, and hybrid facilities \((J\subset I)\),

C

Set of customer zones \((C\subset I)\),

D

Set of disposal centers \((D\subset I)\),

M

Set of products \((m\in M)\),

L

Set of price levels for buying used products \((l\in L)\) and the number of price levels is NL,

R

Set of raw materials \((r\in R)\),

Q

Set of quality levels of used products \((q\in Q)\),

T

Set of tactical periods \((t\in T)\),

S

Set of scenarios \((s\in S)\).

Costs

\({\textit{CD}}_i\)

Fixed cost of opening a distribution center at potential location \(i\in J\),

\({\textit{CC}}_i\)

Fixed cost of opening a collection center at potential location \(i\in J\),

\({\textit{CH}}_i\)

Fixed cost of opening a hybrid facility at potential location \(i\in J\),

\({\textit{CB}}_{i,r}^t\)

Cost of buying one unit of raw material \(r\in R\) from supplier \(i\in K\), in tactical period \(t\in T\),

\({\textit{CTR}}_{i,{i}^{\prime },r}\)

Cost of transporting one unit of raw material \(r\in R\) from supplier \(i\in K\) to production/recovery center \({i}^{\prime }\in P\),

\({\textit{CTF}}_{i,{i}^{\prime },m}\)

Cost of transporting one unit of product \(m\in M\) between entities i and \({i}^{\prime }\) in forward direction,

\({\textit{CTB}}_{i,{i}^{\prime },m}\)

Cost of transporting one unit of product \(m\in M\) between entities i and \({i}^{\prime }\) in reverse direction,

\({\textit{CP}}_{i,m}^t\)

Cost of producing one unit of product \(m\in M\) in production/recovery center \(i\in P\), in tactical period \(t\in T\),

\({\textit{CPF}}_{i,m}^t\)

Cost of processing to forward one unit of product \(m\in M\) from distribution center or hybrid facility \(i\in J\), in tactical period \(t\in T\),

\({\textit{CHD}}_{i,m}\)

Cost of holding one unit of product \(m\in M\) in distribution center or hybrid facility \(i\in J\),

\({\textit{CS}}_{i,m}^t\)

Cost of penalizing one unit of non-satisfied product \(m\in M\) for customer zone \(i\in C\), in tactical period \(t\in T\),

\({\textit{CR}}_{i,m,q}^t\)

Cost of recovering one unit of recoverable product \(m\in M\) with quality level \(q\in Q\) in production/recovery center \(i\in P\), in tactical period \(t\in T\),

\({\textit{CPB}}_{i,m}^t\)

Cost of processing one unit of used product \(m\in M\) in collection center or hybrid facility \(i\in J\), in tactical period \(t\in T\),

\({\textit{CDS}}_{i,m}^t\)

Cost of disposing one unit of scrapped product \(m\in M\) in disposal center \(i\in D\), in tactical period \(t\in T\).

Other input parameters

\({\textit{DE}}_{i,m}^{t,s}\)

Demand of customer zone \(i\in C\) for product \(m\in M\),in tactical period \(t\in T\) and scenario \(s\in S\),

\({\textit{RE}}_{i,m,q}^{t,s}\)

Potential return amount of used product \(m\in M\) with quality level \(q\in Q\) from customer zoned \(i\in C\), in tactical period \(t\in T\) and scenario \(s\in S\),

\(A_{ i, m, q}\)

Lower endpoint of uniform distribution related to expected price of customer zone \(i\in C\) for one unit of used product \(m\in M\) with quality level \(q\in Q\)

\(B_{ i, m, q}\)

Upper endpoint of uniform distribution related to expected price of customer zone \(i\in C\) for one unit of used product \(m\in M\) with quality level \(q\in Q\),

\({\textit{BM}}_{r,m}\)

Required amount of raw material \(r\in R\) for producing one unit of product \(m\in M\),

\({\textit{SP}}_m^t\)

Selling price for one unit of product \(m\in M\), in tactical period \(t\in T\),

\({\textit{SC}}_{i,r}^t\)

Capacity of supplier \(i\in K\) for raw material \(r\in R\), in tactical period \(t\in T\),

\({\textit{PC}}_i^t\)

Capacity of production/recovery center \(i\in P\), in tactical period \(t\in T\),

\({\textit{HC}}_i\)

Storage capacity of potential location \(i\in J\), if a distribution center is opened at location i,

\({\textit{HCH}}_i\)

Storage capacity of potential location \(i\in J\), if a hybrid facility is opened at location i,

\({\textit{FC}}_i\)

Processing capacity of potential location \(i\in J\) for forwarding finished products to customer zones, if a distribution center is opened at location i,

\({\textit{FCH}}_i\)

Processing capacity of potential location \(i\in J\) for forwarding finished products to customer zones, if a hybrid facility is opened at location i,

\({\textit{CCA}}_i\)

Capacity of potential location \(i\in J\) for processing returned products, if a collection center is opened at location i,

\({\textit{CCH}}_i\)

Capacity of potential location \(i\in J\) for processing returned products, if a hybrid facility is opened at location i,

\(\Lambda _{i{i}^{\prime }}\)

Number of forward deliveries between entities \(i\in P\) and \({i}^{\prime }\in J\) in each tactical period,

\(\gamma _{i,m}^{\mathrm{P}}\)

Coefficient for using capacity of production/recovery center \(i\in P\) to produce one unit of product \(m\in M\),

\(\gamma _{i,m}^{\mathrm{R}}\)

Coefficient for using capacity of production/recovery center \(i\in P\) to remanufacture one unit of product \(m\in M\),

\(\gamma _{i,m}^{\mathrm{S}}\)

Coefficient for using storage capacity of distribution center or hybrid facility \(i\in J\) to stock one unit of finished product \(m\in M\),

\(\gamma _{i,m}^{\mathrm {PF}}\)

Coefficient for using capacity of distribution center or hybrid facility \(i\in J\) to handle one unit of finished product \(m\in M\),

\(\gamma _{i,m}^{\mathrm {PB}}\)

Coefficient for using capacity of collection center or hybrid facility \(i\in J\) to handle one unit of used product \(m\in M\),

\(\beta _{m,q}\)

Average recoverable fraction of used product \(m\in M\) with quality level \(q\in Q\)

\({\textit{BU}}\)

The available budget for opening facilities in the network,

\(\pi ^{s}\)

The probability of scenario \(s\in S\).

Binary variables

\(\theta _{i, m, q, l}^t\)

1 if the price level \(l\in L\) is selected for purchasing per unit used product \(m\in M\)with quality level \(q\in Q\) from customer zone \(i\in C\), in tactical period \(t\in T\),

\(x_i \)

1 if a distribution center is established at potential location \(i\in J\),

\(y_i\)

1 if a collection center is established at potential location \(i\in J\),

\(z_i\)

1 if a hybrid facility is established at potential location \(i\in J\).

Continuous variables

\(w_{i, {i}^{\prime }, r}^{t,s}\)

Amount of raw material \(r\in R\) shipped from supplier \(i\in S\) to production/recovery center \({i}^{\prime }\in P\),in tactical period \(t\in T\), and scenario \(s\in S\),

\(f_{i, {i}^{\prime }, m}^{t,s}\)

Amount of finished product \(m\in M\) shipped from entity \(i\in I\) to entity \({i}^{\prime }\in I\) in forward direction, in tactical period \(t\in T\), and scenario \(s\in S, \quad \left( {i\in P,{i}^{\prime }\in J} \right) \hbox {or} \left( {i\in J, {i}^{\prime }\in C} \right) \),

\(v_{i, {i}^{\prime }, m, q}^{t,s}\)

Amount of used product \(m\in M\) with quality level \(q\in Q\) shipped from entity \(i\in I\) to entity \({i}^{\prime }\in I\) in reverse direction, in tactical period \(t\in T\), and scenario \(s\in S, \quad \left( {i\in C,{i}^{\prime }\in J} \right) or\left( {i\in J, {i}^{\prime }\in D} \right) or\left( {i\in J, {i}^{\prime }\in P} \right) \),

\(n_{i, m}^{t,s}\)

Amount of product \(m\in M\) produced at production/recovery center \(i\in P\) in tactical period \(t\in T\) and scenario \(s\in S\),

\(u_{i, m}^{t,s}\)

Un-satisfied demand of product \(m\in M\) for customer zone \(i\in C\) in tactical period \(t\in T\) and scenario \(s\in S\),

\(g_{i, m, q}^{t,s}\)

Amount of used product \(m\in M\) with quality level \(q\in Q\) remanufactured at production/recovery center \(i\in P\) in tactical period \(t\in T\)and scenario \(s\in S\),

\(h_{i, m}^{t,s}\)

Amount of product \(m\in M\) held at distribution center or hybrid facility \(i\in J\) in tactical period \(t\in T\) and scenario \(s\in S\).

Constraints (7) guarantee that in each potential location, at most one type of facilities can be opened. Budget constraint (8) limits the allowed capital for locating new facilities in the logistics network to the available budget. Constraints (9) state that the corresponding variables are binary.

Constraints (11) assure that each production/recovery center receives an enough amount of each raw material from suppliers to produce different products, in each tactical period. Constraints (12) guarantee that in each tactical period, the raw material flows forwarding from each supplier does not exceed the corresponding capacity of that supplier. In accordance to constraints (13), in each tactical period, all produced and remanufactured products in each production/recovery center, should be forwarded to distribution centers or hybrid facilities. Capacity of production/recovery centers are restricted by constraints (14). Constraints (15) and (16) are balance constraints in the forward direction. Constraints (15) ensure that in each tactical period and for each product, the input product flows plus the product amount that is stored at the end of the previous period in any distribution center or hybrid facility is equal to the output product flows plus the quantity of the product that is stored at the end of the current tactical period. Constraints (16) state that the customer zone’s demand for each product should be partially or completely satisfied in each tactical period. In accordance to constraints (17), each open distribution center or hybrid facility must not hold the finished products more than its storage capacity, in each tactical period. Constraints (18) limit each open distribution center or hybrid facility to not forward products to customer zones more than its processing capacity, in each tactical period. Constraints (1921) are balance constraints in the reverse direction. Constraints (22) limit the amount of collected used products to the capacity of processing used products in each collection center or hybrid facility, in each tactical period. Constraints (23) assure that in each tactical period and for each customer zone, only one level for acquisition price should be selected for each used product with each quality level. In accordance to constraints (24), in each tactical period, the amount of used products forwarded from each customer zone to collection centers is less than or equal to the purchased amount of that used product. Finally, Constraints (25) restrict corresponding variables from taking negative values. In the next section, the simulation-based SA algorithm for solving the stochastic optimization problem is presented.

5 Solution approach

The considered problem includes the capacitated location problem which is known to be NP-complete (Davis and Ray 1969) and the IFRLN design problem is an NP-hard optimization problem (Pishvaee et al. 2010a). Therefore, commercial solvers such as CPLEX cannot solve the proposed model in a reasonable time for large-sized networks or a large number of scenarios. In this section, a simulation-based solution approach on the basis of simulated annealing (SA) is proposed to solve the problem.

\(\left( {\varvec{x}, \varvec{y}, \varvec{z}, \varvec{\theta }} \right) \) are the first stage decisions in the proposed two-stage stochastic program and should be made before the realization of uncertain parameters. Furthermore, these decisions are binary variables and for any feasible solution of them, the resulting problem would be linear and can be solved simply by using commercial solvers. Variables \({\varvec{\theta }}\) are multi-dimensional binary variables and make the optimization problem too complex. However, in accordance with our implementations, these variables have smaller implication on the problem’s objective in comparison with major configuration decisions (x, y, z), and if we relax variables \({\varvec{\theta }}\) to be continuous between 0 and 1, an upper bound problem would be obtained that is very close to the original problem in terms of the objective value. As a consequence, in the proposed solution approach, the SA algorithm is applied to swiftly search the feasible domain of variables (x, y, z). Then, a developed linear-relaxation based heuristic obtains the binary and feasible values for variables \({\varvec{\theta }}\) in accordance with the best solution found by the SA algorithm.

The proposed solution approach has three main steps. In the first step, we relax binary variables \({\varvec{\theta }}\) to be continuous between 0 and 1 and assume that these variables are scenario dependent as second stage decisions to obtain an upper bound problem. In the second step, we use the SA algorithm to find the best solution for the upper bound problem. The SA algorithm searches on the feasible domain of variables (x, y, z) and applies a simulation approach to calculate the objective function of the upper bound problem. In the simulation phase of the SA, after finding each feasible solution for variables (x, y, z), the solution is fixed in the upper bound problem and the obtained linear programming problem is solved separately for each scenario using any LP solver (e.g., CPLEX) to determine the objective function and the optimal values for the rest of continuous variables (\(\varvec{w}, \varvec{f}, \varvec{v}, \varvec{n}, \varvec{u}, \varvec{g}, \varvec{h}, {\varvec{\theta }})\). Finally, in the last step of the solution algorithm, we use a fixing procedure and then, solve a simple MILP model to obtain the binary values for variables \({\varvec{\theta }}\) and the problem’s objective in accordance with the best solution of the SA algorithm.

5.1 Encoding scheme and decoding procedure for the SA

A \(2\times \left| J \right| \) matrix containing numbers between 0 and 1 is assumed as an encoding scheme. \(\left| J \right| \) is the number of potential locations for distribution centers, collection centers, and hybrid facilities. Figure 5 shows the encoding scheme for an illustrative example with eight potential locations. Here, \(\gamma _j\) and \({\gamma }_j^{\prime }\) are the values in the first row and jth column, and second row and jth column in the solution representation matrix, respectively. (e.g. \(\gamma _2 =0.5601\) and \({\gamma }_5^{\prime } =0.7389\))

Fig. 5
figure 5

The solution representation matrix for an illustrative example

A decoding procedure is developed to generate feasible solutions for (x, y, z) from the encoding matrix. The binary values for (x, y, z) should not violate the budget constraint, \(\sum \limits _{i\in J} {CD_i x_i } +\sum \limits _{i\in J} {CC_i y_i } +\sum \limits _{i\in J} {CH_i z_i } \le BU\), that the presented decoding procedure handles this issue. In Fig. 6, \(\left( {\hat{\varvec{x}},\hat{\varvec{y}},\hat{\varvec{z}}} \right) \) are the obtained values for binary decisions (x, y, z).

Fig. 6
figure 6

The pseudo code of developed decoding procedure for the SA algorithm

In the decoding procedure, parameter \(\alpha \) has a predetermined value between 0 and 1 that is used to generate a solution for (x, y, z) in accordance with the values of the SA’s representation matrix. Furthermore, as another strategy, namely strategy II, we can use \(\max \left( {\gamma _i,{\gamma }_i^{\prime }} \right) \) in section a of the decoding procedure instead of \(\gamma _i +{\gamma }_i^{\prime }\). Both of these strategies are examined in the computational results part.

After finding a feasible solution for binary variables (x, y, z), by fixing their values in the upper bound problem in which variables \({\varvec{\theta }}\) is assumed to be scenario-dependent and continuous between 0 and 1, a linear programming (LP) problem would be obtained. This LP problem is separable for each scenario and hence, we can solve it for each scenario using CPLEX solver to determine the objective and values of continuous variables (\(\varvec{w}, \varvec{f}, \varvec{v}, \varvec{n}, \varvec{u}, \varvec{g}, \varvec{h}, {\varvec{\theta }}\)).

5.2 SA’s framework

Local searching on a solution has a main impact on the SA’s performance. Here, we use normal distribution function as a popular probabilistic distribution function for searching on a continuous search space. Therefore, a randomly generated \(2\times \left| J \right| \) matrix from a normal distribution with mean parameter \(\mu \) and standard deviation \(\sigma \) is added to the elements of the previous solution representation matrix to obtain a new solution in its neighborhood. Moreover, in the new solution, the matrix elements are prevented from taking values more than one or less than zero and therefore, after local searching, the value of an element should be set to zero, if it is less than zero and one, if it is more than one. In this paper, in accordance with a large number of empirical experiments, the values of \(\mu \) and \(\sigma \) are considered 0 and 0.15, respectively. Additionally, most of the matrix’s elements take a value between 0 and 1 by these values for \(\mu \) and \(\sigma \).

The acceptance probability is computed for a new worse solution in the SA algorithm after a local searching by using relation (26). \(f^{\mathrm{new}}\) and \(f^{\mathrm{old}}\) are the objective values of the current and new solutions in relation (26) in which \(f^{\mathrm{new}}<f^{\mathrm{old}}\). Moreover, c and \(\tau \) represent the coefficient of the temperature and the temperature, successively.

$$\begin{aligned} P^{A}=\exp \left( {{-\left( {f^{\mathrm{old}}-f^{\mathrm{new}}} \right) }/{\left( {c\times \tau } \right) }} \right) . \end{aligned}$$
(26)

In the SA algorithm, if a randomly generated number between (0, 1) is less than \(P^{A}\), the new solution will be accepted. Furthermore, a cooling scheme by which a new temperature is equal to the previous temperature minus one, \(\tau ^{\mathrm{new}}=\tau -1\), is utilized.

5.3 A heuristic for obtaining pricing variables

In this section, a heuristic as the last step of the solution approach is developed to obtain binary and near-optimal values for variables \({\varvec{\theta }}\). The values for variables \({\varvec{\theta }}\) that are obtained from the best solution of the SA algorithm are continuous between 0 and 1 and scenario dependent. Here, these relaxed values are demonstrated by \(\tilde{\varvec{{\theta }}}\), and \({\tilde{\theta }}_{i,m,q, l}^{t,s}\) is the obtained value for price level \(l\in L\) to buy per unit used product \(m\in M\) with quality level \(q\in Q\) from customer zone \(i\in C\), in tactical period \(t\in T\) and scenario \(s\in S\). Afterward, we can compute the approximate values \(({\bar{\varvec{\theta }}})\) for variables \({\varvec{\theta }}\) that are not dependent to scenarios by using relation (27). It is worth noting that, these approximate values are still continuous between 0 and 1.

$$\begin{aligned} \bar{{\theta }}_{i,m,q, l}^t =\sum _{s\in S} {\pi _s \times \tilde{\theta }_{i,m,q, l}^{t,s} } \qquad \forall i\in C,\,\forall m\in M,\, \forall q\in Q,\,\forall l\in L, \,\forall t\in T \end{aligned}$$
(27)

As previously mentioned, the SA’s objective is related to the objective value of the upper bound problem. In the last step of the solution approach, using the developed heuristic, the binary values for variables \({\varvec{\theta }}\) and the value of original problem’s objective should be obtained.

The heuristic, in accordance to the best SA’s solution, fixes variables (x, y, z) in the original MILP model. In addition, variables \({\varvec{\theta }}\) in which their approximate values \(({\bar{\varvec{\theta }}})\) are more than \(\varepsilon \), should be fixed to 1 in the model. Here, \(\varepsilon \) is a threshold parameter with predetermined value between 0 and 1. Afterward, the objective value and binary values for the rest of variables \({\varvec{\theta }}\) are achieved by solving the model using CPLEX solver. It is worth noting that in the computational results section, we set a proper value for \(\varepsilon \) based on the empirical experiments and investigate its impact on the performance of the heuristic. The main framework of the solution approach is presented in Fig. 7.

Fig. 7
figure 7

The framework of the proposed solution approach

6 Modeling demand and potential return uncertainty

In this paper, a two-stage stochastic programming model is developed in which the second stage has multiple periods. In such a stochastic program, one of the main concerns is to obtain an efficient fan of discrete scenarios for stochastic parameters and estimate their probabilities. Here, two stochastic processes are assumed to model customer zones’ demands and potential returns in multiple periods. The stochastic processes are based on a modified autoregressive model of first order that is presented by Sodhi (2005). In our case, in tactical period \(t\in T\) and for scenario \(s\in S\), the demand of customer zone \(i\in C\) for product \(m\in M\) and the potential return of customer zone \(i\in C\) for product \(m\in M\) with quality level \(q\in Q\) are obtained by relation (28) and (29), respectively.

$$\begin{aligned} {\textit{DE}}_{i,m}^{t,s} -\mu _{i,m}^t= & {} \rho _{i,m} \left[ {{\textit{DE}}_{i,m}^{t-1,s} -\mu _{i,m}^{t-1} } \right] +\varepsilon _{i,m}^{t,s}, \end{aligned}$$
(28)
$$\begin{aligned} {\textit{RE}}_{i,m,q}^{t,s} -{\mu }_{i,m,q}^{\prime t}= & {} {\rho }_{i,m,q}^{\prime } \left[ {{\textit{RE}}_{i,m,q}^{t-1,s} -{\mu }_{i,m,q}^{\prime t-1} } \right] +{\varepsilon }_{i,m,q}^{\prime t,s}. \end{aligned}$$
(29)

In relations (28) and (29), \(\varvec{\rho }\) and \({\varvec{\rho }^{\prime }}\) are autoregressive parameters, \(\varvec{\mu }\) and \({\varvec{\mu }}^{\prime }\) are predetermined time’s parameters to capture life cycle and seasonality of the new products and potential returns over the planning horizon, \(\varvec{\varepsilon }\) and \({\varvec{\varepsilon }}^{\prime }\) are error terms that are assumed to be normal distribution functions with mean zero and variance \(\varvec{\sigma }_{\varvec{\varepsilon }}^2, {\varvec{\sigma }}_{{\varvec{\varepsilon }^{\prime }}}^{{\prime }2}\), respectively. It is worth noting that in the first period, \(\left[ {{\textit{DE}}_{i,m}^{0,s} -\mu _{i,m}^0 } \right] \) and \(\left[ {{\textit{RE}}_{i,m,q}^{0,s} -{\mu }_{i,m,q}^{\prime 0}} \right] \) are considered to be zero.

To make different scenarios, the error terms should be generated and Fig. 8 illustrates the process of error term’s generation in which \(\left| T \right| \) and \(\left| S \right| \) are the numbers of tactical periods and scenarios, respectively. In this paper, such as (Fattahi et al. 2015b) and (Govindan and Fattahi 2015), LHS method is applied for generating \(\varvec{\varepsilon }\) and \({\varvec{\varepsilon }}^{\prime }\).

Fig. 8
figure 8

Error generation for a scenario fan

A large number of scenarios will make the stochastic program complex in terms of computational tractability. Therefore, after generating a fan of scenarios, backward scenario reduction technique is used to reduce the number of scenarios. Forward and backward scenario reduction methods are two types of well-known scenario reduction techniques. For more detailed explanations about these techniques, one can refer to Dupačová et al. (2003).

7 Computational results

In this section, the stochastic model and solution approach are examined, carefully. Furthermore, the developed pricing heuristic for the solution approach is investigated, separately. Finally, the importance of considering stochasticity in the problem is discussed, and the scenario generation method is examined in terms of out-of-sample and in-sample stability.

7.1 Assessing the performance of the stochastic model and solution algorithm

To investigate the performance of the solution approach and MILP model, commercial software GAMS 24.1.2 using CPLEX solver is used to solve the mathematical model and the solution approach is coded with MATLAB R2012a software and CPLEX 12. A personal computer with Intel Core i7-640M CPU (2.8 GHz), with 4.00 GB of RAM, is used for all implementations.

Several test problems with different sizes including small-, medium- and large-sized tests are generated. In the test problems, we assume that \(\left| R \right| =6, \left| Q \right| =2, \left| L \right| =5\), and \(\left| S \right| =15\), and other characteristics of them are illustrated in Table 2, and the method of generating test problems’ parameters using the uniform distributions is explained in “Appendix 1”. Moreover, to obtain scenario fans for the test problems, we generate 300 scenarios by the LHS method, and then, reduce the number of them to 15 scenarios by the backward scenario reduction technique.

Table 2 Characteristics of test problems

Here, parameter \(\upalpha \) in the SA’s decoding procedure and threshold \(\varepsilon \) in the pricing heuristic are set to 0.5 and 0.9, respectively. The solution approach had a good performance with these parameters’ values in enormous empirical experiments. Furthermore, the performance of the pricing heuristic with this threshold value is examined later.

The SA’s performance in the solution approach is dependent to its controllable factors. Here, these factors are the acceptance probability, initial temperature, temperature’s coefficient, and cooling scheme. In this paper, the factors are set for the SA algorithm using Taguchi experimental design (see Montgomery et al. 1984; Taguchi 1986). The orthogonal arrays \(L_{9} (3^{4})\) and two criteria, relative percentage deviation (RPD) and signal-to-noise (S/N) ratio, are considered for the Taguchi method to calibrate these factors such as Zarandi et al. (2013) and Fattahi et al. (2015a). One can refer to those research studies for more information about this issue. The final levels of them are reported in Table 3.

Table 3 The values of controllable factors of the SA

The results from solving the problem with the solution approach and the MILP model with the CPLEX solver are reported for the generated test problems in Table 4. Furthermore, both developed decoding strategies are examined for solving the test problems. It should be noted that the reported relative gaps of the CPLEX solver is based on the difference between the lower and upper bound of its branch and cut algorithm and hence, does not mean the diversion from the optimal objective value. Moreover, the maximum allowable CPU time is set to 6 h for solving the model using CPLEX solver. Optimal solutions are achieved in some test problems using the CPLEX solver. Therefore, the results of the solution approach are compared to the optimal objective values in these test problems to make sure that the solution approach is efficient for our optimization problem. The reported optimality gaps for these test problems are calculated using relation (30).

$$\begin{aligned} \hbox {Solution approach's optimality gap}= & {} \frac{\hbox {Optimal objective}- \hbox {Solution approach's objective}}{\hbox {Optimal objective}} \nonumber \\&\times 100\,\%. \end{aligned}$$
(30)
Table 4 Comparison between results of the solution approach and CPLEX solver

As illustrated in Table 4, when the sizes of test problems become large, the CPLEX solver cannot find their optimal solutions in a reasonable time. Moreover, in test problems P21–P24, the CPLEX solver cannot start its solving procedure because of the out of memory error.

As reported in Table 4, the proposed solution approach has a good performance in solving the stochastic program in terms of the objective value and computational time. Figure 9 shows how the SA algorithm converges to the steady state after some limited iterations for test P15. Furthermore, the average optimality gap of the solution approach with decoding strategy I and II are 3.12 and 3.10 %, respectively. Figure 10 compares the optimality gaps of the solution approach with decoding strategy I and II.

Fig. 9
figure 9

The convergence of the SA algorithm with decoding strategy I for P15

Fig. 10
figure 10

The comparison between decoding strategy I and II

7.2 Investigating the presented heuristic in the solution algorithm

In the solution approach, SA algorithm obtains the objective function of the upper bound problem. Furthermore, in the last step of the solution approach, the presented heuristic determines the final objective for the original problem and binary values for acquisition price decisions. The closeness between the SA’s and final objective value, final objective/SA’s objective \({\times }100\) %, shows the heuristic’s performance. The comparison between them is reported in Table 5 for several test problems.

Table 5 The closeness between final objective value of the solution approach and SA’s objective

The average closeness between SA and final objective values in the considered test problems is 98.59 %. Moreover, the time of performing the heuristic is negligible in comparison with the solution approach’s CPU time. Therefore, we can conclude that the presented heuristic in the solution approach performs well.

As mentioned before, the value of threshold \(\upvarepsilon \) is set to 0.9 in the solution approach based on several empirical experiments. The value of threshold has a main role on the performance of the heuristic. In Table 6, in several test problems with randomly fixed binary values for decisions (x, y, z), different threshold values are set for the heuristic and then, the final objective for the problem after performing the heuristic are obtained.

Table 6 Results of the pricing heuristic for different values for the threshold in the heuristic

From Table 6, we can conclude that threshold value 0.9 is suitable for the heuristic in terms of optimality and CPU time.

7.3 Assessing the importance of stochasticity in the problem

To evaluate the importance of parameters’ stochasticity in the IFRLN design problem, the expected value of perfect information (EVPI) and value of stochastic solution (VSS), as two well-known criteria, are calculated for the stochastic model in several test problems. The VSS measures the difference between the objective value of the two-stage stochastic program and the expected value of deterministic problems’ objectives for different discrete scenarios in which first stage decisions are fixed based on solving a problem with the expected value of stochastic parameters. Furthermore, a decision-maker can earn a maximum amount by accurate and complete information about the future stochastic parameters; this amount is equal to EVPI. In stochastic optimization problems, the EVPI reveals potential worth of more accurate forecasts for stochastic parameters. For more explanations about EVPI and VSS, one can refer to “Appendix 2” and (Birge and Louveaux 2011). Table 7 illustrates EVPI and VSS for some test problems.

Table 7 Values of EVPI and VSS for some test problems

As illustrated in Table 7, the meaningful values of VSS and EVPI show that it is important to capture stochasticity of demands and potential returns of customer zones in designing the IFRLN.

7.4 Assessing in-sample and out-of-sample stability for scenario generation procedure

In this paper, LHS is applied to generate scenarios and then, the number of scenarios is reduced by using backward scenario reduction technique. Because of randomness nature of the LHS, by implementing scenario generation procedure several times with the same input parameters, different scenario fans will be obtained. In-sample stability of a scenario generation procedure assures that whichever of these scenario fans is used in the stochastic optimization problem, the optimal objective value is (approximately) the same (Lium et al. 2009; Govindan and Fattahi 2015). Here, different scenario fans including 15 scenarios are obtained by using the scenario generation procedure for several test problems and their optimal objective values are reported in Table 8.

Out-of-sample stability assures that true objective values for first stage decisions in two-stage stochastic programs are also approximately same as the models’ objective values. Generally, out-of-sample stability can be evaluated using simulation (Lium et al. 2009). Here, the optimal first stage decisions from solving several test problems are simulated by 1000 scenarios. To simulate them, first stage decisions are fixed in the MILP model and then, the obtained linear programming problem is solved for each scenario. In Table 8, simulation results are also reported. Here, \(\left( {\varvec{x}, \varvec{y}, \varvec{z}, \varvec{\theta }} \right) \) variables are first stage decisions. For an acceptable scenario generation procedure, out-of sample and in-sample stability are necessary properties.

Table 8 Simulation response and objective values for the stochastic model with different scenario fans

As illustrated in the last column of Table 8, for each test problem, the difference between the maximum and minimum objective values for different scenario fans with 15 scenarios is relatively small. Furthermore, the average difference between simulation value and objective function value for considered test problems is 0.48 %. Therefore, we can conclude that the scenario generation procedure has in-sample and out-of-sample stability, relatively. The frequency analysis function from simulating the optimal first stage decisions of P15 with 1000 scenarios is also shown in Fig. 11.

Fig. 11
figure 11

Comparison between simulation response and objective values for each scenario in P15. a The optimal objective value for each scenario in P15. b Simulation responses for optimal first stage decisions of P15

In Fig. 11b, we can see the true distribution function of the IFRLN’s net income for optimal solution of the optimization problem.

7.5 Discussion about acquisition price decisions

In this section, an illustrative example is generated with \(\left| K \right| =2, \left| P \right| =3, \left| J \right| =8, \left| C \right| =12, \left| D \right| =2, \left| M \right| =3, \left| R \right| =6, \left| T \right| =4, \left| Q \right| =2, \left| L \right| =5\), and \(\left| S \right| =15\) in which the available budget for IFRLN design (BU) is equal to 40,000. The optimal solution for this hypothetical example is achieved by using the CPLEX solver. In the obtained optimal solution, the available capacities for forwarding finished products in open distribution centers/hybrid facilities and handling used products in open collection centers/hybrid facilities are 23,991 and 19,019, successively. For the presented example, these results can highlight the financial importance of reverse and forward networks in the IFRLN in comparison to each other.

The financial benefits from forward and reverse networks in the IFRLN are sensitive to costs of buying raw materials. In the example, to investigate this issue, we consider a multiplier coefficient for these costs and change their values to perform a sensitivity analysis. The optimal capacities for forwarding finished products and collecting used products with different values for costs of buying raw materials are illustrated in Fig. 12.

Fig. 12
figure 12

Sensitivity of optimal IFRLN capacities to costs of buying raw materials

As demonstrated in Fig. 12, in the example, when costs of buying raw materials increase, the optimal capacities for forwarding finished products decrease and for collecting used products increase. In the IFRLN, both of forward and reverse networks have main impacts on the network’s net income and in some situations, the investment on reverse network is more beneficial than forward one and vice versa.

In the hypothetical example, Fig. 13 illustrates changes of optimal acquisition prices offered to customer zone 4 for collecting used product 3 with different quality levels over the planning horizon.

Fig. 13
figure 13

Changes of acquisition prices offered to customer zone 4 for collecting used product 3

Fig. 14
figure 14

Sensitivity of the optimal objective value to parameter \(\eta \)

Fig. 15
figure 15

Sensitivity of the optimal objective value to parameter \({\eta }^{\prime }\)

In order to investigate, the importance of dynamic pricing in comparison with static one, we can add constraints (31) into the model. Here, \(\eta \) is a possible change in acquisition prices between two consequent tactical periods and sensitivity of optimal objective value to \(\eta \) is illustrated in Fig. 14 for the hypothetical example. For more information about the importance of dynamic pricing in comparison with static pricing, one can refer to Keyvanshokooh et al. (2013).

$$\begin{aligned}&\left( B_{i,m,q}-{A_{i,m,q} } \right) \left| {\sum _{l\in L} {\theta _{i, m, q, l}^{t+1} \left( {\frac{l-1}{\left| L \right| -1}} \right) } -\sum _{l\in L} {\theta _{i, m, q, l}^t \left( {\frac{l-1}{\left| L \right| -1}} \right) } } \right| \le \eta \nonumber \\&\quad \forall i\in C,\,\forall m\in M,\,\forall q\in Q,\,\forall t\in T\backslash \left\{ {NT} \right\} \end{aligned}$$
(31)

Moreover, we can impose constraints (32) into the model to prevent from offering different acquisition prices to different customer zones for collecting used products. Here, for each used product, \({\eta }^{\prime }\) is a possible difference between acquisition prices offered to two different customer zones.

$$\begin{aligned}&\left| {\sum _{l\in L} {\theta _{i, m, q, l}^t \left( {\frac{l-1}{\left| L \right| -1}} \right) \left( {A_{i,m,q} -B_{i,m,q} } \right) } -\sum _{l\in L} {\theta _{{i}^{\prime }, m, q, l}^t \left( {\frac{l-1}{\left| L \right| -1}} \right) } \left( {A_{i,m,q} -B_{i,m,q} } \right) } \right| \nonumber \\&\quad \le {\eta }^{\prime } \qquad \forall i,{i}^{\prime }\in C, \forall m\in M, \forall q\in Q, \forall t\in T \end{aligned}$$
(32)

In Fig. 15, the sensitivity of objective value to \({\eta }^{\prime }\) is illustrated.

8 Conclusion

In this paper, a multi-commodity, multi-period IFRLN design problem under demand and return uncertainty is proposed. This stochastic optimization problem, in contrary to previous studies in this area, assumed that the amounts of returned products with different quality levels are dependent on acquisition prices in the reverse direction. An MILP model using two-stage stochastic programming was developed for the problem. The model simultaneously considered the strategic and tactical planning decisions in the IFRLN. Location decisions of distribution centers, collection centers, and hybrid facilities as strategic decisions and acquisition price decisions as tactical decisions were considered as first stage decisions. Other tactical decisions were considered scenario-dependent and as second stage decisions. The presented model can only be solved to optimality for small- and medium-sized test problems using CPLEX solver. To cope with the problem’s intractability, a simulation-based SA algorithm was proposed. Furthermore, to deal with uncertain parameters and construct an efficient fan of scenarios, the LHS and backward scenario reduction technique were applied.

In the computational results, it was shown that the gap between the optimal objective value and the approximate one obtained by using the simulation-based SA was reasonably low. In addition, the scenario generation method had a good performance in terms of in-sample and out-of-sample stability. Meaningful values for EVPI and VSS in several test problems illustrated the main importance of considering uncertainty in the IFRLN design. We also discussed about the significance of the proposed model and driven some managerial insights using several numerical experiments including sensitivity analysis on the main parameters of the problem.

In the simulation phase of the presented solution approach, LP problems should be solved using an LP solver for each scenario and hence, proposing a heuristic method to determine the approximate objective function and second stage decisions without solving LP problems can improve the solution approach in terms of the computational time. As another future research direction, one can examine the optimization problem in which a normal distribution function defines the expected price of each customer zone for one unit of each used product. Moreover, there are also many opportunities to extend exact, heuristic and other meta-heuristic solution approaches for the IFRLN design under uncertainty. Finally, an interesting extension would be to consider environmental and social aspects in the problem’s objective or constraints to have a sustainable logistics network.