1 Introduction

Delivering at a desired service level in some systems such as postal and cargo delivery plays a key role in the logistics of commodities. Unlike most previous works which did not consider time-definite delivery service in supply chain networks, it is becoming more and more difficult to ignore it. Moreover, high-speed delivery with no delay affects the competitive environment, hindering those who does not adapt. However, the major difficulty of these applications is to determine the network topology and scheduled departure time from the origins. So, developing a proper model is a must to obtain a high-quality distribution strategy. For the time-sensitive commodities, the distribution topology of the network highly affects the distribution time and cost. Hub location problem (HLP) is a special kind of network design problem, which is applied on the transportation problems (Campbell 2009; Song and Carter 2008; Gelareh and Nickel 2011), delivery systems (Liu et al. 2003; Tan and Kara 2008; Yaman et al. 2007), and telecommunications (Kewcharoenwong and Üster 2014; Carello et al. 2004). Readers interested in the literature dealing with HLP are referred to O’Kelly and Miller (1994), Campbell (1994a, (2002), Klincewicz (1998), Bryan and O’Kelly (1999), Alumur and Kara (2008) and Farahani et al. (2013), and the references cited therein.

The most important task in transshipment network is to pass the flow between origins and destinations. To achieve this goal, there are two alternatives: one is to connect each origin and destination node to each other, which is not efficient and it is very costly. The other way is using hubs to consolidate the flow, which is received from origins and redistribute it to destination nodes directly (if the destination node is allocated to this hub) or other hubs. In the common form of HLP, there are supply or demand nodes. It is assumed that there is a positive flow between nodes according to the real practice. The advantage of using a hub network is that economies of scale can be achieved by combining the flow into an integral whole.

Three assumptions are regularly stated in the classical version of HLP. First of all, the hubs are completely interconnected. Second, the discount factor of using a hub link is between zero and one. Finally, there is no direct connection between non-hubs. The first and last assumptions are relaxed in this paper. An incomplete hub network is designed by the relaxation of the first assumption. For the incomplete hub network topology, as Karimi and Setak (2014) categorized, there are four types, including tree, ring, special, and general topologies. The suggested model in this paper presents the general incomplete hub network. Additionally, other topologies are also discussed to get better interpretation.

In this paper, postal and cargo delivery systems, which are the major applications of hub network design are addressed. In these systems, questions are raised about the location of hubs, routing the flows, designing the network, and scheduling departure time from the origins to reach the destinations on time. However, far too little attention has been paid to this case of HLP, which considers locating, routing, and scheduling problems together in an integrated model.

The flow shipment scheduling in some applications such as postal delivery systems has motivated us to propose a mathematical model for an incomplete hub network. The flow shipment scheduling in an incomplete hub location-routing problem (FSSIHLRP) tries to minimize the routing cost of flows and the fixed costs. Fixed costs include the cost of opening hubs, the cost of establishing hub links (i.e. an edge connecting one hub to another hub), spoke links (i.e. an edge connecting a non-hub to a hub), and non-hub link (i.e. an edge connecting a non-hub to another non-hub). Some preprocessing is used to improve the solution time.

The rest of the paper is organized as follows. Next section illustrates the related background of FSSIHLRP. Section 3 describes the problem and the related model. In Sect. 4, some preprocessing steps are introduced for the model to eliminate some of the variables from the formulation. Characterization and evaluation of the preprocessing steps are studied in Sect. 5. Finally, concluding remarks are given in the last section.

2 Previous related research

To provide a high-level service, FSSIHLRP considers the general topology for hub network design, non-stop or direct shipment, and time-definite delivery system. In this section, after introducing the primary papers in the literature, the general network topology of HLP is reviewed. Next, the literature of the direct shipment in HLP is surveyed. Finally, the commodity time-definite delivery is studied.

The first mathematical formulation for HLP was introduced by O’kelly (1986). Then, he investigated a quadratic mathematical model for a single allocation hub median location problem (O’kelly 1987). Campbell (1994b) reported the capacitated, median, covering, and center HLP. The rest of the lengthy and rich literature of HLP includes developing formulations and solution methods.

General hub network topology In the general topology, no network structure other than connectivity is imposed on the hub network. Moreover, non-hub links can be used in this topology. To the best knowledge of the authors, the general topology of HLP was suggested by Yoon et al. (2000) in the telecommunication area. They did not consider the fixed cost of opening the hubs. In addition, they used a hierarchical network, but the direct shipment was not allowed. Nickel et al. (2001) proposed a mathematical model for HLP in the urban public transportation problem, which considers fixed costs of opening the hubs, the cost of establishing hub links, spoke links, and flow shipment. Nevertheless, their model did not take the fixed costs of the non-hub links into account. Our proposed model is an extension of their model. Yoon and Current (2008) subsequently provided another model for the general topology, but the routing costs were not discussed in their model. Gelareh and Nickel (2011) developed Nickel et al. (2001) model and reduced the dimension of the problem by decreasing the number of the variables and constraints. Their model was referred to as HLPPT (i.e. HLP for public transportation). It should be noted that their model which was customized for the transportation problem cannot design a network with only one hub. In addition, they did not address the direct shipment and non-hub link cost in their model. Catanzaro et al. (2011) prepared a mathematical model, which clustered the network into some sub-networks in which at least one hub should be located at least a hub in each of them, and found the routing of the flows to minimize flow shipment cost. A single allocation incomplete hub covering network with fuzzy travel times was suggested by Davari et al. (2013). In their model, the objective function maximized the minimum covered flows on the network. In addition, the direct shipment was not permitted in the model. Setak et al. (2013) suggested a model for hub location-routing problem, in which the general hub network topology was used. To consider the flow shipment scheduling in this network structure, we propose a mathematical formulation in this paper. As another related research, Karimi and Setak (2014) presented a model which minimized the fixed costs of opening hubs, establishing hub links, spoke links, and flow shipment costs in an incomplete network.

Direct shipment As one of the elementary assumptions in this area, the non-stop shipment is not allowed in HLP. However, Aykin (1994) relaxed this assumption for the first time. Sung and Jin (2001) provided another model of direct shipment using multiple allocation strategy. Furthermore, more researchers such as Nickel et al. (2001), Yoon and Current (2008), Catanzaro et al. (2011), and Mahmutoğulları et al. (2015) have employed the non-stop shipment in their models.

Time-definite delivery system In this delivery system, travel time of each O–D pair is guaranteed with a desired service level. Service time for the commodity shipment is studied in different distribution strategies. Lin and Chen (2004) considered a generalized capacitated hub-and-spoke network and provided two transportation modes, which satisfied a chosen service level. Tan and Kara (2008) presented an integer programming formulation for cargo delivery system and implemented it for the Turkish network. Moreover, their model determined the departure time of the vehicles from hub to hub and hub to non-hub in a single allocation complete hub network topology. Yaman et al. (2007) proposed a mathematical model for the ground transportation cargo delivery systems. They also concentrated on the departure time and the number of stops between each O–D. Chen et al. (2008) considered the routing decision and tree topology over an existing network, while the model minimized the violations of the predefined service time. As another example, Campbell (2009) constructed a model of the time-definite transportation. While guaranteeing a desired service time, he considered the total routing cost, subject to the feasible assignments. Karimi and Bashiri (2011) proposed four mathematical formulations, in which the travel time was limited on each connection. Ishfaq (2012) discussed the design of a less than truck load logistics and offered multiple delivery time services. Yaman et al. (2012) introduced a model, called release time scheduling hub location problem (RSHL), for cargo delivery systems, which decided on the release time of vehicles from each node for the next-day delivery service. Peker and Kara (2015) proposed a p-hub maximal covering problem to find the best locations for the hubs to maximize the satisfied demands within a coverage bound.

A broad categorization of the literature is illustrated in Table 1 by considering seven characteristics, namely hub network structure, network fixed costs, allocation strategy, number of hubs, direct shipment, time-definite delivery system, and flow shipment scheduling. The attributes of FSSIHLRP are also presented in the last row.

Table 1 Review of the existing literature

To help filling the shown gap in Table 1, in this study, an incomplete hub location-routing problem with a general hub network topology is proposed. There is no capacity restriction for FSSIHLRP which is focused on the scheduling of leaving times from the origins and hubs to destinations. This problem fits in the classes of general hub network topology, non-stop shipment, and time-definite delivery systems.

3 Problem definition and formulation

3.1 Description of the problem

The study of commodity delivery systems plays an important role in the supply chain network. On time delivery is the key factor for the success of such systems. As is known, the transportation systems and delivery of commodities are the applications of HLP. In other words, HLP is one of the most important tools for the transportation network design.

It is supposed that a delivery company wants to meet a desired service time using an appropriate transportation network. The company should know about the components of the transportation network such as location of hubs and non-hubs, commodities routing from origins to destinations, scheduled departing time from origins, and departure time from the hub nodes.

In the light of the described problem, in this section, a hub location network is formulated as FSSIHLRP. The resulted network consists of hubs, non-hubs, hub links, spoke links, and non-hub links. As stated before, delivery systems intend to ship the commodity within the qualified service time. In these systems, commodities travel from/to an origin via hub/hubs or directly to a destination. Three important variables (i.e. location of hubs, routing of the flows, and planning the departure times) are determined. The objective of FSSIHLRP is to minimize the fixed cost of designing the network and the routing cost of flows. The advantage of using FSSIHLRP over the related literature is that FSSIHLRP integrates the planning issues with an incomplete hub network design. The following assumptions are considered in FSSIHLRP.

Assumptions

  • The locations of customers, their flows and distances, fixed costs of opening hubs, hub links, spoke links, and direct links are given.

  • Fixed costs of opening hub links, spoke links, and direct links are symmetric.

  • The travel cost graph is non-directional and satisfies the triangle inequality.

  • Each node can potentially be a hub.

  • There is an advantage for routing streams through hubs.

  • The number of hubs is unknown, which means model will decide on that.

  • Direct connections between non-hubs are allowed.

  • An incomplete hub network is considered.

  • A discount factor is used to take advantage of hub links in the network.

  • The multiple allocation strategy is employed.

  • The direct link between two points yields the non-stop travel between them.

  • This model restricts the travel time with a certain service level. Flows departing from a hub need to wait for all the required incoming flows.

  • The scheduled departure times from the origins should be no later than the specified times.

  • At certain service time, at least predefined units of flow should be delivered to the destinations.

3.2 FSSIHLRP formulation

FSSIHLRP is formulated by the following set, parameters, decision variables, objective function, and constraints.

Set

N :

Set of all nodes indexed by i, j, k, and m

Parameters

\(\alpha \) :

Cost discount factor (\(0\le \alpha \le 1)\); it is used in the travelling cost for hub links,

L :

Number of days per year which is equivalent to 365,

T :

Maximum coverage service level for travelling time between all pairs,

\(c_{ij}\) :

Transfer annual cost per unit of flow from nodes i to j,

\(t_{ij}\) :

Travel time from nodes i to j,

\(F_k \) :

Fixed annual average cost of opening a hub at node k,

\(\mathrm{FH}_{km} \) :

Fixed annual average cost of establishing a hub link between nodes k and m,

\(\mathrm{FS}_{ij} \) :

Fixed annual average cost of establishing a spoke link between nodes i and j,

\(\mathrm{FD}_{ij} \) :

Fixed annual average cost of establishing a direct link between nodes i and j,

\(w_{ij} \) :

Daily average amount of flow which is transported between nodes i to j,

\(\gamma _i \) :

The time which the scheduled leaving time from node i should not be later than,

\(\sigma _i \) :

Minimum daily flow which is picked from node i,

\(\mathrm{st}_k \) :

Average service time at hub node k,

\(f_i \left( {s_i } \right) \) :

Flow arrival function which calculates the amount of flow accumulated at node i until \(s_i \); logically, it should be a non-decreasing function.

Decision variables

\(a_{ij}^{km}\) :

\(\left\{ {\begin{array}{*{20}c} 1 \\ 0 \\ \end{array} } \right. \,\,\begin{array}{*{20}l} {{\text {If the flow originated at}}\,i\,{\text {and destined to }}_{{{{j}}\,}} {\text {is routed via hubs}}\,k\,{\text {and}}\,m,}\\ {{\text {Otherwise}}} \\ \end{array}\)

\(e_{ij}^k \) :

\( \left\{ {\begin{array}{*{20}c} 1 \\ 0 \\ \end{array} } \right. \,\,\begin{array}{*{20}l} {{\text {If the flow originated at}}\,{\text {non-hub}}\,{\text {and destined to }}_{{{{j}}\,}} {\text {is routed via hub}}\,k,} \\ {{\text {Otherwise,}}} \\ \end{array}\)

\(f_{ij}^k\) :

\( \left\{ {\begin{array}{*{20}c} 1 \\ 0 \\ \end{array} } \right. {} {} \begin{array}{*{20}l} {{\text {If the flow originated at}}{} \,i{\text { and destined to non-hub }}_{{{{j}}{} }} {\text {is routed via hub }}{} k,} \\ {{\text {Otherwise,}}} \\ \end{array}\)

\(g_{ij}\) :

\(\left\{ {\begin{array}{*{20}c} 1 \\ 0 \\ \end{array} } \right. \,\,\begin{array}{*{20}l} {{\text {If }}i{\text { is connected and directly routed to }}_{{{{j}}\,}}{\text {; only one of them is a hub}},} \\ {{\text {Otherwise,}}} \\ \end{array}\)

\(b_{ij}\) :

\(\left\{ {\begin{array}{*{20}c} 1 \\ 0 \\ \end{array} } \right. \,\,\begin{array}{*{20}l} {{\text {If }}i{\text { is connected and directly routed to }}_{{{{j}}\,}}{\text {; none of them is a hub}},} \\ {{\text {Otherwise,}}} \\ \end{array}\)

\(z_{km}\) :

\(\left\{ {\begin{array}{*{20}c} 1 \\ 0 \\ \end{array} } \right. \,\,\begin{array}{*{20}l} {{\text {If hub nodes }}k{\text { and }}m{\text { are directly connected}},} \\ {{\text {Otherwise,}}} \\ \end{array}\)

\(y_k\) :

\(\left\{ {\begin{array}{*{20}c} 1 \\ 0 \\ \end{array} } \right. \,\,\begin{array}{*{20}l} {{\text {If }}k{\text { is hub}},} \\ {{\text {Otherwise,}}} \\ \end{array}\)

\(s_{i}\) :

Scheduled leaving time from node i,

\(D_{kj}\) :

Departure time from hub k to nodej,

To clarify the performances of the binary variables, all of them are depicted in Fig. 1, except \(y_k\). In this figure, the circles and rectangles show the non-hub and hub nodes, respectively. In addition, the line and dash line arrows, respectively, illustrate the direct links and paths between the nodes.

Fig. 1
figure 1

Non-zero binary variables in the path for sending flow between i and j

FSSIHLRP

$$\begin{aligned}&\min \sum \limits _i {\sum _j {\sum _k {\sum _m {L\alpha c_{km} w_{ij} } } } } a_{ij}^{km} +\sum _i {\sum _j {\sum _k {Lw_{ij} (c_{ik} e_{ij}^k +c_{kj} f_{ij}^k )} } } +\sum _i {\sum _j {Lc_{ij} w_{ij} g_{ij} } } \nonumber \\&\quad +\sum _i {\sum _j {Lc_{ij} w_{ij} b_{ij} } } +\sum _k {F_k y_k } +\sum _k {\sum _{m>k} {FH_{km} z_{km} } } +\sum _i {\sum _{j>i} {FS_{ij} g_{ij} } } +\sum _i {\sum _{j>i} {\mathrm{FD}_{ij} b_{ij} } } \end{aligned}$$
(1)
$$\begin{aligned}&\quad \sum _{m\ne i} {a_{ij}^{im} } +\sum _{m\ne i} {e_{ij}^m } +g_{ij} +b_{ij} =1 \quad \forall i,j,(i\ne j)\end{aligned}$$
(2)
$$\begin{aligned}&\quad \sum _{m\ne j} {a_{ij}^{mj} } +\sum _{m\ne j} {f_{ij}^m } +g_{ij} +b_{ij} =1 \quad \forall i,j,(i\ne j)\end{aligned}$$
(3)
$$\begin{aligned}&\quad \sum _{m\ne i,k} {a_{ij}^{km} } +f_{ij}^k -\sum _{m\ne j,k} {a_{ij}^{mk} } -e_{ij}^k =0 \quad \forall i,j,k,(i\ne j,i\ne k,j\ne k)\end{aligned}$$
(4)
$$\begin{aligned}&e_{ij}^k \le 1-y_i \quad \forall i,j,k,(i\ne j)\end{aligned}$$
(5)
$$\begin{aligned}&f_{ij}^k \le 1-y_j \quad \forall i,j,k,(i\ne j)\end{aligned}$$
(6)
$$\begin{aligned}&e_{ij}^k +\sum _{m\ne j,k} {a_{ij}^{mk} } \le y_k \quad \forall i,j,k,(i\ne j,i\ne k,j\ne k)\end{aligned}$$
(7)
$$\begin{aligned}&f_{ij}^k +\sum _{m\ne i,k} {a_{ij}^{km} } \le y_k \quad \forall i,j,k,(i\ne j,i\ne k,j\ne k)\end{aligned}$$
(8)
$$\begin{aligned}&g_{ij} +2a_{ij}^{ij} +\sum _{m\ne i,j} {a_{ij}^{im} } +\sum _{m\ne i,j} {a_{ij}^{mj} } \le y_i +y_j \quad \forall i,j,(i\ne j) \end{aligned}$$
(9)
$$\begin{aligned}&g_{ij} \le 2-y_i -y_j \quad \forall i,j,(i\ne j)\end{aligned}$$
(10)
$$\begin{aligned}&a_{ij}^{km} +a_{ij}^{mk} \le z_{km} \quad \forall i,j,k,m,(i\ne j,k<m)\end{aligned}$$
(11)
$$\begin{aligned}&b_{ij} \le 1-y_i \quad \forall i,j,(i\ne j) \end{aligned}$$
(12)
$$\begin{aligned}&b_{ij} \le 1-y_j \quad \forall i,j,(i\ne j)\end{aligned}$$
(13)
$$\begin{aligned}&e_{ij}^k \le g_{ik}\quad \forall i,j,k,(i\ne j) \end{aligned}$$
(14)
$$\begin{aligned}&f_{ij}^k \le g_{kj}\quad \forall i,j,k,(i\ne j) \end{aligned}$$
(15)
$$\begin{aligned}&g_{ij} -g_{ji} =0 \quad \forall i,j,(i<j) \end{aligned}$$
(16)
$$\begin{aligned}&b_{ij} -b_{ji} =0 \quad \forall i,j,(i<j) \end{aligned}$$
(17)
$$\begin{aligned}&z_{km} -z_{mk} =0 \quad \forall k,m,(k<m)\end{aligned}$$
(18)
$$\begin{aligned}&s_i \le \gamma _i \quad \forall i \end{aligned}$$
(19)
$$\begin{aligned}&f_i (s_i )\ge \sigma _i \quad \forall i \end{aligned}$$
(20)
$$\begin{aligned}&(e_{ij}^k +b_{ij} g_{ik} )(s_i +st_k y_k +t_{ik} g_{ik} )\le D_{kj} \quad \forall i,j,k,(i\ne j)\end{aligned}$$
(21)
$$\begin{aligned}&a_{ij}^{km} (st_m y_m +D_{kj} +t_{km} z_{km} )\le D_{mj} \quad \forall i,j,k,m,(i\ne j),(k\ne m) \end{aligned}$$
(22)
$$\begin{aligned}&f_{ij}^m (t_{mj} +D_{mj} )\le T \quad \forall i,j,m,(i\ne j) \end{aligned}$$
(23)
$$\begin{aligned}&a_{ij}^{km} \in \left\{ {0,1} \right\} \quad \forall i,j,k,m \end{aligned}$$
(24)
$$\begin{aligned}&e_{ij}^k \in \left\{ {0,1} \right\} \quad \forall i,j,k \end{aligned}$$
(25)
$$\begin{aligned}&f_{ij}^k \in \left\{ {0,1} \right\} \quad \forall i,j,k \end{aligned}$$
(26)
$$\begin{aligned}&b_{ij} \in \left\{ {0,1} \right\} \quad \forall i,j \end{aligned}$$
(27)
$$\begin{aligned}&g_{ij} \in \left\{ {0,1} \right\} \quad \forall i,j \end{aligned}$$
(28)
$$\begin{aligned}&z_{km} \in \left\{ {0,1} \right\} \quad \forall k,m \end{aligned}$$
(29)
$$\begin{aligned}&y_k \in \left\{ {0,1} \right\} \quad \forall k \end{aligned}$$
(30)
$$\begin{aligned}&s_i \ge 0 \quad \forall i \end{aligned}$$
(31)
$$\begin{aligned}&D_{kj} \ge 0 \quad \forall i,k \end{aligned}$$
(32)

Objective function (1) minimizes four cost elements of the problem: (1) Delivery cost (i.e. the 1\(\mathrm{st}\), 2\(\mathrm{nd}\), 3\(\mathrm{rd}\), and 4\(\mathrm{th}\) parts of the objective function); (2) Fixed cost of opening hubs (i.e. the 5\(\mathrm{th}\) part of the objective function); (3) Fixed cost of establishing hub links (i.e. the 6\(\mathrm{th}\) part of the objective function); (4) Fixed cost of establishing direct and spoke links (i.e. the last part of the objective function). Constraints (2), (3), and (4) are the flow conservation constraints. In fact, these constraints hold the flow balance on O–D pairs. Constraints (5) and (6) guarantee that, for \(e_{ij}^k \) and \(f_{ij}^k \), i and j should be non-hub nodes, respectively. Constraints (7) and (8) ensure that flows can move through a node to reach the destination when the node is a hub. Constraints (9) illustrate that employing a link for moving flows from an origin to a destination depends on their character (i.e. being hub nodes or not). When only one of i or j is hub, constraints (10) ensure that \(g_{ij}\) can be one. It should be noted that, when none of i and j is hub, constraints (9) force \(g_{ij} \) to be zero. Therefore, constraints (10) work correctly in this case. The constraints (11) guarantee that a hub link should exist before being used in any flow path. Constraints (12) and (13) demonstrate that \(b_{ij}\) can be non-zero when i and j are not hub. Constraints (14) and (15) check that sending or receiving flows between two non-hubs needs a spoke link. Constraints (16)–(18) illustrate that the receiving path is the same as the sending path. Constraints (19) ensure that the flows should not be sent earlier than the latest time. Constraints (20) show that at least the minimum flow, picked from a node, should be delivered to the destination with the desired service time in a day. Constraints (21) and (22) illustrate that, to depart from a hub node toward the destinations, a vehicle should wait for all the vehicles to arrive from the nodes that are routed via this hub to the destinations. Constraints (23) express that the accomplishment time of any delivery service should happen by the predetermined service time. Finally, constraints (24)–(32) show the domain of the variables.

3.3 Linearization

This subsection presents linearization for the non-linear constraints (i.e. (21), (22), and (23)), presented in FSSIHLRP. A Big M-type linearization method is suggested for the mentioned non-linear constraints as follows:

$$\begin{aligned}&s_i +t_{ik} g_{ik} +st_k y_k -(2-g_{ik} -b_{ij} -e_{ij}^k )BN_1 \le D_{kj} \qquad \quad \forall i,j,k,(i\ne j)\end{aligned}$$
(33)
$$\begin{aligned}&D_{kj} +st_m y_m +t_{km} z_{km} -(1-a_{ij}^{km} )BN_2 \le D_{mj} \qquad \qquad \qquad \forall i,j,k,m,(i\ne j),(k\ne m)\nonumber \\ \end{aligned}$$
(34)
$$\begin{aligned}&D_{mj} +t_{mj} f_{ij}^m \le T \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \forall i,j,m,(i\ne j) \end{aligned}$$
(35)

New linear constraints (33), (34), and (35) are used instead of (21), (22), and (23), respectively. In these new constraints, \(BN_1\) and \(BN_2 \) play the role of big numbers. \(BN_1\) is set at \(\mathop {\max t_{ij} }\limits _{i,j} +\mathop {\max s_k }\limits _k +\mathop {\max \gamma _i }\limits _i \) and \(BN_2 \) is set at T. These values are chosen to tighten the constraints.

Theorem 1

Constraints (33), (34), and (35) correctly linearize (21), (22), and (23), respectively.

First, observe that all the delivery times should be within T. Therefore, (35) is a correct linearization for (23). From constraints (2), (3), and (4), for (33), there are three cases if \(g_{ik} =1\).

Case 1 \(b_{ij} =1\)and \(e_{ij}^k =0\). In this case, both (33) and (21) yield the same left-hand sides.

Case 2 \(b_{ij} =0\) and \(e_{ij}^k =1\). This case is similar to case 1.

Case 3 \(b_{ij} =0\) and \(e_{ij}^k =0\). The left-hand side of (21) yields zero and the left-hand side of (33) returns a negative value. Therefore, as \(D_{kj} \) is non-negative, the two constraints work similarly in this case.

In addition, there are three cases if \(g_{ik} =0\). These cases are the same as the previous ones. However, the results of these cases when \(g_{ik} \) is zero are the same as those of the third case when \(g_{ik} \) is one. Therefore, (33) is a correct linearization for (21). Finally, for (34), there are two cases.

Case 1 \(a_{ij}^{km} =1\). In this case, both (34) and (22) yield the same right-hand sides.

Case 2 \(a_{ij}^{km} =0\). The left-hand side of (22) returns zero. Additionally, the left-hand side of (34) yields a negative value. However, as \(D_{mj} \) is non-negative, (34) is a true linearization for (22). \(\square \)

3.4 Piecewise linear flow arrival function

FSSIHLRP is for an arbitrary flow arrival function. This function demonstrates the flows come to origins. As presented before, it should logically be a non-decreasing function. As observed in Fig. 2, the flow arrival function is a piecewise linear and convex function if linear interpolation is used between two points within the time bounds. Moreover, the piecewise linear function can be an approximation of nonlinear arrival pattern, which is more used in the real applications such as postal delivery systems. Clearly, a piecewise linear function includes several parts with a unique constant and coefficient in each part, as shown in Fig. 2. In the approximations of a nonlinear flow arrival function, by growing the number of parts, the error is reduced:

Fig. 2
figure 2

A sample piecewise linear pattern

In Fig. 2, the origin contains three parts for piecewise linear function. The notations used in this figure are also defined in the rest of the paper. The piecewise linear flow arrival function is formulated using the following new notations:

Set

\(M_i\) :

Set of all piecewise linear parts for node i indexed by\(p_i \),

Parameters

\(\mathrm{co}_{ip_i } \) :

Coefficient of part p for node i,

\(\mathrm{con}_{ip_i } \) :

Constant of part p for node i,

\(\mathrm{up}_{ip_i } \) :

The upper bound of part p for node i,

\(\mathrm{lo}_{ip_i } \) :

The lower bound of part p for node i.

Decision variables

\({s}'_{ip_i } \) :

Scheduled leaving time from nodei at the time related to part \({{p}}_i \),

\(h_{ip_i }\) :

\( \left\{ {\begin{array}{*{20}c} 1 \\ 0 \\ \end{array} } \right. \,\,\begin{array}{*{20}l} {{\text {If part }}{{p}}_{i} \,{\text {is}}\,{\text {selected}}\,{\text {for}}\,{\text {node}}\,i,} \\ {{\text {Otherwise,}}} \\ \end{array}\)

Constraints instead of (19) and (20):

$$\begin{aligned}&\sum _{p_i } {{s}'_{ip_i } } \le \gamma _i \quad \forall i\end{aligned}$$
(36)
$$\begin{aligned}&\sum _{p_i } {(\mathrm{co}_{ip_i } {s}'_{ip_i } +h_{ip_i } \mathrm{con}_{ip_i } )} \sum _j {w_{ij} } \ge \sigma _i \quad \forall i\end{aligned}$$
(37)
$$\begin{aligned}&{s}'_{ip_i } \le \mathrm{up}_{ip_i } h_{ip_i } \quad \forall i,p_i \end{aligned}$$
(38)
$$\begin{aligned}&\mathrm{lo}_{ip_i } h_{ip_i } \le {s}'_{ip_i } \quad \forall i,p_i \end{aligned}$$
(39)
$$\begin{aligned}&\sum _{p_i } {h_{ip_i } } =1 \quad \forall i\end{aligned}$$
(40)
$$\begin{aligned}&\sum _{p_i } {{s}'_{ip_i } } =s_i \quad \forall i \end{aligned}$$
(41)

3.5 Extensions to FSSIHLRP

In the previous paragraphs, a basic model called FSSIHLRP was suggested. Now, we introduce other extensions. As is known, in the general topology, the multiple allocation strategy is used for connecting non-hubs to hubs. To employ a single allocation policy, constraints (42) should be added to FSSIHLRP.

$$\begin{aligned} \sum _{j\ne i} {g_{ij} } \le 1\quad \forall i \end{aligned}$$
(42)

Another well-known hub network topology is the tree of hubs. This type has been introduced and used in a few articles in this area (Chou 1990; Kim and Tcha 1992; Contreras et al. 2009). In this hub network topology, there is just one route to transport from one hub to the other hubs. Adding constraint (43) to the model changes the hub network to a tree topology. Constraints (18) cause the right side of this constraint to be doubled.

$$\begin{aligned} \sum _k {\sum _{m\ne k} {z_{km} } } =2\left( \sum _k {y_k } -1\right) \end{aligned}$$
(43)

The ring of hub topology is another network design, which is commonly used in telecommunication systems (Lee et al. 1993; Chiu et al. 1995; Contreras et al. 2009). In this design, each hub node is just connected to two hubs. Using constraints (44) helps FSSIHLRP use the ring topology.

$$\begin{aligned} \sum _{m\ne k} {z_{km} } =2y_k \quad \forall k \end{aligned}$$
(44)

Sometimes, decision makers need to limit the number of stopping in the flow shipment path between each O–D pair. By employing a new parameter, namely \(q_{ij} \), a new constraint (45) should be attached to the model. \(q_{ij} \) is the maximum number of stopping hub nodes to shipping flow from origin i to destination j.

$$\begin{aligned} \sum _k {\sum _{m\ne k} {a_{km}^{ij} } } \le q_{ij} \quad \forall i,j,i\ne j \end{aligned}$$
(45)

Furthermore, a famous type of hub location is p-HLP. In this type, decision makers or owners of the network decide to use only a predetermined number of hubs, called p. Clearly, adding constraints (46) to FSSIHLRP yields FSSI p-HLRP.

$$\begin{aligned} \sum _k {y_k } =p \end{aligned}$$
(46)

4 Preprocessing

Before solving any FSSIHLRP, we perform preprocessing steps to remove some of the variables. These are planned to cut the size of the problem and to tighten the linear relaxation. In addition, they are run by taking advantage of the following properties.

Property 1

In any optimal solution to the model, for all i and \(p_i \) with \(\mathrm{up}_{ip_i } \mathrm{co}_{ip_i } +\mathrm{con}_{ip_i } <\frac{\sigma _i }{\sum \limits _j {w_{ij} } }\), \(h_{ip_i } =0\).

Proof

Assume that for some i and \(p_i \), \(h_{ip_i } =1\). As stated by constraints (38), the upper bound of \({s}'_{ip_i }\) is \(\mathrm{up}_{ip_i } \). Therefore, when \(\sum \limits _j {w_{ij} } (\mathrm{up}_{ip_i } \mathrm{co}_{ip_i } +\mathrm{con}_{ip_i } )\) is less than the minimum flow, which is picked from node i, constraints (37) will be violated, which guarantees that \(h_{ip_i }\) should be zero. \(\square \)

Corollary 1

It is the corollary of property 1 and constraints (39). For all i, \(\mathrm{lp}_i \) is defined as the last \(p_i \) where \(h_{ip_i } =0\). The minimum feasible value for \(s_i \) is \(\mathrm{lo}_{i_{\mathrm{lp}_{i} +1}} \), which is called \(\mathrm{slo}_i \).

Property 2

In any optimal solution to the model, for all ik, and m with \(\mathrm{slo}_i +t_{ik} +\mathrm{st}_k +\mathrm{st}_m +t_{km} >T\), \(a_{ij}^{km} =0\) for all j.

Proof

Without loss of generality, we simply assume that k and mare hub nodes and i is connected to k. Therefore, as shown by constraints (33), \(D_{kj} \)will be \(\mathrm{slo}_i +t_{ik} +st_k \). According to constraints (34), if \(a_{ij}^{km} =1\), then \(D_{mj}\) will be \(\mathrm{slo}_i +t_{ik} +st_k +st_m +t_{km} \). If \(D_{mj}\) is greater than the maximum coverage service time, constraints (35) will be violated. To capture the feasible solution, in this case, \(a_{ij}^{km}\) should be zero. \(\square \)

Property 3

In any optimal solution to the model, for all ij, and k with \(\mathrm{slo}_i +t_{ik} +st_k +t_{kj} >T\), \(e_{ij}^k =0\) and \(f_{ij}^k =0\).

Proof

Assume that, for some ij, and k, \(e_{ij}^k =1\) and \(f_{ij}^k =1\). It means that i and j are connected to the same hub (i.e. k) and there is no direct link between them. According to constraints (33), \(D_{kj} \) will be \(\mathrm{slo}_i +t_{ik} +st_k \). Moreover, as demonstrated by constraints (34), \(D_{kj} =D_{mj} \), since for this i and j, all \(a_{ij}^{km} \) is zero; therefore, the left-hand side of constraints (35) will be \(\mathrm{slo}_i +t_{ik} +st_k +t_{kj} \). If this left-hand side is greater than the maximum coverage service time, constraints (35) will be violated. To capture the feasible solution, in this case, \(e_{ij}^k \) and \(f_{ij}^k\) should be zero. \(\square \)

Property 4

In any optimal solution to the model, for all ij, and k with \(\mathrm{slo}_i +t_{ik} +st_k +\mathop {\min }\limits _m (t_{mj} +t_{km} +st_m )>T\), \(a_{ij}^{km} =0\) for all m.

Proof

Assume that, for some ijk, and m, \(a_{ij}^{km} =1\). Clearly, the flow originated at i and delivered to j is routed via hubs k and m. Without loss of generality, we simply suppose that the flow path between i and j is two stops. Therefore, from constraints (33), (34), and (35), if \(\mathrm{slo}_i +t_{ik} +st_k +\mathop {\min }\limits _m (t_{mj} +t_{km} +st_m )\) as the worst case of routing via hubs k and m is greater than the service time, the model will be infeasible. Hence, \(a_{ij}^{km}\) should be zero in this condition. \(\square \)

5 Computational experiments

5.1 Dataset

For the computational studies, we use two commonly used test beds for HLPs. The first dataset is CAB, which is extracted from Civil Aeronautics Board for airline passengers. This set of problems was introduced by O’kelly (1987). The x-node problem, which is obtained by considering the first x nodes from all the nodes, is employed. The second dataset, called AP, was introduced into the literature by Ernst and Krishnamoorthy (1996). The AP dataset was formed by a real application in a postal delivery system. A characteristic of this dataset was that the flow matrix was not symmetrical.

Meanwhile, some parameters such as fixed costs, travel time, maximum time for leaving from the nodes, minimum flow that is picked, service time at hub nodes, and piecewise linear functions are not a part of the datasets. We produce these parameters in the following ways. The fixed costs are calculated and scaled for CAB and AP by a method derived from Gelareh (2008). We consider \(F_k =\hbox {2.5}\times \hbox {10}^{\hbox {7}}\), \(\mathrm{FH}_{km} =500c_{km} \), and \(\mathrm{FS}_{ij} =\mathrm{FD}_{ij} =200c_{km} \) as well as \(F_k =\hbox {5}\times \hbox {10}^{\hbox {3}}\), \(\mathrm{FH}_{km} =200c_{km} \), and \(\mathrm{FS}_{ij} =\mathrm{FD}_{ij} =100c_{km} \) for CAB and AP, respectively. The travel times (hour) are set at \(c_{ij} /6000\) and \(c_{ij} \) for CAB and AP, respectively. The flow is also divided by L to rescale it to daily flow. The service times are randomly generated between 0 and 1 h for these datasets. The maximum time for leaving from the nodes for all the problems is 8. The minimum flow, which is picked from each node i, randomly generated from a uniform probability distribution with the parameters (0, \(0.4\sum \limits _j {w_{ij} } )\). For each node in CAB and AP, the illustrated piecewise linear function in Fig. 3 is used. We take \(\alpha \in \left\{ {0.5,0.6,0.7,0.8,0.9} \right\} \) for CAB and AP datasets.

Fig. 3
figure 3

The employed piecewise linear function for CAB and AP

Moreover, in this study, we use a new dataset called Iranian Roads Network (IRN) which contains 20 nodes. The travel time (minute) and the distance between the cities are extracted from Google Map. Moreover, the fixed cost of hubs is calculated by the procedure introduced by Tan and Kara (2008) for the Turkish network. In this procedure, we consider only the population of each city, land price, and industrialization level of each city which are collected from Statistical Center of Iran (http://www.amar.org.ir/). The travel costs between the nodes are considered as 1 % of their distances. The flow matrix is calculated by \(w_{ij} =0.001\times P_i \frac{P_j }{\sum \limits _{k=1}^{20} {P_k -P_i } }\) for each pair of cities, in which \(P_i \) is the population of the \(i^\mathrm{th}\) city. Moreover, we consider \(\mathrm{FH}_{km} =100c_{km} \) and \(\mathrm{FS}_{ij} =\mathrm{FD}_{ij} =80c_{km} \). Furthermore, the employed piecewise linear function for IRN is shown in Fig. 3. The minimum flow picked from each city is set at 100. In addition, the maximum time for receiving flow for each city is set at 480 min. We take \(\alpha \in \left\{ {0.5,0.6,0.7,0.8,0.9} \right\} \) and \(T\in \left\{ {1620,1800,1980,2160} \right\} \). The IRN dataset is available at http://wp.kntu.ac.ir/hkarimi/files/IRN.rar Additionally, we run the model on a server with a dual core 2.53 GHz Intel processor and 4 GB of RAM and commercial solver CPLEX, version 12.2.

5.2 Preprocessing performances

We put 5400 s time bound on CPLEX. Tables 2, 3, 4, 5, 6, 7 and 8 summarize the performance results. They have four main components. The first three columns describe the problem number, discount factor, and maximum service time level. The second five columns give the optimum solution characteristics such as number of hubs, hub links, spoke links, non-hub links, and optimum objective function value (OFV). In the next two columns, the CPLEX results of the model without our suggested preprocessing are illustrated. This type of run is referred to as type 1. The number of nodes in the branch and cut tree and the solution time are reported in these columns. The final main component of these tables gives the same results which are achieved using the model with preprocessing, called type 2. In AP with \(T=40\) in each discount factor value, Model 1 cannot obtain the optimum solution in the predefined solution time bound. Therefore, the results are illustrated at 5400 s.

Table 2 Results for the CAB dataset with type 1 and type 2 (part 1)
Table 3 Results for the CAB dataset with type 1 and type 2 (part 2)
Table 4 Results for the CAB dataset with type 1 and type 2 (part 3)
Table 5 Results for the AP dataset with type 1 and type 2 (part 1)
Table 6 Results for the AP dataset with type 1 and type 2 (part 2)
Table 7 Results for the AP dataset with type 1 and type 2 (part 3)
Table 8 Results for the IRN dataset with Type 1 and Type 2

As reported in Tables 2, 3, 4, 5, 6, 7 and (8), on average, the solution times for CAB, AP, and IRN are reduced by 75.57, 80.41, and 82.96 %, respectively, using all preprocessing. Moreover, averagely, the number of nodes in the branch and cut tree is decreased by 90.87, 95.67, and 95.86 % using type 2 for CAB, AP, and IRN, respectively.

The total number of non-fixed variables before preprocessing for CAB and AP with 10 nodes is 12,470. For example, when coverage time bound is 40 for AP, the number of non-fixed variables is decreased by about 60 % . To conclude from these Tables 2, 3, 4, 5, 6 and 7, in the tight problems (i.e. the problem including less service time), the solution time and number of nodes in the branch and cut tree are significantly improved by type 2. Therefore, we can also conclude that the linear relaxation is improved significantly with the inclusion of the preprocessing.

5.3 Impact of maximum coverage service time and discount factor

Even though the primary emphasis of the FSSIHLRP is the total cost minimization, other interesting insights can be gained for by a comparison to the coverage service time measure. In general, a solution yielding the minimum total cost is not obtained by the minimum coverage service time. In fact, these two parameters are in conflict. A loss in one causes a gain in the other. Therefore, in the light of this conflict, in Figs. 4, 5, and 6, for CAB, AP with different sizes, and IRN, the effect of service time and also the discount factor on the OFV is described. In Fig. 6, the lines with 1800 and 1980 coverage time are the same, but the line with 2160 coverage time is different. However, due to the resolution of the chart, these three lines lay on each other.

Fig. 4
figure 4

OFV performances in terms of discount factor and coverage time for CAB

Fig. 5
figure 5

OFV performances in terms of discount factor and coverage time for AP

Fig. 6
figure 6

OFV performances in terms of discount factor and coverage time for IRN

The direct impact of higher coverage time is a reduction in the OFV of a network. Therefore, decision makers should consider this conflict in their problems. Moreover, it is interesting to note that a higher discount factor yields a higher OFV. This impact has been discussed more in the hub location literature by O’Kelly and Bryan (1998). In addition, as an interesting observation for all problems of IRN, the scheduled leaving time from the cities close to the border such as Bandar Abbas, Tabriz, and Zahedan is much more than that in the cities close to the center of Iran such as Arak, Isfahan, and Yazd. Furthermore, in this dataset, by increasing the discount factor, the number of hubs is decreased. As another additional insight, which is gained from the study of IRN, number of non-hub links is increased by increasing discount factor. Further analysis showed that by increasing the service time, the number of spoke links is decreased. Consequently, there is a significant negative relation between the number of spoke links and non-hub links for IRN. As a surprising observation, in all problems of IRN, the hub location at Tehran survives, while the remaining hub locations switch from a city to another city by changing the discount factor. In addition, as a striking observation, for all problems of IRN except for \(\alpha =0.5\), the hub networks topology are tree. Furthermore, in IRN, with the exception of \(\alpha =0.9\), no change in hub network topology is detected. As another interesting observation, the created network in \(\alpha =0.6,T=1620\) is the same as \(\alpha =0.8,T=1620\). The difference in the OFVs is due to the difference in the discount factors.

6 Conclusion

The purpose of the current study was to introduce a mathematical formulation that decides on the locations of hubs in an incomplete hub network topology, the allocations of non-hubs to hubs, non-hubs to non-hubs and the departure time from nodes simultaneously to improve service quality at minimum cost. Additionally, some preprocessing was introduced based on the new assumptions. In FSSIHLRP, the scheduled leaving time from the origin node and the departure time from a hub to a node were determined. In this study, a model and some tools were provided for decision makers to consider the maximum service time in their companies. To evaluate the proposed preprocessing, the well-known CAB and AP datasets and a new dataset called IRN were used. The results showed the addition of these tools was better than solving the model without them both in terms of reducing the CPU time and the used node in the branch and cut tree. In conclusion, using these tools caused a faster and efficient branch and cut algorithm.

The generalizability of these results is subject to certain limitations. For instance, no consideration of the effect of the OFV on the service time and vice versa and not including the effects of time zone in the problem, since this network is a large one like a global network. Overcoming these restrictions will lead to more real-world and challenging future works.