1 Introduction

Hub networks are frequently employed in many transportation, telecommunication and computer systems to efficiently route commodities between many origins and destinations. A distinguishing feature of hub networks is the use of transshipment, consolidation, or sorting points for commodities, called hub facilities, to connect a large number of origin/destination (O/D) pairs by using a small number of links. Commodities having the same origin but different destinations are consolidated when routed to the hubs and are then combined with other commodities having different origins but the same destination. The use of hub facilities helps centralize commodity handling and sorting operations, reduce set-up costs, and achieve economies of scale on routing costs through the consolidation of flows. Hub networks can be seen as hierarchical networks which, in their most basic form, contain two levels: an access-level network connecting O/D nodes to hubs, and a hub-level network connecting hub nodes between them. The design of hub networks involves selecting nodes to place hub facilities, determining the arcs to connect O/D nodes and hubs, and selecting the paths to route commodities.

Hub network design problems (HNDPs) lie at the heart of network design planning in transportation and telecommunication systems. Application areas of HNDPs in transportation include air freight and passenger travel, postal delivery, express package delivery, trucking, liner shipping, public transportation, and rapid transit systems. Demand corresponds to passengers, mail, express packages, or goods carried by airplanes, trucks, trains, or vessels moved on physical networks such as roads and railways or through the air or water. Hub facilities are sorting centers or transportation terminals in which one or more transportation modes interact. Hubs are used as intermediate facilities to consolidate flows, perform an activity to commodities (i.e., sort, assemble, label), or transfer them to other modes of transportation. Consolidation of flows at hubs enables economies of scale on transportation costs, not only on the routing of flows between hubs but also between O/D nodes and hubs. Sometimes hubs are required route commodities. Some other times hubs are not required but desirable for economical reasons.

Applications of HNDPs in telecommunications arise in the design of distributed data networks, where commodities correspond to electronic data that are routed over a variety of physical links such as co-axial cables and fiber optic links or through the air via satellite channels and microwave links. Hub facilities correspond to hardware such as switches, concentrators, and multiplexors which help to provide efficient connections between tributary and backbone networks. Large set-up costs for hub facilities and communication links, in combination with economies of scale in data transmissions and network utilization, motivate the use of hub-and-spoke architectures.

HNDPs constitute a challenging class of network optimization problems involving two types of design decisions: (1) the location of hub facilities at nodes of an underlying network, and (2) the activation of various classes of links to connect origin, destinations, and hubs. Given the inherent complexity of the interaction between these two types of decisions, HNDPs were first studied from a facility location perspective. In particular, the so-called hub location problems (HLPs) focus on the interaction between hub facilities and consider the location of hubs as the key decision. Most HLPs use a set of assumptions that simplify the design and routing decisions to the point of being completely determined by the allocation decisions of O/D nodes to hubs. When such simplifying assumptions are relaxed, HNDPs are more closely related to multicommodity network design problems (MNDPs). In fact, HNDPs are a particular class of MNDPs in which node selection decisions are taken into account. A specific class of such general HNPDs, denoted as hub arc location problems (HALPs), have also been studied in which a set of hub arcs, and their associated hub nodes, need to be selected. In this case, the modeling of O/D paths become more involved as the allocation of nodes to hubs does no longer determine the routing of flow through the hub network. HALPs retain some assumptions of HLPs, specially the ones regarding the design of the access-level network. The rich variety of applications has also given rise to HNDPs with specific hub network topologies and to more general models involving design decisions on both hub and access levels as well as additional node selection decisions.

This chapter studies HNPDs from a network design perspective. We focus on the role network design and routing decisions play in the formulation and solution of various classes of HNDPs. Section 2 starts with some preliminaries, including the key features of hub networks, the types of decisions that can be taken into account, and how these decisions interact between them. We also describe commonly considered assumptions and properties of HNDPs and how these impact their formulation. In particular, Sect. 3 introduces different formulations for various classes of HLPs problems considering three allocation patterns: multiple, single, and r-allocation. Section 4 presents more complex HNDPs such as HALPs and other problems with specific hub network topologies such as tree-star, start-star, ring-star, and hub line networks. For all these classes of problems, we highlight their most relevant applications and describe some formulations which have been developed and exploited in combination with decomposition methods to solve them. Section 5 provides a historical review of key references on HNDPs together with some of the most significant milestones in the field. Conclusions and perspectives follow in Sect. 6.

2 Preliminaries

A generic hub network design problem can be described as follows. Consider a complete graph \(\mathcal {G} =(\mathcal {N},\mathcal {E})\), where \(\mathcal {N}\) is the set of nodes representing the origins and destinations of flows as well as the set of potential hub locations, and \(\mathcal {E}\) is the set of edges. For each node pair (i, j), let W ij ≥ 0 and d ij ≥ 0 denote the amount of flow to be routed and the distance, respectively, from the origin \(i \in \mathcal {N}\) to the destination \(j \in \mathcal {N}\). For each node \(i \in \mathcal {N}\), f i is the fixed set-up cost for locating a hub, whereas for each \(e \in \mathcal {E}\), g e denotes the fixed set-up cost for activating an (undirected) hub arc. A hub arc \(e=(i,j) \in \mathcal {E}\) connects two different hub nodes i and j and has a unit flow cost of αd ij. The parameter α (0 ≤ α ≤ 1) is used as a discount factor to provide reduced unit flow costs on hub arcs to reflect economies of scale resulting from consolidation of flows between hubs. The unit flow cost between O/D pairs is given by the length of the path between the origin and destination nodes in the solution network. Each O/D path has a collection leg from the origin node to the first hub, possibly a transfer leg between the first and the last hubs, and a distribution leg from the last hub to the destination node.

Depending on the assumptions and considered application, the solution network of a HNDP consists of up to four types of arcs: (1) hub arcs connecting two hubs with a discounted flow cost, (2) bridge arcs connecting also two hub nodes but without benefiting from the reduced unit flow cost of a hub arc, (3) access arcs connecting non-hub nodes and hubs, and (4) direct arcs connecting two non-hub nodes. A generic HNDP consists of locating a set of hub facilities, activating a set of arcs, and of determining the routing of flows through the hub network, with the objective of minimizing the total set-up and flow cost.

HLPs are a class of HNDPs which have been most studied in the literature. They focus on the location of a set of hub facilities and the assignment of O/D nodes to these facilities. Arc selection and routing decisions are mainly determined by the assumptions made on the cost structure and the assignment pattern. In particular, there are four assumptions underlying most HLPs: (1) commodities have to be routed via a set of hubs, (2) hub, access and bridge arcs have no set-up cost, (3) the discount factor α is the same for all hub arcs and does not depend on the amount of flow routed on each hub arc, (4) distances d ij satisfy the triangle inequality. The following properties are a direct consequence of these assumptions:

  • O/D paths with hubs: Assumption 1 prohibits direct connections between O/D nodes that are not hubs and hence, O/D paths must include at least one hub node. Note that this assumption is rather mild, as it is always possible to add a dummy hub and associated flow costs to represent direct connections between non-hub nodes.

  • Fully-interconnected hubs: Assumption 2 allows hubs to be interconnected at no extra cost and, together with Assumptions 3 and 4, an important resulting property is that the set of hub arcs define a complete subgraph on the set of hub nodes. As a consequence, hub arc selection decisions become trivial once the location of hub nodes is known.

  • one-hub-arc O/D paths: Another important property obtained when combining all assumptions is that O/D paths contain at least one and at most two hubs. However, it is important to note that whenever Assumption 2 or 4 are not satisfied, paths may contain more than two hubs and more than one hub arc.

The above properties do not only simplify the network design decisions in HLPs, as they are completely determined by the location and assignment decisions, but most importantly, they significantly reduce the number of O/D paths that need to be considered on a hub network. In HLPs, O/D paths include either a single hub node and no hub arc, or two hub nodes and a single hub arc. Moreover, because of Assumptions 2 and 4, each collection and distribution leg, if present, contains only one access arc. O/D paths are thus of the form (i, k, m, j), where \((k,m) \in \mathcal {N} \times \mathcal {N}\) is the ordered pair of hubs to which i and j are allocated, respectively. The flow cost of routing W ij along the path (i, k, m, j) is then given by \(W_{ij}\left (\chi d_{ik} + \alpha d_{km} + \delta d_{mj}\right ),\) where χ, α, and δ represent the collection, transfer and distribution costs along the path. To reflect economies of scale between hubs, we assume that α < χ and α < δ. Note that these paths contain one, two or at most three arcs, depending on the number of visited hubs and on the function of origins and destinations (i.e., hub or non-hub nodes). As a consequence, there are only O(n 2) paths for each O/D pair. As we will show in Sect. 3, this allows the development of tight path-based formulations with O(n 4) variables that explicitly consider all these paths and for some allocation patterns, they do not even require the use of flow conservation constraints.

In the case of more general HNDPs that do not satisfy some of the above mentioned assumptions, the modeling of O/D paths becomes more involved given that hub nodes are not necessarily fully interconnected and due to the presence of bridge arcs. O/D paths may contain more than three arcs and visit more than two hub nodes. The transfer leg can use several bridge and hub arcs, depending on whether additional assumptions on the structure of O/D paths are considered or not. This means that a much larger number of O/D paths exist. In fact, for the case of a complete graph the number of paths between all pairs of nodes is given by \(\sum _{i=0}^{n-2}(n-2)!/(n-2-i)!\) As a consequence, path-based formulations for HNDP would have up to O(n n−2) variables. Flow conservation constraints are now needed when extending arc-based formulations of HLPs which contain only O(n 4) variables. In Sect. 4, we highlight the added complexity in formulating and solving HNDPs where non-trivial arc selection and routing decisions need to be made.

Figure 18.1a shows an example of a solution network of a HLP in which different structures on O/D paths arise (squares represent hub nodes and circles represent non-hub nodes). The path (5, 8, 3, 4) is a two-hub path formed by the access arcs (5, 8), (4, 3) and the hub arc (8, 3). The path (9, 9, 2, 1) is also a two-hub path but containing only the access arc (1, 2) and the hub arc (2, 9). The path (3, 3, 9, 9) is yet another two-hub path formed only by the hub arc (3, 9). The path (4, 3, 3, 6) is a one-hub path containing only the access arcs (4, 3) and (6, 3). The path (5, 8, 8, 8) is also a one-hub path containing the single access arc (5, 8).

Fig. 18.1
figure 1

Solution network of a hub location problem (a) and a hub network design problem (b)

Figure 18.1b shows an example of a solution network of a more general HNDP in which different structures on O/D paths arise (dashed lines represent bridge arcs). The path (5, 8, 9, 6) is a three-hub path formed by the bridge arc (5, 8), the hub arc (8, 9), and the access arc (9, 6). The path (1, 2, 8, 9, 3, 4) is a four-hub path containing the access arcs (1, 2), (4, 3) and the hub arcs (2, 8), (8, 9), and (9, 3).

3 Hub Location Problems

HLPs focus on the location of hub facilities and the assignment of O/D nodes to open hubs. At the hub-level network, hub arc selection decisions are completely determined by the location of the hubs, given that they are full-interconnected with hub arcs. At the access-level network, arc selection decisions are given by the allocation of O/D nodes to hubs. There are three possible allocation strategies: multiple assignments, single assignments, and r-allocation. In the case of HLPs in which there is no set-up cost for the activation of access arcs, once the hub locations are known, the flow cost is minimized by finding a shortest path on the network induced by the selected hubs for each O/D pair, resulting in a multiple allocation pattern of O/D nodes to hubs. That is, a O/D node may be directly connected to more than one hub facility. A multiple assignment pattern simplifies the routing decisions and provides greater flexibility on hub networks, allowing lower flow cost solutions. However, they may considerably increase the network design cost as a larger number of access links must be activated. Applications in which it would be reasonable to consider multiple assignments arise mainly in transportation, in particular in air freight and passenger travel, public transportation, and rapid transit systems. In these cases, access arcs either do not correspond to physical links or they are associated with existing physical infrastructure (i.e., roads or highways) and hence, there is no set-up cost associated with them.

In a single assignments strategy, each O/D node must be connected to exactly one hub facility. All commodities with the same origin (or destination) are thus routed via the same access arc. Applications of a single assignment strategy arise in telecommunications, where access arcs correspond to physical links having significant set-up costs which need to be installed to provide connection and communication services to terminal nodes. Other applications arise in transportation, in particular in express package and postal delivery where commodities are usually consolidated at O/D nodes to be sent to the same sorting facility. Finally, in an r-allocation strategy each O/D node can be connected to at most r hubs. This strategy generalizes both single and multiple assignment strategies and, at the same time, provides the flexibility of allowing nodes to be allocated to two or more hubs while keeping some control on the number of access arcs on the solution network. In what follows, we describe the most relevant formulations that have been introduced to model each of the allocation strategies. We also point out to the most relevant solution algorithms developed for each of these classes of problems.

3.1 Multiple Assignments

We can use the so-called flow-based formulations to model HLPs with multiple assignments. They use continuous variables to determine the amount of flow routed on a particular arc originated at a given node. In the case of multiple assignments, we need three sets of flow variables to model the collection, transfer, and distribution legs in an O/D path. In particular, for the collection leg we define the continuous variables U ik, \(i,k \in \mathcal {N}\), equal to the amount of flow from origin node i sent directly to hub k via access arc (i, k). For the transfer leg, let Y ikm, \(i,j,k \in \mathcal {N}\), be equal to the amount of flow originated at node i and passing through hub arc (k, m). Finally, for the distribution leg let X ijm, \(i,j,m \in \mathcal {N}\), be equal to the amount of flow from origin i sent from hub m directly to destination j via access arc (m, j). We also define binary location variables z i, \(i \in \mathcal {N}\), equal to 1 if and only if a hub is located at node i. Using these sets of decision variables, we can formulate HLPs with multiple assignments as follows:

$$\displaystyle \begin{aligned} \begin{array}{rcl} \mbox{minimize} & \quad &\displaystyle \sum_{k \in \mathcal{N}}f_{k}z_{k}+\sum_{i,k \in \mathcal{N}} \chi d_{ik} U_{ik} + \sum_{i,k,m \in \mathcal{N}} \alpha d_{km}Y_{ikm} + \sum_{i,j,m \in \mathcal{N}} \delta d_{mj} X_{ijm} \\ \mbox{subject to} & \quad &\displaystyle \sum_{k \in \mathcal{N}} U_{ik} = O_i \qquad i \in \mathcal{N} {} \end{array} \end{aligned} $$
(18.1)
$$\displaystyle \begin{aligned} \begin{array}{rcl} & \quad &\displaystyle \sum_{m \in \mathcal{N}} X_{ijm} = W_{ij} \qquad i,j \in \mathcal{N} {} \end{array} \end{aligned} $$
(18.2)
$$\displaystyle \begin{aligned} \begin{array}{rcl} & \quad &\displaystyle U_{ik} + \sum_{m \in \mathcal{N}} Y_{imk} = \sum_{m \in \mathcal{N}} Y_{ikm} + \sum_{j \in \mathcal{N}} X_{ijk} \qquad i, k \in \mathcal{N} {} \end{array} \end{aligned} $$
(18.3)
$$\displaystyle \begin{aligned} \begin{array}{rcl} & \quad &\displaystyle U_{ik} \leq O_i z_k \qquad i, k \in \mathcal{N} {} \end{array} \end{aligned} $$
(18.4)
$$\displaystyle \begin{aligned} \begin{array}{rcl} & \quad &\displaystyle X_{ijm} \leq W_{ij} z_m \qquad i,j,m \in \mathcal{N} {} \end{array} \end{aligned} $$
(18.5)
$$\displaystyle \begin{aligned} \begin{array}{rcl} & \quad &\displaystyle U_{ik}, Y_{ijk}, X_{ijk} \geq 0 \qquad i, j, k \in \mathcal{N} {} \end{array} \end{aligned} $$
(18.6)
$$\displaystyle \begin{aligned} \begin{array}{rcl} & \quad &\displaystyle z_{k}\in \{0,1\} \qquad k \in \mathcal{N}. {} \end{array} \end{aligned} $$
(18.7)

Constraints (18.1)–(18.3) correspond to the flow conservation equations for a network flow problem for each origin node i. In particular, for each node \(i\in \mathcal {N}\) there is a network with 2n + 1 nodes. The first node is the source with a supply of O i and then, there are n transshipment nodes, one for each possible hub node \(k \in \mathcal {N}\). Finally, the demand at each of the n destination nodes j is given as W ij. Constraints (18.4)–(18.5) ensure that flows are routed via open hubs. The above formulation contains O(n 3) variables and O(n 3) constraints. If the flow requirements are symmetric, i.e., W ij = W ji, \(\forall i,j \in \mathcal {N}\), and if the collection and distribution cots are equal (χ = δ), then the U ik variables can be eliminated from the formulation by using:

$$\displaystyle \begin{aligned} U_{ik} = \sum_{j \in \mathcal{N}} X_{jik} \qquad \forall i,k \in \mathcal{N}. \end{aligned}$$

Arc-based formulations can also be adapted for the case of HLPs with multiple assignments. For each \(i,j,k,m \in \mathcal {N}\), we define binary variables x ijkm equal to 1 if and only if the flow originated at i and destination j is routed via hub arc (k, m). Using the same set of location variables z i in combination with the arc variables x ijkm, the problem can be stated as follows:

$$\displaystyle \begin{aligned} \begin{array}{rcl} \mbox{minimize} & \quad &\displaystyle \sum_{k \in \mathcal{N}} f_k z_{k} + \sum_{i,j,k,m \in \mathcal{N}} W_{ij}\left(\chi d_{ik} + \alpha d_{km} + \delta d_{mj}\right)x_{ijkm} \\ \mbox{subject to} & \quad &\displaystyle \sum_{k,m \in \mathcal{N}} x_{ijkm}= 1 \qquad i,j \in \mathcal{N} {} \end{array} \end{aligned} $$
(18.8)
$$\displaystyle \begin{aligned} \begin{array}{rcl} & \quad &\displaystyle \sum_{m \in \mathcal{N}} x_{ijkm} + \sum_{m \in \mathcal{N} \setminus \left\lbrace k \right\rbrace } x_{ijmk} \leq z_{k} \qquad i,j,k \in \mathcal{N} {} \end{array} \end{aligned} $$
(18.9)
$$\displaystyle \begin{aligned} \begin{array}{rcl} & \quad &\displaystyle x_{ijkm} \geq 0 \qquad i,j,k,m \in \mathcal{N} {} \end{array} \end{aligned} $$
(18.10)
$$\displaystyle \begin{aligned} \begin{array}{rcl} & \quad &\displaystyle z_{k}\in \{0,1\} \qquad k \in \mathcal{N}. {} \end{array} \end{aligned} $$
(18.11)

Constraints (18.8) state that exactly one hub arc must be selected to route the flow from origin i to destination j. Constraints (18.9) ensure that O/D paths (i, k, m, j) use only open hubs. This formulation has O(n 4) variables and O(n 3) constraints and usually provides tight LP bounds. In addition, we note that this arc-based formulation is equivalent to a path-based formulation given that all O/D paths are completely characterized by the arc variables x ijkm.

Given that O/D nodes can be connected to more than one hub facility, we can exploit some properties on the structure of O/D paths to do preprocessing in order to significantly reduce the number of required variables in the formulation. In particular, it is known that every flow uses at most one direction of a hub arc, the one with lower flow cost. We thus define an undirected flow cost F ije for each \(e=(k,m) \in \mathcal {E}\) and \(i,j \in \mathcal {N}\) as \(F_{ije} = \min \{ F_{ijkm}, F_{ijmk} \}\). The number of variables can be further reduced by defining a set of candidate hub arcs E ij for each O/D pair. This is done by using the property that no flow will be routed through a hub arc containing two hubs whenever it is cheaper to route it through only one of them.

The x ijkm can be projected out from the arc-based formulation via Benders decomposition to obtain a valid formulation in the space of the binary variables z i. The Benders reformulation of the arc-based formulation is:

$$\displaystyle \begin{aligned} \begin{array}{rcl} \mbox{minimize} & \quad &\displaystyle \sum_{k \in \mathcal{N}} f_k z_{k} + \eta \\ \mbox{subject to} & &\displaystyle \sum_{k \in \mathcal{N}} z_{k} \geq 1 \qquad {} \end{array} \end{aligned} $$
(18.12)
$$\displaystyle \begin{aligned} \begin{array}{rcl} & &\displaystyle \eta \geq \sum_{i \in \mathcal{N}} a^r_{i} z_{i} \qquad r=1,\dots,|Q_D|, {} \end{array} \end{aligned} $$
(18.13)
$$\displaystyle \begin{aligned} \begin{array}{rcl} & \quad &\displaystyle z_{k}\in \{0,1\} \qquad k \in \mathcal{N}, {} \end{array} \end{aligned} $$
(18.14)

where Q D is the set of extreme points of the dual subproblem associated with constraints (18.8)–(18.9). Non-dominated Benders cuts (18.13) can be efficiently generated with ad hoc algorithms that resort on the solution of linear and network flow problems.

3.2 Single Assignments

Flow-based formulations can also be adapted to model HLPs with single assignments. Similarly to the case of multiple assignments, we use continuous variables to compute the amount of flow routed on a particular arc originated at a given node. However, in the case of single assignments, we only need to use the set of flow variables associated with the hub arcs (Y ikm). For each pair \(i,k\in \mathcal {N}\), we also define binary location/allocation variables z ik, equal to one if and only if node i is assigned to hub k. When i = k, variable z kk represents the establishment or not of a hub at node k. HLPs with single assignments can be formulated as follows:

$$\displaystyle \begin{aligned} \begin{array}{rcl} \mbox{minimize} & &\displaystyle \sum_{k \in \mathcal{N}}f_{k}z_{kk}+\sum_{i,k \in \mathcal{N}} \left(\chi O_i + \delta D_i \right)d_{ik} z_{ik} + \sum_{i,k,m \in \mathcal{N}} \alpha d_{km}Y_{ikm} \\ \mbox{subject to} & &\displaystyle \sum_{m \in \mathcal{N}} Y_{imk} + O_i z_{ik} = \sum_{j \in \mathcal{N}} W_{ij}z_{jk} + \sum_{m \in \mathcal{N}} Y_{ikm} \qquad i, k \in \mathcal{N} {} \end{array} \end{aligned} $$
(18.15)
$$\displaystyle \begin{aligned} \begin{array}{rcl} & &\displaystyle \sum_{k \in \mathcal{N}} z_{ik}= 1 \qquad i \in \mathcal{N} {} \end{array} \end{aligned} $$
(18.16)
$$\displaystyle \begin{aligned} \begin{array}{rcl} & &\displaystyle z_{ik}\leq z_{kk} \qquad i, k \in \mathcal{N} {} \end{array} \end{aligned} $$
(18.17)
$$\displaystyle \begin{aligned} \begin{array}{rcl} & &\displaystyle z_{ik}\in \{0,1\} \qquad i, k \in \mathcal{N} {} \end{array} \end{aligned} $$
(18.18)
$$\displaystyle \begin{aligned} \begin{array}{rcl} & &\displaystyle Y_{ikm} \geq 0 \qquad i, k, m \in \mathcal{N}. {} \end{array} \end{aligned} $$
(18.19)

Constraints (18.15) state that the flow entering to hub k either directly from node i or via other hubs m has to be equal to the flow leaving to either other hubs m or to destination nodes j. Constraints (18.16) ensure that each O/D node is assigned to exactly one hub node. Finally, constraints (18.17) guarantee O/D nodes are assigned to open hubs. The above formulation contains O(n 3) variables and O(n 2) constraints.

HLPs with single assignments are closely related to classical discrete location problems. In fact, they can be modeled as facility location problems with additional quadratic costs associated with the interaction of O/D nodes. HLPs with single assignments can be stated as the following quadratic binary integer program:

$$\displaystyle \begin{aligned} \begin{array}{rcl} \mbox{minimize} & &\displaystyle \sum_{k \in \mathcal{N}}f_{k}z_{kk} + \sum_{i,k \in \mathcal{N}} \left(\chi O_i + \delta D_i \right)d_{ik} z_{ik} + \sum_{i,j,k,m \in \mathcal{N}} \alpha W_{ij}d_{km}z_{ik}z_{jm}\qquad {} \\ \mbox{subject to} & &\displaystyle \mbox{(18.16)--(18.18)}. \end{array} \end{aligned} $$
(18.20)

Note that constraints (18.16)–(18.18) define the set of feasible solutions to the so-called uncapacitated facility location problem (UFLP). In fact, when the quadratic term of the objective (18.20) is removed, the HLP with single assignments reduces to the UFLP. However, contrary to the UFLP, integrality conditions on the allocation variables z ik need to be explicitly stated to have a valid formulation. This is mainly due to the fact that objective (18.20) is non-convex.

We now discuss different approaches that have been considered to handle the quadratic term of the objective (18.20). The first one is to use the reformulation linearization technique of Adams and Sherali (1990), to obtain the following linear MIP formulation:

$$\displaystyle \begin{aligned} \begin{array}{rcl} \mbox{minimize} & \quad &\displaystyle \sum_{k \in \mathcal{N}}f_{k}z_{kk} + \sum_{i,k \in \mathcal{N}} \left(\chi O_i + \delta D_i \right)d_{ik} z_{ik} + \sum_{i,j,k,m \in \mathcal{N}} \alpha W_{ij}d_{km}x_{ijkm} \\ \mbox{subject to} & \quad &\displaystyle \mbox{(18.16)--(18.18)} \\ & \quad &\displaystyle \sum_{m \in \mathcal{N}} x_{ijkm}=z_{ik} \qquad i, j, k \in \mathcal{N} {} \end{array} \end{aligned} $$
(18.21)
$$\displaystyle \begin{aligned} \begin{array}{rcl} & \quad &\displaystyle \sum_{k \in \mathcal{N}} x_{ijkm}=z_{jm} \qquad i, j, m \in \mathcal{N} {} \end{array} \end{aligned} $$
(18.22)
$$\displaystyle \begin{aligned} \begin{array}{rcl} & \quad &\displaystyle x_{ijkm} \geq 0 \qquad i, j, k, m \in \mathcal{N}. {} \end{array} \end{aligned} $$
(18.23)

where x ijkm, \(i,j,k,m \in \mathcal {N}\), are variables equal to 1 if and only if the flow originated at i and destination j transits via hub arc (k, m). This formulation can be seen as an arc-based formulation in which constraints (18.21)–(18.22) are flow conservation equations for n 2 networks, each of which associated with an O/D pair (i, j). In addition, it contains O(n 4) variables and O(n 3) constraints and is known to provide tight LP bounds. Moreover, constraints (18.16) can be replaced by

$$\displaystyle \begin{aligned} \sum_{k,m \in \mathcal{N}} x_{ijkm} = 1 \qquad \forall i,j \in \mathcal{N}, {} \end{aligned} $$
(18.24)

to obtain an alternative valid formulation. This highlights that, due to the particular structure of a fully interconnected hub-level network, this formulation can also be seen as a path-based formulation given that it uses path variables x ijkm to characterize all O/D paths visiting either one or two hub nodes. In this case, constraints (18.24) correspond to the convexity constraints associated with O/D pairs. These arc/path-based formulations have been used in combination with decomposition methods to develop adhoc solution algorithms for efficiently solving various HLPs with single assignments (see, Sect. 5).

It is possible to use projection methods to eliminate the path variables x ijkm of arc-based formulations to obtain MIP formulations with fewer variables. The first one is a direct method used in Mirchandani (2000) to project out flow variables for network loading problems. The second one is an indirect method used in Rardin and Wolsey (1993) for uncapacitated fixed charge network flow problems. Labbé and Yaman (2004) apply the direct projection method on an arc-based formulation and analyze the strength and dominance of these projection inequalities. The authors prove that a subset of these projection inequalities are facet-defining and that some others, are dominated by other families of facet-defining inequalities. Labbé et al. (2005) show that the projection inequalities defined by a subset of the extreme rays of the projection cone are sufficient to provide a valid formulation for HLPs with single assignments. In particular, HLPs with single assignments can be formulated as

$$\displaystyle \begin{aligned} \begin{array}{rcl} \mbox{minimize} & &\displaystyle \sum_{k \in \mathcal{N}}f_{k}Z_{k}+\sum_{i,k \in \mathcal{N}}(\chi O_i + \delta D_i)d_{ik}z_{ik} + \sum_{k,m \in \mathcal{N}} \alpha d_{km}y_{km} \\ \mbox{subject to} & &\displaystyle \mbox{(18.16)--(18.18)} \\ & &\displaystyle y_{km} \geq \sum_{(i,j) \in K} W_{ij}\left(z_{ik} + z_{jm} - 1\right) \qquad k, m \in \mathcal{N}, K \subseteq \mathcal{N} \times \mathcal{N} {} \end{array} \end{aligned} $$
(18.25)
$$\displaystyle \begin{aligned} \begin{array}{rcl} & &\displaystyle y_{km} \geq 0 \qquad k, m \in \mathcal{N}, {} \end{array} \end{aligned} $$
(18.26)

where y km, \(k,m \in \mathcal {N}\) are an additional set of continuous variables equal to the amount of flow routed on hub arc (k, m). For each arc (k, m), constraints (18.25) and (18.26) imply

$$\displaystyle \begin{aligned}y_{km} = \max_{K \subseteq \mathcal{N} \times \mathcal{N}}\sum\limits_{(i,j) \in K} W_{ij}\left(z_{ik} + z_{jm} - 1\right) = \sum\limits_{(i,j) \in K_{km}} W_{ij}\left(z_{ik} + z_{jm} - 1\right), \end{aligned}$$

where K km is the set of all demands which are routed on hub arc (k, m). This formulation contains only O(n 2) variables but an exponential number of constraints. Constraints (18.25) are a particular case of a more general class of facet defining inequalities which can be separated in polynomial time.

An alternative to project out the path variables x ijkm is by using Benders decomposition (BD) to obtain a valid reformulation in the space of the original z ik variables. In particular, the Benders reformulation of the arc-based formulation is:

$$\displaystyle \begin{aligned} \begin{array}{rcl} \mbox{minimize} & &\displaystyle \sum_{k \in \mathcal{N}}f_{k}z_{kk} + \sum_{i,k \in \mathcal{N}} (\chi O_i + \delta D_i)d_{ik} z_{ik} + \eta \\ \mbox{subject to} & &\displaystyle \mbox{(18.16)--(18.18)} \\ & &\displaystyle \sum_{k \in \mathcal{N}} z_{kk} \geq 1 \qquad {} \end{array} \end{aligned} $$
(18.27)
$$\displaystyle \begin{aligned} \begin{array}{rcl} & &\displaystyle \eta \geq \sum_{i,k \in \mathcal{N}} a^r_{ik} z_{ik} \qquad r=1,\dots,|P_D|, {} \end{array} \end{aligned} $$
(18.28)

where P D is the set of extreme points of the dual subproblem associated with constraints (18.21)–(18.22). Even though there is an exponential number of constraints (18.28), non-dominated cuts can be efficiently separated with ad hoc algorithms that resort on the solution of linear and network flow problems.

3.3 r-Allocation

The r-allocation strategy provides flexibility in the design of hub networks without explicitly considering set-up costs on access arcs. It has as particular cases both single and multiple assignment strategies. Flow-based and arc-based formulations can also be adapted to model HLPs with r-allocation.

In the case of the flow-based formulation, we combine the location/allocation variables z ik from the single assignments variant with the flow variables U ik, Y ikm, and X ijm from the multiple assignments variant to model the collection, transfer, and distribution legs, respectively. Similarly to the multiple assignments strategy, we also need the U ik variables for the collection leg, as it is no longer possible to model it using the allocation variables z ik. Using these sets of variables, we obtain the following flow-based formulation:

$$\displaystyle \begin{aligned} \begin{array}{rcl} \mbox{minimize} & \quad &\displaystyle \sum_{k \in \mathcal{N}}f_{k}z_{kk}+\sum_{i,k \in \mathcal{N}} \chi d_{ik} U_{ik} + \sum_{i,k,m \in \mathcal{N}} \alpha d_{km}Y_{ikm} + \sum_{i,j,m \in \mathcal{N}} \delta d_{mj} X_{ijm} \\ \mbox{subject to} & &\displaystyle \mbox{(18.1)--(18.3), (18.6), (18.17)--(18.18)} \\ & &\displaystyle \sum_{k \in \mathcal{N}} z_{ik} \leq r \qquad i \in \mathcal{N} {} \end{array} \end{aligned} $$
(18.29)
$$\displaystyle \begin{aligned} \begin{array}{rcl} & \quad &\displaystyle U_{ik} \leq O_i z_{ik} \qquad i, k \in \mathcal{N} {} \end{array} \end{aligned} $$
(18.30)
$$\displaystyle \begin{aligned} \begin{array}{rcl} & \quad &\displaystyle X_{ijm} \leq W_{ij} z_{jm} \qquad i,j,m \in \mathcal{N}. {} \end{array} \end{aligned} $$
(18.31)

Constraints (18.29) ensure that each O/D node is allocated to at most r hub facilities, whereas constraints (18.30) and (18.31) state that flow can be routed on access arcs (i, k) and (j, m) only if they have been activated, respectively. These constraints are equivalent to constraints (18.4) and (18.5) from the multiple assignment variant but yield stronger bounds. Note that in order to model HLPs considering an r-allocation strategy, it is needed to combine not only the set of variables but also the set of constraints (18.1)–(18.3), (18.6) from the multiple assignments variant with constraints (18.17)–(18.18) from the single assignments variant.

In the case of the arc-based formulation, the location/allocation z ik variables and the routing variables x ijkm from the single assignments variant are enough to model the problem. The formulation is as follows:

$$\displaystyle \begin{aligned} \begin{array}{rcl} \mbox{minimize} & \quad &\displaystyle \sum_{k \in \mathcal{N}}f_{k}z_{kk} + \sum_{i,j,k,m \in \mathcal{N}} W_{ij}\left(\chi d_{ik} + \alpha d_{km} + \delta d_{mj}\right)x_{ijkm} \\ \mbox{subject to} & &\displaystyle \mbox{(18.17)--(18.18), (18.29)} \\ & &\displaystyle \sum_{k \in \mathcal{N}} \sum_{m \in \mathcal{N}}x_{ijkm}= 1 \qquad \forall\; i, j \in \mathcal{N} {} \end{array} \end{aligned} $$
(18.32)
$$\displaystyle \begin{aligned} \begin{array}{rcl} & \quad &\displaystyle \sum_{m \in N} x_{ijkm} \leq z_{ik} \qquad i, j, k \in \mathcal{N} {} \end{array} \end{aligned} $$
(18.33)
$$\displaystyle \begin{aligned} \begin{array}{rcl} & \quad &\displaystyle \sum_{k \in N} x_{ijkm} \leq z_{jm} \qquad i, j, m \in \mathcal{N} {} \end{array} \end{aligned} $$
(18.34)
$$\displaystyle \begin{aligned} \begin{array}{rcl} & \quad &\displaystyle x_{ijkm} \geq 0 \qquad i, j, k, m \in \mathcal{N}. {} \end{array} \end{aligned} $$
(18.35)

Constraints (18.32) state that the flow associated with each node pair must be routed using one O/D path. Constraints (18.33) and (18.34) state that only O/D paths associated with active access arcs can be used to route commodities.

As expected, these formulations considering such flexible allocation strategies tend to be more difficult to solve as compared to the best formulations presented in Sects. 3.1 and 3.2 for specific multiple and single allocation variants.

4 Hub Network Design Problems

Full interconnection between hub nodes may be prohibitive in applications where there is a considerable set-up cost associated with the hub arcs. To overcome this drawback of standard HLPs, several problems considering incomplete hub networks have been studied. Formulating and solving more general HNDPs represent a bigger challenge as compared to standard HLPs. This is due to the fact that HNDPs involve additional design decisions such as link activation of hub, access and bridge arcs as well as non-trivial routing decisions. It is no longer possible to state HNDPs as quadratic extensions of facility location problems but rather as extensions of MNDPs in which node selection (i.e., location) decisions need to be taken into account. HNDPs are a class of network optimization problems known to be significantly more difficult to solve in practice as compared to facility location problems. One of the main reasons is that O/D paths may contain more than three arcs and visit more than two hubs. As a consequence, they cannot longer be mainly determined by the allocation decisions. Flow conservation constraints and additional design variables for arc selection decisions are now needed to explicitly model O/D paths in both flow and arc-based formulations. This has a negative impact in the quality of the LP bounds associated with these formulations when compared to the LP bounds obtained with standard HLPs.

In the first part of this section we concentrate on a particular class of HNLPs, referred to as hub arc location problems (HALPs), which have as key decisions the location of hub arcs. These problems retain some of the assumptions used in hub location models, specially the ones that relate to the cost structure and allocation patterns to simplify the design decisions at the access level network and to focus on the design decisions at the hub level network. In the second part we study HALPs that consider specific hub network topologies arising from various applications and highlight how these topologies impact the routing decisions.

4.1 Hub Arc Location Problems

A fundamental difference between HALPs and HLPs is that solution networks may not longer have a fully interconnected hub-level network. HALPs explicitly consider link activation decisions in hub and bridge arcs. Additional restrictions may be imposed on the topology of hub-level networks. However, an important simplifying assumption that is retained from HLPs, as compared with more general HNDPs, is that they do not involve non-trivial link activation decision on access arcs. That is, similar to HLPs, assignment patterns determine the design decisions in the access-level network. As a result, both single, multiple, and r-allocation HALPs variants can be considered.

In HALPs hubs are not necessarily fully interconnected due to the set up cost on the hub arcs or because additional conditions on the network topology are imposed. This causes O/D paths to become more involved, since they may use more than three arcs and visit more than two hubs. Similar to HLPs, because of Assumptions 2 and 4, each collection and distribution leg, if present, employs either one access arc or one bridge arc. However, the transfer leg can now use several bridge and hub arcs, depending on the particular assumptions considered on the structure of O/D paths.

To simplify the added complexity of the routing decisions in HALPs an additional assumption, referred to as the one-hub-arc O/D path assumption, can be considered. It states that O/D paths must contain at most one hub arc on the transfer leg. In turn, this limits paths to have at most three arcs, being the first and last ones either access or bridge arcs and the intermediate arc, if it exists, a hub arc. This assumption is used to duplicate the level of service obtained in HLPs and is also consistent with practice. In air transportation, for example, it ensures that a passenger will never have to change flights more than twice. In ground transportation, it is convenient to restrict the number of break-bulk terminals that each commodity has to pass through so as to reduce handling and congestion at terminals and to provide a form of performance guarantee. O/D paths are once more of the form (i, k, m, j), and we can thus define their associated flow costs as \(W_{ij}\left (\chi d_{ik} + \alpha d_{km} + \delta d_{mj}\right )\).

In what follows we first describe some HALPs that consider the one-hub-arc assumption. We then discuss other more general HALPs that do not consider any assumption on the structure of O/D paths. In particular, we show how the routing the decisions become more involved given that it is needed to determine whether a discount is perceived between two hub nodes or not.

4.1.1 Models with One-Hub-Arc O/D Paths

These problems do not consider set-up costs on the activation of hub nodes and hub arcs. Instead, they considered a cardinality constraint on the number of hub arcs in the solution network. The selected hub arcs induce a set of hub nodes, but there is no limit on the number of activated hubs. These HALPs consider multiple assignments and the goal is to minimize the total flow cost.

Given that in this case bridge arcs can only exist in the collection or distribution legs, a flow-based formulation can be obtained by using the same set of flow variables U ik, Y ikm, and X ijm used in HLPs to model flows passing on the collection, transfer, and distribution legs, respectively. In addition, for \((k,m) \in \mathcal {E}\), we define binary variables y km equal to one if and only if hub arc (k, m) is selected. Using these sets of variables, we can formulate the problem as follows:

$$\displaystyle \begin{aligned} \begin{array}{rcl} \mbox{minimize} & &\displaystyle \sum_{i,k \in \mathcal{N}} \chi d_{ik} U_{ik} + \sum_{i,k,m \in \mathcal{N}} \alpha d_{km}Y_{ikm} + \sum_{i,j,m \in \mathcal{N}} \delta d_{mj} X_{ijm} \\ \mbox{subject to} & &\displaystyle \mbox{(18.1)--(18.2), (18.4)--(18.7)} \\ & &\displaystyle \sum_{(k,m) \in \mathcal{E}} y_{km} = q {} \end{array} \end{aligned} $$
(18.36)
$$\displaystyle \begin{aligned} \begin{array}{rcl} & &\displaystyle z_k \leq \sum_{(k,m) \in \mathcal{E}} y_{km} \qquad k \in \mathcal{N} {} \end{array} \end{aligned} $$
(18.37)
$$\displaystyle \begin{aligned} \begin{array}{rcl} & &\displaystyle U_{ik} = \sum_{m \in \mathcal{N}} Y_{ikm} \qquad i, k \in \mathcal{N} {} \end{array} \end{aligned} $$
(18.38)
$$\displaystyle \begin{aligned} \begin{array}{rcl} & &\displaystyle \sum_{m \in \mathcal{N}} Y_{imk} = \sum_{j \in \mathcal{N}} X_{ijk} \qquad i, k \in \mathcal{N} {} \end{array} \end{aligned} $$
(18.39)
$$\displaystyle \begin{aligned} \begin{array}{rcl} & &\displaystyle Y_{ikk} \leq O_i z_k \qquad i, k \in \mathcal{N} {} \end{array} \end{aligned} $$
(18.40)
$$\displaystyle \begin{aligned} \begin{array}{rcl} & &\displaystyle Y_{ikm} + Y_{imk} \leq M_{ikm} y_{km} \qquad i\in \mathcal{N}, (k,m) \in \mathcal{E} {} \end{array} \end{aligned} $$
(18.41)
$$\displaystyle \begin{aligned} \begin{array}{rcl} & &\displaystyle y_{km} \in \{0,1\} \qquad (k,m) \in \mathcal{E}, {} \end{array} \end{aligned} $$
(18.42)

where M ikm ≤ O i is an upper bound on the amount of flow originated at i that can be routed via hub arc (k, m). Constraints (18.36) force the number of selected hub arcs to be equal to q, whereas constraints (18.37) ensure that a location variable z k is activated only if there exist at least one hub arc incident to node k. These constraints, in combination with (18.4)–(18.7), and (18.40), ensure that flow variables U ik, Y ikm, and X ijm are used only at open hubs. Constraints (18.41) guarantee that flow is routed via two different hub nodes with a discounted cost only if the associated hub arc is selected. Finally, constraints (18.38) and (18.39) are flow conservation constraints for each node i and each potential hub k. This formulation contains O(n 3) variables and O(n 3) constraints.

A more general class of HALPs with multiple assignments has also been studied. In particular, these problems contain both set-up costs and cardinality constraints on hub arcs and hub nodes. For each \(e \in \mathcal {E}\), g e denotes the set-up cost for selecting hub arc e. This class of HALPs consist of locating a set of at most q hub arcs (q ≥ 1), that induce a set of at most p hub nodes (p ≥ 2), and of determining the routing of commodities through the hub network, with the objective of minimizing the total set-up and flow cost.

Taking into account the one-hub-arc assumption, we define the cost for routing W ij when using hub arc e = (k, m) as, \(F_{eij}= W_k \min \left \lbrace F^1_{eij}, F^2_{eij}, F^3_{eij}, F^4_{eij} \right \rbrace ,\) where

$$\displaystyle \begin{aligned}\begin{array}{r*{20}l} &F^1_{eij} = \chi d_{ik}&+\alpha d_{km}+\delta d_{mj}; \quad \quad F^2_{eij} = \chi d_{im}&+\alpha d_{mk}+\delta d_{jk}; \\ &F^3_{eij} =\chi d_{ik}&+\delta d_{kj}; \quad \quad F^4_{eij} = \chi d_{im}&+\delta d_{mj}. \end{array}\end{aligned} $$

Note that the definition of F eij uses some properties of the considered multiple assignments pattern and cost structure. First, every commodity uses at most one direction of a hub arc, the one with lower flow cost. It is thus possible to know a priori how the end hub nodes of a given hub arc e would be connected with origin i and destination j, in case such commodity is routed via hub arc e. Second, no commodity will be routed through a hub arc whenever it is cheaper to route it through only one of its hub nodes. Therefore, some O/D paths may not contain a transfer leg (i.e., a hub arc).

For each \(i,j \in \mathcal {N}\) and \(e\in \mathcal {E}\) we define (undirected) routing variables x eij equal to 1 if and only if demand originated at i and destination j is routed via hub arc e. Using these variables, an arc-based formulation for this class of HALPs can be obtained as follows:

$$\displaystyle \begin{aligned} \begin{array}{rcl} \mbox{minimize} & &\displaystyle \sum_{i \in \mathcal{N}}f_i z_i + \sum_{e \in \mathcal{E}}g_{e}y_{e} + \sum_{i,j \in \mathcal{N}}\sum_{e \in \mathcal{E}} F_{eij} x_{ije} \\ \mbox{subject to} & &\displaystyle \sum_{e \in \mathcal{E}}y_{e} \le p \quad \qquad {} \end{array} \end{aligned} $$
(18.43)
$$\displaystyle \begin{aligned} \begin{array}{rcl} & &\displaystyle \sum_{i\in \mathcal{N}}z_{i} \le q \quad \qquad {} \end{array} \end{aligned} $$
(18.44)
$$\displaystyle \begin{aligned} \begin{array}{rcl} & &\displaystyle \sum_{e\in \mathcal{E}}x_{ije} = 1 \qquad \forall i,j \in \mathcal{N} {} \end{array} \end{aligned} $$
(18.45)
$$\displaystyle \begin{aligned} \begin{array}{rcl} & &\displaystyle x_{ije} \le y_{e} \qquad \forall e \in \mathcal{E}, i,j \in \mathcal{N} {} \end{array} \end{aligned} $$
(18.46)
$$\displaystyle \begin{aligned} \begin{array}{rcl} & &\displaystyle y_{e} \le z_k \quad \qquad \forall e=(k,m) \in \mathcal{E}{} \end{array} \end{aligned} $$
(18.47)
$$\displaystyle \begin{aligned} \begin{array}{rcl} & &\displaystyle y_{e} \le z_m \quad \qquad \forall e=(k,m) \in \mathcal{E}{} \end{array} \end{aligned} $$
(18.48)
$$\displaystyle \begin{aligned} \begin{array}{rcl} & &\displaystyle y_{e}, z_i \in \{0, 1\}\; \qquad \forall e \in \mathcal{E}, i \in \mathcal{N} {} \end{array} \end{aligned} $$
(18.49)
$$\displaystyle \begin{aligned} \begin{array}{rcl} & &\displaystyle x_{ije} \geq \qquad \forall e \in \mathcal{E}, i,j \in \mathcal{N}. {} \end{array} \end{aligned} $$
(18.50)

Constraints (18.43) and (18.44) state the maximum cardinality constraint on the hub arcs and hub nodes, respectively. Constraints (18.45) guarantee that every commodity is assigned to exactly one hub arc, whereas (18.46) allow commodities to be routed only via selected hub arcs. Constraints (18.47) and (18.48) ensure that the end nodes of hub arcs are open hub nodes. This formulation has O(n 4) variables and O(n 4) constraints.

This general class of HALPs can be stated as the minimization of a real-valued supermodular set function. This fundamental property, which is also known for other types of facility location problems (Wolsey 1983), can be exploited to develop formulations. In particular, using supermodular properties, it is possible to completely eliminate the routing variables x ijkm from the above formulation. For each \(i,j\in \mathcal {N}\), we order the elements of \(\mathcal {E}\) by non-decreasing values of their coefficients F ije, and we denote e rk to the r-th element according to that ordering. That is, \(F_{ije_{1}} \leq F_{ije_{2}} \leq \dots \leq F_{ije_{|\mathcal {E}|}} \leq F_{ije_{|\mathcal {E}|+1}}\), where \(F_{ije_{|\mathcal {E}|+1}}=F_{ij{e^*}}\) is the cost for the fictitious edge e such that (1) \(F_{ij{e^*}}>\max _{e\in \mathcal {E}}F_{ije}\), for all \(i,j\in \mathcal {N}\); and (2) \(\sum _{i,j \in \mathcal {N}}F_{ije*}>\max _{e\in \mathcal {E}}(f_e+\sum _{i,j \in \mathcal {N}}F_{ije})\). This assumption guarantees that at least one hub variable y e is at value one in any optimal solution. A formulation for this class of HALPs is as follows:

$$\displaystyle \begin{aligned} \begin{array}{rcl} \mbox{minimize} & &\displaystyle \sum_{i \in \mathcal{N}}f_i z_i + \sum_{e\in \mathcal{E}}g_{e}y_{e} + \sum_{i,j \in \mathcal{N}}\eta_{ij} \\ \mbox{subject to} & &\displaystyle \mbox{(18.43)--(18.44), (18.47)--(18.49)} \\ & &\displaystyle \eta_{ij} \ge F_{ije_r}+\sum_{e\in \mathcal{E}}(F_{ije}-F_{ije_r})^{\bar{ }}\; y_{e} \quad r=1,\dots,|\mathcal{E}|+1, \ i,j\in \mathcal{N}, {}\\ \end{array} \end{aligned} $$
(18.51)

where η ij are continuous decision variables used to evaluate the flow cost of O/D pair (i, j) and \((x)^{\bar { }}= \min \left \lbrace 0, x \right \rbrace \). Constraints (18.51) are the so-called supermodular constraints computing the flow cost for each O/D pair by only taking into account the set of open hub arcs. This formulation has only O(n 2) variables and O(n 4) constraints.

4.1.2 Models with Arbitrary O/D Paths

We now focus on a general class of HALPs that relax the one-hub-arc O/D path assumption and allow paths to contain more than one hub/bridge arc on the transfer leg. A flow-based formulation can be obtained by using the same set of flow variables U ik, Y ikm, and X ijm as before plus an additional set of flow variables B ikm, \(i,j,k \in \mathcal {N}\), equal to the amount of flow originated at node i and passing through bridge arc (k, m). Let β denote the unit flow cost of bridge arcs, where β > α. A flow-based formulation can be stated as follows:

$$\displaystyle \begin{aligned} \begin{array}{rcl} \mbox{minimize} & &\displaystyle \sum_{i,k \in \mathcal{N}} \chi d_{ik} U_{ik} + \sum_{i,k,m \in \mathcal{N}}d_{km}\left( \alpha Y_{ikm} + \beta B_{ikm}\right) + \sum_{i,j,m \in \mathcal{N}} \delta d_{mj} X_{ijm} \\ \mbox{subject to} & &\displaystyle \mbox{(18.1)--(18.2), (18.4)--(18.7), (18.36), (18.37), (18.40), (18.41)} \\ & &\displaystyle U_{ik} + \sum_{m \in \mathcal{N}} \left( Y_{imk} + B_{imk}\right) \\ & &\displaystyle = \sum_{m \in \mathcal{N}} \left( Y_{ikm} + B_{ikm}\right) +\sum_{j \in \mathcal{N}} X_{ijk} \quad i, k \in \mathcal{N} {} \end{array} \end{aligned} $$
(18.52)
$$\displaystyle \begin{aligned} \begin{array}{rcl} & &\displaystyle B_{ikm} \leq M_{ikm} z_{k} \qquad i, k, m \in \mathcal{N} {} \end{array} \end{aligned} $$
(18.53)
$$\displaystyle \begin{aligned} \begin{array}{rcl} & &\displaystyle B_{ikm} \leq M_{ikm} z_{m} \qquad i, k, m \in \mathcal{N} {} \\ & &\displaystyle B_{ikm} \geq 0 \qquad i, k, m \in \mathcal{N}. \end{array} \end{aligned} $$
(18.54)

Constraints (18.52) correspond to the flow conservation equations for each origin i and potential hub node k. Note that the flow entering into a hub can come either directly from the origin i or from other hub nodes via hub arcs or bridge arcs. Similarly, the flow leaving the node can go either directly to destination nodes j or to other hub nodes via hub arcs and bridge arcs. Constraints (18.53) and (18.54) ensure that bridge arcs are used only between open hub nodes. This formulation contains O(n 3) variables and O(n 3) constraints.

Arc-based formulations can also be adapted for this class of HALPs. Given that O/D paths cannot longer be characterized by using only the routing variables x ijkm, as it is the case in HLPs with multiple assignments and HALPs with one-hub-arc O/D paths, we need to combine them with other variables to properly model O/D paths. In particular, we use the U ik and X ijm variables used in previous flow-based formulations to model the collection and distribution legs, respectively, together with the routing variables x ijkm that state whether the hub arc (k, m), and its associated discounted cost, is used to route the demand associated with node pair i, j. In addition, we define the (non discounted) routing variables b ijkm equal to one if and only if the flow originated at i and destination j uses bridge arc (k, m). Note that both x ijkm and b ijkm are required to properly model the transfer leg. An arc-based formulation can be stated as follows:

$$\displaystyle \begin{aligned} \begin{array}{rcl} \mbox{minimize} & &\displaystyle \sum_{i,j \in \mathcal{N}} W_{ij} \left( \sum_{k \in \mathcal{N}} \chi d_{ik} U_{ijk} + \sum_{k,m \in \mathcal{N}}d_{km}\left( \alpha x_{ijkm} + \beta b_{ijkm}\right) + \sum_{m \in \mathcal{N}} \delta d_{mj} X_{ijm}\right) \\ \mbox{subject to} & &\displaystyle \mbox{(18.7), (18.36) and (18.37)} \\ & &\displaystyle \sum_{k \in N} U_{ijk} = 1 \qquad i,j \in \mathcal{N} {} \end{array} \end{aligned} $$
(18.55)
$$\displaystyle \begin{aligned} \begin{array}{rcl} & &\displaystyle \sum_{m} X_{ijm} = 1 \qquad i,j \in \mathcal{N} {} \end{array} \end{aligned} $$
(18.56)
$$\displaystyle \begin{aligned} \begin{array}{rcl} & &\displaystyle U_{ijk} + \sum_{m \in \mathcal{N}} \left( x_{ijmk} + b_{ijmk}\right) \\ & &\displaystyle = \sum_{m \in \mathcal{N}} \left(x_{ijkm} + b_{ijkm}\right) +\sum_{j \in \mathcal{N}} X_{ijk} \quad i, j, k \in \mathcal{N} {} \end{array} \end{aligned} $$
(18.57)
$$\displaystyle \begin{aligned} \begin{array}{rcl} & &\displaystyle U_{ijk} \leq z_k \qquad i,j, k \in \mathcal{N} {} \end{array} \end{aligned} $$
(18.58)
$$\displaystyle \begin{aligned} \begin{array}{rcl} & &\displaystyle X_{ijm} \leq z_m \qquad i,j,m \in \mathcal{N} {} \end{array} \end{aligned} $$
(18.59)
$$\displaystyle \begin{aligned} \begin{array}{rcl} & &\displaystyle x_{ijkm} + x_{ijmk} \leq y_{km} \qquad i, j \in \mathcal{N}, (k,m) \in \mathcal{E} {} \end{array} \end{aligned} $$
(18.60)
$$\displaystyle \begin{aligned} \begin{array}{rcl} & &\displaystyle \sum_{m \in \mathcal{N}} b_{ijkm} \leq z_{k} \qquad i,j, k \in \mathcal{N} {} \end{array} \end{aligned} $$
(18.61)
$$\displaystyle \begin{aligned} \begin{array}{rcl} & &\displaystyle \sum_{k \in \mathcal{N}} b_{ijkm} \leq z_{m} \qquad i,j, m \in \mathcal{N} {} \\ & &\displaystyle U_{ijk}, Y_{ijk} \geq 0 \qquad i, j, k \in \mathcal{N}. \\ & &\displaystyle x_{ijkm}, b_{ijkm} \geq 0 \qquad i, j, k, m \in \mathcal{N}. \end{array} \end{aligned} $$
(18.62)

Constraints (18.55)–(18.57) correspond to the flow conservation equations for each node pair and potential hub. Constraints (18.58)–(18.59) forces the flow on the access arcs to be routed via open hubs. Constraints (18.60) ensure that discounted costs are perceived on the transfer leg only for the selected hub arcs, whereas (18.61) and (18.62) allow bridge arcs to be used only between open hub nodes. This formulation has O(n 4) variables and O(n 4) constraints.

Note that in none of the HLPs and HALPs discussed until now, there has been a need to add flow conservation constraints for the case of arc-based formulations. All previous formulations exploited in one way or the other the property (or assumption) that O/D paths can be characterized by the hubs to which origins and destinations are assigned to. Note that not only the required number variables has doubled, but also several additional constraints are need to model feasible O/D paths.

4.2 Specific Hub Network Topologies

We now focus on HNDPs that consider specific hub network topologies emerging from various applications in transportation and telecommunications. In particular, we study four topologies: star-start hub networks, tree-start hub networks, cycle-star hub networks, and hub line networks. We describe the main applications associated with these topologies and provide formulations that exploit their structure.

4.2.1 Star-Star Hub Networks

A start-start hub network consists of a set of hub nodes directly connected to a central hub node (i.e., a hub-level network is a star). Each O/D node is connected to a hub node, creating a set of stars at the access-level network (see Fig. 18.2a). Applications of such networks arise in the design of satellite communication networks (Helme and Magnanti 1989), where homing stations (hub facilities) containing an earth station and a local switch are used in combination with terrestrial and satellite links to connect node pairs. Nodes connected to the same homing station communicate through the local switch, whereas nodes connected to different homing stations use their assigned earth stations and the satellite. Other applications of start-start hub networks arise in the area of cargo delivery. Yaman (2008) provides a concrete application associated with one of the largest cargo delivery companies in Turkey, in which a star-star hub network with central hub located in Ankara is used. Commodities originated at a city are sent to a single hub. At the hub, cargo arriving from different cities are collected and sorted. If the destination is served by the same hub, the cargo is routed directly to its destination. Otherwise, the cargo is sent to a central hub facility where it is further routed to the hub of the destination and eventually to its destination. Hub arcs are served with higher capacity trucks or cargo airplanes.

Fig. 18.2
figure 2

Structure of (a) cycle-star, (b) star-star, (c) tree-star, and (d) line hub network

Let 0 denote the central hub which has already been located and let d k0 denote the unit flow cost between hub k and node 0. Assuming a star structure at the hub-level network simplifies, to some extent, the hub arc selection and routing decisions. For instance, arc selection decisions are determined by the location decisions—if a hub is located at node k, the hub arc (k, 0) will be activated. Moreover, exactly two possible paths exist to connect node pairs. On the one side, if two nodes i and j are assigned to the same hub k, then the flow from node i to node j will follow the path (i, k, j), containing only two access arcs and no hub arcs. On the other side, if node i is assigned to hub k and node j is assigned to hub m ≠ k, then the flow from i to j will follow the path (i, k, 0, m, j). That is, it will contain two access arcs and two hub arcs. This means that in order to compute the flow cost for each O/D pair we only need to know which type of path will be used. It is possible to exploit this feature to model star-star hub networks as follows. For each \(k \in \mathcal {N}\) and \(i,j \in \mathcal {N}\), i ≠ j, we define the variable u ijk equal to one if and only if one of the nodes i and j is assigned to hub k. That is, when u ijk is equal to one that means that flows between nodes i and j (in both directions) are routed on the hub arc (k, 0). Using these variables in combination with the location/allocation variables z ik we obtain the following formulation:

$$\displaystyle \begin{aligned} \begin{array}{rcl} \mbox{minimize} & &\displaystyle \sum_{k \in \mathcal{N}}f_{k}z_{kk}{+}\sum_{i,k \in \mathcal{N}}(\chi O_i {+} \delta D_i)d_{ik}z_{ik} {+} \sum_{k \in \mathcal{N}}\sum_{i,j \in \mathcal{N} : i\neq j} \left( W_{ij} {+}W_{ji} \right) \alpha d_{k0}u_{ijk} \\ \mbox{subject to} & &\displaystyle \mbox{(18.16)--(18.18)} \\ & &\displaystyle u_{ijk} \geq z_{ik} - z_{jk} \quad k \in \mathcal{N}, i,j \mathcal{N}, i \neq j {} \end{array} \end{aligned} $$
(18.63)
$$\displaystyle \begin{aligned} \begin{array}{rcl} & &\displaystyle u_{ijk} \geq z_{jk} - z_{ik} \quad k \in \mathcal{N}, i,j \mathcal{N}, i \neq j. {} \end{array} \end{aligned} $$
(18.64)

Note that constraints (18.63)–(18.64) are used to model the nonlinear term u ijk = |z ik − z jk| that makes u ijk variables equal to one whenever two O/D nodes are assigned to different hubs. This formulation contains O(n 3) variables and O(n 3) constraints.

An alternative formulation can be obtained by using projection inequalities similar to the ones introduced in Sect. 3.2 for HLPs with single assignments. In particular, for each \(j \in \mathcal {N}\) we define continuous variables y k equal to the amount of flow routed on hub arc (k, 0). We can then formulate this problem as:

$$\displaystyle \begin{aligned} \begin{array}{rcl} \mbox{minimize} & &\displaystyle \sum_{k \in \mathcal{N}}f_{k}z_{kk}+\sum_{i,k \in \mathcal{N}}(\chi O_i + \delta D_i)d_{ik}z_{ik} + \sum_{k \in \mathcal{N}} \alpha d_{k0}y_{k} \\ \mbox{subject to} & &\displaystyle \mbox{(18.16)--(18.18)} \\ & &\displaystyle y_{k} \geq \sum_{(i,j) \in K} \left( W_{ij}+W_{ji} \right) \left(z_{ik} - z_{jk}\right) \qquad k \in \mathcal{N}, K \subseteq \mathcal{N} \times \mathcal{N}, {} \end{array} \end{aligned} $$
(18.65)

were constraints (18.65) have the same interpretation as (18.25), i.e., to compute the amount of flow routed on hub arc (k, 0). This formulation has only O(n 2) variables but an exponential number of constraints.

4.2.2 Tree-Star Hub Networks

A tree-star hub network consists of a set of hub nodes connected via a spanning tree (i.e., hub-level network is a tree). Each O/D node is assigned to exactly one hub, creating a set of stars at the access-level network (see Fig. 18.2b). Potential applications of such networks arise in the design of digital data service networks (Lee et al. 1996), where private service networks are constructed for individual organizations by connecting customer sites to digital switching offices (hub facilities) with bridging capabilities. These hubs are connected with fiber optic links and given that there is a very high set-up cost associated with these links, service providers usually consider tree topologies to minimize the number of required links to provide connection services between customer sites. Other applications of tree-star hub networks arise in the design of rapid transit systems. Contreras et al. (2010) give a concrete example in the design of the high-speed train network in Spain, which has been designed with a tree structure and it is intended that, when finished, each city (O/D node) with more than 10,000 inhabitants will be within 50 km of some high-speed train station (hub facilities). Kim and Tcha (1992) provides additional applications of tree-star networks in the design of community access television systems (CATV).

Contrary to star-star topologies, arc selection and routing decisions become more involved in the case of tree-star topologies as these cannot be determined by the location/allocation decisions. In fact, even if the location of hubs and allocation of O/D nodes to hubs is given, the problem is still NP-hard as it reduces to the well-known optimum communication spanning tree problem (Hu 1974; Zetina et al. 2019). Before discussing formulations, we define the graph of flows \(\mathcal {G}_F = (\mathcal {N}, \mathcal {E}_{F})\), as the undirected graph with node set \(\mathcal {N}\) and an edge associated with each pair \((i,j) \in \mathcal {N} \times \mathcal {N}\) such that W ij + W ji > 0. We assume that \(\mathcal {G}_F\) is made up of a single connected component since otherwise the problem can be decomposed into several independent ones, one for each connected component in \(\mathcal {G}_F\). Whenever a particular application requires a single tree and the graph of flows contains more than one connected component, we can replace the flows of value zero with W ij = 𝜖 > 0 sufficiently small.

Hub arc variables y km used in HALPs can also be employed to construct the tree of hubs. Moreover, flow conservation constraints are explicitly included in formulations to model O/D paths. A flow-based formulation for the design of tree-star hub networks can be stated as follows:

$$\displaystyle \begin{aligned} \begin{array}{rcl} \mbox{minimize} & &\displaystyle \sum_{i,k \in \mathcal{N}} \left(\chi O_i + \delta D_i \right)d_{ik} z_{ik} + \sum_{i,k,m \in \mathcal{N}} \alpha d_{km}Y_{ikm} \\ \mbox{subject to} & &\displaystyle \mbox{(18.15)--(18.19), (18.41)--(18.42)} \\ & &\displaystyle \sum_{k \in \mathcal{N}} z_{kk} = p {} \end{array} \end{aligned} $$
(18.66)
$$\displaystyle \begin{aligned} \begin{array}{rcl} & &\displaystyle \sum_{(k,m) \in \mathcal{E}} y_{km} = p -1 {} \end{array} \end{aligned} $$
(18.67)
$$\displaystyle \begin{aligned} \begin{array}{rcl} & &\displaystyle z_{km} + y_{km} \leq z_{mm} \qquad (k,m) \in \mathcal{E} {} \end{array} \end{aligned} $$
(18.68)
$$\displaystyle \begin{aligned} \begin{array}{rcl} & &\displaystyle z_{mk} + y_{km} \leq z_{kk} \qquad (k,m) \in \mathcal{E}. {} \end{array} \end{aligned} $$
(18.69)

Constraints (18.66) and (18.67) ensure that exactly p hub nodes and p − 1 hub arcs are selected, respectively, which in combination with flow conservation constraints (18.15) guarantee that the selected p − 1 hub arcs define a single connected component associated with the p selected hubs, i.e., a tree spanning all hubs. Constraints (18.68) and (18.69) are a stronger version of the standard linking constraints (18.47) and (18.48). Finally, note that the assumption that the graph of flows \(\mathcal {G}_F\) contains a single connected component, together with (18.66), (18.67) and (18.15), eliminates the need for subtour elimination constraints. However, when direct connections are allowed between non-hub nodes the following set of constraints need to be included to obtain a valid formulation.

$$\displaystyle \begin{aligned} \sum_{(k,m) \in S \times S} y_{km} \leq \sum_{k \in S \setminus \left\lbrace s\right\rbrace } z_k \qquad \forall S \subseteq \mathcal{N}, s \in S. {} \end{aligned} $$
(18.70)

An arc-based formulation can also be used to design tree-star hub networks:

$$\displaystyle \begin{aligned} \begin{array}{rcl} \mbox{minimize} & &\displaystyle \sum_{i,k \in \mathcal{N}}(\chi O_i + \delta D_i)d_{ik}z_{ik} + \sum_{i,j,k,m \in \mathcal{N}} \alpha W_{ij}d_{km}x_{ijkm} \\ \mbox{subject to} & &\displaystyle \mbox{(18.16)--(18.18), (18.66)--(18.69)} \\ & &\displaystyle z_{ik} + \sum_{m \in \mathcal{N}} x_{ijmk} = \sum_{m \in \mathcal{N}} x_{ijkm} + z_{jk} \quad i, j, k \in \mathcal{N}. {} \end{array} \end{aligned} $$
(18.71)

Constraints (18.71) are the flow conservation equations stating that for each node pair (i, j) and potential hub k, the flow from i to j may enter node k either directly from its origin via access arc (i, k) or through another hub via a hub arc (m, k). Similarly, the flow may exit node k either directly to its destination via access arc (m, j) or through another hub via a hub arc (k, m).

4.2.3 Cycle-Star Hub Networks

A cycle-star hub network consists of a set of hub nodes connected with a set of hub arcs by means of a cycle. Each O/D node must be connected to exactly one hub node, creating a set of stars at the access-level network (see Fig. 18.2c). Potential applications of cycle-star hub networks arise in the design of telecommunication networks (Lee et al. 1993; Xu et al. 1999) where a number of tributary networks are connected to a backbone network via a set of hubs. Given the large set-up costs associated with the installation of a set of links, network planners usually consider the design of a network containing the minimum number of links. Although tree-star and line-star topologies are attractive network topologies for this goal, these may not be appropriate for telecommunications networks where there are requirements for the backbone network to guarantee the existence of at least one path between O/D nodes in case a backbone link fails. A cycle-star hub network ensures connectivity of the network in such disruptive scenario while minimizing the set-up cost. Additional applications arise in the design of rapid transit systems. Network planners may be interested in the extension of public transportation networks in a metropolitan areas by installing a circular rapid transit line, such as a subway, a tram or an express lane. Examples of circular lines are the Moscow Underground, the Melbourne Circular Tram Line, and some of the Montreal bus lines (e.g., 33, 55, and 470). In some situations, a cycle is desirable not only due to reliability requirements but also because it offers an alternative path which can considerably reduce the travel time between node pairs.

Similar to tree-star topologies, arc selection and routing decisions are more involved as these cannot be determined by location/allocation decisions. In fact, even if the location of hubs and allocation of O/D nodes is given, the problem is still NP-hard as it reduces to the so-called minimum flow cost Hamiltonian cycle problem (Ortiz-Astorquiza et al. 2015). Given that hub arcs are undirected and uncapacitated, for each pair of hub nodes there exist exactly two possible paths on the cycle and the flows associated with the O/D nodes allocated to such hubs will be routed through the least cost path containing an undetermined number of hub arcs. A flow-based formulation for the design of cycle-star hub networks can be stated as follows:

$$\displaystyle \begin{aligned} \begin{array}{rcl} \mbox{minimize} & &\displaystyle \sum_{i,k \in \mathcal{N}} \left(\chi O_i + \delta D_i \right)d_{ik} z_{ik} + \sum_{i,k,m \in \mathcal{N}} \alpha d_{km}Y_{ikm} \\ \mbox{subject to} & &\displaystyle \mbox{(18.15)--(18.19), (18.41)--(18.42), (18.68)--(18.69)} \\ & &\displaystyle \sum_{k \in \mathcal{N}} z_{kk} = p {} \end{array} \end{aligned} $$
(18.72)
$$\displaystyle \begin{aligned} \begin{array}{rcl} & &\displaystyle \sum_{(k,m) \in \mathcal{E}} y_{km} = p {} \end{array} \end{aligned} $$
(18.73)
$$\displaystyle \begin{aligned} \begin{array}{rcl} & &\displaystyle \sum_{(k,m) \in \mathcal{E}} y_{km} = 2z_k \qquad k \in \mathcal{N}. {} \end{array} \end{aligned} $$
(18.74)

Constraints (18.72)–(18.74) together with flow conservation equations (18.15) guarantee that the set of selected hub arcs form a cycle of hubs. Similar to tree-star hub networks, subtour elimination constraints (18.70) need to be incorporated when considering direct connections between non-hub nodes to obtain a valid formulation.

An arc-based formulation can be stated as follows:

$$\displaystyle \begin{aligned} \begin{array}{rcl} \mbox{minimize} & &\displaystyle \sum_{i,k \in \mathcal{N}}(\chi O_i + \delta D_i)d_{ik}z_{ik} + \sum_{i,j,k,m \in \mathcal{N}} \alpha W_{ij}d_{km}x_{ijkm} \\ \mbox{subject to} & &\displaystyle \mbox{(18.16)--(18.18), (18.71), (18.68)--(18.69), (18.72)--(18.74)}. \end{array} \end{aligned} $$

4.2.4 Hub Line Networks

A hub line network consists of a set of hub nodes connected by means of a path (or line). In this case, each O/D node can be assigned to more than one hub node, i.e., a multiple allocation pattern (see Fig. 18.2d). Potential applications for hub lines arise in public transportation planning, in particular in the design of rapid transit systems and highway networks (Martins de Sá et al. 2015). Network planners may consider the expansion of an existing network in a metropolitan region to improve users’ travel time by installing a rapid transit line, such as a subway, tram, or light rail line or an express bus lane. Hubs correspond to central stations such as subway, tram, bus or train stations. The aim is to minimize the total travel time between node pairs. Additional applications of hub line topologies appear in the design of road networks, where network planners may be interested in extending current road network in urban, suburban, or rural regions when constructing a new path-shaped highway or express lane. Hub nodes can be seen as a set of interchanges between highways and other existing roads (Lari et al. 2008).

Similar to tree-star and cycle-star hub networks, arc selection and routing decisions are not determined by the location decisions. Hub arc variables y km are needed to construct the (undirected) line of hubs. Also, flow variables U ijk and X ijk used in HALPs are required to model the routing at the access-level network. An arc-based formulation for the design of hub line networks is as follows:

$$\displaystyle \begin{aligned} \begin{array}{rcl} \mbox{minimize} & &\displaystyle \sum_{i,j} W_{ij} \left( \sum_{k \in \mathcal{N}} \chi d_{ik} U_{ijk} + \sum_{k,m \in \mathcal{N}}d_{km} \alpha x_{ijkm} + \sum_{m \in \mathcal{N}} \delta d_{mj} X_{ijm}\right) \\ \mbox{subject to} & &\displaystyle \mbox{(18.7), (18.55)--(18.56), (18.58)--(18.60)} \\ & &\displaystyle \sum_{k \in \mathcal{N}} z_{k} = p {} \end{array} \end{aligned} $$
(18.75)
$$\displaystyle \begin{aligned} \begin{array}{rcl} & &\displaystyle \sum_{(k,m) \in \mathcal{E}} y_{km} = p -1 {} \end{array} \end{aligned} $$
(18.76)
$$\displaystyle \begin{aligned} \begin{array}{rcl} & &\displaystyle \sum_{(k,m) \in \mathcal{E}} y_{km} \leq 2z_k \qquad k \in \mathcal{N} {} \end{array} \end{aligned} $$
(18.77)
$$\displaystyle \begin{aligned} \begin{array}{rcl} & &\displaystyle U_{ijk} + \sum_{m \in \mathcal{N}} x_{ijmk} = \sum_{m \in \mathcal{N}} x_{ijkm} + X_{ijk} \quad i, j, k \in \mathcal{N}. {} \end{array} \end{aligned} $$
(18.78)

Constraints (18.75)–(18.77) limit the number of hub nodes, hub arcs and degree of each hub node to at most two, respectively. Constraints (18.78) are the flow conservation equation which properly account for flows whenever a hub k is used. All these constraints together ensure that the hub-level is connected, forming a hub line. Similar to tree-star and cycle-star hub networks subtour elimination constraints (18.70) need to be added when considering direct connections between non-hub nodes to obtain a valid formulation.

5 Bibliographical Notes

The study of HLPs began with the pioneering work of O’Kelly (1986a), for continuous models, and O’Kelly (1986b), for discrete models. Campbell (1994a), Klincewicz (1998), and Bryan and O’Kelly (1999) provide early reviews and focus on classification schemes, fundamentals, and models with applications in the areas of telecommunications and air transportation. Campbell et al. (2001) wrote a comprehensive survey in which the location of hubs is the key decision. Alumur and Kara (2008) provided a classification scheme and review of the growing literature on network hub location models before 2008. Campbell and O’Kelly (2012) provided an insight into early motivations for analyzing hub location models and highlighted recent research directions. Contreras and O’Kelly (2019) wrote a concise overview of the main developments and most recent trends in hub location such as flow dependent discounted costs, capacitated models, uncertainty, dynamic and multi-modal models, and competition and collaboration.

In what follows, we highlight some of the most relevant references with respect to the development of mathematical models and solution algorithms for solving hub location and hub network design problems.

5.1 Hub Location Problems

O’Kelly (1987) provided the first formulation for a hub location problem with single assignments. O’Kelly formulated this problem as a discrete location problem with additional quadratic costs associated with the interaction of O/D nodes. The study of hub location models with multiple assignments originated in Campbell (1992), in which various formulations for this class of problems were presented.

The work of Skorin-Kapov et al. (1997) and Ernst and Krishnamoorthy (1996, 1998b) presented the first generation of tight arc-based formulations and useful flow-based formulations for single and multiple allocation variants of HLPs. However, the tightest arc-based formulation known so far for multiple assignment variants is the one independently introduced by Hamacher et al. (2004) and Marín (2005a). The former obtains this formulation by lifting facet-defining inequalities of the well-known uncapacitated facility location problem whereas the latter obtains the same set of facet-defining constraints as well as other facets by reformulating the problem as a set packing problem and identifying maximum cliques in an auxiliary graph. Boland et al. (2004) presented preprocessing procedures to reduce the number of variables and constraints for flow-based formulations, as well as some valid inequalities that improve LP relaxation bounds of capacitated variants.

Contreras et al. (2009b) used the Benders reformulation of Sect. 3.1 to develop and exact algorithm for uncapacitated HLPs with multiple assignments that, in combination with other algorithmic features such as preprocessing, a heuristic, and elimination tests, provided solutions for large-scale instances with up to 500 nodes. This Benders reformulation has also been extended to solve multi-level capacitated instances with up to 300 nodes (Contreras et al. 2012), and stochastic problems dealing with uncertainties in both demand flows and transportation costs (Contreras et al. 2011a).

Arc-based formulations have also been used to develop ad hoc solution algorithms for various HLPs with single assignments. Pirkul and Schilling (1998) use a Lagrangean relaxation (LR) in which constraints (18.16), (18.21)–(18.22) are relaxed to approximately solve p-hub median problems with single assignments. Contreras et al. (2009a) and Elhedhli and Wu (2010) use LRs in which constraints (18.21)–(18.22) are relaxed to solve capacitated HLP variants. Contreras et al. (2011b) use the LR of Contreras et al. (2009a) to solve integer restricted master problems containing a small subset of the x ijkm variables and generating more if needed with a column generation procedure. This lower bounding procedure is embedded within a branch-and-price algorithm to solve capacitated instances with up to 200 nodes. Contreras et al. (2010, 2017) presented some families of extended cut-set inequalities that can help improve the LP bounds associated with flow-based formulations. Labbé et al. (2005) developed a branch-and-cut algorithm that uses several families of projection inequalities to solve quadratic capacitated variants with up to 50 nodes.

Camargo and Miranda (2012) and Camargo et al. (2011) were the first works to introduce Benders reformulations for solving HLPs with single assignments. The authors use a hybrid outer-approximation / Benders decomposition algorithm for dealing with the nonlinearity caused by functions used to represent congestion at hubs. Contreras et al. (2021) recently used a Benders reformulation within a branch-and-cut framework to optimally solve uncapacitated and capacitated instances with up to 900 nodes.

5.2 Hub Network Design Problems

O’Kelly and Miller (1994) is the first work discussing the need of including additional design decisions in hub location models. The authors provide a classification of hub network topologies based on protocols that consider the allocation pattern of O/D nodes, the interconnection between hub nodes, and the possibility of allowing direct connections between O/D nodes.

Campbell et al. (2005a) introduced HALPs and provided a classification scheme for them that accounts for assumptions on hub-level network decisions, access-level network decisions, and O/D path decisions. In a follow-up paper, Campbell et al. (2005b) presented an exact enumeration-based algorithm to solve instances with up to 25 nodes and q = 6 for several classes of HALPs. Contreras and Fernández (2014) showed how a general class of HALPs can be stated as the minimization of a real-valued supermodular set function and developed a branch-and-cut algorithm using supermodular cuts to solve various particular cases of HALPs for instances with up to 125 nodes.

More complex HALPs have been studied where additional features need to be taken into account. In the context of air passenger transportation, Sasaki et al. (2014) study competitive HALPs with multiple assignments in a Stackelberg framework. Gelareh et al. (2010) deals with another competitive HALP with multiple assignments arising in liner shipping. The authors extend path-based formulations for this variant and present a LR algorithm to obtain bounds for instances with up to 20 nodes. Tanash et al. (2017) focus on HALPs with single assignments in which flow dependent costs are considered. The authors propose a branch-and-bound algorithm that uses an arc-based formulation to solve instances with up to 50 nodes. Gelareh and Nickel (2011) study HALPs with multiple assignments arising in urban transport and liner shipping. A Benders decomposition algorithm is proposed to solve the considered problem. A multi-period extension of this problem is presented in Gelareh et al. (2015), where a Benders decomposition is also used to solve it. Rothenbcher et al. (2016) deals with HALPs with multiple assignments in which there exist capacities on hub arcs. A branch-and-price algorithm that uses a path-based formulation is developed to solve the problem. The pricing of the path variables is NP-hard as it corresponds to solving a shortest path problem with resource constraints. Camargo et al. (2017) focus on HALPs with multiple assignments in which flow-dependent discounted flow costs and hop-constraints are considered. The authors present a Benders reformulation and a branch-and-cut algorithm to solve it. After a number of algorithmic features are employed to generate non-dominated cuts, instances with up to 80 nodes can be optimally solved.

In the case of star-star hub network topologies, Labbé and Yaman (2008) performed a polyhedral analysis and show that inequalities (18.63)–(18.65) are facet-defining. Using a LR algorithm based on the above formulation, the authors solved instances with up to 150 nodes. Tree-start hub networks seem to be more challenging to solve. Martins de Sá et al. (2013) presented a Benders decomposition algorithm that uses the arc-based formulation to optimally solve instances with up to 100 nodes. Contreras et al. (2017) developed a branch-and-cut algorithm for HNDPs with a cycle-star topology using a flow-based formulation in combination with a general class of mixed-dicut inequalities to solve instances with up to 100 nodes. In the case of hub lines, Martins de Sá et al. (2015) introduced a Benders reformulation based on an arc-based formulation and developed a branch-and-cut algorithm to solve instances with up to 100 nodes.

6 Conclusions and Perspectives

We have provided an overview of hub network design problems in which both location and arc selection are key decisions. We focused on the role network design and routing decisions play in the formulation and solution of various classes of hub network design problems of increasing complexity. We pointed out how the assumptions and properties presented in Sect. 2 simplify the network design decisions, giving rise to a first generation of hub location models dealing mostly with the location of hubs and the assignment of O/D nodes to open hubs. We have also highlighted how network design decisions become more involved when removing some of these assumptions, leading to the study of a second generation of models sharing more features to the more complex multi-commodity network design problems than to discrete facility location problems.

Although substantial progress has been made by researchers and practitioners in the area of hub network design, there is still significant work ahead. In many practical applications additional design and tactical decisions need to be taken into account to accurately model the associated systems. For instance, some applications require the design of more complex access networks that are no longer determined by an assignment pattern of O/D nodes to hubs. Klincewicz (1998) reviews various models arising in the design of telecommunications networks in which tributary trees are used. Yaman et al. (2007) considers a concrete application in cargo delivery systems in which multi-stop access paths visiting more than one O/D node in the way to a hub node are used to route commodities. Camargo et al. (2013), Rodríguez-Martín et al. (2014), Cardoso Lopes et al. (2016), and Kartal et al. (2017) among others, study models arising in freight transportation and express delivery in which collection, transfer or distribution tours have to be designed. The formulation and solution of such complex problems is far more challenging as compared to standard HLPS and even to HALPs.

Other applications, such as in airline and ground transportation, require additional design decisions associated with the nodes and served commodities (Alibeyg et al. 2016, 2018). For example, in the case of airline companies network planners have to design their network when entering into the market, or may have to modify already established networks through alliances, merges and acquisitions. The decisions are to determine the cities that will be part of their network and which O/D flights to activate in order to offer air travel services to passengers between cities so as to maximize the profit.

Finally, another interesting facet of hub networks which has been rarely studied is the integration of network design with scheduling decisions. Yaman et al. (2012) studies a concrete application arising in cargo delivery systems for next-day delivery in which the goal is to simultaneously design a hub network and to decide on the release times of trucks from each demand center so that the total cargo guaranteed to be delivered by the next day is above a threshold while minimizing the flow cost. Masaeli et al. (2018) study another model arising in parcel delivery systems in which the number of dispatches to operate on the hub network as well as the time period of dispatching each vehicle from a hub are taken into account while designing the hub network.