1 Introduction

Flexibility is generally viewed as a firm’s ability to match production to uncertain demand. A certain level of flexibility can curb the damage caused by uncertain demand, whereas lacking flexibility can result in significant loss. For example, Chryslers Neon-based PT Cruiser was a very fashionable model in 2000 and 2001. The dedicated plant in Toluca, Mexico, was not able to keep up with its demand, while the plant making the Neon in Belvidere, Illinois, was underutilized but not configured to build the PT Cruiser. The estimated loss was of $240 million in profit and another 0.5 points of market share in each of those years (Biller et al. 2006). Later, companies were moving from focused factories to flexible factories (Deng and Shen 2013). The Ford Motor Company, for example, invested $485 million in 2002 in two Canadian engine plants to retool them with a flexible system. The company also developed a plan to convert its engine and transmission plants worldwide into flexible systems. Similar initiatives have also been launched in companies like General Motors and Nissan.

In 1995, Jordan and Graves initiated a stream of research on supply chain flexibility by examining the economic benefits of chaining relative to full flexibility (Jordan and Graves 1995). After that, k-chain flexibility has become an important research topic. The efficiency of the long chain, or the sparse process structure in general, has been justified theoretically by Chou et al. (2010) which showed that the performance of 2-chain is close to 96 % that of the full flexibility system. Many studies have also reported that 2-chain is an ideal system. A comprehensive review of recently and significantly related literature is provided in Sect. 2. In the current research, we examine the efficiency of chaining structures under stochastic settings in both loss and queueing systems. The majority of existing studies are based on newsvendor settings (e.g., Chou et al. 2010). In many systems, orders arrive at random and production time is uncertain. However, few studies have been conducted on the chaining configuration in these stochastic systems. In addition, few works analyse queueing systems explicitly due to theoretical intractability. Moreover, the majority of the results from previous studies regarding flexibility system performance is obtained through simulation.

To analyse the effectiveness of k-chain, we apply the recent results on skill-based stochastic systems (Adan and Weiss 2012 and Visschers et al. 2012). By redefining the system state space, we are able to design efficient algorithms which can compute the performance of k-chain numerically in both loss systems and queueing systems. On the basis of our numerical studies, we observe that 2-chain is no longer effective in loss systems although it still performs well in queueing systems.

The paper is organized as follows. In Sect. 2, we review the most recent and important works on flexibility design. In Sect. 3, we describe our model and provide some basic properties. In Sect. 4 and 5, we design computation algorithms for both loss systems and queueing systems, and summarize our observations from numerical studies. Conclusions are made in Sect. 6.

2 Literature review

In this section, we review the most recent and important works on flexibility design. Jordan and Graves (1995) first examined the economic benefits of chaining relative to full flexibility, thus initiating a stream of research on supply chain flexibility. Graves and Tomlin (2003) then investigated various structures for achieving horizontal flexibility within a single supply chain level. Readers are referred to Chou et al. (2008) for reviews of process flexibility before 2008.

In 2010, significant theoretical progress in exploring efficiency of chaining structures has been made. Chou et al. (2010) showed that a simple chaining structure performs surprisingly well given a variety of realistic demand distributions, even when the system is large. They also identified a class of conditions under which a sparse flexible structure alone is needed so that expected performance is already within the optimality of the full flexibility system given general problems. Chou et al. (2010) examined how range and response dimension interact to affect the performance of the process flexible structure. Deng and Shen (2013) argued that effective flexible process structures are essentially highly connected graphs. They utilized the concept of graph expansion (a measure of graph connectivity) in obtaining various insights into this design problem.

Process flexibility is important in system design. Simchi-Levi and Wei (2012) developed a theory that explains the effectiveness of long chain designs for finite-size systems. Simchi-Levi and Wei (2015) determined that the long chain is superior to a mass of sparse flexibility designs. Simchi-Levi et al. (2013) noted that a 3-chain design is significantly more robust than a 2-chain one and achieves the same robustness as full flexibility does under high uncertainty levels. Furthermore, investment in process flexibility designs alters the optimal inventory placements. Désir et al. (2016) reported that a disconnected network with 2 n edges is optimal under this situation instead of a long chain, even for independent identical demand distributions. Wang and Zhang (2015) assessed the performance of a long chain under different demand distributions from a demand-distribution-free perspective. Iravani et al. (2005) focused on the strategic-level issues of how flexibility can be generated by using multi-purpose resources, such as cross-trained labour, flexible machines and flexible factories. Iravani et al. (2007) emphasised flexible service centres, such as inbound call centres with cross-trained agents, and model these centres as parallel queueing systems with flexible servers. A recent paper (Shi et al. 2015) developed a theory for the design of process flexibility for multi-period product systems and proved that any partial flexibility structure that satisfies Generalized Chaining Condition ( GCC ) is almost optimal under a class of policies. The authors also proposed that the performance of GCC structures can gain nearly as much as benefit as fully flexible system when the traffic intensity rate is fairly high.

The majority of works focus on symmetric systems. Nonetheless, some studies design process flexibility for unbalanced networks on the basis of chaining structure. Mak and Shen (2009) reported that the heavily advocated chaining heuristic can sometimes perform unsatisfactorily when resources are not perfectly flexible. Deng and Shen (2013) proposed additional flexibility design guidelines for unbalanced networks in which the numbers of plants and products are unequal by refining the well-known Chaining Guidelines. The results of an extensive computational study suggest that the refinements made by the researchers effectively determine flexible configurations with minimal shortfall in unbalanced networks. Besides the works mentioned above, many important studies have also been conducted on flexibility. Readers are referred to Chen et al. (2015), Hopp et al. (2005), Hopp et al. (2009), Iravani et al. (2003), Iravani et al. (2005), Iravani and Teo (2005), Iravani and Krishnamurthy (2007), Kula et al. (2004), Peltokorpi et al. (2015) and Sennott et al. (2006) for other related works.

The study of skill-based systems provides support for our analysis on system performance. Adan et al. (2010) studied an Erlang loss system with multi-type customers and servers and reported that the probabilities of assigning arriving customers to idle servers can be chosen in such a way that the Markov process describing the system is reversible, with a simple product-form stationary distribution. Visschers et al. (2012) examined a system with multi-type jobs and servers in which waiting jobs are served on a first-come-first-served (FCFS) basis, and arriving jobs that encounter several idle servers are assigned to a feasible server at random. These researchers suggested the existence of assignment probabilities under which a system displays product-form stationary distribution and develop explicit expressions for it. Adan and Weiss (2012) detected a simple explicit, steady-state distribution for a loss system with multi-type customers and skill-based servers under assignment to longest idle server (ALIS) policy. These researchers also report that this system is insensitive and that the results hold for general service time distributions as well. Adan and Weiss (2012) also established a product-form solution for two infinite multi-type sequences.

3 Model description

We model the stochastic systems with a set of servers which serve several types of customers. We denote the set of servers by \( \mathcal {M} \), and the number of servers by \( |\mathcal {M}| \). The system serves several types of customers, and we denote the set of customer types by \( \mathcal {C} \). Hence, we have \( |\,\mathcal {C}| \) types of customers. Each server- j can serve a subset of customer types, denoted by \( \mathcal {C}(j) \). Equivalently, each type- i customer can be served by a subset of servers \( \mathcal {M}(i) \), the union of which is \( \mathcal {M} \). This relationship can be specified by a bipartite graph \( \mathscr {G} = (\mathcal {C},\mathcal {M},E) \), where E denotes the edges of the graph which connects servers to the customer types they can serve. We define \( \mathcal {C}(S) = \bigcup _{j \in S} \mathcal {C}(j) \) as the total set of customer types which can be handled by the servers in \( S \subset \mathcal {M} \) and \( \mathcal {U}(S) = \overline{\mathcal {C}(\overline{S})} \) as the set of customer types which can be served only by the servers in S.

For any two graphs \( \mathscr {G} = (\mathcal {C},\mathcal {M},E) \) and \( \mathscr {G'} = (\mathcal {C'},\mathcal {M'},E') \), if \( \mathcal {C} = \mathcal {C'} \), \( \mathcal {M} = \mathcal {M'} \) and \( E \subset E' \), then \( \mathscr {G} \) is regarded as a spanning subgraph of \( \mathscr {G'} \), as denoted by \( \mathscr {G} \subset \mathscr {G'} \). Based on the definition of graphs, we obtain the following lemma. Lemma 1 basically states that if a network exhibits high flexibility, then each server can serve more types of customers and each customer can be assigned to more available servers.

Lemma 1

If \( \mathscr {G} \subset \mathscr {G}' \), then \( \mathcal {C}^{\mathscr {G}}(S) \subset \mathcal {C}^{\mathscr {G}'}(S) \) and \( \mathcal {U}^{\mathscr {G}}(S) \supset \mathcal {U}^{\mathscr {G}'}(S) \) for any \( S \subset \mathcal {M} \).

Proof

\( \forall \ i\in \mathcal {C}^{\mathscr {G}}(S) ,\exists \ j\in S \), such that the server- j can serve type- i customers. That is, the edge \( e_{ij} \in E \). Since \( \mathscr {G} \subset \mathscr {G'} \), we have \( E \subset E' \) according to the definition of spanning subgraph. Thus, \( e_{ij} \in E' \). So, \( i \in \mathcal {C}^{\mathscr {G'}}(S)\) and hence \( \mathcal {C}^{\mathscr {G}}(S) \subset \mathcal {C}^{\mathscr {G'}}(S) \). The proof of the latter part is similar since \( \mathcal {U}^{\mathscr {G}}(S) \supset \mathcal {U}^{\mathscr {G'}}(S) \) is equivalent to \( \mathcal {C}^{\mathscr {G}}(\overline{S}) \subset \mathcal {C}^{\mathscr {G'}}(\overline{S}) \). \(\square \)

We assume that arrivals are Poisson processes and service times are exponential. Type- i customers arrive according to an independent Poisson process with rate \( \lambda _i \). Server- j works independently with a rate \( \mu _j \). The arrival and service processes are mutually independent of each other. We introduce the notations \( \lambda _A = \sum _{i \in A}\lambda _i, \forall A \subset \mathcal {C} \) and \( \mu _S = \sum _{j \in S}\mu _j, \forall S \subset \mathcal {M} \). On the basis of Lemma 1, we then derive the following monotonicity properties. Proposition 1 suggests that more servers can take more demands, and less demands can be taken by other servers. Proposition 2 compares two graphs \( \mathscr {G} \) and \( \mathscr {G'} \) where \( \mathscr {G} \subset \mathscr {G'} \). Thus, \( \mathscr {G'} \) has more edges (links) than \( \mathscr {G} \) which means that the flexibility of \( \mathscr {G'} \) is higher. Given a set of servers S , they are capable of serving more customer types in \( \mathscr {G'} \). However, the number of customer types which can be served only by the servers in S decrease in \( \mathscr {G'} \).

Proposition 1

Monotonicity property 1: For a graph \( \mathscr {G} \) and any \( S \subset S' \subset \mathcal {M} \), we have

$$\begin{aligned} \lambda _{\mathcal {C}(S)} \le \lambda _{\mathcal {C}(S')} \end{aligned}$$
(1)
$$\begin{aligned} \lambda _{\mathcal {U}(S)} \le \lambda _{\mathcal {U}(S')}. \end{aligned}$$
(2)

Proposition 2

Monotonicity property 2: For any two graphs \( \mathscr {G} \subset \mathscr {G'} \) and \( S \subset \mathcal {M} \), we have

$$\begin{aligned} \lambda ^{\mathscr {G}}_{\mathcal {C}(S)} \le \lambda ^{\mathscr {G'}}_{\mathcal {C}(S)} \end{aligned}$$
(3)
$$\begin{aligned} \lambda ^{\mathscr {G}}_{\mathcal {U}(S)} \ge \lambda ^{\mathscr {G'}}_{\mathcal {U}(S)} \end{aligned}$$
(4)

In our stochastic networks, service discipline is a combination of FCFS and ALIS. Arriving customers which encounter more than one available server are assigned to the server that has been idle for the longest time. However, arriving customers which locate no available servers are either lost or join the queue. We refer to these two systems as loss systems and queueing systems.

The flexibility design in such stochastic networks involves deciding the flexibility of each server, that is, the subset of customer types which it can serve, \( \mathcal {C}(j) \). The flexibility level of server- j is defined as \( | \mathcal {C}(j) | \). The system with complete resource pooling displays full flexibility, whereas the system which performs dedicated service has level-1 flexibility. We denote any connected bipartite graph by \( \mathscr {G} \), the complete resource pooling graph by \( \mathscr {F} \) and the dedicated graph by \( \mathscr {D} \).

In this study, we focus on the case in which each server initially serves a dedicated type of customers. Hence, we have \( |\,\mathcal {C}| = |\mathcal {M}| \), let’s denote it as K. In particular, we are interested in a bipartite graph structure called tailored chaining. Tailored chaining with k-flexibility is called a k-chain, and any server- j can serve k types of customers starting from type- j and followed by type-\( (j+1),...,(j+k-1) \) (all numbers larger than K take modular on K ) in a k-chain configuration. All servers have the same flexibility as k. Figure 1 shows the bipartite graphs of dedicated, full flexibility and tailored chaining with 2-flexibility when \( K =3 \). In the dedicated system, one supplier can supply only one dedicated type of demand. In the tailored chaining (we can also call it 2-chain in this example) setting, each supplier can serve two types of demand. In the full flexibility system, a supplier can serve all types of demand. In the following sections, we study the k-chain efficiency in both loss systems and queueing systems, and compare the performance of a k-chain to the dedicated system and the system with full flexibility.

Fig. 1
figure 1

Structural system flexibility

4 Loss system

In the loss system, arriving customers who do not find available servers leave the system without being served. Customers who find more than one available server are assigned to the server which has been idle for the longest time. The percent of customers who leave without being served, i.e., loss percentage, is the main performance measure for the loss system.

We define the system state at time t as \( X(t) = s \), where \( s = (j_1,j_2,\dots ,j_k) \) is the list of idle servers at time t , ordered by their order of consecutive idle time. Thus, server \( j_1 \) has the longest idle time, and so on. Given this state definition, X(t) is a continuous-time, finite-state Markov chain under the ALIS policy with a stationary distribution (Adan and Weiss 2012):

$$\begin{aligned} \pi _{X}(j_1,\dots ,j_k) = \pi _{X}(\emptyset )\frac{\mu _{j_1}}{\lambda _{\mathcal {C}(j_1)}}\frac{\mu _{j_2}}{\lambda _{\mathcal {C}(\{j_1,j_2\})}}\cdots \frac{\mu _{j_k}}{\lambda _{\mathcal {C}(\{j_1,\dots ,j_k\})}}, \end{aligned}$$
(5)

which satisfies the partial balance equations

$$\begin{aligned} \pi _{X}(j_1,\dots ,j_k)\lambda _{\mathcal {C}(\{j_1,\dots ,j_k\})} = \pi _{X}(j_1,\dots ,j_{k-1})\mu _{j_k}, \end{aligned}$$
(6)

and for all \( m \ne {j_1,\dots ,j_k} \),

$$\begin{aligned} \pi _{X}(j_1,\dots ,j_k)\mu _m= \, & {} \pi _{X}(m,j_1,\dots ,j_k)\lambda _{\mathcal {C}(m)} + \pi _{X}(j_1,m,\dots ,j_k)\lambda _{\mathcal {C}(m)\setminus \mathcal {C}(j_1)}\nonumber \\&+ \dots +\pi _{X}(j_1,\dots ,j_k,m)\lambda _{\mathcal {C}(m)\setminus \mathcal {C} (\{j_1,\dots ,j_k\})}. \end{aligned}$$
(7)

Let \( \theta _i \) be the fraction of customers of type- i who are lost. According to the property of Poisson arrival see the average (PASTA), a customer of type- i is lost if the set of idle servers does not contain any server from \( \mathcal {M}(i) \). Hence,

$$\begin{aligned} \theta _i = \sum _{\{j_1,\dots ,j_k\} \cap \mathcal {M}(i) = \emptyset } \pi _{X}(j_1,\dots ,j_k). \end{aligned}$$
(8)

The total number of states is \( K!\sum _{k=0}^K \frac{1}{k!} \). A minimum computation time of \( O(K!\sum _{k=0}^K \frac{1}{k!}) \) is required to compute the steady-state distribution and performance measures, which seems to be a huge task.

However, in most cases, the information regarding server idle time is not used in performance analysis, although it is necessary for real-time operations. Therefore, we can redefine the system state at time t as \( Y(t) = S \), where \( S = \{j_1,j_2,\dots ,j_k\} \) is the set of idle servers at time t , for analysis purposes. We recall that \( s = (j_1,j_2,\dots ,j_k) \) represents the list of idle servers at time t , ordered by their order of becoming idle. Thus, s is merely one permutation out of all possible permutations. Let \( \mathcal {P}(S) \) be the set of all of the permutations of S , then we have

$$\begin{aligned} \pi _{Y}(S) = \pi _{Y}(\{j_1,j_2,\dots ,j_k\}) = \sum _{(\tilde{j}_1,\dots ,\tilde{j}_k) \in \mathcal {P}(S)} \pi _{X}(\tilde{j}_1,\dots ,\tilde{j}_k). \end{aligned}$$
(9)

By applying the partial balance equations (5) and (6), we obtain the recursion of steady-state probabilities

$$\begin{aligned} \begin{aligned} \pi _{Y}(S) \lambda _{\mathcal {C}(S)}&= \sum _{(\tilde{j}_1,\dots ,\tilde{j}_k) \in \mathcal {P}(S)} \pi _{Y}(\emptyset )\frac{\mu _{\tilde{j}_1}}{\lambda _{\mathcal {C}(\tilde{j}_1)}}\frac{\mu _{\tilde{j}_2}}{\lambda _{\mathcal {C}(\{\tilde{j}_1,\tilde{j}_2\})}}\cdots \frac{\mu _{\tilde{j}_k}}{\lambda _{\mathcal {C}(\{\tilde{j}_1,\dots ,\tilde{j}_k\})}}\times \lambda _{\mathcal {C}(\{\tilde{j}_1,\dots ,\tilde{j}_k\})} \\&= \sum _{(\tilde{j}_1,\dots ,\tilde{j}_k) \in \mathcal {P}(S)} \pi _{Y}(\emptyset )\frac{\mu _{\tilde{j}_1}}{\lambda _{\mathcal {C}(\tilde{j}_1)}}\frac{\mu _{\tilde{j}_2}}{\lambda _{\mathcal {C}(\{\tilde{j}_1,\tilde{j}_2\})}}\cdots \frac{\mu _{\tilde{j}_{k-1}}}{\lambda _{\mathcal {C}(\{\tilde{j}_1,\dots ,\tilde{j}_{k-1}\})}}\times \mu _{\tilde{j}_k} \\&=\sum _{j \in S} \pi _{Y}(S\setminus \{j\})\mu _j. \end{aligned} \end{aligned}$$
(10)

The loss percentage of type- i customers is given by

$$\begin{aligned} \theta _i = \sum _{S \cap \mathcal {M}(i) = \emptyset } \pi _{Y}(S), \end{aligned}$$
(11)

where

$$\begin{aligned} \pi _{Y}(S) = \pi _{\tilde{Y}}(S) = \pi _{\tilde{Y}}(\emptyset )\frac{\mu _{j_1}}{\eta _{j_1}(\{j_1\})}\frac{\mu _{j_2}}{\eta _{j_2}(\{j_1,j_2\})}\cdots \frac{\mu _{j_k}}{\eta _{j_k}(\{j_1,\dots ,j_k\})}, \end{aligned}$$
(12)

and \( \eta _{k}(S) \) can be recursively calculated by

$$\begin{aligned} \eta _{k}(S) = \lambda _{\mathcal {C}(S)} \left( 1 + \sum _{j \in S{\setminus} \{k\}} \frac{\eta _{j}(S{\setminus} \{k\})}{\eta _{k}(S{\setminus} \{j\})} \right) ^{-1}, \forall k \in S. \end{aligned}$$
(13)

Given this state space redefinition, the total number of states is reduced to \( 2^K \). When the recursion equation (10) is used, computation time is reduced to \( O( K^22^K ) \). As a result, the applicability of the results of Adan and Weiss (2012) to our case is enhanced.

In the following paragraphs, we investigate the symmetric case in which \( \lambda _i = \lambda \), \( \mu _j = \mu \). We define \( \rho = \lambda /\mu \). We apply loss rate \( \theta (K,k,\rho ) \) as a key system performance measure and study the effects of its parameters: system size K , flexibility k ( k-chain, \(1 \le k \le K \)) and traffic intensity \( \rho \). In a dedicated symmetric system (\( k=1 \)), each server can enter only two possible states: either the server is empty, or it is busy serving a customer. The probability that the server is empty is expressed as \( (1+\rho )^{-1} \), whereas the probability that the server is busy is determined by \( \rho (1+\rho )^{-1} \). In a complete resource pooling system(\( k=K \)), the probability that an arrival customer will be lost is equal to the probability that all servers are busy, known as Erlang loss.

Proposition 3

The performance of k-chain in loss systems:

  1. 1

    For the dedicated system, the percent of lost is \( \rho (1+\rho )^{-1} \).

  2. 2

    For the complete resource pooling system, the percent of loss is

    $$\begin{aligned} \frac{(K\rho )^K}{K!}\left( \sum _{k=0}^{K} \frac{(K\rho )^k}{k!} \right) ^{-1}. \end{aligned}$$
    (14)
  3. 3

    For other k-chain systems, loss percentage can be calculated by (11).

As a result of computation simplification, we can compute a system with a maximum of 16 servers and 16 types of customers. To compute larger size problems, one needs to further improve the computation algorithm (we leave this issue as a future research topic). For each system with size K , we compute the loss percentage of all possible k-chain configurations, i.e., \( k = 1, 2, \dots , K \). We also study three different scenarios: systems with light load (\( \rho = 0.5 \)), medium load (\( \rho = 1 \)) and heavy load (\( \rho = 2 \)). All numerical results are shown in Tables 1, 2 and 3. We summarize the numerical results in the following observations.

Table 1 Loss rate of k-chain configurations in Erlang Loss System when \( \rho = 1\)
Table 2 Loss rate of k-chain configurations in Erlang Loss System when \( \rho = 0.5\)
Table 3 Loss rate of k-chain configurations in Erlang Loss System when \( \rho = 2\)

Observation 1

Loss rate \( \theta (K,k,\rho ) \) is convex and decreases with system size K. In particular, the decrease is insignificant when k is small for all \( \rho \).

This observation coincides with economies of scale. In any k-chain, loss decreases when system size increases. However, the marginal decrease is reduced, as depicted in Fig. 2.

Similar to Simchi-Levi and Wei (2012), the authors observed that the performance of 2-chain improves with K, but the improvement converges to zero exponentially quickly. As shown in Table 1, given a small flexibility system, such as 2-chain, the benefit is insignificant as the system size increased. We can see when \( K \geqslant 5 \), loss percentage remains the same. The finding above implies that flexibility can be maintained in several separate sub-systems, that is, several short chains, to reduce organizational complexity. For example, the loss rate is approximately 0.3901 for a 2-chain system with size \( K=16 \) when \( \rho =1 \). If we separate the system into four independent sub-systems with size \( K=4 \), then the loss rate for the four systems is 0.3902. The system performance does not change remarkably. That is, the 2-chain is more effective for smaller size systems relative to full flexibility design. Thus, several small closed chains, where each chain connects a substantial number of plants and products, can perform just as well as the long chain.

Fig. 2
figure 2

Performance of k-chain configuration in a loss system with \( \rho = 1 \)

Fig. 3
figure 3

Performance of k-chain configuration in a loss system under different loads

Observation 2

Loss rate \( \theta (K,k,\rho ) \) is convex and decreases with k.

Generally, increasing the flexibility will reduce the loss. However, in all of the three cases, most of the increase in throughput is achieved by 2-chain since percentage of loss decreases significantly when k changes from 1 to 2. Higher orders of chaining increase throughput only marginally and in progressively diminishing amounts. These results suggest that total flexibility is generally unnecessary if increases in flexibility require massive investments.

Observation 3

The marginal benefit of increasing the flexibility is insignificant when traffic intensity is high.

Figure 3 shows that the marginal benefit is much smaller when traffic intensity is high. Take \( K=16 \) for example, comparing Table 2 and 3, it is obvious that adding flexibility is more effective in light load systems than systems with heavy load. (When \( \rho =0.5 \), loss percentage dropped from 0.3333 to 0.0045, or about 98.64 % as flexibility increases from 1 to 16. However, this rate is only 21.13 % when \( \rho =2 \).). Thus, there is less incentive to implement flexibility when system is under heavy load.

Observation 4

2-chain is no longer effective in the loss system.

Chou et al. (2010) shows that the performance of 2-chain is already close to 96 % of the full flexibility system based on newsvendor settings when \(\rho =1 \). Here, we use dedicated and complete resource pooling systems as benchmarks to evaluate k-chain efficiency as follow:

$$\begin{aligned} \psi _{k,K}(\rho ) = \frac{L_{1,K}(\rho )-L_{k,K}(\rho )}{L_{1,K}(\rho )-L_{K,K}(\rho )}, \end{aligned}$$
(15)

where \(L_{k,K}(\rho )\) is the customer loss rate in a loss system of size K with k-chain configuration and traffic intensity \(\rho \).

Table 4 Efficiency of k-chain (%) with different sizes in Erlang Loss System when \( \rho = 1\)

As can be seen from Table 4, when \( K=16 \), the performance of 2-chain can only reach one third of the full flexibility system. Several explanations have been offered as to why the properties which hold in the newsvendor model cannot be extended to the Erlang loss model: (1) Service times are uncertain which can increase system uncertainty. This situation cannot be handled by the flexibility design alone. (2) Dynamic real-time operations enhance loss. Meanwhile, orders can be held for a period and satisfied at the end of the period in the newsvendor model.

5 Queueing system

The queueing system described in this section is similar to the system presented above, except that customers will join the queue and wait for service if no servers are available upon their arrival. Visschers et al. (2012) analyzed and obtained product-form solutions for a continuous time Markov chain that describes a multi-type customers, multi-type servers queueing system under Assumption 1.

Assumption 1

[Assignment condition in Visschers et al. (2012)] For \(i=1,\dots \, K\), and for every subset \(\left\{ M_1,\dots \,M_i\right\} \in \mathcal {M} \), the following holds:

$$\begin{aligned} \prod _{j=1}^{i}\lambda _{M_j}\left( \left\{ M_1,\dots , M_{j-1}\right\} \right) = \prod _{j=1}^{i}\lambda _{\overline{M}_j}\left( \left\{ \overline{M}_1,\dots ,\overline{M}_{j-1}\right\} \right) \end{aligned}$$
(16)

for every permutation \(\overline{M}_1,\dots ,\overline{M}_i\) of \(M_1,\dots ,M_i\), where \(\lambda _{M_j}\left( \left\{ M_1,\dots , M_{j-1}\right\} \right) \) is the activation rate of machine \(M_j\) (See Visschers et al. (2012) for the definition and more details).

Remark 1

Assumption 1 is important for correctness of the analysis, because as Visschers et al. (2012) explicitly constructed examples where no product form solution exists when the random assignment probabilities are not chosen correctly.

Remark 2

As Visschers et al. (2012) conjectured the assignment probability distributions become less relevant when traffic intensity approaches 1. So the assumption on choosing a specific set of random assignment probabilities will not be restrictive and the product form solution will be a good approximation for general assignment probability distributions.

Under Assumption 1, Visschers et al. (2012) obtained the product-form solution:

$$\begin{aligned} \pi (s)=\alpha _i^{n_i}\dots \alpha _1^{n_1}\frac{\prod _{j=1}^{i}\lambda _{M_j}\left( \left\{ M_1,\dots , M_{j-1}\right\} \right) }{\prod _{j=1}^{i}\mu _{\left\{ M_1,\dots , M_j\right\} }}\pi (0), \end{aligned}$$
(17)

where \(s=(n_i, M_i, n_{i-1}, M_{i-1}, \dots , n_1, M_1)\) and \( \alpha _j = \lambda _{\mathcal {U}(\{M_1,\dots , M_j\})} \mu _{\{M_1,\dots , M_j\}}^{-1} \).

We define the state as \( (M_1,n_1,\dots ,M_i,n_i;M_{i+1},\dots ,M_K) \), that is, the state in which a system has i busy servers and \( K-i \) idle servers with corresponding numbers of customers waiting between the busy servers. \( M_1,\dots ,M_K \) is a permutation of \( 1,\dots ,K \). Servers \( M_1,\dots ,M_i \) serve customers with increasing arrival times, and \( n_j \) customers wait between servers \( M_j \) and \( M_{j+1} \), \(1 \le j \le i\). It should be noticed that the waiting customers between servers \( M_j \) and \( M_{j+1} \) can only be handled by the servers \( M_1 ,\dots , M_j\) and not by any of the servers \( M_{j+1} ,\dots , M_i\) or any of the idle servers. This is due to the FCFS processing order (see Fig. 4). Servers \( M_{i+1},\dots ,M_K \) are idle with increasing idle time. Similar to Adan and Weiss (2014), we consider the queueing system under FCFS-ALIS policy. The steady-state probability is determined by [see Adan and Weiss (2014)]

$$\begin{aligned}&\pi (M_1,n_1,\dots ,M_i,n_i;M_{i+1},\dots ,M_K) \nonumber \\&\quad = B \left( \prod _{j=1}^{i}\mu _{\{M_1,\dots , M_j\}} \prod _{j=i+1}^{K}\lambda _{\mathcal {C}(\{M_j,\dots , M_K\})}\right) ^{-1} \alpha _1^{n_1}\dots \alpha _i^{n_i}, \end{aligned}$$
(18)

where B is a normalization constant.

Fig. 4
figure 4

Graphical illustration of the system state in the queueing system

The marginal distribution of attaining server status \( \varvec{M}_i = (M_1,\dots ,M_i;M_{i+1}, \dots , M_K) \) is given by

$$\begin{aligned} \pi (\varvec{M}_i) = B \left( \prod _{j=1}^{i}\mu _{\{M_1,\dots , M_j\}} \prod _{j=i+1}^{K}\lambda _{\mathcal {C}(\{M_j,\dots , M_K\})}\right) ^{-1} (1-\alpha _1)^{-1}\dots (1-\alpha _i)^{-1}. \end{aligned}$$
(19)

Let \( \Phi _{ki} = \{j | 1 \le j \le i, k \in \mathcal {U}(\{M_1,\dots , M_j\})\} \). As aforementioned, \( n_j \) is customers waiting between servers \( M_j \) and \( M_{j+1} \). Here, we use \(n_{kj} \) and \(N_k\) to represent the type- k customers among \( n_j \) and the total type- k customer in the queue, respectively.

Conditioning on server status \( \varvec{M}_i = (M_1,\dots ,M_i;M_{i+1}, \dots ,M_K) \), similar to the discussion in Visschers et al. (2012), \( n_j \) is a geometric random variable with parameter \( \alpha _j \),

$$\begin{aligned} \mathbf {E}[n_j|\varvec{M}_i] = \frac{\alpha _j}{1-\alpha _j}; \end{aligned}$$
(20)

\( n_{kj} \) is a geometric random variable with parameter \( \eta _{kj} \) ,

$$\begin{aligned} \mathbf {E}[n_{kj}|\varvec{M}_i] = \frac{\eta _{kj}}{1-\eta _{kj}}, \end{aligned}$$
(21)

where

$$\begin{aligned} \eta _{kj} = \frac{\lambda _k}{\mu _{\{M_1,\dots , M_j\}} - \lambda _{\mathcal {U}(\{M_1,\dots , M_j\})} + \lambda _k}; \end{aligned}$$

and the expectation of \( N_k \) is expressed as

$$\begin{aligned} \mathbf {E}[N_k|\varvec{M}_i] = \sum _{j \in \Phi _{ki}} \frac{\eta _{kj}}{1-\eta _{kj}}. \end{aligned}$$
(22)

By law of total probability, we combine Eqs. (19) and (22) and obtain

$$\begin{aligned} \mathbf {E}[N_k] = B \sum _{\varvec{M}_i} \frac{\sum _{j \in \Phi _{ki}} \frac{\eta _{kj}}{1-\eta _{kj}}}{\prod _{j=1}^{i}(1-\alpha _j)\mu _{\{M_1,\dots , M_j\}} \prod _{j=i+1}^{K}\lambda _{\mathcal {C}(\{M_j,\dots , M_K\})}} \end{aligned}$$
(23)

By applying Little’s law, we compute the average waiting time for type- k customers, \( \mathbf {E}[N_k] = \lambda \mathbf {E}[W_k] \). Therefore, the average number of customers (waiting time) in the queue is given by

$$\begin{aligned} \mathbf {E}[N] = \sum _{k=1}^K \mathbf {E}[N_k], \quad \mathbf {E}[W] = \mathbf {E}[N]/\lambda _{\mathcal {C}}. \end{aligned}$$
(24)

Hence, we have

$$\begin{aligned} \mathbf {E}[N]&= B \sum _{k=1}^K \sum _{\varvec{M}_i} \frac{\sum _{j \in \Phi _{ki}} \frac{\eta _{kj}}{1-\eta _{kj}}}{\prod _{j=1}^{i}(1-\alpha _j)\mu _{\{M_1,\dots , M_j\}} \prod _{j=i+1}^{K}\lambda _{\mathcal {C}(\{M_j,\dots , M_K\})}} \end{aligned}$$
(25)
$$\begin{aligned}&= B \sum _{\varvec{M}_i} \frac{ \sum _{j = 1}^{i} \sum _{k \in \mathcal {U}(\{M_1,\dots , M_j\})} \frac{\eta _{kj}}{1-\eta _{kj}}}{\prod _{j=1}^{i}(1-\alpha _j)\mu _{\{M_1,\dots , M_j\}} \prod _{j=i+1}^{K}\lambda _{\mathcal {C}(\{M_j,\dots , M_K\})}}. \end{aligned}$$
(26)

Proposition 4

The average waiting time in the queue system is given by

$$\begin{aligned} \mathbf {E}[W] = B \lambda _{\mathcal {C}}^{-1} \sum _{\varvec{M}_i} \frac{ \sum _{j = 1}^{i} \sum _{k \in \mathcal {U}(\{M_1,\dots , M_j\})} \frac{\eta _{kj}}{1-\eta _{kj}}}{\prod _{j=1}^{i}(1-\alpha _j)\mu _{\{M_1,\dots , M_j\}} \prod _{j=i+1}^{K}\lambda _{\mathcal {C}(\{M_j,\dots , M_K\})}}. \end{aligned}$$
(27)

In particular, if all \( \lambda _i = \lambda \) and \( \mu _j = \mu \). Let \( \rho := \lambda /\mu \). The formula can be simplified as

$$\begin{aligned} \mathbf {E}[W]&= B \lambda _{\mathcal {C}}^{-1} \sum _{\varvec{M}_i} \frac{\sum _{j = 1}^{i} \sum _{k \in \mathcal {U}(\{M_1,\dots , M_j\})} \frac{\rho }{j(1-\alpha _j)}}{\prod _{j=1}^{i} j\mu (1-\alpha _j) \prod _{j=i+1}^{K}\lambda _{\mathcal {C}(\{M_j,\dots , M_K\})}}, \end{aligned}$$
(28)

where \( \alpha _j = \lambda _{\mathcal {U}(\{M_1,\dots , M_j\})}/j\mu \).

In the following paragraphs, we consider the symmetric case where all \( \lambda _i = \lambda \) and \( \mu _j = \mu \). The dedicated system is similar to K parallel and independent M/M/1 queues. The average number of customers in the system is denoted by \( K\rho (1-\rho )^{-1} \). For each type of customer, the average response time (the sum of both waiting and service times) is \( 1/(\mu - \lambda ) \), and the average waiting time is \( 1/(\mu - \lambda ) - 1/\mu = \rho /(\mu - \lambda )\).

The full flexibility system is in fact an M/M/K queue. Let \( \rho ' := \lambda /K\mu \) and \( p_k \) denote the stationary probability that k customers are in the system. The system stationary probability is given by

$$\begin{aligned} p_k = {\left\{ \begin{array}{ll} p_0\frac{(K \rho ')^k}{k!},\quad k \le K\\ p_0\frac{K^K \rho '^k}{K!},\quad k \ge K \end{array}\right. } \end{aligned}$$
(29)

where

$$\begin{aligned} p_0 = \left[ \sum _{k=0}^{K-1} \frac{(K\rho ')^k}{k!} + \frac{(K\rho ')^K}{K!} \frac{1}{1-\rho '} \right] ^{-1}. \end{aligned}$$
(30)

The probability that an arriving customer will be forced to wait in the queue is

$$\begin{aligned} p_{K_+} = p_0 \frac{(K\rho ')^K}{K!} \frac{1}{1-\rho '}. \end{aligned}$$
(31)

The expected number of customers in the system is given by

$$\begin{aligned} E_K = K\rho ' + \frac{\rho '}{1-\rho '} p_{K_+}. \end{aligned}$$
(32)

Under the queueing system, arriving customers who encounter no available servers will join the queue. The waiting time (or delay) in this queue is the main performance measure. We denote \( W_{k,K}(\rho ) \) as the customer waiting time in a queueing system of size K with k-chain configuration and traffic intensity \( \rho \). Figure 5 shows the average waiting time given different levels of chaining configuration (or flexibility) and five different traffic intensity levels \( \rho = 0.6, 0.7, 0.8, 0.9 \) and 0.95. It is intuitive that the heavier traffic load is, the longer waiting time will be. Also, we can obtain that average waiting time drops significantly when flexibility is increased from 1 to 2. Table 5 provides additional details.

Table 5 Average waiting time of k-chain with different sizes and traffic loads
Fig. 5
figure 5

Performance of k-chain configuration in a queueuing system with \( K = 8 \)

Just like loss system, we use the following benchmark to evaluate k-chain efficiency:

$$\begin{aligned} \psi _{k,K}(\rho ) = \frac{\mathbf {E}[W_{1,K}(\rho )]-\mathbf {E}[W_{k,K}(\rho )]}{\mathbf {E}[W_{1,K}(\rho )]-\mathbf {E}[W_{K,K}(\rho )]}. \end{aligned}$$
(33)

Table 6 shows the efficiency of k-chain with different sizes and traffic loads. For a given K and \(\rho \), \( \psi _{k,K}(\rho ) \) increases with the flexibility k. As shown in Table 6, efficiency values are getting bigger from the left to right. For a given K and k, \( \psi _{k,K}(\rho ) \) increases with the traffic intensity \( \rho \). Take \( K=8 \) for example, 2-chain cannot achieve >90 % efficiency when \( \rho < 0.9 \). By contrast, 3-chain attains >93 % efficiency under all investigated traffic loads. A significant increment is observed from 2-chain to 3-chain or 4-chain; however, the marginal increment is very small when flexibility continues to increase. To facilitate efficient system operation, we suggest employing a 3-chain or 4-chain structure rather than a 2-chain one. Taking traffic intensity into consideration, 3-chain configuration may be enough when the system is under a heavy traffic load. As for systems with \( \rho =0.8 \) or \( \rho =0.9 \), 4-chain structure could be better. More details can be seen in Fig. 6. Our observation is similar to Shi et al. (2015), in which the authors proved that in a very different setting, the efficiencies of the chaining structures approach 100 % as traffic intensity approaches 1.

Table 6 Efficiency of k-chain with different sizes and traffic loads(%)
Fig. 6
figure 6

Efficiency of k-chain configuration in a queueing system with \( K = 8 \)

Figure 7 shows the average waiting time under k-chain configuration with an increase in system size K from 2 to 8 and when \( \rho =0.9 \). \( \mathbf {E}[W_{k,K}(\rho )] \) decreases with K to a certain limit. This result implies that when a large k-chain splits into several independent small chains, system performance does not vary significantly. For example, the average waiting time is approximately 1.91 for a 2-chain system with size \( K = 8 \). If we separate the system into two independent sub-systems with size \( K = 4 \), then the average waiting time for both systems is 2.39.

Fig. 7
figure 7

Average waiting time under k-chain configuration with \( \rho =0.9 \)

6 Conclusion

In this study, we investigate the k-chain efficiency in stochastic networks with uncertainties on both the demand and supply sides. We consider loss systems and queueing systems. The key performance measure for loss systems is the percentage of lost customers. Unlike Chou et al. (2010) which shows that the performance of 2-chain is close to 96 % that of the full flexibility system when \( \rho =1 \), we find that 2-chain is no longer effective in the loss system. This situation is even worse when traffic intensity becomes higher. One needs to increase flexibility to achieve better performance. The key performance measure for queueing systems is the average waiting time. In these systems, 2-chain performs rather well when traffic intensity is close to 1. This finding is consistent with the existing literature which shows that 2-chain is an ideal system. However, the efficiency of 2-chain drops when the traffic intensity gets lower. Thus, a higher flexibility level, e.g., 3-chain or 4-chain, is required.

Our work can be extended in a variety of ways. One possible research direction for future works is the discussion on asymmetric systems. In this research, we focus on the symmetric case in which all servers are identical, all customer types are identical, and the number of servers equals to the number of customer types. However, in reality, we are often faced with asymmetric systems where the demands are different for different customer types or the service rates are different for different servers. Another direction may involve investigating structural properties in more general system network design rather than focusing only on the k-chain configuration. In addition, one can develop better algorithms to compute the performance of large systems.