1 Introduction

In this paper, we consider a parallel network of \(N\) single-server queues. The speeds of the servers vary over time and are in addition mutually dependent. More specifically, we assume that these service speeds are governed by a single, irreducible, continuous-time Markov chain with a finite state space. For this network, we are interested in both the marginal and the joint workload processes for each of the queues, as well as the processes describing the virtual waiting time and the queue length. Stationary distributions for these processes are difficult to obtain, since the workload process pertaining to one queue as well as the virtual waiting-time and the queue length processes are correlated with the corresponding processes of the other queues. Our goal in this paper is to derive the heavy-traffic behaviour of the network by obtaining the limiting stationary distributions of the aforementioned processes. These results can serve as simple and accurate approximations when the network is heavily utilised or can be combined with known light-traffic results to obtain approximations for arbitrarily loaded systems (see, for example, [14]).

The study of this general network is motivated by the fact that multi-queue performance models with time-varying and mutually dependent service speeds find a wide variety of applications. An example is the field of wireless networks, where multiple users transmit data packets through a wireless medium at speeds that are typically varying over time and mutually dependent, for example due to phenomena such as ‘shadow fading’ (cf. [38]). Another such application constitutes an I/O subsystem of an application server (see, for example, [40]), in which the content of multiple I/O buffers is transferred to clients at varying and mutually dependent speeds, due to the varying level of congestion of the application server’s network connection. A final example is given by the phenomenon of garbage collection in multi-threaded computer systems (cf. [33]). Typically, when the total memory utilisation in such a system exceeds a certain threshold, the processing speeds of the threads are temporarily reduced and are as such mutually dependent.

Queueing models with service speeds that vary over time have received attention in multiple settings in the literature. In practice, service speeds may be dependent on factors such as the workload present in the system, which leads to the formulation of queues with state-dependent service rates; see, for example, [3] for an overview. Another branch of work on time-varying service speeds is that of service rate control, where the aim is to minimise waiting and capacity costs (for example [2, 16, 35, 41]) or to optimise a trade-off between service quality and service speed (for example [20]) based on the state of the system by dynamically varying the service speed. In our case, the service speeds depend on an external environment that is governed by a Markov process. Analyses of single-server queueing models with Markov-modulated service speeds can be found in [17, 27, 29, 30, 37]. However, none of these papers concern themselves with the derivation of heavy-traffic asymptotics. In this paper, we focus on a queueing network where the service speeds of all servers in the network are simultaneously governed by a single continuous-time Markov chain. This allows us to incorporate mutual dependencies between the service speeds into the model. Conceptually, there are no additional challenges in obtaining heavy-traffic results for the queueing network with multiple queues compared to the single-queue case, although deriving the results for the multi-queue case is more cumbersome at times.

We are mainly interested in the heavy-traffic asymptotics of the network of queues. The study of queues in heavy traffic was initiated by Kingman with a series of papers in the 1960s, starting with [24]; see [25] for an overview of these early results. These papers were largely focused on the use of Laplace transforms. In our case, however, Laplace transforms for the stationary distribution of the total workload process or even the workload process for a queue in isolation are hard to obtain. The workload process of a queue in isolation can in principle be modelled as a reflected Markov additive process (MAP). For the definition and an overview of the standard theory on MAPs, see [1, Section XI.2]. However, the stationary distribution of the workload process is not easily derived from that. For example, standard techniques such as relating the Laplace transforms of the stationary workload conditional on the states of the modulator to each other typically lead to a linear system with a number of equations smaller than the number of unknowns, defying straightforward solutions, as shown in [21]. Less straightforward computations might involve studying the singularities of the characterising matrix exponent pertaining to the reflected MAP (cf. [21]). In the past, stationary distributions for special cases of reflected MAPs have also been analysed by studying their spectral expansion (for example [28]) or by determining the boundary probabilities in terms of the solution of a generalised eigenvalue problem (for example [39]).

As it is not clear that the approach via Laplace transforms will work in our case, we will use a functional central limit theorem approach mainly developed by Iglehart and Whitt; see [43] for an overview. This is not always trivial; see for example [10, 26]. Heavy-traffic approximations for generalised Jackson networks were studied in [5, 15]. However, the model that we consider does not fall in the framework of generalised Jackson networks. Instead, we tailor more classical arguments for single-node systems to our setting. An advantage of our approach is that it can be extended to allow for variations or generalisations of our model. For example, it is assumed that the workload input processes of the queues are compound Poisson processes. As we will see in the sequel, however, our heavy-traffic analysis still works through completely under relaxed assumptions if Lemma 3.2 can be proved for this more general setting.

As we study networks with general service speeds, our model also captures a class of queues with service interruptions. Heavy-traffic asymptotics for single-server queues with vacations have been studied in [23]. Related but different problems are networks with interruptions, of which durations and frequency scale with the traffic intensity, and have been studied in [6, 23] and [43, Section 14.7]. As opposed to these models, our model allows the durations of consecutive service interruptions, which we assume to be independent of the traffic intensity, to be interdependent through the Markovian random environment (see also [8]), and the interruptions are not restricted to a point in time the queue empties.

For the network that we study in this paper, we find that the marginal workload, virtual waiting-time and queue length processes pertaining to a queue in isolation exhibit state-space collapse under heavy-traffic assumptions and have exponential limiting distributions. Moreover, we show that the limiting distribution of the joint workload process (as well as that of the virtual waiting-time and the queue length processes) corresponds to the stationary distribution of an \(N\)-dimensional semi-martingale reflected Brownian motion (SRBM) with state space \(\mathbb {R}_+^N\) (see, for example, [7, Theorem 6.2] for a definition). The reflection matrix corresponding to this SRBM is an identity matrix, so that positive conclusions about the existence of a stationary distribution can be drawn (cf. [18]). However, computing this distribution is challenging. The conditions needed for the stationary distribution to have a product form do not apply to our model, and results such as those of [11] seem hard to translate to our setting. In this paper, we therefore show how to use the numerical methods developed in [9] for steady-state analysis of multi-dimensional SRBMs to analyse the joint limiting distribution of the stationary workload process. This allows us to compute quantities such as the correlation coefficients between the marginal components.

The rest of this paper is organised as follows. Section 2 describes the model in detail, gives the necessary notation and gives several preliminary results. In Sect. 3, we derive the heavy-traffic limit for a properly scaled workload process and observe that the stationary distribution of the marginal workload processes converges to an exponential distribution. Section 4 extends these results to heavy-traffic limits for the virtual waiting-time and queue length processes. Finally, in Sect. 5, we study how one can compute the joint distribution of the limiting processes pertaining to the workloads, virtual waiting times and the queue lengths, by viewing these as SRBMs. By means of simulation results, we also show that the obtained heavy-traffic results give rise to accurate approximations for considerably loaded systems, which mark the usefulness of the heavy-traffic analysis that we perform from an application perspective.

2 Notation and preliminaries

In this section, we introduce the notation used in this paper, and we present several preliminary results. In the remainder of this paper, vectors and matrices are printed in bold face. Furthermore, \(\varvec{0}\) and \(\varvec{1}\) represent vectors of appropriate size where each of the elements are equal to zero and one, respectively.

2.1 Arrival processes

We study the heavy-traffic asymptotics of a network consisting of \(N\) parallel single-server queues \(Q_1, \ldots , Q_N\), each with its own dedicated arrival stream. Type-\(i\) customers arrive at \(Q_i\) according to a Poisson process with rate \(\lambda _i\) and have a service requirement distributed according to a random variable \(B_i\) with finite first two moments \(\mathbb {E}[B_i]\) and \(\mathbb {E}[B_i^2]\). In particular, we represent by \(B_{i,j}\) the service requirement of the \(j\)-th arriving type-\(i\) customer. We assume the service requirements of all customers to be mutually independent. Further, we denote by \(\{N_i(t), t>0\}\) a unit-rate Poisson process. Then, the cumulative workload that enters \(Q_i\) during the time interval \([0,t)\) is given by

$$\begin{aligned} V_i(\lambda _i t) = \sum _{j=1}^{N_i(\lambda _i t)} B_{i,j}, \end{aligned}$$

where the arrival rate is left as part of the argument, as this will prove to be useful for heavy-traffic scaling purposes in the sequel. In the remainder of this paper, we will refer to \(\{V_i(t), t \ge 0\}\) as the arrival process of \(Q_i\). The mean corresponding to this arrival process is given by \(m_{V,i} = \mathbb {E}[V_i(1)] = \mathbb {E}[B_i]\). Similarly, the variance is given by \(\sigma ^{2}_{V, i} = \text{ Var }[V_i(1)] = \mathbb {E}[N_i(1)]\text{ Var }[B_i]+\text{ Var }[N_i(1)]\mathbb {E}[B_i]^2 = \text{ Var }[B_i]+\mathbb {E}[B_i]^2 = \mathbb {E}[B_i^2]\). Note that the arrival process has stationary and independent increments, so that \(t^{-1}\mathbb {E}[V_i(t)] = m_{V,i}\) and \(t^{-1}\text{ Var }[V_i(t)] = \sigma ^{2}_{V, i}\) for any \(t>0\).

2.2 Cumulative service processes

The service speeds of the \(N\) servers serving \(Q_1, \ldots , Q_N\) may vary over time and are mutually dependent. More specifically, the joint process of these service speeds is modulated by a single irreducible, stationary, continuous-time Markov chain \(\{\varPhi (t), t \ge 0\}\) with finite state space \(\varvec{\mathcal {S}}\) and invariant probability measure \(\varvec{\pi } = (\pi _i)_{i\in \varvec{\mathcal {S}}}\). When this Markov chain resides in the state \(\omega \in \varvec{\mathcal {S}}\), the server of \(Q_i\) drains its queue at service rate \(\phi _{i}(\omega )\). We have as a consequence that the workload that the server of \(Q_i\) has been capable of processing during the time interval \([0,t)\) is represented by

$$\begin{aligned} C_i(t) = \int _{0}^t \phi _i(\varPhi (s)) \mathrm{d}s. \end{aligned}$$

We will also refer to the process \(\{C_i(t), t \ge 0\}\) as the cumulative service process of \(Q_i\). Note that, as the Markov process \(\{\varPhi (t), t \ge 0\}\) is in stationarity, the increments of the process \(\{C_i(t), t \ge 0\}\) are also stationary. The mean corresponding to the process \(\{C_i(t), t \ge 0\}\) is given by

$$\begin{aligned} m_{C,i} = \mathbb {E}[C_i(1)] = \int _{0}^1\sum _{\omega \in \varvec{\mathcal {S}}}\phi _i(\omega )\mathbb {P}\left( \varPhi (s) = \omega \right) \mathrm{d}s = \sum _{\omega \in \varvec{\mathcal {S}}} \phi _i(\omega ) \pi _\omega . \end{aligned}$$

Since the \(C_i\)-process has stationary increments, it holds that \(t^{-1}\mathbb {E}[C_i(t)] = m_{C,i}\) for any \(t>0\). We denote the asymptotic variance \(\lim _{t \rightarrow \infty } t^{-1}\text{ Var }[C_i(t)]\) by \(\sigma ^{2}_{C, i}\). Similarly, the long-run time-averaged covariance between the cumulative service processes of the servers at \(Q_i\) and \(Q_j\) is represented by \(\gamma ^{C}_{i, j} = \lim _{t \rightarrow \infty } \frac{1}{t} \text{ Cov }[C_i(t), C_j(t)]\). Computing expressions for \(\sigma ^{2}_{C, i}\) and \(\gamma ^{C}_{i, j}\) is not trivial. We focus on this problem in Sect. 5.2.

2.3 Scaling

A queue \(Q_i\) is said to be ‘stable’ if the expected amount of arriving work \(\lambda _i\mathbb {E}[B_i]\) per time unit is smaller than the average workload \(m_{C,i}\) that its server is capable of processing per time unit. Equivalently, \(Q_i\) is stable if its load, defined as \(\rho _i = \frac{\lambda _i\mathbb {E}[B_i]}{m_{C,i}}\), is less than one. We are interested in the performance of the network of queues in heavy traffic; i.e. the case for which the arrival rates \(\lambda _1, \ldots , \lambda _N\) are scaled so that \((\rho _1, \ldots , \rho _N) \rightarrow \varvec{1}\). For this purpose, it is convenient to introduce the index \(r\). In the \(r\)-th system, each arrival rate \(\lambda _i\) is taken so that \(\beta _i(1-\rho _i)^{-1} = r\), where the \(\beta _i\)-parameters control the rate at which the arrival rates are scaled by \(r\), while the series of service requirements \(B_{i,1}, B_{i,2}, \ldots \) and the \(C_i\)-processes are not scaled by \(r\). The heavy-traffic limit for any performance measure of the system corresponds to the limit \(r \rightarrow \infty \). We denote by \(\lambda _{i,r}\) the arrival rate of type-\(i\) customers corresponding to the \(r\)-th system, so that \(\lambda _{i,r} \rightarrow \frac{m_{C,i}}{\mathbb {E}[B_i]}\) when \(r \rightarrow \infty \). For notational convenience, we write for two functions \(f(r)\) and \(g(r)\) that \(f(r) = o(g(r))\) if \(\lim _{r \rightarrow \infty } f(r)/g(r) = 0\).

2.4 Functional central limit theorems for primitive processes

For purposes that will become clear in the sequel, we now state heavy-traffic limits for the primitive processes that are scaled in time by a factor \(r^2\). First, for the scaled arrival processes, we observe that \(\mathbb {E}[V_i(\lambda _{i,r} r^2 t)] = \lambda _{i,r}r^2\mathbb {E}[B_i]t\). As the arrival processes constitute independent renewal reward processes, the functional central limit theorem for renewal reward processes (see, for example, [43, Theorem 7.4.1]) implies that

$$\begin{aligned}&\Big \{\Big (\frac{V_1(\lambda _{1,r}r^2t)-\lambda _{1,r}r^2\mathbb {E}[B_1]t}{\sqrt{\lambda _{1,r}}r}, \ldots , \frac{V_N(\lambda _{N,r}r^2t)- \lambda _{N,r}r^2\mathbb {E}[B_N]t}{\sqrt{\lambda _{N,r}}r}\Big ), \nonumber \\&\qquad \qquad t\ge 0\Big \} \,{\buildrel d \over \rightarrow }\,\{\varvec{Z}_V(t), t\ge 0\} \end{aligned}$$
(1)

as \(r \rightarrow \infty \), where \(\{\varvec{Z}_V(t), t\ge 0\}\) is an \(N\)-dimensional Brownian motion with zero drift and covariance matrix \(\varvec{\Gamma }^V = \mathrm{diag}(\sigma ^{2}_{V, 1}, \ldots , \sigma ^{2}_{V, N})\).

Similarly, after observing that \(\mathbb {E}[C_i(r^2t)] = m_{C,i}r^2t\), it follows from results in [42] that the time-scaled cumulative service processes satisfy

$$\begin{aligned} \Big \{\Big (\frac{C_1(r^2t)-m_{C,1}r^2t}{r}, \ldots , \frac{C_n(r^2t)-m_{C,N}r^2t}{r}\Big ), t \ge 0\Big \} \,{\buildrel d \over \rightarrow }\,\{\varvec{Z}_C(t), t\ge 0\} \end{aligned}$$
(2)

as \(r \rightarrow \infty \), where \(\{\varvec{Z}_C(t), t\ge 0\}\) is an \(N\)-dimensional Brownian motion with zero drift and covariance matrix \(\varvec{\Gamma }^C\) with elements \(\Gamma ^C_{i,j} = \gamma ^{C}_{i, j}\). Alternatively, this result follows from the functional central limit theorem for MAPs obtained in [34, Theorem 3.4]. Using the results of [34], we will show how to obtain expressions for \(\gamma ^C_{i,j}\) in Sect. 5.2.

A heavy-traffic limit for the joint scaled net-input process now follows by combining (1) and (2) with the observation that \(\frac{\lambda _{i,r}r^2\mathbb {E}[B_i]t-m_{C,i}r^2t}{r} = \beta _im_{C,i}t\). In particular, this leads to

$$\begin{aligned} \Big \{\Big (\frac{V_1(\lambda _{1,r}r^2t)-C_1(r^2t)}{r}, \ldots , \frac{V_N(\lambda _{N,r}r^2t)-C_N(r^2t)}{r}\Big ), t \ge 0\Big \} \,{\buildrel d \over \rightarrow }\,\{\varvec{Z}(t), t\ge 0\} \end{aligned}$$
(3)

as \(r \rightarrow \infty \), where \(\{\varvec{Z}(t) = (Z_1(t), \ldots , Z_N(t)), t \ge 0\}\) is an \(N\)-dimensional Brownian motion with drift vector \(\varvec{\mu } = (-\beta _1m_{C,1}, \ldots , -\beta _Nm_{C,N})\) and covariance matrix

$$\begin{aligned} \varvec{\Gamma } = \mathrm{diag}\left( \frac{m_{C,1}}{\mathbb {E}[B_1]}\sigma ^{2}_{V, 1}, \ldots , \frac{m_{C,N}}{\mathbb {E}[B_N]}\sigma ^{2}_{V, N}\right) +\varvec{\Gamma }^C. \end{aligned}$$
(4)

2.5 Representations

Let \(\{\varvec{W}_r(t) = (W_{1,r}(t), \ldots , W_{N, r}(t)), t \ge 0 \}\) be the process that describes the workload in each queue of the \(r\)-th system at time \(t\) and let \(\varvec{W}_r = (W_{1,r}, \ldots , W_{N, r}) = \varvec{W}_r(\infty )\) denote the workload in the system in steady state. The processes \(\{\varvec{D}_{r}(t), t \ge 0 \}\) and \(\{\varvec{L}_{r}(t), t \ge 0 \}\) as well as \(\varvec{D}_{r}\) and \(\varvec{L}_{r}\) are similarly defined for the virtual waiting time (the delay faced by an imaginary customer arriving at time \(t\)) and the queue length (excluding the customer in service), respectively.

The workload \(W_{i, r}(t)\) present in \(Q_i\) at time \(t\) can be represented by the one-sided reflection of the net-input process \(\{V_i(\lambda _{i,r}t)-C_i(t), t \ge 0\}\), under the assumption that \(W_{i,r}(0) = 0\):

$$\begin{aligned} W_{i, r}(t)&= V_i(\lambda _{i,r}t)-C_i(t)-\inf _{s \in [0, t]}\left\{ V_i(\lambda _{i,r}s)-C_i(s)\right\} \nonumber \\&= \sup _{s \in [0,t]} \left\{ V_i(\lambda _{i,r}t)-V_i(\lambda _{i, r}s)-(C_i(t)-C_i(s))\right\} \!. \end{aligned}$$
(5)

As the joint cumulative service process \(\{(C_1(t), \ldots , C_N(t)), t \ge 0\}\) has stationary increments, it holds that \(\Big (C_1(t)-C_1(s), \ldots , C_N(t)-C_N(s)\Big ) \,{\buildrel d \over =}\,\Big (C_1(t-s), \ldots , C_N(t-s)\Big )\), where \(\,{\buildrel d \over =}\,\) means equality in distribution. Furthermore, since the arrival processes are independent, and compound Poisson processes have time-reversible increments, we also have that \(\Big (V_1(\lambda _{1,r}t)-V_1(\lambda _{1,r}s), \ldots , V_N(\lambda _{N,r}t)-V_N(\lambda _{N,r}s)\Big ) \,{\buildrel d \over =}\,\Big (V_1(\lambda _{1,r}(t-s)), \ldots , V_N(\lambda _{N,r}(t-s))\Big )\). Due to this, we have by (5) that \(\varvec{W}_{r}(t)\) satisfies

$$\begin{aligned} \varvec{W}_{r}(t)&\,{\buildrel d \over =}\,\Big (\sup _{s \in [0,t]} \left\{ V_1(\lambda _{1,r}(t-s))-C_1(t-s)\right\} , \ldots , \sup _{s \in [0,t]} \left\{ V_N(\lambda _{N,r}(t-s))\right. \\&\quad \left. -\,\,C_N(t-s)\right\} \Big ) \\&= \Big (\sup _{s \in [0,t]} \left\{ V_1(\lambda _{1,r}(s))-C_1(s)\right\} , \ldots , \sup _{s \in [0,t]} \left\{ V_N(\lambda _{N,r}(s))-C_N(s)\right\} \Big ). \end{aligned}$$

By letting \(t \rightarrow \infty \), this results in

$$\begin{aligned} \varvec{W}_r \,{\buildrel d \over =}\,\Big (\sup _{s\ge 0} \left\{ V_1(\lambda _{1, r} s) - C_1(s)\right\} , \ldots , \sup _{s \ge 0} \left\{ V_N(\lambda _{N, r} s) - C_N(s)\right\} \Big ). \end{aligned}$$
(6)

In this study, we are particularly interested in the distribution of the scaled workload \(\widetilde{\varvec{W}}_{r} = \frac{\varvec{W}_r}{r}\) (as well as the similarly defined scaled virtual waiting time \(\widetilde{\varvec{D}}_{r}\) and scaled queue length \(\widetilde{\varvec{L}}_{r}\)) in heavy traffic, i.e. as \(r \rightarrow \infty \). It is easily seen from (6) that the scaled workload can be written in terms of the similarly scaled net-input process. That is, after scaling time by a factor \(r^2\), we have

$$\begin{aligned} \widetilde{\varvec{W}}_{r} \,{\buildrel d \over =}\,\Big (\sup _{t\ge 0} \Big \{\frac{V_1(\lambda _{1, r} r^2t) - C_1(r^2t)}{r}\Big \}, \ldots , \sup _{t \ge 0} \Big \{\frac{V_N(\lambda _{N, r} r^2t) - C_N(r^2t)}{r}\Big \}\Big ). \end{aligned}$$
(7)

3 Heavy-traffic asymptotics of the workload

In this section, we derive the following heavy-traffic asymptotic result for the scaled workload \(\widetilde{\varvec{W}}_r\).

Theorem 3.1

For the scaled workload vector \(\widetilde{\varvec{W}}_r\), we have

$$\begin{aligned} \widetilde{\varvec{W}}_r \,{\buildrel d \over \rightarrow }\,\overline{\varvec{Z}}, \end{aligned}$$

as \(r \rightarrow \infty \), where \(\overline{\varvec{Z}} = (\overline{Z}_1, \ldots , \overline{Z}_N)\), \(\overline{Z}_i =\sup _{t\ge 0} \{Z_i(t)\}\), and \(Z_i(t)\) is as introduced in Sect. 2.

In order to prove this theorem, observe that, as opposed to the infinite-domain case, the supremum of càdlàg functions on a finite domain \([0,M)\), \(M \in \mathbb {R}_+\), is a continuous functional; see, for example, [43]. The proof uses this fact in combination with an additional result stated in Lemma 3.4. To prove Lemma 3.4, we first establish upper bounds of the tail probabilities of the suprema of the processes \(\{V_i(\lambda _{i,r}t)-\mathbb {E}[V_i(\lambda _{i,r})]t, t \ge 0\}\) and \(\{\mathbb {E}[C_i(1)]t - C_i(t), t \ge 0\}\) in Lemmas 3.2 and 3.3, respectively.

Lemma 3.2

For the arrival process \(\{V_i(\lambda _{i,r}), t \ge 0\}\) of \(Q_i\), we have that

$$\begin{aligned} \mathbb {P}\left( \sup _{t \in [0, T)} \{V_i(\lambda _{i,r}t)-\mathbb {E}[V_i(\lambda _{i,r})]t\} \ge x\right) \le \frac{\lambda _{i,r}\mathbb {E}[B_i^2]T}{x^2} \end{aligned}$$

for any \(r, x, T \in \mathbb {R}_+\).

Proof

As \(\{V_i(\lambda _{i,r}t)-\mathbb {E}[V_i(\lambda _{i,r})]t, t \ge 0\}\) is a right-continuous martingale, we have by Doob’s inequality (cf. [31, Theorem II.1.7]) that \(\mathbb {P}(\sup _{t \in [0, T)} \{V_i(\lambda _{i,r}t)-\mathbb {E}[V_i(\lambda _{i,r})]t\} \ge x) \le x^{-2}\sup _{t \in [0, T)} \{\text{ Var }[V_i(\lambda _{i,r}t)]\}\). Since \(\text{ Var }[V_i(\lambda _{i,r}t)] = \lambda _{i,r}\sigma ^{2}_{V, i}t\) is strictly increasing in \(t\), the lemma follows. \(\square \)

Lemma 3.3

For the cumulative service process \(\{C_i(t), t\ge 0\}\) pertaining to the server of \(Q_i\), there exist for every \(x, T \in \mathbb {R}_+\) a set of positive real constants \(c_1, c_2, c_3\) and \(c_4\) such that

$$\begin{aligned} \mathbb {P}\left( \sup _{t\in [0, T)} \{\mathbb {E}[C_i(1)]t - C_i(t)\} \ge x\right) \le \frac{c_1T}{x^2}+\frac{c_2}{T} + \frac{c_3T}{e^{c_4\sqrt{x}}}. \end{aligned}$$

Proof

The lemma is a consequence of Proposition 1 in [22]. Define \(h = \max _{\omega \in \varvec{\mathcal {S}}} \{\phi _i(\omega )\}\) and \(H(t) = ht-C_i(t)\). The process \(\{H(t), t \ge 0\}\) represents increments of the regenerative process \(\{h-\phi _i(\varPhi (t)), t \ge 0\}\) and regenerates for example every time the Markov process \(\{\varPhi (t), t \ge 0\}\) enters the reference state \(\omega = \varPhi (0)\). We denote the \(n\)-th of such regeneration times by \(T_n\). Furthermore, we define \(\gamma _n^* = \sup _{T_{n-1}\le t \le T_n} \{ H(t) - H(T_{n-1})\}\) and \(\nu _n = T_n-T_{n-1}\). Note that \(\nu _1, \nu _2, \ldots \) can be seen as i.i.d. samples from a random variable \(Y\), and represent return times of state \(\omega \) in the Markov chain \(\{\varPhi (t), t \ge 0\}\). Proposition 1 in [22] now implies that, for all \(x, T \in \mathbb {R}_+\), there exist positive real constants \(d_1, d_2, d_3\) and \(d_4\) such that

$$\begin{aligned} \mathbb {P}\left( \sup _{t\in [0, T)} \{\mathbb {E}[C_i(1)]t - C_i(t)\} > x\right) \le d_1\left( e^{-d_2\frac{x^2}{T}}+e^{-d_3 T} + Te^{-d_4\sqrt{x}}\right) , \end{aligned}$$
(8)

if \(\mathbb {E}[e^{\sqrt{\sup _{0 \le t \le Y}\{H(t)\}}}] < \infty \) and \(\mathbb {E}[e^{\sqrt{\gamma _n^*}}]<\infty \) for any \(n\in \mathbb {N}_+\). This statement follows by substituting the variables \(B_t\), \(b\) and \(Q(x)\) in [22, Proposition 1] by \(H(t)\), \(h-\mathbb {E}[C_i(1)]\) and \(\sqrt{x}\), respectively. To show that the necessary conditions hold in our case, observe that \(H(t)\) is non-decreasing in \(t\) and takes values from \([0,ht]\). By combining this with the fact that \(\sqrt{x} < \epsilon x+\frac{1}{\epsilon }\) for any \(x \ge 0\) and \(\epsilon > 0\), we have that \(\mathbb {E}[e^{\sqrt{\sup _{0 \le t \le Y}\{H(t)\}}}]=\mathbb {E}[e^{\sqrt{H(Y)}}] \le \mathbb {E}[e^{\sqrt{hY}}] < \mathbb {E}[e^{\epsilon h Y+\epsilon ^{-1}}] = e^{\epsilon ^{-1}}\mathbb {E}[e^{\epsilon hY}]\) for any \(\epsilon > 0\). As \(\gamma _n^* \le h \nu _n\) for any \(n>0\), similar computations yield that \(\mathbb {E}[e^{\sqrt{\gamma _n^*}}] < e^{\epsilon ^{-1}}\mathbb {E}[e^{\epsilon h Y}]\) for all \(n \in \mathbb {N}\) and any \(\epsilon >0\). Subsequently, note that the regeneration time \(Y\), which constitutes the return time of state \(\omega \) in the Markov chain \(\{\varPhi (t), t \ge 0\}\), can be decomposed into a period of time \(Y_1\) until the transition away from \(\omega \), and the following period \(Y_2\) until re-entry into state \(\omega \). The former period \(Y_1\) is exponentially distributed with a certain rate \(\alpha \), so that \(\mathbb {E}[e^{\epsilon h Y_1}] = \frac{\alpha }{\alpha -\epsilon h}\) for \(\epsilon < h^{-1}\alpha \). The latter period \(Y_2\) is easily seen to be stochastically smaller than a geometrically distributed random variable with the positive success parameter \(q = \min _{\omega ' \in \varvec{\mathcal {S}}\backslash \{\omega \}} \{\mathbb {P}(\varPhi (1) = \omega \mid \varPhi (0) = \omega ')\}\). Hence, \(\mathbb {E}[e^{\epsilon h Y_2}] \le \frac{q e^{\epsilon h}}{1-(1-q)e^{\epsilon h}}\) for \(\epsilon <-h^{-1}\log (1-q)\). As \(Y_1\) and \(Y_2\) are mutually independent, we thus have for \(0 < \epsilon < h^{-1}\min \{\alpha , -\log (1-q)\}\) that \(e^{\epsilon ^{-1}}\mathbb {E}[e^{\epsilon hY}] \le e^{\epsilon ^{-1}}\frac{\alpha }{\alpha -\epsilon h}\frac{q e^{\epsilon h}}{1-(1-q)e^{\epsilon h}} < \infty \), so that the necessary conditions are satisfied. The lemma now follows from (8) by noting that \(e^{-T} < T^{-1}\) for all \(T>0\) and taking \(c_1 = d_1d_2^{-1}, c_2 = d_1d_3^{-1}, c_3 = d_1\) and \(c_4 = d_4\). \(\square \)

Based on the results obtained in Lemmas 3.2 and 3.3, we now establish the final auxiliary result needed to prove Theorem 3.1. This result is summarised in the following lemma.

Lemma 3.4

The scaled net-input process \(\{\frac{V_i(\lambda _{i,r}r^2 t)-C_i(r^2t)}{r}, t>0\}\) corresponding to \(Q_i\) satisfies

$$\begin{aligned} \lim _{M\rightarrow \infty } \lim _{r \rightarrow \infty } \mathbb {P}\left( \sup _{t\ge M} \Big \{\frac{V_i(\lambda _{i,r}r^2 t)-C_i(r^2t)}{r}\Big \} \ge x\right) = 0 \end{aligned}$$

for all \(x,M \in \mathbb {R}_+\).

Proof

The first part of the proof is inspired by the proof of (20) in [32]. For any \(r\), let \(b_{i,r} = \frac{\mathbb {E}[V_i(\lambda _{i,r})] + \mathbb {E}[C_i(1)]}{2}\), so that \(b_{i,r} - \mathbb {E}[V_i(\lambda _{i,r})] = \mathbb {E}[C_i(1)] - b_{i,r} = \frac{m_{C,i}-\lambda _{i,r}\mathbb {E}[B_i]}{2} = \frac{1}{2}\beta _im_{C,i} r^{-1}\). Due to the subadditivity property of the supremum operator, we have for any \(M > 0\) that

$$\begin{aligned}&\mathbb {P}\left( \sup _{t\ge M} \Big \{\frac{V_i(\lambda _{i,r}r^2 t)-C_i(r^2t)}{r}\Big \} \ge x\right) \nonumber \\&\quad \le \mathbb {P}\left( \sup _{t\ge M} \Big \{\frac{V_i(\lambda _{i,r}r^2 t)-b_{i,r} r^2t}{r}\Big \} +\sup _{t\ge M} \Big \{\frac{b_{i,r} r^2t-C_i(r^2t)}{r}\Big \}\ge x\right) \nonumber \\&\quad \le \mathbb {P}\left( \sup _{t\ge M} \{V_i(\lambda _{i,r}r^2 t)-b_{i,r} r^2t\} \ge 0\right) + \mathbb {P}\left( \sup _{t\ge M} \{b_{i,r} r^2t-C_i(r^2t)\}\ge 0\right) \nonumber \\&\quad \le \sum _{j=0}^\infty \mathbb {P}\left( \sup _{t \in [2^jM, 2^{j+1}M)} \{V_i(\lambda _{i,r}r^2 t)-b_{i,r} r^2 t\} \ge 0\right) \nonumber \\&\qquad \quad + \sum _{j=0}^\infty \mathbb {P}\left( \sup _{t \in [2^jM, 2^{j+1}M)} \{b_{i,r} r^2 t-C_i(r^2t)\}\ge 0\right) \nonumber \\&\quad = \sum _{j=0}^\infty \mathbb {P}\left( \sup _{t \in [2^jr^2M, 2^{j+1}r^2M)} \{V_i(\lambda _{i,r} t)-\mathbb {E}[V_i(\lambda _{i,r})] t -\frac{1}{2}\beta _im_{C,i}r^{-1}t\} \ge 0\right) \nonumber \\&\qquad \quad + \sum _{j=0}^\infty \mathbb {P}\left( \sup _{t \in [2^jr^2M, 2^{j+1}r^2M)} \{\mathbb {E}[C_i(1)] t -C_i(t)-\frac{1}{2}\beta _im_{C,i}r^{-1} t\}\ge 0\right) \nonumber \\&\quad \le \sum _{j=0}^\infty \mathbb {P}\left( \sup _{t \in [0, 2^{j+1}r^2M)} \{V_i(\lambda _{i,r} t)-\mathbb {E}[V_i(\lambda _{i,r})] t\} \ge 2^{j-1}\beta _im_{C,i}rM\right) \nonumber \\&\qquad \quad + \sum _{j=0}^\infty \mathbb {P}\left( \sup _{t \in [0, 2^{j+1}r^2M)} \{\mathbb {E}[C_i(1)]t -C_i(t)\}\ge 2^{j-1}\beta _im_{C,i}rM\right) \nonumber \\&\quad \le \sum _{j=0}^\infty \frac{\lambda _{i,r}\mathbb {E}[B_i^2]2^{j+1}r^2M}{2^{2j-2}\beta _i^2m_{C,i}^2r^2M^2} + \sum _{j=0}^\infty \Big ( \frac{c_12^{j+1}r^2M}{2^{2j-2}\beta _i^2m_{C,i}^2r^2M^2}+\frac{c_2}{2^{j+1} m_{C,i}r^2M}\nonumber \\&\qquad \quad +\frac{c_32^{j+1}r^2M}{e^{c_4\sqrt{2^{j-1}\beta _im_{C,i}rM}}} \Big ) \end{aligned}$$
(9)

for certain positive constants \(c_1\), \(c_2\), \(c_3\) and \(c_4\). The second-to-last inequality follows by observing that the maximum value of \(-\frac{1}{2} \beta _i m_{C,i}r^{-1}t\) in the domain \(t \in [2^jr^2M, 2^{j+1}r^2M]\) equals \(-2^{j-1}\beta _im_{C,i}rM\) and by enlarging the intervals of the suprema to also include \([0, 2^jr^2M)\). The last inequality follows from Lemmas 3.2 and 3.3. Simplifying (9) leads to

$$\begin{aligned}&\mathbb {P}\left( \sup _{t\ge M} \Big \{\frac{V_i(\lambda _{i,r}r^2 t)-C_i(r^2t)}{r}\Big \} \ge x\right) \nonumber \\&\quad \le \frac{16(\lambda _{i,r}\mathbb {E}[B_i^2]+c_1)}{\beta _i^2m_{C,i}^2M} +\frac{c_2}{m_{C,i}r^2M}+\sum _{j=0}^\infty f_{i,j}(r,M), \end{aligned}$$
(10)

where \(f_{i,j}(r, M) = c_32^{j+1}r^2Me^{-c_4\sqrt{2^{j-1}\beta _im_{C,i}rM}}\). The lemma now follows from (10) by taking the limit \(r\rightarrow \infty \) and subsequently the limit \(M\rightarrow \infty \), if \(\lim _{r\rightarrow \infty } \sum _{j=0}^\infty f_{i,j}(r,M) = 0\). To show that this condition holds, observe that the derivative of \(f_{i,j}\) with respect to \(r\) reads \(\frac{\partial }{\partial r} f_{i,j}(r,M) = c_32^jrMe^{-h_{i,j}(M)\sqrt{r}}(4-h_{i,j}(M)\sqrt{r})\), where \(h_{i,j}(M) := c_4\sqrt{2^{j-1}\beta _im_{C,i}M}\). As a result, \(\frac{\partial }{\partial r} f_{i,j}(r,M)<0\) if and only if \(4-h_{i,j}(M)\sqrt{r}<0\). Due to the monotonicity of \(h_{i,j}(M)\) and \(\sqrt{r}\) in \(j\) and \(r\), respectively, there thus exist positive constants \(j_0\) and \(r_0\), so that \(\frac{\partial }{\partial r} f_{i,j}(r,M)<0\) for any \(j\ge j_0\) and \(r\ge r_0\). This results in the fact that \(\sup _{r \ge r_*} f_{i,j}(r, M) = f_{i,j}({r_*},M)\) for every \(r_* \ge r_0\). Hence, an upper bound for \(\sum _{j=0}^\infty f_{i,j}(r,M)\) when \(r \ge r_* \ge r_0\) is given by

$$\begin{aligned} \sum _{j=0}^\infty f_{i,j}(r,M)\! =\! \sum _{j=0}^{j_0-1} f_{i,j}(r,M)\, + \sum _{j=j_0}^\infty f_{i,j}(r,M) \le \sum _{j=0}^{j_0-1} f_{i,j}(r,M) \,+ \!\sum _{j=j_0}^\infty f_{i,j}({r_*},M). \end{aligned}$$
(11)

When \(r \rightarrow \infty \), we can use (11) with \(r_*\) taken arbitrarily large so that

$$\begin{aligned} \lim _{r\rightarrow \infty } \sum _{j=0}^\infty f_{i,j}(r,M) \le \lim _{r \rightarrow \infty } \sum _{j=0}^{j_0-1} f_{i,j}(r,M) + \sum _{j=j_0}^\infty \lim _{r_* \rightarrow \infty } f_{i,j}({r_*},M). \end{aligned}$$

By observing that \(\lim _{r \rightarrow \infty } f_{i,j}(r,M) = 0\), this inequality reduces to \(\lim _{r\rightarrow \infty } \sum _{j=0}^\infty f_{i,j}(r,M) \le 0\). Since \(f_{i,j}(r,M)\ge 0\), it thus must hold that \(\lim _{r\rightarrow \infty } \sum _{j=0}^\infty f_{i,j}(r,M) = 0\), which concludes the proof. \(\square \)

Using these auxiliary results, we can now prove Theorem 3.1.

Proof of Theorem 3.1

By (7), it is enough to show that

$$\begin{aligned} \lim _{r \rightarrow \infty } \mathbb {P}\left( \bigcap _{i=1}^N \Big \{\sup _{t\ge 0} \Big \{\frac{V_i(\lambda _{i,r}r^2 t)-C_i(r^2t)}{r}\Big \} \ge x_i\Big \}\right) = \mathbb {P}\left( \bigcap _{i=1}^N \Big \{ \sup _{t\ge 0}\{Z_i(t)\} \ge x_i\Big \}\right) \end{aligned}$$
(12)

for all \(x_1, \ldots , x_N \ge 0\). We first obtain a lower bound for the left-hand side of (12):

$$\begin{aligned}&\lim _{r \rightarrow \infty } \mathbb {P}\left( \bigcap _{i=1}^N \Big \{\sup _{t\ge 0} \Big \{\frac{V_i(\lambda _{i,r}r^2 t)-C_i(r^2t)}{r}\Big \} \ge x_i\Big \}\right) \nonumber \\&\quad \ge \lim _{r \rightarrow \infty } \mathbb {P}\left( \bigcap _{i=1}^N \Big \{\sup _{t\in [0,M)} \Big \{\frac{V_i(\lambda _{i,r}r^2 t)-C_i(r^2t)}{r}\Big \} \ge x_i\Big \}\right) \nonumber \\&\quad =\mathbb {P}\left( \bigcap _{i=1}^N \Big \{ \sup _{t\in [0,M)}\{Z_i(t)\} \ge x_i\Big \}\right) \end{aligned}$$
(13)

for all \(M\in \mathbb {R}_+\), where the equality follows from (3) together with a combination of the continuous mapping theorem and the continuity property of the supremum operator applied to càdlàg-functions on the finite domain \([0,M)\). Next, to derive an upper bound for the left-hand side of (12), denote by \(E_{M,i}\) the event that

$$\begin{aligned} \sup _{t \in [0,M)} \Big \{\frac{V_i(\lambda _{i,r}r^2 t)-C_i(r^2t)}{r}\Big \} = \sup _{t \ge 0} \Big \{\frac{V_i(\lambda _{i,r}r^2 t)-C_i(r^2t)}{r}\Big \}, \end{aligned}$$

and let \(E_{M,i}^c\) be its complementary event. It is trivial to see that \(\mathbb {P}(\bigcap _{i=1}^N \{\sup _{t\in [0,M)} \{Z_i(t)\} \ge x_i\})\) is an upper bound for \(\lim _{r \rightarrow \infty } \mathbb {P}(\bigcap _{i=1}^N \{\sup _{t\ge 0} \{\frac{V_i(\lambda _{i,r}r^2 t)-C_i(r^2t)}{r}\} \ge x_i; E_{M,i}\})\) for all \(M \in \mathbb {R}_+\). Furthermore, we have that \(\sum _{i=1}^N \mathbb {P}(\sup _{t\ge M} \{\frac{V_i(\lambda _{i,r}r^2 t)-C_i(r^2t)}{r}\} \ge x_i)\) is an upper bound for \(\mathbb {P}(\bigcap _{i=1}^N \{\sup _{t\ge 0} \{\frac{V_i(\lambda _{i,r}r^2 t)-C_i(r^2t)}{r}\} \ge x_i\}; \bigcup _{i=1}^N E_{M,i}^c)\). Therefore, we obtain by using De Morgan’s law that

$$\begin{aligned}&\lim _{r \rightarrow \infty } \mathbb {P}\left( \bigcap _{i=1}^N \Big \{\sup _{t\ge 0} \Big \{\frac{V_i(\lambda _{i,r}r^2 t)-C_i(r^2t)}{r}\Big \} \ge x_i\Big \}\right) \nonumber \\&\quad \le \;\mathbb {P}\left( \bigcap _{i=1}^N \Big \{ \sup _{t\in [0,M)}\{Z_i(t)\} \ge x_i\Big \}\right) \nonumber \\&\qquad + \,\lim _{r \rightarrow \infty } \sum _{i=1}^N \mathbb {P}\left( \sup _{t\ge M} \Big \{\frac{V_i(\lambda _{i,r}r^2 t)-C_i(r^2t)}{r}\Big \} \ge x_i\right) . \end{aligned}$$
(14)

When \(M \rightarrow \infty \), the lower bound established in (13) converges to \(\mathbb {P}(\bigcap _{i=1}^N \{ \sup _{t\in [0,\infty )}\{Z_i(t)\} \ge x_i\})\). The upper bound found in (14) also converges to this expression, as the second term in the right-hand side of (14) vanishes due to Lemma 3.4. From this, (12) immediately follows, which proves the theorem. \(\square \)

Remark 3.1

The joint distribution of \(\overline{\varvec{Z}}\) is not straightforward to derive explicitly. However, explicit expressions for the marginal distribution of \(\overline{Z}_i\) are not hard to obtain. Note that \(\overline{Z}_i = \sup _{t \ge 0} Z_i(t)\) is the all-time supremum of a one-dimensional Brownian motion with negative drift \(-\beta _im_{C,i}\) and variance \(\frac{m_{C,i}}{\mathbb {E}[B_i]}\sigma ^{2}_{V, i}+\sigma ^{2}_{C, i}\). It is well known that the all-time supremum of a Brownian motion with negative drift \(-a\) and variance \(b\) is exponentially (\(\frac{2a}{b}\)) distributed. Therefore, the distribution of the steady-state scaled workload \(\widetilde{W}_{i,r}\) present in \(Q_i\) converges to an exponential distribution with rate \(2\beta _i\left( \frac{\sigma ^{2}_{V, i}}{\mathbb {E}[B_i]}+\frac{\sigma ^{2}_{C, i}}{m_{C,i}}\right) ^{-1}\) as \(r \rightarrow \infty \). In the next section, we will see that the limiting distributions of \(\widetilde{D}_{i,r}\) and \(\widetilde{L}_{i,r}\) only differ from the limiting distribution of \(\widetilde{W}_{i,r}\) by a multiplicative factor \(m_{C,i}^{-1}\) and \(\mathbb {E}[B_i]^{-1}\), respectively. As a result, the distributions of the steady-state delay \(\widetilde{D}_{i,r}\) and the steady-state queue length \(\widetilde{L}_{i,r}\) also converge to exponential distributions with rates \(2\beta _im_{C,i}\left( \frac{\sigma ^{2}_{V, i}}{\mathbb {E}[B_i]}+\frac{\sigma ^{2}_{C, i}}{m_{C,i}}\right) ^{-1}\) and \(2\beta _i\mathbb {E}[B_i]\left( \frac{\sigma ^{2}_{V, i}}{\mathbb {E}[B_i]}+\frac{\sigma ^{2}_{C, i}}{m_{C,i}}\right) ^{-1}\), respectively. We will study the derivation of the complete distribution of \(\overline{\varvec{Z}}\) in Sect. 5.3.

4 Extension to virtual waiting times and queue lengths

In Sect. 3, we derived a heavy-traffic limit theorem for the scaled workload vector \(\widetilde{\varvec{W}}_{r}\). In this section, we extend this result to heavy-traffic limits for the distributions of the virtual waiting-time vector \(\widetilde{\varvec{D}}_{r}\) and the queue length vector \(\widetilde{\varvec{L}}_{r}\) by considering the joint distribution of \(\widetilde{\varvec{D}}_{r}\) and \(\widetilde{\varvec{W}}_{r}\) as well as that of \(\widetilde{\varvec{L}}_{r}\) and \(\widetilde{\varvec{W}}_{r}\) in Sects. 4.1 and 4.2, respectively. It turns out that, when \(r \rightarrow \infty \), the distributions of both \(\widetilde{\varvec{D}}_{r}\) and \(\widetilde{\varvec{L}}_{r}\) are elementwise equal to the distribution of \(\widetilde{\varvec{W}}_{r}\) up to a multiplicative constant.

4.1 Heavy-traffic asymptotics of the virtual waiting time

We now study the distribution of the scaled virtual waiting time in heavy traffic. First, we obtain the tail probability of the joint distribution of \(\widetilde{\varvec{D}}_{r}\) and \(\widetilde{\varvec{W}}_{r}\) as \(r \rightarrow \infty \) in Proposition 4.1. Based on this, we obtain an extension of Theorem 3.1 for the scaled virtual waiting time in Corollary 4.2.

Proposition 4.1

The tail probability of the limiting joint distribution of \(\widetilde{\varvec{D}}_{r}\) and \(\widetilde{\varvec{W}}_{r}\) satisfies

$$\begin{aligned} \lim _{r \rightarrow \infty }&\mathbb {P}\left( \bigcap _{i=1}^N \Big \{\widetilde{D}_{i,r} \ge s_i; \widetilde{W}_{i,r} \ge t_i\Big \}\right) = \mathbb {P}\left( \bigcap _{i=1}^N \Big \{\overline{Z}_i \ge \max \{m_{C,i}s_i,t_i\}\Big \}\right) \end{aligned}$$

with \(\overline{Z}_1, \ldots , \overline{Z}_N\) as defined in Theorem 3.1.

Proof

Observe that since the waiting time faced by an imaginary type-\(i\) customer arriving at time \(u\) is longer than \(s_i\) time units, the workload present in \(Q_i\) just before \(u\) is larger than \(C_i(u+s_i)-C_i(u)\). This is evident, since the latter number represents the amount of work that the server of \(Q_i\) is able to process in the \(s_i\) time units following time \(u\). In other words, the event \(\{D_{i,r}(u)>s_i\}\) is tantamount to the event \(\{W_{i,r}(u)>C_i(u+s_i)-C_i(u)\}\) for \(i=1, \ldots , N\), so that in steady state (i.e. \(u \rightarrow \infty \)) we have

$$\begin{aligned}&\mathbb {P}\left( \bigcap _{i=1}^N \Big \{D_{i,r}>s_i; W_{i,r}>t_i\Big \}\right) =\mathbb {P}\left( \bigcap _{i=1}^N \Big \{W_{i,r}>\max \{C_i(s_i), t_i\}\Big \}\right) . \end{aligned}$$
(15)

Based on this, we obtain an expression for the tail probability of the joint distribution of \(\widetilde{\varvec{D}}_r\) and \(\widetilde{\varvec{W}}_r\):

$$\begin{aligned} \mathbb {P}\left( \bigcap _{i=1}^N \Big \{\widetilde{D}_{i,r} \ge s_i; \widetilde{W}_{i,r} \ge t_i\Big \}\right)&= \mathbb {P}\left( \bigcap _{i=1}^N \Big \{W_{i,r} \ge \max \{C_i(rs_i), rt_i\}\Big \}\right) \nonumber \\&= \mathbb {P}\left( \bigcap _{i=1}^N \Big \{\widetilde{W}_{i,r} \ge \max \Big \{\frac{C_i(rs_i)}{r}, t_i\Big \}\Big \}\right) , \end{aligned}$$
(16)

where we used (15) in the first equality. We now focus on showing that

$$\begin{aligned}&\lim _{r \rightarrow \infty } \mathbb {P}\left( \bigcap _{i=1}^N \Big \{\widetilde{W}_{i,r} \ge \max \Big \{\frac{C_i(rs_i)}{r}, t_i\Big \}\Big \}\right) = \mathbb {P}\left( \bigcap _{i=1}^N \Big \{\overline{Z}_i \ge \max \{m_{C,i}s_i,t_i\}\Big \}\right) , \end{aligned}$$
(17)

which, combined with (16), directly implies the result to be proved. To this end, we observe that, since \(\{C_i(t), t \ge 0\}\) is a renewal reward process, \(r^{-1}C_i(rs_i) \rightarrow m_{C,i}s_i\) almost surely as \(r \rightarrow \infty \) due to standard results in renewal theory. Denote by \(F_{i,r}^\epsilon \) for any \(\epsilon > 0\) the event that \(r^{-1}C_i(rs_i) \in [m_{C,i}s_i-\epsilon , m_{C,i}s_i+\epsilon ]\) and let \(F_{i,r}^{\epsilon , c}\) be its complementary event. Thus, \(\lim _{r \rightarrow \infty } \mathbb {P}(F_{i,r}^\epsilon ) = 1\). As a result, we have, due to De Morgan’s law, that

$$\begin{aligned}&\mathbb {P}\left( \bigcap _{i=1}^N \Big \{\widetilde{W}_{i,r} \ge \max \Big \{\frac{C_i(rs_i)}{r}, t_1\Big \}\Big \}\right) \\&\quad = \mathbb {P}\left( \bigcap _{i=1}^N \Big \{\widetilde{W}_{i,r} \ge \max \Big \{\frac{C_i(rs_i)}{r}, t_i\Big \};F_{i,r}^\epsilon \Big \}\right) + o(1). \end{aligned}$$

Letting \(r \rightarrow \infty \) in this expression, using the definition of the event \(F_{i,r}^\epsilon \) and applying Theorem 3.1, we obtain the following lower bound for the left-hand side of (17):

$$\begin{aligned}&\lim _{r \rightarrow \infty } \mathbb {P}\left( \bigcap _{i=1}^N \Big \{\widetilde{W}_{i,r} \ge \max \Big \{\frac{C_i(rs_i)}{r}, t_i\Big \}\Big \}\right) \ge \mathbb {P}\left( \bigcap _{i=1}^N \Big \{\overline{Z}_i \ge \max \{m_{C,i}s_i\!+\!\epsilon ,t_i\}\Big \}\right) . \end{aligned}$$
(18)

Similarly, an upper bound for the left-hand side of (17) is given by

$$\begin{aligned}&\lim _{r \rightarrow \infty } \mathbb {P}\left( \bigcap _{i=1}^N \Big \{\widetilde{W}_{i,r} \ge \max \Big \{\frac{C_i(rs_i)}{r}, t_i\Big \}\Big \}\right) \le \mathbb {P}\left( \bigcap _{i=1}^N \Big \{\overline{Z}_i \ge \max \{m_{C,i}s_i\!-\!\epsilon ,t_i\}\Big \}\right) . \end{aligned}$$
(19)

In Remark 3.1, we found that \(\overline{Z}_i\) is exponentially distributed for \(i=1,\ldots N\), so that the joint distribution of \(\overline{\varvec{Z}}\) has no discontinuity in the point \((m_{C,1}s_1, \ldots , m_{C,N}s_N)\). As a consequence, by taking the limit \(\epsilon \rightarrow 0\) in the right-hand sides of (18) and (19), we obtain (17), which, as explained above, proves the proposition. \(\square \)

From Proposition 4.1, the heavy-traffic limit for the virtual waiting time follows in the following corollary.

Corollary 4.2

For the scaled virtual waiting-time vector \(\widetilde{\varvec{D}}_r\), it holds that

$$\begin{aligned} \widetilde{\varvec{D}}_r \,{\buildrel d \over \rightarrow }\,\Big (\frac{1}{m_{C,1}}, \ldots , \frac{1}{m_{C,N}}\Big )\overline{\varvec{Z}}, \end{aligned}$$

as \(r \rightarrow \infty \), with \(\overline{\varvec{Z}}\) defined in Theorem 3.1.

Proof

This is an immediate result from Proposition 4.1 by taking \(t_1=\ldots =t_N\) \( = 0\). \(\square \)

4.2 The joint queue-length distribution

In this section, we obtain an extension of Theorem 3.1 for the scaled steady-state queue length \(\widetilde{\varvec{L}}_r\) in heavy traffic. Let \(B_{i,r}^R\) be the remaining service requirement of a type-\(i\) customer in service in the \(r\)-th system if \(L_{i,r}>0\), and zero otherwise. It is then trivially seen that

$$\begin{aligned} \varvec{W}_r = \left( B_{1,r}^R, \ldots , B_{N,r}^R\right) + \left( \sum _{j=1}^{L_{1,r}}\widehat{B}_{1,j}, \ldots , \sum _{j=1}^{L_{N,r}}\widehat{B}_{N,j}\right) \end{aligned}$$
(20)

for all \(i>0\), where \(\widehat{B}_{i,j}\) represents the service requirement of the waiting customer in the \(j\)-th waiting position of \(Q_i\) and is distributed according to \(B_i\). These service requirements are mutually independent as well as independent from \(\varvec{W}_r\) and \(\varvec{L}_r\). Note that \(\widehat{B}_{i,j}\) is defined differently from \(B_{i,j}\), which we defined in Sect. 2 to be the service requirement of the \(j\)-th arriving type-\(i\) customer since the start of the queueing process. The scaled version of (20) is given by

$$\begin{aligned} \widetilde{\varvec{W}}_r = \left( \widetilde{B}_{1,r}^R, \ldots , \widetilde{B}_{N,r}^R\right) + \frac{1}{r}\left( \sum _{j=1}^{r\widetilde{L}_{1,r}}\widehat{B}_{1,j}, \ldots , \sum _{j=1}^{r\widetilde{L}_{N,r}}\widehat{B}_{N,j}\right) , \end{aligned}$$
(21)

where \(\widetilde{B}_{i,r}^R = \frac{1}{r} B_{i,r}^R\) for \(i=1, \ldots , N\). It is intuitively tempting to conclude that \((\widetilde{B}_{1,r}^R, \ldots , \widetilde{B}_{N,r}^R) \rightarrow \varvec{0}\) as \(r \rightarrow \infty \), and based on that, conclude that \(\widetilde{\varvec{W}}_r\) and \(\widetilde{\varvec{L}}_r\) are equal elementwise up to a multiplicative constant. However, this is not straightforward, since, for example, \(\widetilde{\varvec{L}}_r\) and \((\widetilde{B}_{1,r}^R, \ldots , \widetilde{B}_{N,r}^R)\) are not independent. We make these results rigorous in this section. Inspired by [44, Proposition 1], we first obtain another representation for the joint distribution of \(\widetilde{L}_{i,r}\) and \(\widetilde{W}_{i,r}\) for a single queue \(Q_i\) in Lemma 4.3. Based on this result, we derive the heavy-traffic asymptotics for \((\widetilde{L}_{i,r}, \widetilde{W}_{i,r}, \widetilde{B}_{i,r}^R)\) in Lemma 4.4, which imply that \(\widetilde{B}_{i,r}^R \rightarrow 0\) as \(r \rightarrow \infty \). We subsequently conclude that \((\widetilde{B}_{1,r}^R, \ldots , \widetilde{B}_{N,r}^R) \rightarrow \varvec{0}\) as \(r \rightarrow \infty \) and derive the joint distribution of \(\widetilde{\varvec{L}}_r\) and \(\widetilde{\varvec{W}}_r\) as \(r \rightarrow \infty \) in Proposition 4.5. From this, an extension of Theorem 3.1 for the scaled queue length \(\widetilde{\varvec{L}}_r\) follows in Corollary 4.6.

In order to construct an additional representation for the joint distribution of \(\widetilde{L}_{i,r}\) and \(\widetilde{W}_{i,r}\), we need to introduce some additional notation. Denote by \(W_{i,n}^r\) and \(L_{i,n}^r\) the workload present in \(Q_i\) and the queue length of \(Q_i\), respectively in the \(r\)-th system, just before the \(n\)-th arrival of a type-\(i\) customer. Furthermore, \(A_{i,j}^r\) refers to the time between the \(j\)-th and the \((j+1)\)-st arriving type-\(i\) customer in the \(r\)-th system, so that \(S_{i, n}^{A, r} = \sum _{j=1}^n A_{i,j}^r\) and \(S_{i,n}^B = \sum _{j=1}^n B_{i,j}\) represent the cumulative series of interarrival times and service requirements of type-\(i\) customers. By construction of the heavy-traffic scaling, \(A_{i, j}^r \,{\buildrel d \over \rightarrow }\,A_{i,j}\) and \(\mathbb {E}[A_{i, j}^r] \rightarrow \mathbb {E}[A_{i, j}]\) as \(r \rightarrow \infty \), where \(A_{i, j}\) are i.i.d. samples from an exponential \(\left( m_{C,i}/\mathbb {E}[B_i]\right) \) distribution. Finally, we define \(S_{i,n}^r = S_{i,n}^B-C_i(S_{i,n}^{A,r})\). The required representation is now given in the following lemma.

Lemma 4.3

For any \(x, y>0\) and \(i= 1, \ldots , N\), the joint distribution of \(\widetilde{L}_{i,r}\) and \(\widetilde{W}_{i,r}\) satisfies

$$\begin{aligned} \mathbb {P}\left( \widetilde{L}_{i,r} \ge x; \widetilde{W}_{i,r} \ge y\right) =&\, \mathbb {P}\left( W_{i, r}+B_{i}\ge C_i(S_{i, \lceil rx \rceil }^{A,r}); \right. \nonumber \\&\left. r^{-1}\max \Big \{W_{i, r} + S_{i, \lceil rx \rceil }^r, \max _{j \in \{1, \ldots , \lceil rx \rceil \}}\{S^r_{i,\lceil rx \rceil }-S^r_{i, j}\}\Big \}\!\ge \! y\right) \!. \end{aligned}$$

Proof

The proof is inspired by [44, Proposition 1]. Observe that, for any \(k \ge 1\) and \(n \ge 1\), the event \(\{L_{i, n+k}^{r} \ge k\}\) coincides with the event that the workload the server at \(Q_i\) was capable of processing between the arrival of the \(n\)-th and \((n+k)\)-th customer, \(C_i(S_{i, n+k-1}^{A,r})-C_i(S_{i, n-1}^{A,r})\), does not exceed the amount \(W_{i, n}^r+B_{i,n}\) of work present in \(Q_i\) just after the arrival of the \(n\)-th customer. Hence, we have that

$$\begin{aligned} \{L_{i, n+k}^r \ge k\} = \{W_{i, n}^r+B_{i,n}\ge C_i(S_{i, n+k-1}^{A,r})-C_i(S_{i, n-1}^{A,r})\}. \end{aligned}$$
(22)

Moreover, due to Lindley’s recursion, which is given by \(W_{i, n+1}^r = \max \{W_{i, n}^r + S_{i, n}^r-S_{i, n-1}^r, 0\}\) or \(W_{i, n+k}^r = \max \{W_{i, n}^r + S_{i, n+k-1}^r-S_{i, n-1}^r, \max _{j \in \{0, \ldots , k-1\}}\{S_{i,n+k-1}-S_{i, n+j}\}\}\), we have for any \(y\ge 0\) that

$$\begin{aligned} \{W_{n+k}^r \ge y\} = \Big \{\max \Big \{W_{i, n}^r + S_{i, n+k-1}^r-S_{i, n-1}^r, \max _{j \in \{0, \ldots , k-1\}}\{S_{i,n+k-1}^r-S_{i, n+j}^r\}\Big \}\ge y \Big \}\!. \end{aligned}$$
(23)

By combining (22) and (23), taking the probabilities of these events, letting \(n \rightarrow \infty \) and observing that the vector \((L_{i, n}^r, W_{i, n}^r)\) weakly converges to \((L_{i,r}, W_{i,r})\), we obtain

$$\begin{aligned}&\mathbb {P}\left( L_{i, r} \ge k; W_{i,r} \ge y\right) \\&= \mathbb {P}\left( W_{i, r}+B_{i}\ge C_i(S_{i, k}^{A,r}); \max \Big \{W_{i, r} + S_{i, k}^r, \max _{j \in \{1, \ldots , k\}}\{S^r_{i,k}-S^r_{i, j}\}\Big \}\ge y\right) \!, \end{aligned}$$

for any \(k \ge 1, y \ge 0\). By noting that \(\mathbb {P}(\widetilde{L}_{i,r} \ge x, \widetilde{W}_{i,r} \ge y) = \mathbb {P}(L_{i,r} \ge \lceil rx \rceil , r^{-1}W_{i,r} \ge y)\), the desired statement follows immediately. \(\square \)

Based on Lemma 4.3, we derive the heavy-traffic asymptotics of \((\widetilde{L}_{i,r}, \widetilde{W}_{i,r}, \widetilde{B}_{i,r}^R)\) in the following lemma. This lemma directly implies that \(\widetilde{B}_{i,r}^R \rightarrow 0\) as \(r \rightarrow \infty \).

Lemma 4.4

For any queue, the scaled steady-state queue length, workload and remaining service requirement exhibit state-space collapse under heavy-traffic assumptions. In particular, we have that

$$\begin{aligned} (\widetilde{L}_{i,r}, \widetilde{W}_{i,r}, \widetilde{B}_{i,r}^R) \,{\buildrel d \over \rightarrow }\,\left( \frac{1}{\mathbb {E}[B_i]}, 1, 0\right) \overline{Z}_i \end{aligned}$$

as \(r \rightarrow \infty \) for any \(i\in \{1, \ldots , N\}\), with \(\overline{Z}_i\) defined in Sect. 2.

Proof

Again, the proof is inspired by [44, Proposition 1]. We first focus on the joint distribution of \(\widetilde{L}_{i,r}\) and \(\widetilde{W}_{i,r}\). Due to the strong law of large numbers, \(r^{-1}S_{i, \lceil rx \rceil }^{A,r} \rightarrow \mathbb {E}[A_{i,j}]x = \frac{\mathbb {E}[B_i]x}{m_{C,i}}\) almost surely as \(r \rightarrow \infty \). Moreover, \(t^{-1}C_i(t) \rightarrow m_{C,i}\) almost surely as \(t \rightarrow \infty \), so that

$$\begin{aligned} \frac{C_i(S_{i, \lceil rx \rceil }^{A,r})}{r} = \frac{C_i(S_{i, \lceil rx \rceil }^{A,r})}{S_{i, \lceil rx \rceil }^{A,r}}\frac{S_{i, \lceil rx \rceil }^{A,r}}{r} \rightarrow \mathbb {E}[B_i]x \end{aligned}$$
(24)

in probability as \(r \rightarrow \infty \). We further have, due to the weak law of large numbers, that \(r^{-1}S_{i,\lceil rx \rceil }^B \rightarrow \mathbb {E}[B_i]x\), so that \(r^{-1}S_{i, \lceil rx \rceil }^r \rightarrow 0\) and \(r^{-1}\max _{j \in \{1, \ldots , \lceil rx \rceil \}}\{S^r_{i,\lceil rx \rceil }-S^r_{i, j}\}\rightarrow 0\) as \(r \rightarrow \infty \). Let, for any \( \epsilon >0\), \(G_{i, r}^\epsilon \) denote the event

$$\begin{aligned} \{&r^{-1}C_i(S_{i, \lceil rx \rceil }^{A,r}) \!\in \! [\mathbb {E}[B_i]x-\epsilon ,\mathbb {E}[B_i]x+\epsilon ]; r^{-1}S_{i,\lceil rx \rceil }^B \in [\mathbb {E}[B_i]x-\epsilon ,\mathbb {E}[B_i]x+\epsilon ]; \\&\qquad r^{-1}S_{i, \lceil rx \rceil }^r \in [-\epsilon , \epsilon ]; r^{-1}\max _{j \in \{1, \ldots , \lceil rx \rceil \}}\{S^r_{i,\lceil rx \rceil }-S^r_{i, j}\} \in [0, \epsilon ]\}. \end{aligned}$$

Due to the convergence results above, \(\lim _{r \rightarrow \infty } \mathbb {P}(G_{i, r}^\epsilon ) = 1\) so that \(\mathbb {P}(\widetilde{L}_{i,r} \ge x; \widetilde{W}_{i,r} \ge y) = \mathbb {P}(\widetilde{L}_{i,r} \ge x; \widetilde{W}_{i,r} \ge y; G_{i, r}^\epsilon ) + o(1)\). After combining this with Lemma 4.3 and consequently taking the limit \(r \rightarrow \infty \), we obtain

$$\begin{aligned}&\lim _{r \rightarrow \infty }\mathbb {P}\left( \widetilde{W}_{i,r} \ge \max \{\mathbb {E}[B_i]x+\epsilon , y+\epsilon \}\right) \\&\qquad \le \lim _{r \rightarrow \infty }\mathbb {P}\left( \widetilde{L}_{i,r} \ge x; \widetilde{W}_{i,r} \ge y\right) \le \lim _{r \rightarrow \infty }\mathbb {P}\left( \widetilde{W}_{i,r} \ge \max \{\mathbb {E}[B_i]x-\epsilon , y-\epsilon \}\right) \!, \end{aligned}$$

since \(\widetilde{B}_{i} \rightarrow 0\) as \(r \rightarrow \infty \). By first applying Theorem 3.1 on the left-hand side and the right-hand side, next noting that the distribution of \(\overline{Z}_i\) has no discontinuity points (cf. Remark 3.1), and finally letting \(\epsilon \rightarrow 0\), we obtain

$$\begin{aligned} \lim _{r \rightarrow \infty } \mathbb {P}\left( \widetilde{L}_{i,r} \ge x; \widetilde{W}_{i,r} \ge y\right) = \mathbb {P}\left( \overline{Z}_i \ge \max \{\mathbb {E}[B_i]x, y\}\right) \!. \end{aligned}$$
(25)

It remains to consider the convergence of \(\widetilde{B}_{i,r}^R\). We show that \(\lim _{r \rightarrow \infty }\mathbb {P}(\widetilde{B}_{i,r}^R>\delta ) = 0\) for all \(\delta >0\), which finalises the proof of the desired statement. Note that due to representation (21), we have that \(\mathbb {P}(\widetilde{B}_{i,r}^R > \delta ) = \mathbb {P}(\widetilde{W}_{i,r} > \frac{1}{r} \sum _{j=1}^{r\widetilde{L}_{i,r}}\widehat{B}_{i,j}+\delta )\). Let \(H_{i,r}^\epsilon \) denote the event \(\{\frac{1}{n}\sum _{j=1}^n \widehat{B}_{i,j}\in (\mathbb {E}[B_i] - \epsilon , \mathbb {E}[B_i]+\epsilon ) \text{ for } \text{ all } n \ge \sqrt{r}\}\). By using the law of total probability and noting that \(\lim _{r \rightarrow \infty } \mathbb {P}(H_{i,r}^\epsilon ) = 1\) due to the weak law of large numbers, we thus have, similar to earlier calculations, that

$$\begin{aligned} \mathbb {P}\left( \widetilde{B}_{i,r}^R > \delta \right)&= \mathbb {P}\left( \widetilde{W}_{i,r} > \frac{1}{r} \sum _{j=1}^{r\widetilde{L}_{i,r}}\widehat{B}_{i,j}+\delta ; H_{i,r}^\epsilon \right) + o(1)\\&= \mathbb {P}\left( \widetilde{W}_{i,r} > \widetilde{L}_{i,r}\frac{1}{r\widetilde{L}_{i,r}} \sum _{j=1}^{r\widetilde{L}_{i,r}}\widehat{B}_{i,j}+\delta ; H_{i,r}^\epsilon \right) + o(1). \end{aligned}$$

By taking the limit \(r \rightarrow \infty \) and using the established convergence of \(\widetilde{L}_{i,r}\), we obtain

$$\begin{aligned} \lim _{r \rightarrow \infty } \mathbb {P}\left( \widetilde{W}_{i,r} > \widetilde{L}_{i,r}(\mathbb {E}[B_i]+\epsilon )+\delta \right)&\le \lim _{r \rightarrow \infty } \mathbb {P}\left( \widetilde{B}_{i,r}^R > \delta \right) \nonumber \\&\le \lim _{r \rightarrow \infty } \mathbb {P}\left( \widetilde{W}_{i,r} > \widetilde{L}_{i,r}(\mathbb {E}[B_i]-\epsilon )+\delta \right) \!. \end{aligned}$$

By letting \(\epsilon \rightarrow 0\) and noting, as before, that the limiting distribution of \(\widetilde{W}_{i,r}\) has no discontinuity points, this leads to \(\lim _{r \rightarrow \infty }\mathbb {P}(\widetilde{B}_{i,r}^R>\delta ) = \lim _{r \rightarrow \infty } \mathbb {P}(\widetilde{W}_{i,r} > \widetilde{L}_{i,r}\mathbb {E}[B_i]+\delta )\) for any \(\delta >0\). Observe that (25) implies that \(\lim _{r \rightarrow \infty } \mathbb {P}(\widetilde{W}_{i,r} > \widetilde{L}_{i,r}\mathbb {E}[B_i]+\delta ) = 0\) for any \(\delta >0\), which completes the proof. \(\square \)

Based on the previous results, we now obtain the limiting joint distribution of \(\widetilde{\varvec{L}}_r\) and \(\widetilde{\varvec{W}}_r\) in the following proposition.

Proposition 4.5

The tail probability of the limiting joint distribution of \(\widetilde{\varvec{L}}_{r}\) and \(\widetilde{\varvec{W}}_{r}\) satisfies

$$\begin{aligned}&\lim _{r \rightarrow \infty } \mathbb {P}\left( \bigcap _{i=1}^N \Big \{\widetilde{L}_{i,r} \ge s_i; \widetilde{W}_{i,r} \ge t_i\Big \}\right) = \mathbb {P}\left( \bigcap _{i=1}^N \Big \{\overline{Z}_i \ge \min \{\mathbb {E}[B_i]s_i,t_i\}\Big \}\right) \end{aligned}$$
(26)

with \(\overline{Z}_1, \ldots , \overline{Z}_N\) defined in Sect. 2.

Proof

Equation (21) implies that the event \(\{\widetilde{L}_{i,r} \ge s_i\}\) coincides with the event \(\{\widetilde{W}_{i,r} \ge \widetilde{B}_{i,r}^R+\frac{1}{r}\sum _{j=1}^{rs_i} \widehat{B}_{i,j}\}\), as the \(\widehat{B}_{i,j}\) can only take non-negative values. Thus, we have

$$\begin{aligned} \mathbb {P}\left( \bigcap _{i=1}^N \Big \{\widetilde{L}_{i,r} \ge s_i; \widetilde{W}_{i,r} \ge t_i\Big \}\right) =\mathbb {P}\left( \bigcap _{i=1}^N \Big \{\widetilde{W}_{i,r} \ge \max \{\widetilde{B}_{i,r}^R+\frac{1}{r}\sum _{j=1}^{rs_i} \widehat{B}_{i,j}, t_i\}\Big \}\right) \!. \end{aligned}$$

Let \(H_{i,r}^\epsilon \) be defined as before and recall that \(\lim _{r \rightarrow \infty } \mathbb {P}(\bigcap _{i=1}^N H_{i,r}^\epsilon ) = 1\), so that, due to the law of total probability,

$$\begin{aligned}&\mathbb {P}\left( \bigcap _{i=1}^N \Big \{\widetilde{L}_{i,r} \ge s_i; \widetilde{W}_{i,r} \ge t_i\Big \}\right) \\&\quad = \mathbb {P}\left( \bigcap _{i=1}^N \Big \{\widetilde{W}_{i,r} \ge \max \{\widetilde{B}_{i,r}^R+s_i\frac{1}{rs_i}\sum _{j=1}^{rs_i} \widehat{B}_{i,j}, t_i\}; H_{i,r}^\epsilon \Big \}\right) + o(1). \end{aligned}$$

Note that, according to Lemma 4.4, \(\widetilde{B}_{i,r}^R \rightarrow 0\) as \(r \rightarrow \infty \) for \(i=1, \ldots , N\), so that also \((\widetilde{B}_{1,r}^R, \ldots , \widetilde{B}_{N,r}^R) \rightarrow \varvec{0}\) as \(r \rightarrow \infty \). We thus obtain

$$\begin{aligned}&\lim _{r \rightarrow \infty } \mathbb {P}\left( \bigcap _{i=1}^N \Big \{\widetilde{W}_{i,r} \ge \max \{\mathbb {E}[B_i]+\epsilon , t_i\}\Big \}\right) \le \lim _{r \rightarrow \infty } \mathbb {P}\left( \bigcap _{i=1}^N \Big \{\widetilde{L}_{i,r} \ge s_i; \widetilde{W}_{i,r} \ge t_i\Big \}\right) \\&\qquad \qquad \qquad \qquad \qquad \le \lim _{r \rightarrow \infty } \mathbb {P}\left( \bigcap _{i=1}^N \Big \{\widetilde{W}_{i,r} \ge \max \{\mathbb {E}[B_i]-\epsilon , t_i\}\Big \}\right) . \end{aligned}$$

By taking the limit \(\epsilon \rightarrow 0\), an application of Theorem 3.1 and the notion that the distribution of \(\overline{\varvec{Z}}\) has no discontinuity points yield the desired result. \(\square \)

Corollary 4.6

For the scaled queue length vector \(\widetilde{\varvec{L}}_r\), it holds that

$$\begin{aligned} \widetilde{\varvec{L}}_r \,{\buildrel d \over \rightarrow }\,\Big (\frac{1}{\mathbb {E}[B_1]}, \ldots , \frac{1}{\mathbb {E}[B_N]}\Big )\overline{\varvec{Z}}, \end{aligned}$$

as \(r \rightarrow \infty \), with \(\overline{\varvec{Z}}\) defined in Sect. 2.

Proof

The desired statement follows immediately from Proposition 4.5 by taking \(t_1 = \ldots = t_N = 0\). \(\square \)

5 Application to a two-layered network

In this section, we apply the results obtained so far in this paper to a network that is inspired by a manufacturing application and fits the class of so-called layered queueing networks (see e.g. [1214]). We will also refer to this network as the two-layered network. We first describe the network in more detail in Sect. 5.1 and show that this particular model fits naturally in the general framework described in Sect. 2. Then, in Sect. 5.2, we study the question of how to compute the covariance matrix \(\varvec{\Gamma }\) of the \(N\)-dimensional Brownian motion \(\varvec{Z}\) based on this example. More specifically, we obtain expressions for the covariance terms \(\gamma ^{C}_{i, j}\), by using results from the literature on MAPs. We also compute the limiting distributions of \(\widetilde{\varvec{W}}_r\), \(\widetilde{\varvec{D}}_r\) and \(\widetilde{\varvec{L}}_r\). Doing so in an exact fashion turns out to be hard. Therefore, we study how to numerically obtain the limiting distributions, by viewing \(\overline{\varvec{Z}}\) as an \(N\)-dimensional SRBM in Sect. 5.3. Finally, in Sect. 5.4, we conclude by means of simulation that the distribution of \(\widetilde{\varvec{W}}_r\) converges quickly to the distribution of \(\overline{\varvec{Z}}\) as \(r \rightarrow \infty \), and therefore, that the heavy-traffic asymptotics constitute useful approximations for stable systems with a considerable load.

5.1 Description of the two-layered network

The two-layered network that we consider in this section is an extension of the machine-repair model (cf. [36, Chapter 5]) and consists of \(N\) machines \(M_1, \ldots , M_N\) as well as a single repairman \(R\), see Fig. 1. The second layer of this network constitutes the classical machine-repair model, where each machine breaks down after a stochastic lifetime, and the repairman repairs the machines in the order of breakdown. In the event of a breakdown, the machine moves to the repair buffer, where it will wait if the repairman is busy repairing, otherwise repair will start instantly. Contrary to the classical machine-repair model, we assume that each machine \(M_i\) also processes its own queue \(Q_i\) of products at a service speed of one when it is operational, which forms the first layer of the two-layered network.

Fig. 1
figure 1

The two-layered model under consideration

When lifetimes and repair times follow a phase-type distribution, this networks fits the general model given in Sect. 2, as the availability of the Markov chains can then be modelled by a continuous-time Markov chain \(\{\varPhi (t), t \ge 0\}\). For the sake of brevity, we will assume in the remainder of Sect. 5 that \(N=2\) and that the lifetime and repair-time distributions of \(M_i\) are exponentially distributed with rate \(\sigma _i\) and \(\nu _i\), respectively. Then, \(\{\varPhi (t), t \ge 0\}\) operates on the state space \(\varvec{\mathcal {S}}= \{(U,U), (U,R), (R,U), (W,R), (R,W)\}\). A state \(\varvec{\omega } = (\omega _1, \omega _2)\in \varvec{\mathcal {S}}\) represents for each machine \(M_i\) its condition of being up (\(\omega _i = U\)), in repair (\(\omega _i = R\)), or waiting in the repair buffer for repair (\(\omega _i = W\)) at time \(t\). The generator matrix \(\varvec{Q}\) with elements \(q_{i,j}\), \(i,j \in \varvec{\mathcal {S}}\), that corresponds to this Markov chain is given by

$$\begin{aligned} \varvec{Q} = \left( \begin{array}{ccccc} -\sigma _1-\sigma _2 &{} \sigma _2 &{} \sigma _1 &{} 0 &{} 0 \\ \nu _2 &{} -\nu _2-\sigma _1 &{} 0 &{} \sigma _1 &{} 0 \\ \nu _1 &{} 0 &{} -\nu _1-\sigma _2 &{} 0 &{} \sigma _2 \\ 0 &{} 0 &{} \nu _2 &{} -\nu _2 &{} 0 \\ 0 &{} \nu _1 &{} 0 &{} 0 &{} -\nu _1 \\ \end{array} \right) , \end{aligned}$$

and we let \(q_i = -q_{i,i}\) be the sum of the outgoing rates of state \(i\). The continuous-time Markov chain \(\{\varPhi (t), t \ge 0\}\) is irreducible and aperiodic, so that its invariant probability measure \(\varvec{\pi }=(\pi _{j})_{j \in \varvec{\mathcal {S}}}\) is uniquely determined by the equations \(\varvec{\pi }\varvec{Q} = \varvec{0}\) and \(\varvec{\pi }\varvec{1}\) = 1 and can be obtained explicitly in terms of the model parameters \(\sigma _1, \sigma _2, \nu _1\) and \(\nu _2\). Since the machines drain their queues of products at service rate one if they are operational (and zero otherwise), the connection with the general framework in Sect. 2 is completed by choosing the state-dependent service speeds as \(\phi _i(\varvec{\omega }) = 1\!\!1_{\{\omega _i = U\}}\), where \(1\!\!1_{\{A\}}\) denotes the indicator function on the event \(A\).

5.2 Derivation of the covariance matrix

Now that the two-layered network is cast as a special instance of the general model given in Sect. 2, we show how to compute expressions for the covariance matrix \(\varvec{\Gamma }\) of the \(N\)-dimensional Brownian motion \(\varvec{Z}\) completely in terms of the model parameters. We do this based on the example of the two-layered network described in Sect. 5.1. However, the following methods can also be used to find the covariance matrix \(\varvec{\Gamma }\) for any instance of the model given in Sect. 2 without any conceptual complications. By (4), it remains to compute expressions for the covariance terms \(\gamma ^{C}_{i, j} = \lim _{t \rightarrow \infty } \frac{1}{t}\text{ Cov }[C_i(t), C_j(t)]\) for all \(i,j \in \{1, \ldots , N\}\). In order to compute these, observe that the increments of \(\{C_i(t), t \ge 0\}\) and \(\{C_j(t), t \ge 0\}\) are conditionally independent given \(\{\varPhi (t), t \ge 0\}\). Therefore, we can view \(\{(\varPhi (t), C_i(t)), t \ge 0\}\), \(\{(\varPhi (t), C_j(t)), t \ge 0\}\) and \(\{(\varPhi (t), C_i(t)+C_j(t)), t \ge 0\}\) as MAPs. As a consequence, a functional central limit theorem for MAPs obtained in [34] can be applied to compute \(\gamma ^{C}_{i, j}\) for all \(i,j \in \{1, \ldots , N\}\). Let \(\omega _{\small {\hbox {ref}}}\in \varvec{\mathcal {S}}\) be an arbitrary reference state and let \(T_k\) be the \(k\)-th time after \(t=0\) that the Markov chain \(\{\varPhi (t), t \ge 0\}\) enters this state. Then, the results of [34] imply the following lemma.

Lemma 5.1

Suppose that \(\{Y(t), t \ge 0\}\) is a Markov-modulated drift process, of which the drift equals \(d_k\) when the Markov chain \(\{\varPhi (t), t \ge 0\}\) is in state \(k\in \varvec{\mathcal {S}}\). Furthermore, suppose that \(|d_k| < \infty \) for each \(k \in \varvec{\mathcal {S}}\) and that \(\sum _{k \in \varvec{\mathcal {S}}} \pi _k d_k =0\). Then, \(\{\frac{1}{\sqrt{s}}Y(st), t \ge 0\}\) converges in distribution, as \(s\rightarrow \infty \), to a driftless Brownian motion starting at \(0\) with variance parameter

$$\begin{aligned} \sigma _Y^2 = 2\sum _{k\in \varvec{\mathcal {S}}}\pi _k \left( \frac{d_k^2}{q_k} + \sum _{l \in \varvec{\mathcal {S}}\backslash \{\{k\}\cup \{\omega _{\small {\hbox {ref}}}\}\}}\frac{q_{k,l}d_kf_l}{q_k}\right) \!, \end{aligned}$$
(27)

where the \(f_l\)-parameters are the unique solution to the set of linear equations

$$\begin{aligned} f_m = \frac{d_m}{q_m} + \sum _{n \in \varvec{\mathcal {S}}\backslash \{\{m\}\cup \{\omega _{\small {\hbox {ref}}}\}\}} \frac{q_{m,n}}{q_m}f_n. \end{aligned}$$

In particular, we have that \(\lim _{t \rightarrow \infty } \frac{1}{t}\text{ Var }[Y(t)] = \sigma _Y^2\).

Proof

The convergence in distribution immediately follows from [34, Theorem 3.4] by taking \(X(t) = \varPhi (t)\) and \(D_{i,j} = V_{i,j} = \upsilon _i = 0\) for all \(i,j\) in the notation of that paper. To show the result for the asymptotic variance of the modulated process \(Y\), observe that \(M(t) = \max _{k:T_k \le t}\{k\}\) counts the number of times the Markov chain returned to the reference state until time \(t\), so that \(\{M(t), t \ge 0\}\) can be interpreted as a (delayed) renewal process. As a consequence,

$$\begin{aligned} \lim _{t \rightarrow \infty } \frac{\text{ Var }[Y(t)]}{t} =&\lim _{t \rightarrow \infty } \frac{\text{ Var }[Y(\sum _{i=1}^{M(t)} (T_{i+1}-T_{i}))]+o(t)}{t} \\ =&\lim _{t \rightarrow \infty } \frac{\mathbb {E}[M(t)]\text{ Var }[Y(T_2-T_1)]+\text{ Var }[M(t)]\mathbb {E}[Y(T_2-T_1)]^2}{t} \\ =&\, \text{ Var }[Y(T_2-T_1)]\lim _{t \rightarrow \infty } \frac{\mathbb {E}[M(t)]}{t} = \frac{\text{ Var }[Y(T_2-T_1)]}{\mathbb {E}[T_2-T_1]}. \end{aligned}$$

Section 3 in [34] shows that \(\text{ Var }[Y(T_2-T_1)] = \mathbb {E}[(Y(T_2-T_1))^2] = \sigma _Y^2\mathbb {E}[T_2-T_1]\), which concludes the proof. \(\square \)

We now apply this lemma to obtain the covariance matrix for the two-layered model with \(N=2\). In particular, to compute \(\sigma ^{2}_{C, 1}\), we study the process \(Y(t) = C_1(t) - \mathbb {E}[C_1(t)] = C_1(t) - (\pi _{(U,U)}+\pi _{(U,R)})t\) with conditional drift \(d_k = 1\!\!1_{\{k \in \{(U,U), (U,R)\}\}}-(\pi _{(U,U)}+\pi _{(U,R)})\) when the modulator \({\varPhi }\) resides in state \(k\). As \(\text{ Var }[Y(t)] = \text{ Var }[C_1(t)]\) for any \(t \ge 0\), an expression for \(\sigma ^{2}_{C, 1}\) is then readily given in Lemma 5.1 by (27). An expression for \(\sigma ^{2}_{C, 2}\) can be found similarly to the computations above or simply by interchanging the indices in the expression of \(\sigma ^{2}_{C, 1}\). Observe that an expression for \( \lim _{t \rightarrow \infty } \frac{1}{t}\text{ Var }[C_1(t)+C_2(t)]\) can also be found using the same technique, but now considering the process \(Y(t) = {C_1(t)+C_2(t) - (\mathbb {E}[C_1(t) + C_2(t)])} = {C_1(t)+C_2(t) - (2\pi _{(U,U)}+\pi _{(U,R)}+\pi _{(R,U)})t}\) instead with \(d_k = 1\!\!1_{\{k \in \{(U,U), (U,R)\}\}}+ 1\!\!1_{\{k \in \{(U,U), (R,U)\}\}}-(2\pi _{(U,U)}+\pi _{(U,R)}+\pi _{(R,U)})\). Again, it then holds that an expression for \(\lim _{t \rightarrow \infty } \frac{1}{t}\text{ Var }[C_1(t)+C_2(t)]\) is given in (27). After these computations, the covariance matrix \(\varvec{\Gamma }\) can be expressed explicitly in terms of the model parameters. The covariance parameters \(\gamma ^{C}_{1, 1}\) and \(\gamma ^{C}_{2, 2}\) are by definition equal to \(\sigma ^{2}_{C, 1}\) and \(\sigma ^{2}_{C, 2}\), for which we have already derived explicit expressions. As for the remaining parameters, we have that both \(\gamma ^{C}_{1, 2}\) and \(\gamma ^{C}_{2, 1}\) are equal to

$$\begin{aligned}&\lim _{t \rightarrow \infty } \frac{1}{t} \text{ Cov }[C_1(t), C_2(t)]\\&\quad = \frac{1}{2}\left( \lim _{t \rightarrow \infty } \frac{1}{t} \text{ Var }[C_1(t)+C_2(t)]-\lim _{t \rightarrow \infty } \frac{1}{t} \text{ Var }[C_1(t)]-\lim _{t \rightarrow \infty } \frac{1}{t} \text{ Var }[C_2(t)]\right) \!, \end{aligned}$$

where all of the terms between the brackets in the right-hand side are now known. As the rest of the terms appearing in (4) were already expressed in terms of the model parameters, the covariance matrix \(\varvec{\Gamma }\) is now explicitly known.

5.3 Numerical evaluation of the limiting distribution of \(\overline{\varvec{Z}}\)

Now that \(\varvec{\Gamma }\) can be computed explicitly, we investigate in this section the joint distribution of \(\overline{\varvec{Z}}\), i.e. the limiting distribution of the scaled workload \(\widetilde{\varvec{W}}_r\), in stationarity. Since the limiting distributions of \(\widetilde{\varvec{D}}_r\) or \(\widetilde{\varvec{L}}_r\) equal the distribution of \(\overline{\varvec{Z}}\) up to a scalar as observed in Corollaries 4.2 and 4.6, the results also directly relate to the limiting distributions of the scaled virtual waiting time and the scaled queue length.

To study the joint distribution of \(\overline{\varvec{Z}}\) as defined in Theorem 3.1, we first observe that this distribution equals the stationary distribution of an \(N\)-dimensional SRBM. In particular, by the definitions of \(\varvec{Z}(t)\) and \(\overline{Z}_i(t)\) in Sect. 2 and Theorem 3.1, respectively, we have that the process \(\overline{\varvec{Z}}(t) = \{\overline{Z}_1(t), \ldots , \overline{Z}_N(t)\}\) satisfies

$$\begin{aligned} \overline{\varvec{Z}}(t) =&\left( \sup _{s \in [0,t]}\{Z_1(s)\}, \ldots , \sup _{s \in [0,t]}\{Z_N(s)\}\right) \\ \,{\buildrel d \over =}\,&\left( \sup _{s \in [0,t]}\{Z_1(t)-Z_1(t-s)\}, \ldots , \sup _{s \in [0,t]}\{Z_N(t)-Z_N(t-s)\}\right) \\ =&\left( Z_1(t)-\inf _{s \in [0,t]}\{Z_1(s)\}, \ldots , Z_N(t)-\inf _{s \in [0,t]}\{Z_N(s)\}\right) \\ =&\, \varvec{Z}(t) + \varvec{R}\varvec{Y}(t), \end{aligned}$$

where the equality in distribution follows since multi-dimensional Brownian motions are time-reversible [4, Lemma II.2]. In this representation, \(\varvec{R}\) is the \(N\times N\) identity matrix, and \(\varvec{Y}(t) = (Y_1(t), \ldots , Y_N(t)) = (-\inf _{s \in [0,t]}{\{Z_1(s)\}}, \ldots , -\inf _{s \in [0,t]}{\{Z_N(s)}\})\). Observe that \(\{\varvec{Y}(t), t \ge 0\}\) is a continuous, non-decreasing process starting in \(\varvec{0}\), of which the elements \(Y_i\) can only increase at times \(t\) when \(\overline{Z}_i(t) = 0\). A process with such a representation is known to be an SRBM on the state space \(\mathbb {R}_+^N\) (see, for example, [7, Section 7.4]). By letting \(t \rightarrow \infty \), it is now clear that the joint distribution of \(\overline{\varvec{Z}}\) coincides with the stationary distribution of an SRBM on the non-negative orthant with drift vector \(\varvec{\mu }\), covariance matrix \(\varvec{\Gamma }\) and reflection matrix \(\varvec{R}\).

Computing the stationary distribution of a multi-dimensional SRBM is in general a challenging problem. Although the SRBM corresponding to our model satisfies the conditions derived in [18] for a unique stationary distribution to exist, it does not necessarily satisfy the necessary requirements found in [19] for this distribution to have a product form. A numerical approach obtained in [9] to compute the stationary distribution is, however, applicable to our setting.

We now apply this numerical algorithm to the two-layered network and observe several parameter effects. Note that for the two-layered network, \(\varvec{R}\) resolves to a 2\(\times \)2 identity matrix, and the underlying Brownian motion \(\{\varvec{Z}(t), t \ge 0\}\) has a drift vector \(\varvec{\mu } = \left( -\beta _1(\pi _{(U,U)}+\pi _{(U,R)}), -\beta _2(\pi _{(U,U)}+\pi _{(R,U)})\right) \) and a covariance matrix \(\varvec{\Gamma } = \mathrm{diag}\left( \frac{\mathbb {E}[B_1^2]}{\mathbb {E}[B_1]}(\pi _{(U,U)}+\pi _{(U,R)}), \frac{\mathbb {E}[B_2^2]}{\mathbb {E}[B_2]}(\pi _{(U,U)}+\pi _{(R,U)})\right) + \varvec{\Gamma }^C\), where \(\varvec{\Gamma }^C\) is a 2\(\times \)2 matrix consisting of the elements \(\gamma ^{C}_{i, j}\) computed in Sect. 5.2. For a number of instances of the two-layered network, we have computed several characteristics of the stationary distribution, such as the first two moments and the cross-moment of \(\overline{Z}_1\) and \(\overline{Z}_2\). The results are summarised in Table 1, where for each of the instances the calculated values for \(\mathbb {E}[\overline{Z}_1]\), \(\mathbb {E}[\overline{Z}_2]\) and the correlation coefficient \(\text{ Corr }[\overline{Z}_1,\overline{Z}_2] = \frac{\mathbb {E}[\overline{Z}_1\overline{Z}_2]-\mathbb {E}[\overline{Z}_1]\mathbb {E}[\overline{Z}_2]}{\sqrt{\mathbb {E}[\overline{Z}_1^2]-\mathbb {E}[\overline{Z}_1]^2}\sqrt{\mathbb {E}[\overline{Z}_2^2]-\mathbb {E}[\overline{Z}_2]^2}}\) are given. Recall that the marginal distribution of \(\overline{Z}_i\) is exponential, so that \(\mathbb {E}[\overline{Z}_i^2] = 2\mathbb {E}[\overline{Z}_i]^2\). Observe also that the limiting distributions of \(\widetilde{\varvec{D}}_r\) and \(\widetilde{\varvec{L}}_r\) are equal to the distribution of \(\overline{\varvec{Z}}\) up to a scalar, so that \(\text{ Corr }[\overline{Z}_1,\overline{Z}_2]\) does not only represent the correlation coefficient pertaining to the limiting distribution of the scaled workload \(\widetilde{\varvec{W}}_r\), but also to that of the scaled virtual waiting time and the scaled queue length. It follows from Table 1 that the competition between the machines of the repair facilities can be of such a level, that the correlation coefficient pertaining to the queue lengths is significant. Moreover, by taking the first instance as a reference, we observe that the correlation coefficient is highly influenced by the relative convergence speed of the arrival rates (instance no. 2), the variability of the service times (instance no. 3), the level of asymmetry in the model parameters (instance no. 4), the frequency of machine breakdowns and speed of machine repairs with respect to the arrivals and services of products (instance no. 5), and the duration of the machine lifetimes with respect to that of their repairs (instance no. 6).

Table 1 Numerical results for several instances of the two-layered network

5.4 Comparison with simulation results

We end this section with an assessment of the quality of the distribution of \(\overline{\varvec{Z}}\) as an approximation for the joint workload distribution in systems with a considerable load. In Table 2, simulation results for the scaled workload \(\widetilde{\varvec{W}}_r\) corresponding to the values \(r=5,10, 20\) are given for each of the instances given in Table 1. Recall that \(\rho _i = 1-\frac{\beta _i}{r}\), so that \(r = 5, 10, 20\) corresponds to \(\rho _i = 0.8, 0.9, 0.95\) if \(\beta _i = 1\). Thus, the values \(r=5,10, 20\) represent systems that operate under a high load, as is often the case in practice.

Table 2 Simulation results for \(\widetilde{\varvec{W}}_5, \widetilde{\varvec{W}}_{10}\) and \(\widetilde{\varvec{W}}_{20}\)

As expected, Tables 1 and 2 suggest that the distribution of \(\overline{\varvec{Z}}\) generally approximates the distribution of \(\widetilde{\varvec{W}}_r\) well in terms of marginal means and the correlation coefficient. In particular, the tables confirm that \(\mathbb {E}[\widetilde{W}_{i,r}]\) converges to \(\mathbb {E}[\overline{Z}_i]\) from below as \(r \rightarrow \infty \) at a fast rate, so that \(\mathbb {E}[\overline{Z}_i]\) is a provably useful upper bound close to the actual value of \(\mathbb {E}[\widetilde{W}_{i,r}]\) for large \(r\) (i.e. significantly loaded systems). Surprisingly, the rate at which \(\mathbb {E}[\widetilde{W}_{i,r}]\) converges to \(\mathbb {E}[\overline{Z}_i]\) does not seem to differ much between the model instances. The slowest convergence occurs in the third model instance due to the high variability of the service times, but it does not deviate much from the other instances. The only outlying rate of convergence can be found in the expected scaled waiting time of the first queue in the second model instance, where convergence is a lot faster. However, this is obvious by the nature of our scaling, since \(\beta _1 = 1/2\) for that model instance instead of \(\beta _1 = 1\). Furthermore, the values of \(\text{ Corr }[\overline{Z}_1,\overline{Z}_2]\) turn out to be accurate approximations of the values \(\text{ Corr }[\widetilde{W}_{1,r},\widetilde{W}_{2,r}]\), \(r=5,10,20\), for almost all of the model instances. Thus, the limiting distribution seems to capture the correlation structure between the queue lengths in the stable case rather well. One can argue that the fifth model instance is an exception to this. However, due to the high frequency of machine breakdowns and repairs, there hardly is any correlation between the queues, making correlation coefficients hard to approximate accurately.