Keywords

1 Introduction

We consider a multiclass G/G/1 queueing system. When the server is idle, we use a balanced routing control policy to determine what kind of customer is served. This problem arises in a number of important applications. For example, it can be viewed as a order model of Van Mieghem [1]. It is interesting to think of our system as modeling order fulfillment at a firm which dynamically receives orders from customers for several different types or classes of goods and services. In order not to keep one kind of customer’s orders in the firm for too long, we need to settle the problem of routing control in such a system. The purpose of this paper is to show how the firm should sequence the different orders that are competing for its scarce resources.

Load balancing is a canonical method for sharing resources among different customers efficiently. Two versions of the load balancing problem have been studied in the literature: static and dynamic. The static version was first analyzed by Azar et al. [2] using the balls-and-bins model. In this paper, we are interested in the dynamic version, in Vvednskaya et al. [3], a routing policy that is based on the partial queue length information is proposed. The model was also studied by Mitzenmacher [4].

Chen and Ye [5] extends and complements the previous works in several ways. Firstly, based on Mitzenmacher’s, they establish the asymptotic optimality of the balanced routing under more general queueing systems. Secondly, they complements the work of He and Down [6], they rigorously establish the heavy traffic limit theorem for the queueing system under balance routing. Compared to Chen and Ye [5], the distinguishing feature of our analysis is that here we consider a multiclass G/G/1 queueing system. We extend the balanced routing control policy to multiclass G/G/1 queueing system.

The paper is organized as follows. In the next section we present our model. Section 3 analyze the fluid limit of the queueing processes, and establish the uniform attraction property of the fluid limit. In Sect. 4, we show that for any \( c \ge 2, \) the balanced routing control is asymptotically optimal. We conclude in Sect. 5 with extensions and discussion.

2 The Model

We consider a general single-server multiclass queueing system. Customers are categorized into \( L \) different classes depending on their specific arrival patterns. Class \( l \in \mathcal{L} = \left\{ {1, \ldots ,L} \right\} \) customers arrive at the system following a renewal process with arrival rate \( \lambda_{l} \). At the server, the customers are served at a rate of \( \mu_{l} \) according to the first-in first-out (FIFO) discipline.

For class \( l \) customers, denote the interarrival times between consecutive customers as \( u_{l,k} ,\;l \in \mathcal{L};\;k \ge 1 \), and denote the service time of \( k{\text{th}} \) customer at the server by \( v_{l,k} ,\;l \in \mathcal{L};\;k \ge 1 \). We assume that \( \left\{ {u_{l,k} ,\;k \ge 1} \right\},\;l \in \mathcal{L} \), and \( \left\{ {v_{l,k} ,\;k \ge 1} \right\},l \in \mathcal{L} \), are i.i.d. random sequence, all with finite second moments. In particular, \( u_{l,k} \) are random variables with mean \( {1 \mathord{\left/ {\vphantom {1 {\lambda_{l} }}} \right. \kern-0pt} {\lambda_{l} }} \) and variance \( a_{l}^{2} ,l \in \mathcal{L} \); and \( v_{l,k} \) are random variables with mean \( {1 \mathord{\left/ {\vphantom {1 {\mu_{l} }}} \right. \kern-0pt} {\mu_{l} }} \) and variance \( b_{l}^{2} ,l \in \mathcal{L} \). Denote the traffic intensity as \( \rho_{l} = {{\lambda_{l} } \mathord{\left/ {\vphantom {{\lambda_{l} } {\mu_{l} }}} \right. \kern-0pt} {\mu_{l} }},l \in \mathcal{L} \). For any \( \mathcal{L}^{\prime } \subseteq \mathcal{L}, \) let \( \lambda_{{\mathcal{L}^{\prime } }} = \sum\limits_{{l \in \mathcal{L}^{\prime } }} {\lambda_{l} } \) and \( \mu_{{\mathcal{L}^{\prime } }} = \sum\limits_{{l \in \mathcal{L}^{\prime } }} {\mu_{l} } \). Let \( \lambda_{0} = {{\lambda_{\mathcal{L}} } \mathord{\left/ {\vphantom {{\lambda_{\mathcal{L}} } L}} \right. \kern-0pt} L} \) denote the average arrival rate and \( \mu_{0} = {{\mu_{\mathcal{L}} } \mathord{\left/ {\vphantom {{\mu_{\mathcal{L}} } L}} \right. \kern-0pt} L} \) the average service rate.

We call \( A\left( t \right) = \left( {A_{l} \left( t \right)} \right)_{{l \in \mathcal{L}}} \) the arrival process, where \( A_{l} (t) \) represents the number of class \( l \) customers that have arrived during the time interval \( \left[ {0,t} \right] \); and call \( S\left( t \right) = \left( {S_{l} \left( t \right)} \right)_{{l \in \mathcal{L}}} \) the service process, where \( S_{l} (t) \) is the number of class \( l \) customers that are served during the first \( t \) time units that the server devotes to class \( l \).

$$ \begin{gathered} U_{l} (0) = 0,\quad U_{l} (k) = \sum\limits_{{k^{\prime } = 1}}^{k} {u_{{l,k^{\prime } }} } ,\quad k \ge 1,\;l \in \mathcal{L}, \hfill \\ V_{l} (0) = 0,\quad V_{l} (k) = \sum\limits_{{k^{\prime } = 1}}^{k} {v_{{l,k^{\prime } }} } ,\quad k \ge 1,\;l \in \mathcal{L}, \hfill \\ \end{gathered} $$
$$ A_{l} (t) = \sup \left\{ {k \ge 0:U_{l} (k) \le t} \right\},\;{\text{and}}\;S_{l} (t) = \sup \left\{ {k \ge 0:V_{l} (k) \le t} \right\}. $$

Let \( q = \left( {q_{1} , \ldots ,q_{L} } \right)^{\prime } \) be a nonnegative integer valued vector (with \( q_{l} \) denote the queue length of class \( l \) customers). Let \( \pi (q) = \left( {j_{1} ,j_{2} , \ldots ,j_{L} } \right)^{\prime } \) be a permutation of indices \( \mathcal{L} \) such that \( q_{{j_{1} }} \ge q_{{j_{2} }} \ge \cdots \ge q_{{j_{L} }} \), and \( \pi_{l} (q) \) be the set of the first \( l \) components of \( \pi (q) \). Let \( \phi (j,k;\pi ) \) be a zero–one random variable, and \( \phi (j,k;\pi ) = 1 \) represents that the \( k{\text{th}} \) service is dispatched to the class \( j \) customers. For \( \forall q \) and \( \forall \mathcal{L}^{\prime} \subseteq \mathcal{L} \) that satisfy \( \pi_{l} (q) = \mathcal{L}^{\prime } \) for some \( l \) \( \left( {1 \le l \le L} \right), \)

$$ \Upphi \left( {\mathcal{L}^{\prime } ,k} \right): = \sum\limits_{{j \in \mathcal{L}^{\prime } }} {\phi \left( {j,k;\pi (q)} \right)} . $$
(1)

We assume that the routing sequence is independent of the arrival process and the service time process. For ease of analysis, we introduce

$$ R\left( {\mathcal{L}^{\prime } ,0} \right) = 0,\;R\left( {\mathcal{L}^{\prime } ,k} \right) = \sum\limits_{{k^{\prime} = 1}}^{k} {\Upphi \left( {\mathcal{L}^{\prime } ,k^{\prime } } \right)} ,\;k \ge 1,\;{\text{for all}}\;\mathcal{L}^{\prime } \subseteq \mathcal{L}. $$

3 Fluid Limit and Uniform Attraction

Here we start with presenting the main processes in this paper. First, let \( Q(t) = \left\{ {Q_{l} (t)} \right\}_{{l \in \mathcal{L}}} \) be the queue length at time \( t \), where \( Q_{l} (t) \) denote the total number of customers of class \( l \) in the system at time \( t \). Then, \( B_{l} (t) \) is the number of customers of class \( l \) that are served during \( [0,t] \), \( T_{l} (t) \) represents total amount of the time during \( [0,t] \) that the server allocates to class \( l \). The above can be described as follows:

$$ B_{l} (t) = \sum\limits_{k = 1}^{S(t)} {\phi \left( {l,k;\pi \left( {Q\left( {V\left( k \right) - } \right)} \right)} \right)} ,\;\sum\limits_{{l \in \mathcal{L}}} {B_{l} (t)} = S(t) = \sum\limits_{{l \in \mathcal{L}}} {S_{l} (t)} ,B_{l} (t) = S_{l} \left( {T_{l} (t)} \right). $$
(2)

The derived process that characterizes the dynamics of the queueing system is the queue-length process:

$$ Q_{l} (t) = Q_{l} (0) + A_{l} (t) - S_{l} \left( {T_{l} (t)} \right) \ge 0, $$
(3)

\( Q_{l} (0) \) is the initial number of customers of class \( l \) in the system, \( l \in \mathcal{L} \).

Now define the workload process and the idling process: \( W\left( t \right) = \left( {W_{l} \left( t \right)} \right)_{{l \in \mathcal{L}}} \) and \( Y\left( t \right) = \left( {Y_{l} \left( t \right)} \right)_{{l \in \mathcal{L}}} , \) where

$$ W_{l} (t) = V_{l} \left( {Q_{l} (0)} \right) + V_{l} \left( {A_{l} (t)} \right) - T_{l} (t) = V_{l} \left( {Q_{l} (t)} \right) = \frac{1}{{\mu_{l} }}Q_{l} (t), $$
(4)

and \( Y_{l} (t) = t - \sum\limits_{{l \in \mathcal{L}}} {T_{l} (t)} \). In addition, we define \( W_{0} (t) \) and \( Y_{0} (t) \) as follows,

$$ W_{0} (t) = \frac{1}{{\mu_{\mathcal{L}} }}\sum\limits_{{l \in \mathcal{L}}} {\mu_{l} W_{l} (t)} = \frac{1}{{\mu_{\mathcal{L}} }}\sum\limits_{{l \in \mathcal{L}}} {Q_{l} (t)} , $$
(5)
$$ Y_{0} (t) = \frac{1}{{\mu_{\mathcal{L}} }}\sum\limits_{{l \in \mathcal{L}}} {\mu_{l} Y_{l} (t)} = t - \frac{1}{{\mu_{\mathcal{L}} }}\sum\limits_{{l \in \mathcal{L}}} {\mu_{l} T_{l} (t)} . $$
(6)

Theorem 1 (Fluid Limit)

Let \( M \) be a given positive constant, and suppose \( \left| {\bar{Q}^{n} (0)} \right| = \sum\nolimits_{{l \in \mathcal{L}}} {\bar{Q}_{l}^{n} (0)} \le M \) for all \( n \). Then, for any subsequence of fluid scaled processes, there exists a further subsequence, denoted by \( \mathcal{N}, \) such that, along \( \mathcal{N}, \)

$$ \begin{aligned} & \left( {\bar{Q}^{n} (t),\;\bar{B}^{n} (t),\;\bar{T}^{n} (t),\;\bar{W}^{n} (t),\;\bar{W}_{0}^{n} (t),\;\bar{Y}^{n} (t),\;\bar{Y}_{0}^{n} (t)} \right) \to \\ & \left( {\bar{Q}(t),\;\bar{B}(t),\;\bar{T}(t),\;\bar{W}(t),\;\bar{W}_{0} (t),\;\bar{Y}(t),\;\bar{Y}_{0} (t)} \right),\;u.o.c \\ \end{aligned} $$
(7)

Theorem 2 (Uniform Attraction)

Let \( M \) be a given positive constant, and suppose \( \left| {\bar{Q}^{n} (0)} \right| = \sum\nolimits_{{l \in \mathcal{L}}} {\bar{Q}_{l}^{n} (0)} \le M \) for all \( n. \) Assume that the heavy traffic condition \( \rho = 1 \) holds. Then, there exists a time \( T_{M} > 0 \) such that, all fluid levels are the same after the time \( T_{M} \) and the fluid levels remain fixed afterward: for all \( l \in \mathcal{L}, \)

$$ \bar{Q}_{l} (t) = \mu_{0} \bar{W}_{0} (t)\left( { = \mu_{0} \bar{W}_{0} (0)} \right),\;{\text{for}}\;t \ge T_{M} . $$
(8)

4 Diffusion Limit and Asymptotic Optimality

We apply the standard diffusion scaling (along with centering) to the derived process:

$$ \hat{Q}^{n} (t): = \frac{1}{n}Q^{n} \left( {n^{2} t} \right),\;\hat{W}^{n} (t): = \frac{1}{n}W^{n} \left( {n^{2} t} \right),\;\hat{W}_{0}^{n} (t): = \frac{1}{n}W_{0}^{n} \left( {n^{2} t} \right). $$
(9)

We can rewrite the unscaled system workload process for the \( n{\text{th}} \) network, and apply the diffusion scaling to the equation, we have

$$ \hat{W}_{0}^{n} (t) = \frac{1}{n}W_{0}^{n} \left( {n^{2} t} \right) = \hat{X}_{0}^{n} (t) + \hat{Y}_{0}^{n} (t), $$
(10)

where

$$ \begin{aligned} \hat{X}_{0}^{n} \left( t \right): = & \frac{1}{{n\mu_{\mathcal{L}}^{n} }}\sum\limits_{{l \in \mathcal{L}}} {\mu_{l}^{n} V_{l}^{n} \left( {\tilde{Q}_{l}^{n} \left( 0 \right)} \right)} \\ & + \frac{1}{{\mu_{\mathcal{L}}^{n} }}\sum\limits_{{l \in \mathcal{L}}} {\mu_{l}^{n} V_{l}^{n} \left( {\hat{A}_{l}^{n} \left( t \right) - \hat{S}^{n} \left( {\tilde{T}_{l}^{n} \left( t \right)} \right)} \right)} + n\left( {\rho^{n} - 1} \right)t, \\ \end{aligned} $$
(11)
$$ \hat{Y}_{0}^{n} (t): = n\left( {t - \frac{1}{{\mu_{\mathcal{L}}^{n} }}\sum\limits_{{l \in \mathcal{L}}} {\mu_{l}^{n} \tilde{T}_{l}^{n} (t)} } \right), $$
(12)

For each \( n, \) we also have

$$ \hat{Y}_{0}^{n} (t)\;{\text{is}}\;{\text{nondecreasing}}\;{\text{in}}\;t \ge 0,\;{\text{and}}\;\hat{Y}_{0}^{n} (0) = 0. $$
(13)

In view of (4) and with the centering procedure as in the above, we can write the following equality for the workload process,

$$ \hat{W}_{l}^{n} \left( t \right) = \frac{1}{{\mu_{l}^{n} }}\hat{Q}_{l}^{n} \left( t \right) + \hat{V}_{l}^{n} \left( {\tilde{Q}_{l}^{n} \left( 0 \right)} \right) + \hat{V}_{l}^{n} \left( {\tilde{A}_{l}^{n} \left( t \right)} \right) + \frac{1}{{\mu_{l}^{n} }}\hat{S}_{l}^{n} \left( {\tilde{B}_{l}^{n} \left( t \right)} \right). $$
(14)

By the functional central limit theorem for the renewal process (refer to Chen and Yao [7], Chap. 5), we have

$$ \left( {\hat{A}^{n} ,\hat{V}^{n} ,\hat{S}^{n} } \right) \Rightarrow \left( {\hat{A},\hat{V},\hat{S}} \right),{\text{as}}\;\;n \to \infty , $$
(15)

For ease of exposition, assume that

$$ Q_{l}^{n} \left( 0 \right) = 0,\;{\text{for}}\;{\text{all}}\;\;l \in \mathcal{L}\;\;{\text{and}}\;{\text{all}}\;n. $$
(16)

It follows from Theorem 1 that as \( n \to \infty , \)

$$ \left( {\tilde{Q}_{l}^{n} \left( 0 \right),\tilde{T}_{l}^{n} \left( t \right)} \right) \to \left( {0,\rho_{l} t} \right),\;u.o.c, $$
(17)

It follows from the random time-change theorem (refer to Chen and Yao [7], Chap. 5), the process \( \hat{X}^{n} \left( t \right) \) converges weakly as follows,

$$ \hat{X}_{0}^{n} \left( t \right) \Rightarrow \hat{X}_{0} \left( t \right): = \theta t + \sum\limits_{{l \in \mathcal{L}}} {V_{k} \left( {\hat{A}_{l} \left( t \right) - \hat{S}\left( {\rho_{l} t} \right)} \right)} ,\;{\text{as}}\;n \to \infty , $$
(18)

where the limit \( \bar{X}_{0} = \left\{ {\hat{X}_{0} \left( t \right),\;t \ge 0} \right\} \) is a Brownian motion.

We show that the derived diffusion scaled process approach to some limits described by \( \hat{W}_{0} \left( t \right),\;\hat{Y}_{0} \left( t \right) \) in the next theorem. The limiting process are characterized by the following relations for all \( t \ge 0 \) and \( l \in \mathcal{L}: \)

$$ \hat{W}_{0} \left( t \right) = \hat{X}_{0} \left( t \right) + \hat{Y}_{0} \left( t \right) \ge 0; $$
(19)
$$ \hat{Y}_{0} (t)\;{\text{is}}\;{\text{nondecreasing}}\;{\text{in}}\;t,\;{\text{with}}\;\hat{Y}_{0} (0) = 0; $$
(20)
$$ \int_{0}^{\infty } {\hat{W}_{0} \left( t \right)d\hat{Y}_{0} \left( t \right)} = 0. $$
(21)

Theorem 3

Suppose that the heavy traffic condition \( \rho = 1 \) holds. Under the balanced routing policy, we have the following results.

  1. (a)

    Diffusion Limit: The following weak convergence holds when \( n \to \infty : \)

$$ \left( {\hat{W}_{0}^{n} \left( t \right),\;\hat{Y}_{0}^{n} \left( t \right),\;\hat{Q}^{n} \left( t \right)} \right) \Rightarrow \left( {\hat{W}_{0} \left( t \right),\;\hat{Y}_{0} \left( t \right),\;\hat{Q}\left( t \right)} \right), $$

in particular, \( \hat{W}_{0} \) is a single-dimensional RBM, \( \hat{Y}_{0} \) is the associated regulator, and \( \hat{Q}\left( t \right) = Q^{*} \left( {\hat{W}_{0} \left( t \right)} \right), \) is the fixed point with all components equal to \( \mu_{0} \hat{W}_{0} \left( t \right). \)

  1. (b)

    Asymptotic optimality: The balanced routing is asymptotically optimal in the following sense: Let \( \hat{W}_{0}^{n,G} \) and \( \hat{Q}^{n,G} \) denote the scaled workload and queue length process associated with any feasible routing scheme \( G. \). Then, for all \( t \ge 0 \) and \( u \ge 0, \) we have

$$ \mathop {\lim \inf }\limits_{n \to \infty } P\left\{ {\hat{W}_{0}^{n,G} \left( t \right) > u} \right\} \ge P\left\{ {\hat{W}_{0} \left( t \right) > u} \right\},\;{\text{and}} $$
(22)
$$ \mathop {\lim \inf }\limits_{n \to \infty } P\left\{ {\mathop {\hbox{max} }\limits_{l} \hat{Q}_{l}^{n,G} \left( t \right) > u} \right\} \ge P\left\{ {\mathop {\hbox{max} }\limits_{l} \hat{Q}_{l} \left( t \right) > u} \right\}. $$
(23)

5 Concluding Remarks

In this paper, we extend the balanced routing control policy to multiclass G/G/1 queueing system. We prove that, under the diffusion scale to minimize the total workload process and the maximum queue length process, for any \( c \ge 2, \) the balanced routing control is asymptotically optimal. In heavy traffic, the multi-server simplifies to a single server with service capacity equal to the sum of the parallel servers, and the analysis still applies.