Keywords

1 Introduction

In this paper, we consider a control problem in the finite M X/G/1/N + 1 queueing model. Optimal control of queueing models has always been an important feature in the queueing literature and may go back as far as Erlang’s study to determine the minimal number of servers c in the M/G/c/c queue such that the loss probability remains below some predefined level. In fact, optimal control of real-world systems is the driving force to analyze most queueing models. A common theme is the determination of parameters (such as number of servers, system capacity, and service rate) that optimize a certain performance measure of the system. This paper falls in the same area. We address an operational problem in a system with finite buffer, batch arrivals, and single server, where the service of jobs can be done in one of two modes, regular or high (fast). Switching between modes is possible after service completion and takes a random duration. In other words, the controllable parameter is the choice of service mode after each service completion. We focus on three performance measures: the long-run average number of jobs in the buffer, the long-run fraction of lost jobs, and the long-run fraction of batches with losses.

An application of this queueing model appears in a telecommunication system with a controllable transmission rate: Connections are given less bandwidth (transmission rate) when the transmission line is near full capacity. The question is at which capacity the rate should be slowed down. Because the performance criteria in our queueing model are different than those in the communication problem, we seem to answer the opposite question: how many jobs should be waiting in the buffer before working harder. Although this question is quite natural, we found that there is not much attention in the literature for finite-buffer queueing models with controllable service. Though, infinite-buffer systems with controllable service have been studied extensively, see, e.g., Dshalalow [1] for a recent overview. Nishimura and Jiang developed in [2] an algorithm for the steady-state probabilities of an infinite-buffer, single-arrival M/G/1 queue with two service modes and switch-over times for any so-called two-level hysteretic switching rule, i.e., changes in the service mode take place only if the number of jobs in the system rises above a fixed high level or falls below a fixed low level. In [3], Nobel and Tijms extended this model to compound Poisson input, and they used Markov decision theory to calculate the optimal one-level hysteretic rule (i.e., once a change from the slower to the faster service mode is made, this mode is continued until the system is empty) that minimizes the long-run average number of jobs in the system. By exploiting well-known results for the standard M X/G/1 queue, a tailor-made policy-iteration algorithm was developed by an embedding technique to cope with the problem of the infinite state space. Also, in Nobel [4], these results for the M X/G/1 queue formed the basis for a regenerative analysis, so that for every two-level hysteretic switching rule, the average number of jobs in the system could be calculated. In both [4] and [3], the discrete fast Fourier transform (FFT) played an essential role in the calculation of the probability distributions of the number of individual arrivals during a service time and a switching time via their generating functions. Because the application of the FFT requires that these generating functions are given explicitly, some minor restriction with respect to the generality of the service-time and switching-time distributions was required in these papers.

As said, in this paper we discuss the finite-buffer equivalent. The essential differences in the infinite-buffer model are twofold. First, the state space is now finite, suggesting a simplification compared to the infinite model, but, second, the queueing results analogous to those of the standard M X /G/1 model, which were so crucial in the analysis of both [3] and [4], are not available for the standard M X/G/1/N + 1 model. As a consequence, we cannot simply adapt the policy-iteration algorithm of [3] or the regenerative approach of [4] to the finite-buffer model. Besides that, now also other performance measures become relevant for practical applications, e.g., the loss probability of a job.

The traditional approach for analyzing a controlled queueing model is to formulate a stochastic dynamic program (Markov decision model) and then prove that the optimal policy has certain structural properties, see Koole [5] and Stidham and Weber [6] for recent surveys. For an overview of this approach, we refer to the monographs of Kitaev and Rykov [7] or Sennot [8], and the review paper by Cavazos-Cadena and Sennot [9]. We remark that most results obtained by this approach are valid only in queueing models with exponential services. The contribution of this paper is that we are able to determine the optimal switching strategy (not necessarily hysteretic!) for the performance criteria mentioned above and for a wide range of service and switching distributions. Our approach combines the three usual features: modeling, analyzing, and calculating.

  • Modeling: We reformulate the control problem as a semi-Markov decision model. The value-iteration algorithm will be applied for determining optimal policies.

  • Analyzing: We use general results from Nobel [10] to find expressions of the one-step costs in the semi-Markov decision model. These expressions are formulated in terms of coefficients of different generating functions.

  • Calculating: We apply the discrete fast Fourier transform for inverting these generating functions.

The paper is organized as follows. In Sect. 2, the queueing model is described, and in Sect. 3, the semi-Markov decision model is formulated with the associated value-iteration algorithm. In Sects. 4 and 5, we explain how to calculate the one-step costs, which are required in the decision model. In Sect. 6, some numerical results are given.

2 Description of the Queueing Model

We consider a finite-buffer M X /G/1/N + 1 queueing model with two service modes. Batches of jobs arrive at a single-server station according to a Poisson process. The batch sizes are identically distributed, independent random variables. The single server can handle one job at a time and serves the jobs in the order of their arrival (within a batch in random order). The service times are independent random variables, but they are typically not identical. Before servicing a job, the server chooses between a regular speed service mode (R) and a high-speed service mode (H). Only at service completion epochs, the server can switch from one mode to another. Every time the server decides to change mode, a vacation (or switch-over) time is required to prepare for the new mode. These switch-over times are independent random variables and again one of two types: switching from regular to high, or vice versa. In accordance with practical applications, the model assumption is made that the system has always to switch from the high-speed mode to the regular speed mode when the system becomes empty and the last service was done in high-speed mode. The service is disabled during a switch-over time, and the next job to be served stays in the buffer until the switch-over time expires and the service can start. The buffer is a waiting room for N places. Batches that upon arrival do not find sufficient places in the buffer are partially rejected.

This completes the description of the model. The purpose of a controller of the system is to specify the decisions when to switch to the other service mode, in such a way that an optimization criterion is met. The three optimization criteria which we propose are as follows:

  • Minimize the long-run average number of jobs in the buffer,

  • Minimize the long-run fraction of jobs that are lost, and

  • Minimize the long-run fraction of batches that are not fully accepted.

Clearly, we consider in all three cases the so-called expected (long-run) average cost optimality criterion. Because we deal with finite state and action spaces—which we will specify in the sequel—an optimal policy is stationary deterministic [11]. An optimal (stationary deterministic) policy can be found by the value-iteration algorithm of Markov decision theory [11, 12]. Before we give this algorithm in the context of our control problem, we specify the notation.

Notation

λ:

rate of Poisson arrivals of batches;

\( \beta_{k} \) :

Prjbatch contains k jobs} (k = 1, 2,…);

\( \beta^{(1)} \) :

mean batch size;

A :

generic random variable representing the interarrival time of batches;

S R :

generic random variable representing the service time of a job under the regular speed service mode;

S H :

similarly under the high-speed mode;

V R :

generic random variable representing the switch-over time for changing from regular speed to high-speed service mode;

V H :

similarly vice versa;

\( \alpha_{j}^{(L)} \) :

Pr{j jobs arrive during a service time S L } (L = R, H);

\( \phi_{j}^{(L)} \) :

Pr{j jobs arrive during switch-over time V L } (L = R, H).

3 The Semi-Markov Decision Model

The control problem of finding the best policy is formulated in a semi-Markov decision model. The elements are decision epochs, state space, action space, transition probabilities, and one-step costs.

  • The decision epochs occur at the end of the service times and at the end of the switching times. The latter epochs are actually no decision epochs. They are introduced merely to simplify the transition probabilities and the one-step expected costs (see below).

  • State space:

    $$ S \, = \{ \left( {i,K,L} \right) \, | \, i \, = \, 0, \ldots ,N;K = \, S, \, V;L = R,H\} $$

    where state (i, S, L) describes the situation that a service in L-mode has been completed leaving behind i jobs in the buffer and state (i, V, L) denotes that a switching time V L has expired with i jobs in the buffer.

  • Action spaces A(i, K, L) of feasible actions in state (i, K, L):

    $$ \begin{array}{*{20}l} {A\left( {0, \, S, \, L} \right) = \left\{ R \right\},} \hfill & {A\left( {i,S,L} \right) \, = \, \{ R,H\} } \hfill & {\quad (i > 1, \, L = R,H),} \hfill \\ {A\left( {i, \, V, \, R} \right) = \{ H\} ,} \hfill & {A(i,V,H) \, = \, \left\{ R \right\}} \hfill & {\quad \left( {i \, > \, 0} \right).} \hfill \\ \end{array} $$
  • One-step transition probabilities p st (a), denoting the probability that the system jumps to state t at the next decision epoch when given that the state is s at the current decision epoch and that action a is chosen. For instance, suppose that a service in regular mode has been completed leaving behind \( i \ge 1 \) jobs in the buffer and that the action is to switch to high-speed mode. Then, the probability of a full buffer at the next decision epoch is

    \( \begin{aligned} p_{ (i,S,R)(N,V,H)} (H) \, & = { \Pr }\{ {\text{at least}}\;N - i\;{\text{jobs arrive during switch-over time}}V_{R} \} \\ & = \, \sum\limits_{k > N - i} {\phi_{k}^{(R)} } . \\ \end{aligned} \)

    In this way, all transition probabilities are expressed by the densities (β k ), \( \alpha_{k}^{(L)} \) and \( \phi_{k}^{(L)} \). We summarize here the transition probabilities under action a = R only.

    $$ \begin{aligned} p_{(0,S,R)(j,S,R)} (R) & = \begin{array}{*{20}c} {\sum\nolimits_{k = 1}^{j + 1} {\beta_{k} \alpha_{j + 1 - k}^{(R)} } } & {\quad (j \le N - 1),} \\ \end{array} \\ p_{(0,S,R)(N,S,R)} (R) & = \begin{array}{*{20}c} {\sum\nolimits_{k > N} {\beta_{k} + \sum\nolimits_{k = 1}^{N} {\beta_{k} } } } & {\sum\nolimits_{l > N - k} {\alpha_{l}^{(r)} } } \\ \end{array} , \\ p_{(i,S,R)(j,S,R)} (R) & = \begin{array}{*{20}c} {\alpha_{j + 1 - i}^{(R)} } & {\quad (i \ge 1,\;i - 1 \le j \le N - 1),} \\ \end{array} \\ p_{(i,S,H)(N,S,R)} (R) & = \begin{array}{*{20}c} {1 - \sum\nolimits_{k = 0}^{N - i} {\alpha_{k}^{(R)} } } & {\quad (i \ge 1),} \\ \end{array} \\ p_{(i,S,H)(j,V,H)} (R) & = \begin{array}{*{20}c} {\phi_{j - i}^{(H)} } & {\quad (i \ge 0,i \le j \le N - 1),} \\ \end{array} \\ p_{(i,S,H)(N,V,H)} (R) & = \begin{array}{*{20}c} {\sum\nolimits_{k \ge N - i} {\phi_{k}^{(H)} } } & {\quad (i \ge 0),} \\ \end{array} \\ p_{(0,V,H)(j,S,R)} (R) & = \begin{array}{*{20}c} {\sum\nolimits_{k = 1}^{j + 1} {\beta_{k} \alpha_{j + 1 - k}^{(R)} } } & {\quad (j \le N - 1),} \\ \end{array} \\ p_{(0,V,H)(N,S,R)} (R) & = \begin{array}{*{20}c} {\sum\nolimits_{k > N} {\beta_{k} + \sum\nolimits_{k = 1}^{N} {\beta_{k} } } } & {\sum\nolimits_{l > N - k} {\alpha_{l}^{(R)} } ,} \\ \end{array} \\ p_{(i,V,H)(j,S,R)} (R) & = \begin{array}{*{20}c} {\alpha_{j + 1 - i}^{(R)} } & {\quad (i \ge 1,\;i - 1 \le j \le N - 1),} \\ \end{array} \\ p_{(i,V,H)(N,S,R)} (R) & = \begin{array}{*{20}c} {\sum\nolimits_{k > N - 1} {\alpha_{k}^{(R)} } } & {\quad (i \ge 1)} \\ \end{array}. \\ \end{aligned} $$

    All other one-step transition probabilities under action R are zero. The set of transition probabilities under action H is constructed similarly. Note that when service is enabled, the system can hold up to N + 1 jobs, while only up to N jobs (the buffer size) can be present when service is disabled.

  • Expected times between decision epochs: \( \tau_{{s}} \left( {\textit{a}} \right) \) denotes the expected time until the next decision epoch given that in state s action a is chosen. Again we give them here for a = R only:

    $$\begin{aligned} \tau_{(0,S,R)} (R) & = \begin{array}{*{20}c} {E[A + S_{R} ] = \frac{1}{\lambda } + E[S_{R} ],} & {} \\ \end{array} \\ \tau_{(i,S,R)} (R) & = \begin{array}{*{20}c} {E[S_{R} ]} & {\quad (i \ge 1),} \\ \end{array} \\ \tau_{(i,S,H)} (R) & = \begin{array}{*{20}c} {E[V_{H} ]} & {\quad (i \ge 0),} \\ \end{array} \\ \tau_{(0,V,H)} (R) & = \begin{array}{*{20}c} {E[A + S_{R} ] = \frac{1}{\lambda } + E[S_{R} ],} & {} \\ \end{array} \\ \tau_{(i,V,H)} (R) & = \begin{array}{*{20}c} {E[S_{R} ]} & {\quad (i \ge 1).} \\ \end{array} \\ \end{aligned}$$
  • Expected one-step costs: \( c_{s} \left( a \right) \) are the total expected costs incurred until the next decision epoch if in state s action a is taken. Because we consider three different optimization criteria, we deal with three different one-step cost specifications. They will be discussed in the following section.

  • Expected average cost criterion: Let D be a stationary deterministic policy. The process that describes the consecutive states of the control system (under D) is a Markov chain. The transition probabilities of this chain are \( p_{st} \left( {D\left( s \right)} \right) \) given above (D(s) is the action in state s). Clearly, the chain has only one closed (sub)set of states. Hence, the expected average cost of policy D is

    $$ g(D): = \frac{{\sum\nolimits_{s \in S} {c_{s} (D(s))\pi (s|D)} }}{{\sum\nolimits_{s \in S} {\tau_{s} (D(s))\pi (s|D)} }}, $$

    where \( \pi (s|D) \) is the stationary distribution in state s under policy D, cf. [12, Sect. 3.5]. The goal is to find an optimal policy D*, i.e., a stationary policy for which

    $$ g{^{*}}: = g(D{^{*}}) \le g(D) $$

    for all stationary policies D.

3.1 The Value-Iteration Algorithm

Once all the elements of the Markov decision model are known, we can use the value-iteration algorithm to calculate the optimal stationary deterministic policy D*. We give the formulation of the algorithm in general terms (see also [12, Sect. 3.5]).

First choose a positive number \( \tau \) with \( \tau \le \min_{s,a} \tau_{s} (a) \) and a tolerance number \( \epsilon \), e.g., \( \epsilon \)  = 10−6.

INIT For all \( s \in S \), choose non-negative numbers \( W_{0} (s) \) with

$$ W_{0} (s) \le \mathop {\hbox{min} }\limits_{a} \{ c_{s} (a)/\tau_{s} (a)\} . $$

Let n:= 1.

LOOP For all \( s \in S \), calculate

$$ W_{n} (s) = \mathop {\hbox{min} }\limits_{a \in A(s)} \left[ {\frac{{c_{s} (a)}}{{\tau_{s} (a)}} + \frac{\tau }{{\tau_{s} (a)}}\sum\limits_{t \in S} {p_{\text{st}} (a)W_{n - 1} (t)} + \left\{ {1 - \frac{\tau }{{\tau_{s} (a)}}} \right\}W_{n - 1} (s)} \right], $$

and let \( D_{n} (s) \in A(s) \) be the action that minimizes the right-hand side.

EVAL Compute the bounds,

$$ m_{n} = \, \mathop { \hbox{min} }\limits_{s \in S} \left\{ {W_{n} \left( s \right) \, - \, W_{n - 1} - \left( s \right)}\right\} ,\;\;M_{n} = \, \mathop { \hbox{max} }\limits_{s \in S} \left\{ {Wn\left( s \right) \, - \, W_{n - 1} \left( s \right)} \right\}. $$

TEST \( M_{n} - m_{n} \leq \epsilon m_{n} \) then STOP with the resulting policy D n , else n : = n +1 and go to LOOP.

This algorithm returns after, say, n iterations a stationary deterministic policy D n . Let g n := (m n  + M n)/2. Then, \( g_{n} \to g{^{*}} \) if \( n \to \infty \) and \( \left| {g_{n} - g{^{*}}} \right| \le \in g{^{*}} \). In other words, g n is an approximation of the minimum expected average costs.

4 The Expected One-Step Costs

In this section, we specify the expected one-step costs \(c_s \left(a\right) \) for the three optimization criteria.

Consider an arbitrary feasible state-action pair (s, a) of the Markov decision model. Immediately after the action, there is a unique buffer content q. For instance, suppose \( \left( {s, \, a} \right) \, = (\left( {i, \, S, \, R} \right), \, H){\text{ and}}\;{\text{i}} \ge 1 \). Then, q = i because the server is disabled due to switching. On the other hand, if \( \left( {s,a} \right) = \, ((i,S,H),H){\text{ and}}\;{\textit{i}} \ge 1 \), then q = i − 1 because the server takes a job out of the buffer. Also, the time it takes until the next decision epoch depends on the state-action pair (s,a). We denote an arbitrary interdecision time by X. To illustrate, consider the same examples: (s, a) = ((i, S, R), H) yields X = V R (switching and preparing for the other service mode); (s, a) = ((i, S, H), H) yields X = S H (service duration). We summarize the buffer content q and interdecision time X for all feasible state-action pairs (s, a) in Table 1. Notice that we introduced already in Sect. 3 the different expectations of X as \( \tau_{\text{s}} \left( \textit{{a}} \right) \). The calculation of the expected one-step costs, though, requires the complete distributions of interdecision time X.

Table 1 Buffer content and interdecision time, given the state-action pair

The one-step costs are costs incurred during the interdecision time X after the buffer started with q jobs. Therefore, we may denote the expected one-step costs by

$$ c_{s} (a) = w(X,q), $$

where X and q are given in Table 1. Because the buffer content does not change during an interarrival time A, we can express \( W(A + S_{L} ,0) \) in terms of w(S L , q):

$$ \begin{aligned} w(A + S_{R} ,0) & = \sum\limits_{k = 1}^{N} {\beta_{k} w(S_{R} ,k - 1)} + \sum\limits_{k > N} {\beta_{k} w(S_{R} ,N)} , \\ w(A + S_{H} ,0) & = \sum\limits_{k = 1}^{N} {\beta_{k} w(S_{H} ,k - 1)} + \sum\limits_{k > N} {\beta_{k} w(S_{H} ,N)} . \\ \end{aligned} $$

In the following section, we shall determine the functions w(X, q) in detail. We call these functions the expected interdecision cost functions.

5 The Expected Interdecision Cost Functions

As said, it remains to calculate the functions \( w(X,q) \) in the three optimization criteria for \( X = S_{R} ,S_{H} ,V_{R} ,V_{H} \), and \( q = 0,1, \ldots ,N \). This problem is simplified by applying some general results from Chap. 6 in [10]. There, similar functions in the context of a lost-sales production/inventory model are studied. For that purpose, we have to introduce the following two sets of generating functions.

5.1 Generating Functions Associated with the Model

$$ \begin{aligned}\beta (z) &:=\sum\limits_{k = 1}^{\infty } {\beta_{k} z^{k} }\\B(z)&:= \lambda (1 - \beta (z)), \\A_{L} (z)&:= \sum\limits_{j = 0}^{\infty } {\alpha_{j}^{(L)} z^{j} } \;{\text{the generating function of}}\; (\alpha_{j}^{(L)} )_{j} (L = H,R), \\ \Upphi_{L} (z) &:=\sum\limits_{j = 0}^{\infty } {\phi_{j}^{(L)} z^{j} } \;{\text{similarly for}}\; (\phi_{j}^{(L)} )_{j} . \end{aligned} $$

5.2 Generating Functions Associated with X

Let X be any of the two service times S R , S H or any of the two switch-over times V R , V H . It has cumulative distribution function \( G(x): = {\text{Pr(}}X \le x ) \) and Laplace–Stieltjes transform \( \widehat{G}(s): = \int_{0}^{\infty } {\exp ( - sx)G(dx)} . \)

$$\begin{aligned} \Psi (z) & = \sum\limits_{{k = 0}}^{\infty } {\psi _{k} z^{k} } : = \widehat{G}(B(z)), \\ \Gamma (j,z) & = \sum\limits_{{k = 0}}^{\infty } {\gamma _{k} (j)z^{k} } : = j\widehat{G}{^\prime} (B(z)) - \frac{1}{2}zB^\prime (z)\widehat{G}^{\prime\prime} (B(z)),\quad j = 1,2, \ldots , \\ \Delta (z) & = \sum\limits_{{k = 0}}^{\infty } {\delta _{k} z^{k} } := \frac{z}{{(1 - z)B(z)}}\left\{ {1 - \Psi (z) + B(z)\widehat{G}^\prime ({\rm B}(z))} \right\}, \\ \Theta (z) & = \sum\limits_{{k = 0}}^{\infty } {\theta _{k} z^{k} } : = \frac{{ - zB^\prime (z)}}{{(1 - z)B^{2} (z)}}\left\{ {1 - \Psi (z) + B(z)\widehat{G}^\prime (B(z)) - \frac{1}{2}B^{2} (z)\widehat{G}^{\prime\prime} (B(z))} \right\}, \\ \Xi (z) & = \sum\limits_{{k = 0}}^{\infty } {\xi _{{k^{{z^{k} }} }} : = \frac{1}{{1 - z}}}{\left\{ {E[X] + \frac{{\Psi (z) - 1}}{{B(z)}}} \right\}} . \\ \end{aligned} $$

Notice that if X = S R , then \( \Uppsi (z) = A_{R} (z) \): the generating function of the number of arrivals during service time S R . Similarly, if X = V R , then \( \Uppsi (z) = \Upphi_{R} (z): \) the generating function of the number of arrivals during switch-over time V R . (Also, the high-speed equivalents are valid.)

To calculate the expected interdecision cost functions w(X, q), we consider the three optimization criteria separately.

  • Average number of jobs in the buffer.

Let Q(t) be the number of jobs in the buffer at time t, given that at time 0, a decision is taken. Hence,

$$ w(X,q) = E\left[ {\int_{{(0,X)}} {Q(t)dt|Q(0 + ) = q} } \right],\quad q = 0,1, \ldots ,N $$

Notice that the buffer content Q(t) on the time interval (0, X) varies due to the arriving of batches only. The following theorem gives the exact expression.

Theorem 1

Let the random variable X be either a service time or a switching time. Then, w(X, N) = NE [X], and for q = 0, 1,…,N − 1:

$$ w(X,q) = {\textit{NE}}[X] + \sum\limits_{k = 0}^{N - q - 1} {\gamma_{k} (N - q) - (N - q)\delta_{N - q} + \theta_{N - q - 1} } , $$

where \( \gamma_{k} (N - q),\delta_{N - q} ,\theta_{N - q - 1} \) are the coefficients in the terms of the generating functions associated with X.

Proof

The proof follows directly from Theorem 6.4.2 of [10]. We only need to equate the number of empty places in the buffer with the inventory on hand in the lost-sales production/inventory model discussed in [10, Chap. 6].\( \square \)

  • Loss probability of jobs.

This is the problem how to find the switching policy that minimizes the loss probability, i.e., the long-run fraction of jobs that is rejected. The rejection rate is the long-run average number of jobs that is rejected per unit time. Hence, the loss probability is simply the rejection rate divided by the average number of individual arrivals per unit time, which is \( \lambda \beta^{(1)} \). Therefore, it suffices to find the switching policy that minimizes the rejection rate. For that problem, the expected interdecision cost function is

$$ \begin{aligned} w(x,q): & = {\text{the}}\;{\text{expected}}\;{\text{number}}\;{\text{of}}\;{\text{rejected}}\;{\text{jobs}}\;{\text{during ran - }} \\ & {\text{dom}}\;{\text{time}}\;X{\text{,}}\;{\text{given}}\;{\text{that}}\;{\text{at}}\;{\text{the}}\;{\text{start}}\;{\text{of}}\;{\text{this}}\;{\text{interval}} \\ & q\;{\text{jobs}}\;{\text{are}}\;{\text{in}}\;{\text{the}}\;{\text{buffer}}{\text{.}} \\ \end{aligned} $$

This is easy:

$$ w(X,q) = \sum\limits_{k > N - q} {(k - N + q)\psi_k} ,\quad q = 0,1, \ldots ,N, $$

where the probabilities \( \psi_k \) are defined earlier in this section. As a side remark, the infinite sum can be easily rewritten as a finite sum, because \( \sum\nolimits_{k > 0} {k\psi_{k} = \lambda \beta^{(1)} E[X]} \).

  • Fraction of not fully accepted batches.

The optimal switching policy that minimizes this fraction will be calculated by solving the equivalent problem of finding the switching policy that minimizes the long-run average number of batches that is not fully accepted per unit time. So,

$$ \begin{aligned} w(x,q): & = {\text{the}}\;{\text{expected}}\;{\text{number}}\;{\text{of}}\;{\text{batches}}\;{\text{not}}\;{\text{fully}}\;{\text{accepted}} \\ & {\text{during}}\;{\text{random}}\;{\text{time}}\;{\text{X,}}\;{\text{given}}\;{\text{that}}\;{\text{at}}\;{\text{the}}\;{\text{start}}\;{\text{of}} \\ & {\text{this}}\;{\text{interval}}\;q\;{\text{jobs}}\;{\text{are}}\;{\text{in}}\;{\text{the}}\;{\text{buffer}}{\text{.}} \\ \end{aligned} $$

w(X, q) = the expected number of batches not fully accepted during random time X, given that at the start of this interval, q jobs are in the buffer. From Theorem 6.6.1 in [10], we borrow the following result.

Theorem 2

Let the random variable X be either a service time or a switching time. Then,

$$ w(X,q) = \sum\limits_{k > N - q} {\psi_{k} + \lambda \xi_{N - q} } ,\quad q = 0,1, \ldots ,N, $$

where the numbers \( \psi_{k} \) and \( \xi_{k} \) are the coefficients in the terms of the generating functions associated with X.

6 Numerical Results

We have chosen the following M X/G/1/N + 1 model with two service modes. The parameters of arrivals, batches, and service and switching times are

$$ \lambda = 0.15,\;\beta^{(1)} = 4,\;E[S_{R} ] = 1.7,\;E[S_{H} ] = 1.5\;E[V_{R} ] = 1,\;E[V_{H} ] = 1 $$

We consider both a constant and a geometrically distributed batch size. The buffer size is varied from N = 10 to 150. The regular service time S R is taken to be Erlang-2, while the high-speed service time S H is given a squared coefficient of variation \( C_{{S_{H} }}^{2} = 4 \) with a Coxian-2 distribution. The switch-over times are both taken exponentially distributed.

These distributions give us analytic expressions of the generating function β(z) of the batch size and of the Laplace–Stieltjes transform \( {\widehat{\text{G}}}\left( {\text{s}} \right) \) of any of the service and switching times. Hence, all generating functions mentioned in Sect. 5 can be given explicitly. Then, the numerical values of the coefficients of these functions (i.e., \( \alpha_{k}^{(L)} ,\;\phi_{k}^{(L)} ,\psi_{k} ,\uplambda_{k} (j),\delta_{k} ,\theta_{k} ,\xi_{k} \)) can be calculated efficiently by the discrete fast Fourier transform. And so, all ingredients of the semi-Markov decision model are ready for use.

The results are presented in Table 2 (constant batch sizes) and Table 3 (geometric batch sizes). We have considered the criteria “average number of jobs present” and “fraction of lost jobs.” The stopping \( \varepsilon \) in the value-iteration algorithm was set to 10−6. Then, the number of iterations varied from 216 (buffer size N = 10) to 11,234 (buffer size N = 150) for the loss probability criterion with the constant batch sizes and from 112 (buffer size N = 10) to 10,243 (buffer size N = 150) for the average buffer content criterion with the constant batch sizes. The number of iterations in case of geometric batch sizes is slightly less (e.g., N = 150 requires 9,106 and 8,473 iterations). In all cases, we found that the optimal switching rules are of the hysteretic type (m, M), where m < M. That is, when the service mode is regular and the buffer content becomes M or more, the server switches to high speed. On the other hand, when the service mode is high and the buffer content becomes m or less, the server switches to regular mode. (M = N + 1 means that always the regular mode is applied.) We conjecture that there are examples in which the optimal policy is not of the hysteretic type. We have not yet found such an example.

Table 2 Constant batches. Optimal loss probabilities \( \pi_{\text{loss}}^{*} \) and optimal buffer contents Q*
Table 3 Geometric batches. Optimal loss probabilities \( \pi_{\text{loss}}^{*} \) and optimal buffer contents Q*

We see from the figures in the tables that the optimal switching rules for the two optimization criteria are quite different. To see the deteriorating effect of minimizing the loss probability on the average buffer contents and vice versa, we give, apart from the optimal values, also the results of the non-optimized criterion. As expected, the difference between these values fades away for large buffer sizes, when the loss probability becomes small.

Furthermore, we compare the results with those for the infinite buffer. With the method of [4], we determined the policy that minimizes the number of jobs in the system for the infinite model with the same parameter values. This differs by at most one job with the average number in the buffer. We see that the optimal policies are the same for the infinite model and the finite model with large buffer size, but the average number of jobs in the system (in the infinite model) is quite different, especially for the geometric case.

Finally, we remark that while the criterion values for geometric batches and constant batches are completely different, the optimal switching rules are very similar. This insensitivity of the optimal policy for the batch size distribution has already been observed in [3].

7 Conclusions

In this paper, we considered a natural optimization problem in the context of a queueing system with finite buffer and two service modes: “at which levels of buffer content should the server switch service mode?” We formulated the optimization problem as a semi-Markov decision model with fictitious decision epochs and solved it by the value-iteration algorithm. In all cases we evaluated numerically, we found that the optimal policy is of the hysteretic type. We doubt whether always the optimal policy is hysteretic, but we cannot provide counter-examples. Furthermore, we saw that one has to be careful in approximating the performance of the finite-buffer system by the corresponding infinite-buffer system, even for large buffers (with small loss rate). The average number of jobs in the system can be far off, as we noticed in the case of geometric batch sizes. Further study and analysis may give more insights of this—to us—surprising phenomenon.