1 Introduction

In this paper, we introduce the concert queueing game. This model is motivated by the following scenario. Before going to, say, a popular rock concert with unassigned seats, one faces the following dilemma: Should one go early to secure good seats, but wait a long time in queue, or go late when queues are smaller but the better seats already taken? Similarly, in a busy cafeteria that opens at noon for lunch, should one go early when the queues are long, or perhaps go hungry a bit longer and avoid the long queues? Similar trade-offs govern customer decisions in many queueing situations such as visiting a retail store on the day of a huge sale, queueing in front of the book store before the release of a very popular book, visit to the DMV office, to a movie theater, and so on. Similar, although not identical, tradeoffs may be found in diverse areas that involve periodic congestion, from choosing the best time for commuting to work, to choice of the start time for downloading a large file over the Internet.

The proposed model is meant to study the emerging system behavior when users are faced with such service delay vs. queueing delay trade-offs, and choose their arrival times strategically. We assume that there is a large but finite number of customers that need to be served in a first-come first-served manner. The server at the queue becomes active at a particular time. Customers can choose to arrive and queue up both before and after that time. The cost structure of each customer is additive and linear in the waiting time and in the service completing time. Alternatively, a customer may be interested in the number of users served before her rather than the service completion time (see Remark 4 below). Multiple classes of customers are allowed that differ in their cost coefficients. We primarily focus on a finite number of classes, but also address briefly the same model with a continuum of classes.

The analysis in this paper is carried out within a fluid model, which is motivated as the fluid-scale limit of the stochastic queueing system with prescribed arrival timing. This fluid model offers a great deal of analytical simplification. The game of arrivals defined over this model belongs in the class of non-atomic games (Schmeidler 1973), where each customer is infinitesimal and therefore his effect on the others is negligible. We show that this game has a unique Nash equilibrium point (in terms of the aggregate arrival profile), and explicitly identify this point.

An important property of any equilibrium solution is the social efficiency loss it entails, when compared to the social optimum. A popular measure of this loss is the price of anarchy (PoA), which equals the maximum of the ratio of the social cost of the equilibrium solution to that of the socially optimal one among all equilibria. In our model, we show that the PoA in the single class setting is exactly 2 for all parameter values. In the multi-class setting, we develop tight upper bounds and corresponding lower bounds on the PoA that depend only on the range of the cost parameters across customer classes. Furthermore, we consider several mechanisms that can be used to reduce the PoA; the analysis is carried out within the single-class setting for simplicity. These mechanisms include service time restrictions, assigning priorities to certain segments of the populations, or charging tariffs that depend on the time of service. Equilibrium profiles associated with these mechanisms are easily derived. These are discussed in Section 6; a key observation here is that by suitably dividing the population into n segments, along any of these three ways, the PoA can be optimized to equal 1 + 1/n, so that it converges to 1 as n → ∞.

Strategic queueing problems that involve self-optimizing customers have been extensively studied for over four decades, spanning problems of admission control, routing, reneging, choice of priorities, pricing, and related issues. A sizable part of this literature is summarized in the monograph (Hassin and Haviv 2003). A central issue in this context is the comparison of the individual equilibrium and the socially optimal solution. This may be traced back to Naor’s seminal paper (Naor 1969), which considers these two solutions in the context of admission control to a single-server queue, and suggests pricing as a means to induce the social optimum. Recently, Haviv and Roughgarden (2007) provided bounds on the PoA for the problem of routing into n parallel servers, and Gilboa-Freedman et al. (2009) studied the PoA in Naor’s model. In Holt and Sherman (1982), an interesting perspective is taken wherein time of arrival of customers is determined through a first-price auction.

Equilibrium arrival patterns to queues with a finite service period were apparently first considered in Glazer and Hassin (1983), where a Poisson-distributed number of homogeneous customers may choose their arrival times with the goal of minimizing their waiting time. In this model, service ends at a specified time and customers are indifferent to the time-of-day when their service is completed. Several extensions and variations of this model have been considered, e.g., in Rapoport et al. (2004), Lariviere and van Mieghem (2004), Hassin and Kleiner (2009), and are further described in Hassin and Haviv (2003) (Chapter 6) and Hassin and Kleiner (2009). The model of Wang and Zhu (2004) incorporates preferences for early service within a multiple shift scheme, where the service period is divided into evenly-spaced shifts, and the waiting time in each shift is determined by the number of customers who choose this shift.

A related body of research exists in the transportation literature, where equilibrium trip-timing patterns were extensively studied in the context of the so-called bottleneck or morning commute problem. Vickrey (1969) introduced a fluid flow model, where homogeneous commuters choose their departure time for travel through a single bottleneck of fixed flow capacity. The cost function for each commuter includes a penalty for arriving early or late to the destination (relative to the desired arrival time), in addition to the cost of delay in the bottleneck. Pointers to the extensive ensuing literature regarding this model and its generalizations may be found in Lindley (2004). In particular, Newell (1987) introduced commuter heterogeneity in terms of their (linear) cost coefficients, in addition to the required arrival time. Lindley (2004) provides existence and uniqueness results for the multiclass model with nonlinear costs, under fairly general conditions. We note that our fluid model can be considered a special case of these models with the desired arrival times all set to zero though the bottleneck model does not have a predetermined opening time. However, the explicit expressions presented here for the equilibrium as well as the analysis of the PoA and the ways to reduce it are new.

The organization of this paper is as follows: In Section 2 we describe our model. We start with a brief description of the stochastic queueing system and its fluid scale analysis, followed by a description of the arrival game for the fluid model. In Section 3 we focus on the single-class case, for which the results are particularly simple, and show that unique Nash equilibrium corresponds to a uniform arrival profile over a finite interval. We generalize the results to the multi-class settings in Section 4 where we consider finite number of classes. In both Sections 3 and 4, we also compute and bound the price of anarchy for the derived equilibrium. We briefly discuss the generalization to a continuum of classes in Section 5. In Section 6 we discuss some ways to reduce the price of anarchy. In Section 7, we present numerical results for a simple experiment that suggest that the equilibrium arrival profile that is valid in the fluid regime may be close to equilibrium in the finite-N queue, for N reasonably large. Finally, we end with a brief conclusion in Section 8.

2 Model description

This section introduced the fluid model that we analyze in this paper. We start with a brief description of the underlying stochastic queueing system, discuss the fluid limit of this model (with fixed arrival patterns), and then describe the game model that we consider in the rest of the paper.

2.1 The stochastic queueing system

Consider a queueing system that caters to a finite number N of customers, which are served on a first-come first-served basis. N may be random, with a finite mean E(N). The required service times of these customers form an i.i.d. sequence (V j : 1 ≤ j ≤ N) that may have a general marginal distribution with rate μ = 1/E( V j ). Each customer j independently picks his arrival time, as a sample from a probability distribution with CDFFootnote 1 F j (·). Service starts at time t = 0, and continues until all customers are served. Customers may arrive and queue up both before and after t = 0. For simplicity, assume at the outset that each F i is supported on a finite time interval.

Suppose that customers wish to be served as early as possible, while minimizing their waiting time. We capture these (possibly conflicting) goals through a linear cost function. Let

$$ c_j(w_j,\tau_j) = \alpha_j w_j + \beta_j \tau_j$$

denote j’s cost function, where w j is his waiting time in the queue, τ j his service completion time, and α j  > 0, β j  > 0 are the respective cost sensitivities. The cost parameters (α j ,β j ) of each customer shall define his type (or class). The type of each arriving customer is assumed to be randomly selected according to a known and common probability distribution over a given set of types. The customer’s type is considered private information, that is, it is known to customer himself but not to others.

Given the collection F = {F j } of arrival time distributions for all customers, both w j and τ j become well-defined random variables, and we may consider the expected cost \(C^j_{\mathbf F}=E_{\mathbf F}(\alpha_j w_j + \beta_j \tau_j)\), where E F is the expectation induced by F. As usual, we say that the collection F = {F j } of arrival time distributions is a Nash equilibrium point (NEP) for this problem if no user j can reduce his expected cost \(C_j^{\mathbf F}\) by unilaterally modifying his arrival time distribution F j . Our goal is to characterize these NEPs and study their properties. This will be done within an approximate fluid model, which greatly facilitates the analysis and leads to explicit solutions. We thus turn to consider the fluid approximation of this system.

2.2 The fluid limit

This section motivates the fluid model that is the main subject of this paper, by considering the fluid-scale limit of the above-described queueing system as the number of arrivals becomes large. Our discussion here is limited to the case where the arrival time distributions of all customers are pre-specified and not a result of equilibrium considerations. For such a system we identify the fluid limit. Our equilibrium analysis then focuses on the resultant fluid model. In this paper we do not consider the equilibrium arrival distribution in the finite customer setting or its convergence to the equilibrium fluid model arrival profile as the number of arrivals increases to infinity. This is an interesting research direction that is deferred to future work.

Consider a sequence of queueing systems indexed by n ≥ 1, defined on a common probability space, and let N n be the population size (number of customers) in the nth system. Assume that \(\lim_{n\to \infty} \frac{N^n}{n} = \Lambda > 0\) (with probability 1). Let \(F^n_i(\cdot)\) denote the arrival time distribution for customer i in the nth system. The service parameters are as described above. In particular, the service time distribution does not dependent on n, and has rate μ.

Let \({F}^n(t) = \sum_{i=1}^{N^n} F^n_i(t)\) denote the aggregate arrival profile in the nth system. Suppose that the collection \(\{F^n_i\}\) is given so that

$$ \label{eqn:sec_2_1} \frac{1}{n} {F}^n(nt) \rightarrow F(t) $$
(1)

as n → ∞, uniformly on compact sets (u.o.c.), where F(·) is the fluid arrival profile. It follows that F(·) is the CDF of a positive measure on the real line with total mass Λ. More specifically, F is non-decreasing, right-continuous, with F( − ∞ ) = 0 and F( ∞ ) = Λ.

Note that the time axis is scaled by a factor of n in Eq. 1. This accounts for the increase in the overall service time requirement of all N n customers in the nth system. Importantly, under this time scaling the service time of a single customer diminishes to zero as n increases.

We further observe that the same fluid arrival profile F can arise from different choices of individual arrival distributions, ranging from i.i.d. arrival times to deterministic ones. The following simple example illustrates this point.

Example 1

Suppose F is a uniform distribution on [ − T,T] with mass Λ, namely \(F(t)=\Lambda\frac{t\,+\,T}{2T}\) on that interval. Then, F can arise out of each of the following possibilities.

  1. a.

    IID arrivals: Each \(F^n_i\) corresponds to a uniform distribution on [ − nT,nT] for some T > 0, namely \(F^n_i(t)=\frac{t\,+\,nT}{2nT}\) on [ − nT,nT]. Then, \(\frac{1}{n} {F}^n(nt)= \frac{N^n}{n}\left (\frac{t\,+\,T}{2T} \right )\) and this converges to F(t), almost surely, u.o.c. as n → ∞.

  2. b.

    Deterministic arrivals:\(F^n_i\) corresponds to the deterministic arrival time t i  = (2i − n)T. Equivalently, \(F^n_i(t)={\mathbf 1}\{t\geq (2i-n)T\}\), where 1{·} denotes an indicator function.

We next consider the queue-length process and its fluid limit. Let Q n(t) denote the queue length at time t (including the customer in service), namely, Q n(t) is the cumulative number of arrivals minus service completions up to and including time t, and let

$$\bar{Q}^n(t) = \frac{Q^n(nt)}{n}$$

denote its scaled version. To specify the fluid limit of Q n, let S(t) = μt 1{t ≥ 0} denote the fluid-scale potential service process (recall that service starts at t = 0). Also define X(t) = F(t) − S(t), and

$$ \label{queue} {Q}(t)= {X}(t)+ \sup\limits_{0 \leq s \leq t}[-{X}(s)]^+ \,. $$
(2)

Then, by standard results (see, for example, Chen and Yao 2001, Theorem 6.5 and its proof) it follows that as n→ ∞, the following process-level convergence of the scaled queueing process:

$$ \bar{Q}^n(\cdot) \rightarrow {Q}(\cdot) $$

holds, almost-surely, u.o.c. This process-level convergence result evidently relies on the functional strong law of large numbers.

The limit queue process Q(t) corresponds to a fluid system with deterministic input and output streams of fluid. The cumulative arrival process is given by F(t), and the service rate is μ(t) = μ 1{t ≥ 0}. This fluid model will be the subject of our subsequent analysis.

2.3 The multiclass fluid model

We proceed to describe the concert arrival game for the fluid model with a finite number of customer classes. The customer population is represented by the set [0, Λ], where Λ stands for the total workload, and each customer corresponds to a single point in this interval. These infinitesimal customers arrive at a service facility with potential service rate μ (in terms of fluid units per unit time), that activates at time t = 0. Thus, all customers may be served within T f  = Λ/μ time units. All customers join a single queue, and are served in the order of their arrival. If a non-zero mass of customers arrives simultaneously (represented by a jump in F(t)), then their queueing order is determined randomly and with symmetric probabilities.

Customers may belong to different classes, which differ in terms of their cost parameters. Let \({\cal I}=\{1,2,\dots,I\}\) denote the set of customer classes. For each class \(i\in{\cal I}\), let Λ i denote the total workload carried by its members. Thus ∑  i Λ i  = Λ, and serving all class i customers requires Λ i /μ time units. The cost function for a class i customer is given by

$$ C_i(w,\tau)=\alpha_i w + \beta_i \tau $$

where w is this customer’s waiting time in the queue, τ ≥ 0 his service completion time, and α i  > 0, β i  > 0 are the respective cost sensitivities to the waiting time and service completion time that specify his class or type.

Consider a customer who arrives at time t and is placed at the end of a queue of size q. His waiting time will be w = q/μ + max {0, − t} so that he completes his service and leaves the system at τ = t + w = q/μ + max {0,t}. (Note that the service time of individual customers is null since customers are infinitesimal.)

Let F i denote the class-i arrival profile. It is the CDF of a positive measure on the real line with total mass Λ i . Thus, F i ( − ∞ ) = 0, F i ( ∞ ) = Λ i and F i (t) is right-continuous and non-decreasing in t. An arrival profile is the collection {F i } of arrival profiles, one for each class. The sum F(t) = ∑  i F i (t) denotes the aggregate arrival profile. As discussed in Section 2.2, an arrival profile F i should be interpreted as a deterministic summary of the arrival decisions of the individual customers, which may themselves be deterministic or stochastic. The following restriction applies to each F i .

Remark 1

To avoid lingering over some mathematical subtleties, we shall assume at the outset that the measure represented by F i has no singular continuous component, and is therefore the sum of an absolutely continuous component and a discrete component (see Royden 1988, Pg. 108–113, for instance).

Given the aggregate arrival profile F = ∑  i F i , the queue-size process Q(t) is uniquely defined by Eq. 2. Therefore, the expected waiting time W(t) of a potential arrival at time t is well defined as well. Specifically, if Q(t) is continuous at t, then the waiting time is deterministic and given by W(t) = Q(t)/μ + max {0, − t}. If Q(t) has a jump at t (due to an upward jump in the arrival profile F), then the position of an arriving customer would be uniformly distributed in [Q(t − ),Q(t + )] with average \(\bar{Q}(t)=\frac{1}{2}(Q(t-),Q(t+))\), so that the expected waiting time is \(W(t)=\bar{Q}(t)/\mu+\max\{0,-t\}\). Let W F (t) denote the expected waiting time that corresponds to a given arrival profile F.

The expected cost of a class i customer that arrives at t is now given by

$$ \label{eqn:CiFt} C^i_F(t) = \alpha_i W_F(t) + \beta_i (t+W_F(t))\,. $$
(3)

More generally, the expected cost incurred by a class i customer who selects his arrival by sampling from probability distribution G is

$$ {\cal C}^i_F(G)= \int_{-\infty}^{\infty} \left ( \alpha_i W_F(t) + \beta_i (t+ W_F(t)) \right ) d G(t) \,. $$

We proceed to define the Nash equilibrium for the induced game. A multi-strategy for this game is a collection {G s (·), s ∈ [0,Λ]} of probability distributions on the real line, one for each customer s, represented by their CDFs.

Definition 1

A multi-strategy {G s (·), s ∈ [0,Λ]} is a Nash equilibrium point if

  1. (i)

    \(F(t) = \int_0^{\Lambda} G_s(t) ds\) is well defined for each t, and

  2. (ii)

    For any customer s ∈ [0,Λ] of class i,

    $$ {\cal C}^i_F(G_s) \leq {\cal C}^i_F(\tilde{G}),\quad \text{for every CDF } \tilde{G}. $$

That is, no customer s can improve his cost by modifying his own arrival time distribution. Note that this definition makes use of the fact that the action of a single (infinitesimal) customer does not affect the arrival profile F(t). This property is shared by the class of non atomic anonymous games (cf. Schmeidler 1973), to which the present model belongs.

The specific consideration of each customer in the last definition is too detailed for our purpose. A more useful definition may be given in terms of the class arrival profiles.

Definition 2

An arrival profile \(\{F_i,\,i\in{\cal I}\}\) is an equilibrium profile if, for each class i, there exists a set \({\cal T}_i\) of F i -measure Λ i on which \(C^i_F(t)\) is minimal, namely,

$$ C^i_F(\tau) \leq C^i_F(t) \quad \mbox{\rm for all } \tau\in {\cal T}_i \mbox{\rm\ \ and }-\infty<t<\infty \,. $$

Essentially, this definition requires the cost \(C^i_F(t)\) to be minimal on the support of F i .

The two definitions may be seen to be compatible in the following sense:

  1. (i)

    First, given an equilibrium profile \(\{F_i,\,i\in {\cal I}\}\), a compatible equilibrium multi-strategy {G s (·), s ∈ [0,Λ]} may be obtained (for example) by letting G s  = F i i for each customer s of class i. Thus, all customers of a given class i are assigned identical arrival distributions, which adds up to the given arrival profile F i for that class. This immediately implies that \(F(t) \stackrel{\triangle}{=} \int_0^{\Lambda} G_s(t) ds =\sum_i F_i(t)\), and property (ii) of Definition 1 now follows since, for G s  = F i i , we get by Definition 2 that \({\cal C}^i_F(G_s)= \min_t C^i_F(t)\), while the latter is clearly not larger than \({\cal C}^i_F(\tilde{G})\) for any CDF \(\tilde{G}\).

  2. (ii)

    Conversely, an equilibrium multi-strategy {G s (·), s ∈ [0,Λ]} induces a unique arrival profile for each class, given by \(F_i(t)=\int_0^{\Lambda} G_s(t){\mathbf 1}\{s\in S_i \} ds\), where S i is the set of class i customers. Now, {F i } is an equilibrium profile. Indeed, by Definition 1(ii) it follows that, for each s ∈ S i , \({\cal C}^i_F(G_s) = \min_t C^i_F(t)\), hence there must exist a set of times \({\cal T}_s\) of G s -measure 1 on which \(C_F^i(t)\) attains that minimal value. Therefore \(C_F^i(t)\) is minimal also on the union \({\cal T}_i = \bigcup_{s\in S_i} {\cal T}_s\), while the F i -measure of \({\cal T}_i\) is Λ i , since the G s measure of \({\cal T}_i\) is 1 for each s ∈ S i (as \(1\geq G_s({\cal T}_i)\geq G_s({\cal T}_s)=1\)). Thus, the requirements of Definition 2 are satisfied.

3 Analysis of the single-class model

To bring out salient features of the analysis, we first consider the single-class case. Here all customers share the same cost parameters, and we may drop the class index i from the notation. The results in this case are particularly simple: The equilibrium arrival profile turns out to be a uniform distribution, and the price of anarchy exactly equals 2.

The following lemma will be useful in simplifying the expression for the cost function under equilibrium conditions. Some notation is introduced first. Recall that T f  = Λ/μ, and let

$$ t^*= \inf\{t \geq 0: F(t) < \mu t \}. $$

This is the first time beyond 0 at which the server becomes starved.

Lemma 1

For any equilibrium arrival profile F ,

  1. (i)

    \(t^*= T_f\) (i.e., the server works at full rate till the last customer is served).

  2. (ii)

    There are no point masses in F, so that F(t) is absolutely continuous in t.

  3. (iii)

    For t ≤ T f ,

    $$ \label{eqn:1g1} W_F(t)= F(t)/\mu - t. $$
    (4)

As is apparent from the proof below, Condition (ii) in Lemma 1 is applicable even to the finite n queueing system (not just its fluid limit).

Proof of Lemma 1

  1. (i)

    Clearly, \(t^* \leq T_f\), since all customers are served by T f at full service rate. Suppose that \(t^*< T_f\). Then F cannot be an equilibrium arrival profile. To see this, note that Q(t *) = 0 by definition of t *, so that W(t *) = 0. Furthermore, since \(t^*< T_f\), a positive mass of customers have not been served yet, and since Q(t *) = 0 these customers have not arrived by t *, so that F(t *) < Λ. Thus, those customers that arrive after t * can improve their cost by arriving at t * instead and getting served immediately. This implies that F cannot be an equilibrium profile.

  2. (ii)

    Suppose that F has a point mass of size λ > 0 at some t = t 1. Then, a customer that arrives at t 1 sees, on average, half (λ/2) of the customers that arrive at t 1 before her. However, by arriving at t 1 − ε with ε > 0, such a customer would arrive ahead of this bunch, thereby reducing its waiting time by λ/2μ − ϵ at least, and leaving earlier. Clearly, for ϵ small enough this means that arriving at t 1 is not optimal for such a customer. It follows that F has no point masses, namely no discrete component. Since F has no continuous singular component by assumption, it follows that F is absolutely continuous.

  3. (iii)

    We have just established that F has no point masses. This implies that an arrival at t will see the entire queue Q(t) before him. For t < 0, Q(t) = F(t), and the equality in Eq. 4 follows since − t is the customer wait before the server becomes active, and F(t)/μ is the remaining queueing delay once the server becomes active. For 0 ≤ t ≤ T f , Eq. 4 follows from part (i) of this Lemma as service proceeds at full rate in the interval [0,T f ], which implies that Q(t) = F(t) − μt, while W(t) = Q(t)/μ. □

It follows from Lemma 1 that under the equilibrium arrival profile F, the cost C F (t) at any time t ≤ T f equals

$$ \label{eqn:CF} C_F(t) = (\alpha+\beta) F(t)/\mu - \alpha t. $$
(5)

Let \(T_0= -\frac{\Lambda}{\mu} \frac{\beta}{\alpha}\). The cost in Eq. 5 becomes independent of t for t ∈ [T 0, T f ] if we select F = F * where F *(t) = 0 for t ≤ T 0, F *(t) = Λ for t ≥ T f , and

$$ F^*(t) = \Lambda \frac{t-T_0}{T_f-T_0}\,, \quad t \in [T_0, T_f] \,. $$

In that case, Eq. 5 gives C F (t) = Λβ/μ = βT f for t ∈ [T 0, T f ].

Theorem 1

F* is the unique equilibrium arrival profile with \(T_0= -\frac{\Lambda}{\mu} \frac{\beta}{\alpha}\) and \(T_f= \frac{\Lambda}{\mu}\).

Proof

We first verify that F * is an equilibrium profile. First, as noted above, \(C_{F^*}(t)=\beta \Lambda/\mu\stackrel{\triangle}{=} c_0\) for t ∈ [T 0, T f ]. For t > T f we have W(t) = 0, hence C F (t) = βt > βT f  = c 0. For t < T 0, an arrival at t is first in queue and gets served at 0, hence \(C_{F^*}(t)=\alpha(-t) > -\alpha T_0 = c_0\). Thus, \(C_{F^*}(t)\) is minimal on the interval [T 0, T f ], which has F *-measure Λ. Thus, F * is an equilibrium arrival profile by Definition 2.

We next show that F * is the unique equilibrium. Let F be any equilibrium arrival profile. By Definition 2, there exists a set \({\cal T}\) of F-measure Λ on which C F (t) equals some constant c 1, while C F (t) ≥ c 1 elsewhere. From Lemma 1, we know that all customers are served by T f so that F(T f ) = Λ. Therefore, we can restrict the set \({\cal T}\) to ( − ∞ ,T f ]. Moreover, as C F (t) is continuous by Lemma 1, we can replace \({\cal T}\) with its closure without changing the above properties. To summarize, \({\cal T}\) can be taken to be a closed set which is bounded above by T f .

Let t 1 be the maximal point in \({\cal T}\). As just noted, t 1 ≤ T f . We claim that t 1 = T f . Indeed, if t 1 < T f , then an arrival at time t 1 is the last to arrive and thus gets served last at T f , so that C F (t 1) > βT f  = C f (T f ), which is a contradiction to \(t_1\in {\cal T}\). Therefore t 1 = T f , implying that \(T_f \in {\cal T}\).

Now, by definition of \({\cal T}\), \(T_f \in {\cal T}\) implies that C F (t) = C F (T f ) = βT f for every \(t\in {\cal T}\). Note that this cost is identical to the cost computed for F * on [T 0,T f ]. But since Eq. 5 holds at any equilibrium, it follows that F(t) = F *(t) for \(t\in{\cal T} \cap [T_0,T_f]\). But this implies that F(t) = F *(t) for t ∈ [T 0,T f ], since F * is strictly increasing on that interval while F(t) is continuous and cannot increase outside the set \({\cal T}\) (as \({\cal T}\) has F-measure Λ). Finally, noting that \(F^*(T_0)=0\) and \(F^*(T_f)=\Lambda\), F is completely defined and equals F *. □

Remark 2

Observe that the equilibrium cost C F (t) = βT f  = Λβ/μ is independent of α. To understand that, note that for the last arriving customer at t = T f , the waiting time is zero and total cost is just the lateness cost βT f , which also has to be the cost at other time instants t ∈ [T 0,T f ] at equilibrium.

Remark 3

The equilibrium queue size increases linearly for t ≤ 0 according to \(Q(t)=F^*(t)=\frac{\mu \alpha}{\alpha+\beta}(t-T_0)\), and decreases linearly for t ≥ 0 according to \(Q(t)=F^*(t)-\mu t=\frac{\Lambda \beta}{\alpha+\beta}(T_1-t)\). The maximal queue size is obtained at time zero and equals \(Q(0)=\Lambda\frac{\beta}{\alpha+\beta}\). Interestingly, the latter is independent of the service rate μ.

We next evaluate the price of anarchy (PoA) for the single class model. Recall that the social cost J soc is the sum of costs over all customers. For a given arrival profile F, we obtain by Eq. 3 (with the class index dropped),

$$ \label{Jsoc} J_{\rm soc}(F) = \int C_F(t)dF(t) = \int(\alpha W_F(t) + \beta (t+W_F(t))dF(t) \,. $$
(6)

The PoA quantifies the efficiency loss due to selfish decision making by individuals, as the maximum ratio of the social cost at any equilibrium (J eq) to the optimal social cost (J opt). The PoA is then an upper bound on the above ratio for any equilibrium, and equal to this ratio when the equilibrium is unique. Since, by Theorem 1, the equilibrium arrival profile is unique, we simply define PoA as

$$ {\rm PoA} = \frac{J_{\rm eq}}{J_{\rm opt}}. $$

Proposition 1

(PoA for the single-class model) Recall that T f  = Λ/μ. Then

  1. (i)

    \(J_{\rm opt}= \frac{1}{2}\beta \Lambda T_f\).

  2. (ii)

    Jeq = ΛβT f .

  3. (iii)

    Consequently, PoA = 2.

Proof

  1. (i)

    At the socially optimal solution, the arrival instants of all customers are to be selected to minimize the social cost. Since the fluid model is deterministic, the arrival time of every customer can be set to the instant his service is due to start, which eliminates all queueing delay and is therefore optimal. Thus, W(t) ≡ 0. It is also evident that starving the server before all work is done cannot be optimal, so that the server must work at full rate μ from t = 0 to T f  = Λ/μ. Putting these two observations together implies that the uniform arrival profile F(t) = Λt/T f for 0 ≤ t ≤ T f is optimal. Therefore, by Eq. 6,

    $$ J_{\rm opt} = \int \beta t dF(t) = \frac{1}{2} \beta \Lambda T_f \,.$$

    (This expression becomes obvious once we observe that the mean arrival time is T f /2.)

  2. (ii)

    Recall that the cost for each customer at the unique equilibrium profile F * is constant and equal to βT f . Therefore, the social cost is Λβ T f .

Thus, for the single class model, the social cost at equilibrium is always one half the optimal cost, for any choice of cost parameters and service rate.

We close this section by pointing out an important extension to the basic cost model.

Remark 4

(When order of service matters) The cost function considered so far includes two components: the delayed service cost and the waiting cost in the queue. In many settings of interest, such as queueing for a better seat, it is not the time at which service is obtained that is important, but rather the number of customers that obtain service before us. Fortunately, this leads to only minor changes in our fluid model. To see this, note that this change corresponds to replacing the cost function C(t) = αW(t) + β(t + W(t)) from Eq. 3 with

$$\label{new_cost} \hat{C}(t)= \alpha W(t)+\beta F(t)\,. $$
(7)

For this new cost, we can repeat the argument in Lemma 1 to deduce that \(t^*= T_f\) in equilibrium, and therefore W(t) = F(t)/μ − t. Thus, the cost (Eq. 7) equals

$$ \hat{C}(t) = \frac{F(t)}{\mu}(\alpha + \hat{\beta} ) - \alpha t, $$

where \(\hat{\beta}= \beta \mu\). Comparing with Eq. 5, it is evident that the two cost functions coincide once β is replaced by \(\hat{\beta}\). Thus, our previous results hold for the modified cost function as well after making this substitution. In particular, the PoA remains 2.

4 The multiclass problem

We now turn to the multiclass fluid model, where customers can be heterogeneous in terms of their cost parameters. As described in Section 2.3, we divide the customer population into a finite number of classes, each characterized by distinct parameters. In the next section we briefly consider the multiclass model with a continuum of classes.

4.1 The equilibrium profile

We proceed to identify explicitly the equilibrium arrival profile. To that end, define the cost ratio parameters

$$ m_i=\frac{\alpha_i}{\alpha_i+\beta_i}\,, \quad i=1,\dots,I\,. $$

Let us re-order the class indices in increasing order of m i , so that m i  ≤ m i + 1. We will assume for simplicity that all the cost ratio parameters m i are distinct. When this is not the case, one can simply unify customer classes that have identical m i ’s, and all the results of this section essentially hold.

Theorem 2

Suppose m1 < m2 < ... < m I . Then, the equilibrium profile {F i } exists, is unique, and specified as follows: Let T0 < T1 < ... < T I be an increasing sequence of time instants defined by

$$\label{TT} T_I=\Lambda/\mu,\quad T_{i-1}=T_{i}-\frac{\Lambda_i}{\mu m_i}\,, \;\; i=0,1,\dots,I\,. $$
(8)

Then, F i corresponds to a uniform distribution on [Ti − 1,T i ] with density μm i , namely

$$ \label{density} F_i'(t)= \mu m_i {\mathbf 1}\{T_{i-1}\leq t < T_i\} \,. $$
(9)

We proceed to prove this result. To begin with, observe that Lemma 1 and its proof remain unchanged in the multiclass case. Thus, under any equilibrium profile {F i }, the server operates at its full rate μ from time 0 till the last customer is served. Hence all customers are served by time T f  = Λ/μ. Furthermore, a customer that joins the queue at time t will leave it at time τ = F(t)/μ. Therefore, the cost function for a class i arrival at t is given by

$$ \begin{array}{rll} \label{cost3} C_i(t) &=& \alpha_i (\tau-t) + \beta_i \tau = (\alpha_i+\beta_i)\tau - \alpha_i t \\ &=& (\alpha_i+\beta_i)\frac{F(t)}{\mu} - \alpha_i t \,. \end{array} $$
(10)

The next Lemma establishes the relationship between the arrival times of the different classes at equilibrium.

Lemma 2

Let {F i } be an equilibrium profile.

  1. (i)

    If an interval (t1,t2) belongs to the support of F i (t), then

    $$F_i'(t)=\mu m_i \mbox{for} t\in(t_1,t_2)\,.$$
  2. (ii)

    Let i and j be two class indices so that m i  < m j . Then all arrivals of class i occur before those of class j.

The following lemma is useful for proving Lemma 2.

Lemma 3

Let {F i } be an equilibrium profile, and denote F = ∑  i F i . Then, there are no gaps in the aggregate arrival profile, i.e., F(t2) − F(t1) > 0 for all t2 > t1 such that 0 < F(t1) < Λ.

Proof

Suppose, to the contrary, that there are no arrivals on (t 1,t 2). By our assumptions on t 1 there are some arrivals both before and after this interval. Since the server operates at full rate over (t 1,t 2), it follows that the last customer to enter before t 1 will not get served before t 2. Therefore, by arriving just before t 2, this customer will reduce her waiting time while leaving at the same time as before, thereby improving her cost. Thus, this arrival profile cannot be an equilibrium profile. □

Proof of Lemma 2

  1. (i)

    By the equilibrium definition, it follows that C i (t) is constant on (t 1,t 2). From Lemma 1 it easily follows that each F i is absolutely continuous so it admits a density that we denote by F i (t).

    Noting Eq. 10, it follows by differentiation that on that interval,

    $$ F_i'(t)=\mu\frac{\alpha_i}{\alpha_i+\beta_i} = \mu m_i\,. $$
  2. (ii)

    Suppose there are classes i and j with m i  < m j such that some class j arrivals arrive in some interval (t 1,t 2) just before class i arrivals in some interval (t 2,t 3) with t 1 < t 2 < t 3. That there will be non-zero arrivals in each of these two intervals is given by Lemma 3. Let us compare the cost incurred by a class j arrival on these two intervals. For t ∈ (t 1,t 2), C j (t) is constant (by definition of the equilibrium) and equals C j (t 2) (by continuity). Now, from item (i) we know that F′(t) = μm i on (t 2,t 3), hence on that interval,

    $$ \begin{array}{rll} C^{\prime}_j(t) &=& \frac{d}{dt}\left( (\alpha_j+\beta_j)\frac{F(t)}{\mu} - \alpha_j t \right)\\ &=& (\alpha_j+\beta_j)\frac{F'(t)}{\mu} - \alpha_j = (\alpha_j+\beta_j) m_i - \alpha_j \\ &=& (\alpha_j+\beta_j) (m_i -m_j) <0 \,. \end{array} $$

    This implies that the cost C j (t) is strictly smaller on (t 2,t 3) than on (t 1,t 2), which shows that the latter interval cannot be in the support of F j at equilibrium, contrary to our assumption. □

Proof of Theorem 2

To establish Theorem 2, we first show that an equilibrium profile must have the indicated form. From Lemma 2(ii) it follows that the arrivals of the different classes are ordered in increasing order of their m i parameters. Now, from Lemma 3 it follows that the arrivals of each class i are supported on a single interval [τ i ,T i ], and that these intervals are contiguous so that τ i  = T i − 1. From Lemma 2(i) we see that the arrival profile of each class i on its interval [T i − 1,T i ] is uniform with rate μm i . Computing the overall arrival volume on that interval gives μm i (T i  − T i − 1) = Λ i , which implies the recursive relation in Eq. 8. Finally, T I  = Λ/μ follows from Lemma 3, as already indicated.

It is now a simple matter to verify that the indicated arrival profile is indeed an equilibrium profile. Clearly, the cost C i (t) is constant on [T i − 1,T i ] by construction. Moreover, arguing as in the proof of Lemma 2, it is readily verified that C i (t) > 0 for t > T i and C i (t) < 0 for t < T i − 1, thereby establishing that the cost C i (t) is indeed minimized on the support [T i − 1,T i ] of F i . □

We end this subsection with a few observations regarding the equilibrium profile. The aggregate arrival profile F(t) = ∑  i F i (t) can be expressed more explicitly as follows. F(t) is piecewise linear, with slope μm i on [T i − 1,T i ]. The times T i are given by

$$ \label{Ti} T_i=\Lambda/\mu - \sum\limits_{j=i+1}^I\frac{\Lambda_j}{\mu m_j}\,. $$
(11)

At these times,

$$ \label{FTi} F(T_i)=\Lambda-\sum\limits_{j=i+1}^I \Lambda_j = \sum\limits_{j=1}^i \Lambda_j $$
(12)

with linear interpolation on [T i − 1,T i ] at slope μm i (see Fig. 1). Note that T 0 < 0 (since m i  < 1), so that arrivals start before t = 0 as in the single class case. Further, the aggregate arrival profile is convex for t ≤ T I , meaning that the arrival rate is increasing in time, reaching its peak towards the end of the service period. Still, the queue length is strictly decreasing beyond t = 0 (which again follows since m i  < 1.) Finally, arrivals are ordered in increasing order of \(m_i=\frac{\alpha_i}{\alpha_i+\beta_i}\), or equivalently in increasing order of \(\frac{\alpha_i}{\beta_i}\) which indicates the relative cost they attribute to waiting over being late.

Fig. 1
figure 1

The cumulative distribution of the aggregate arrival profile in equilibrium

4.2 Price of anarchy

We proceed to compute and bound the Price of Anarchy (PoA) for the multiclass model. To enhance readability, all derivations of the results of this subsection are presented in the Appendix A.

We first compute the social cost at equilibrium, J eq, and the optimal social cost J opt.

Proposition 2

In the multiclass model,

$$ \label{Jeq} J_{\rm eq} = \frac{1}{\mu} \sum\limits_{i,j=1}^I \Lambda_i\Lambda_j \alpha_i \min\left\{\frac{\beta_i}{\alpha_i},\frac{\beta_j}{\alpha_j}\right\} \,. $$
(13)

and

$$ \label{Jopt} J_{\rm opt} = \frac{1}{2\mu} \sum\limits_{i,j=1}^I \Lambda_i\Lambda_j \min\{\beta_i,\beta_j\} \,. $$
(14)

The following simple bounds on J eq and J opt readily follow from Eqs. 13 and 14:

$$ \label{JeqBound} J_{\rm eq} \leq \frac{1}{\mu} \sum\limits_{i,j=1}^I \Lambda_i\Lambda_j \beta_i \leq \frac{1}{\mu}\beta_{\max} \Lambda^2 $$
(15)
$$ J_{\rm opt} \geq \frac{1}{2\mu} \beta_{\min}\sum\limits_{i,j=1}^I \Lambda_i\Lambda_j = \frac{1}{2\mu}\beta_{\min} \Lambda^2 $$
(16)

where \(\Lambda=\sum_{i=1}^I \Lambda_i\), β min  =  min i (β i ), and β max  =  max i (β i ). We proceed to derive additional bounds on the ratio of J eq and J opt.

From equations Eqs. 13 and 14, we obtain the following explicit expression for the PoA:

$$ \label{PoA} {\rm PoA} \stackrel{\triangle}{=} \frac{J_{\rm eq}}{J_{\rm opt}} = 2 \frac{\sum_{i,j=1}^I \Lambda_i\Lambda_j \alpha_i \min\left\{\frac{\beta_i}{\alpha_i},\frac{\beta_j}{\alpha_j}\right\}}{\sum_{i,j=1}^I \Lambda_i\Lambda_j \min\{\beta_i,\beta_j\}} \,. $$
(17)

As we will see below, the PoA ranges around the single-class value of 2. We proceed to present some bounds on its value. Essentially, we will be interested in bounds that depend only on the ranges of the cost parameters (α i and β i ) but not on the relative size (Λ i ) of the customer classes. We start with some special cases, where only one parameter varies across classes.

Proposition 3

  1. (i)

    Identical wait sensitivities. Suppose α i  ≡ α0: the wait sensitivities are identical for all customer classes. Then

    $$ {\rm PoA} =2. $$
  2. (ii)

    Identical lateness sensitivities. Suppose β i  ≡ β0: the lateness sensitivities are identical for all classes. Then

    $$ \label{bound1a} {\rm PoA} \leq 2, $$
    (18)

    and

    $$ \label{bound1b} {\rm PoA} \geq 2 - (1-{I}^{-1}) \left(1-\frac{\alpha_{\min}}{\alpha_{\max}}\right) \geq 1 + \frac{\alpha_{\min}}{\alpha_{\max}}, $$
    (19)

    where α max  =  max i α i , α min  =  min i α i , and I is the number of classes.

Item (i) of the last proposition is evidently an exact extension of the PoA result for the single-class case, giving the same value of 2. Regarding (ii), we first note the upper bound of 2 is strict unless all the α i ’s are equal as well. Thus, in this case, diversity in the waiting sensitivities of the customers actually improves the PoA compared to the single class case. As for the lower bound, for two user classes (I = 2) with α 1 < α 2 it reads

$$ {\rm PoA} \geq 1.5+0.5\,\frac{\alpha_1}{\alpha_2}. $$

We observe that this bound is tight, and is achieved when Λ1 = Λ2.

We now turn to consider the general case, when both sets of cost parameters may vary across customer classes. The following set of bounds is obtained simply by bounding separately the ratios of each pair of corresponding terms in the numerator and denominator of Eq. 17.

Proposition 4

Let H max  =  max i,jH(i,j) and H min  =  min i,jH(i,j), where

$$ H(i,j) = \frac{(\alpha_i+\alpha_j)\min\left\{\frac{\beta_i}{\alpha_i},\frac{\beta_j}{\alpha_j}\right\}} {2\min\{\beta_i,\beta_j\}}. $$

Then

$$ \label{Gbound} 2H_{\min} \leq {\rm PoA} \leq 2H_{\max}. $$
(20)

Consequently,

$$ \label{G1} {\rm PoA} \leq 1+ \frac{\alpha_{\max}}{\alpha_{\min}} , $$
(21)
$$ \label{G2} {\rm PoA} \leq 1+ \frac{\beta_{\max}}{\beta_{\min}}, $$
(22)
$$ \label{G3} {\rm PoA} \geq \left(1+\frac{\alpha_{\min}}{\alpha_{\max}}\right)\frac{\beta_{\min}}{\beta_{\max}}. $$
(23)

Equation 22 provides an upper bound on the PoA in terms of the β parameters only. In fact, a tighter bound of this form may be derived through somewhat refined analysis. This bound also points to the “worst case” conditions in terms of the PoA when the (β i ) parameters are given.

Proposition 5

\( {\rm PoA} \leq 1+\sqrt{\frac{\beta_{\max}}{\beta_{\min}}}\) .

We note that the bound of the last proposition is tight, in the sense that for any set of β i ’s, the bound is satisfied with equality for some (α i i ) parameters. Indeed, as implied by the proof, setting the β i ’s in increasing order, equality is obtained for Λ2 = ... = ΛI − 1 = 0, \(\Lambda_1/\Lambda_I = \sqrt{\beta_I/\beta_1}\), and α I /α 1 = β I /β 1 (cf. Eq. 17).

5 The continuous parameter model

We next consider our model with a continuous set of customer classes, rather than discrete. It may be argued that this model is more realistic, which comes at the expense of larger computational (and possibly technical) difficulty. Our treatment here will be brief and informal, and we will essentially rely on the discrete-parameter results to infer the form of the equilibrium arrival profile in the present case.

Let q ∈ I denote here the continuous class parameter. We can identify q with the two cost parameters \((\alpha_q,\beta_q)\in \Re_+^2\). Let g 1(q) ≥ be a density function on I, with total mass \(\int g_1(q)dq =\Lambda\). Thus, g 1(q) denotes the density of arrivals of class q. We assume that there are no point masses in the cost parameter distribution, so that g 1 is finite.

Let m q  = α q /(α q  + β q ) ∈ [0,1] denote the cost ratio parameter for class q customers. Since the equilibrium arrival profile is completely characterized by this parameter, it will be useful to define its density. Thus, let g(m) ≥ 0 denote a density function on [0,1], which is obtained from g 1 as

$$ g(m) = \int {\mathbf 1}\{m_q=m\} g_1(q) dq \,. $$

We assume that g(m) is finite as well. Obviously, \(\int g(m)dm = \Lambda\). Further, let

$$ G(m)=\int_0^m g(\eta)d\eta $$

denote the (absolutely continuous) cumulative distribution function of g. For simplicity, we will assume that g has finite support (i.e., m is bounded).

As in the discrete parameter case, let F(t) describe the aggregate arrival profile of the customer population. The equilibrium arrival profile is defined as before. Looking at the continuous model as the limit of the discrete one, with the number of classes going to infinity, we may infer the following analogous properties of the equilibrium arrival profile (see Lemmas 2 and 3, Theorem 2 and Fig. 1).

  1. 1.

    The server operates at full rate μ till the last customer is served. Thus, the last customer is served at T f  = Λ/μ.

  2. 2.

    Arrivals occur in increasing order of m. That is, customers of class q 1 arrive before those of class q 2 if \(m_{q_1}<m_{q_2}\).

  3. 3.

    If arrivals at time t have cost ratio parameter m(t), then

    $$ \label{dF} F^{\prime}(t)=\mu m(t). $$
    (24)

It follows that all customers with m ≤ m(t) arrive up to time t, hence

$$ F(t)=G(m(t)). $$

We proceed to derive differential equations for m(t) and F(t). Differentiating the last equation gives

$$ F^{\prime}(t)=g(m(t))m^{\prime}(t) $$

and together with Eq. 24 we get

$$ \label{dm} m^{\prime}(t) = \mu \frac{m(t)}{g(m(t))}\;\;, \quad t\leq T_f. $$
(25)

The boundary condition for this equation is obtained by noting that the last arrivals occur at T f  = Λ/μ and have maximal m. Thus, letting m max denote the maximal point in the support of g(m),

$$ m(T_f)=m_{\max}. $$

m(t) may now be computed from the differential equation with a boundary condition. The equilibrium arrival profile F(t) may then be computed using F(t) = G(m(t)).

We note that a direct equation for F(t) follows by combining Eq. 24 with Eq. 25, yielding

$$ F^{\prime \prime}(t) =\frac{\mu F^{\prime}(t)}{g(\mu^{-1}F^{\prime}(t))} $$

with terminal conditions F′(T f ) = μm max and F(T f ) = Λ. It is clearly seen that F′′(t) ≥ 0 over t ≤ T f , hence F(t) is convex there.

It is easy to verify that the arrival profile thus defined is indeed an equilibrium profile. Recall that the cost function for a class q arrival is given by (see Eq. 10):

$$ C_{q}(t) = (\alpha_{q}+\beta_{q})\frac{F(t)}{\mu} - \alpha_{q} t = (\alpha_{q}+\beta_{q}) \left(\frac{F(t)}{\mu} - m_{q} t\right). $$

It further follows by construction and Eq. 24 that customers with parameter m q arrive at time t q defined by F′(t q ) = μm q . We will show that t q minimized C q . Differentiating, we get

$$ C_{q}^{\prime}(t) = (\alpha_{q}+\beta_{q}) \left(\frac{F^{\prime}(t)}{\mu} - m_{q}\right). $$

Therefore, C q ′(t) = 0 at t = t q . Furthermore,

$$ C_{q}^{\prime \prime}(t) = (\alpha_{q}+\beta_{q}) \frac{F^{\prime \prime}(t)}{\mu} \,. $$

But as observed above F(t) is convex on t ≤ T f and hence so is C i . Thus, t q is a minimizer there. It is also clear that the cost C q is increasing for t beyond T f , hence t q is a global minimizer of C i .

We turn to an example that illustrates the required computations in the simple case of uniformly distributed cost parameters.

Example 2

Let the cost ratio parameter m of the customer population be uniformly distributed on some interval [m 0,m 1], namely

$$ g(m) = g_0 {\mathbf 1}\{m_{0}\leq m \leq m_{1}\}\,, \quad g_0=\frac{\Lambda}{m_{1}-m_{0}}. $$

Then, by Eq. 25,

$$ m^{\prime}(t) = \frac{\mu}{g_0} m(t)\,, \;\; t\leq T_f\,; \quad m(T_f)=m_{1}, $$

with the solution

$$ m(t)=m_{1}e^{ \frac{\mu}{g_0}(t-T_f)}\,, \;\; T_0\leq t\leq T_f. $$

Here T 0 must satisfy m(T 0) = m 0, so that \(T_0=T_f-\frac{g_0}{\mu}\log\big(\frac{m_{1}}{m_{0}}\big)\). The equilibrium arrival density F′ is given by

$$ F^{\prime}(t)=g(m(t))m^{\prime}(t) = g_0 m^{\prime}(t) = \mu m_1 e^{ \frac{\mu}{g_0}(t-T_f)}\,, \quad T_0\leq t\leq T_f. $$

Evidently, the arrival distribution at equilibrium turns out to be an exponentially increasing function. Finally, the cumulative arrival distribution F(t) may be obtained by integrating F′ and using F(T f ) = Λ, yielding

$$ F(t)= g_0 m_1 \big(e^{ \frac{\mu}{g_0}(t-T_f)}-1\big)+\Lambda\,, \quad T_0\leq t\leq T_f. $$

To close this section, we observe that the PoA bounds from Section 4.2, which depend only on the range of the cost parameters α and β, should hold without modification in the present continuous-parameter model as well.

6 Reducing the price of anarchy

We next discuss some ways in which the social inefficiency of the equilibrium solution, hence the price of anarchy, may be reduced. For simplicity, we consider the setting of a single customer class. The generalization to multi-class is conceptually straightforward, but requires more elaborate calculations. A key message of this section is that the fluid model is sufficiently tractable to provide elegant and intuitive answers to many natural methods for reducing the PoA.

We consider three methods in this context:

  • Temporal segmentation, where certain parts of the population can be served only after specified time thresholds.

  • Priority assignment, where certain parts of the populations are given absolute service priority over others.

  • Time-dependent tariffs, where customers who are served earlier are charged more.

In these cases we show that by appropriately segmenting the population into n parts, the PoA can be reduced from 2 to \(1+\frac{1}{n}\).

Both temporal segmentation and priority assignment can be viewed as partial forms of customer scheduling through appointment setting, and may be applicable whenever the latter is relevant. A familiar application which is handled along similar lines is airplane boarding, where economy passengers are assigned different priority based on their seat location. Price differentiation and segmentation are of course important topics in operations research and economics, and our treatment here barely scratches the surface.

We start the discussion by considering in some detail temporal segmentation with two groups. Here we will compute explicitly the equilibrium for the different choices of the temporal delay threshold and population shares, and establish the optimal choices that lead to the minimal PoA of \(\frac{3}{2}\). This derivation is also of independent interest, as it brings out some interesting structural properties of the equilibrium at different levels of separation between the two populations. We then proceed to consider (albeit in lesser detail) the n-level schemes for temporal and priority segmentation, and conclude with a brief discussion on the use of differential tariffs. Later, in the Appendix A, we also point out the performance degradation that may occur with suboptimal pricing. Without loss of generality, we take Λ = 1 in this section.

6.1 Temporal segmentation: two segments

Consider the case where the population is divided into two segments. Specifically, assume that a proportion a ∈ (0,1) of the population is allowed to be served at any time t ≥ 0, while a proportion (1 − a) is allowed to be served only after some time \(\hat{\tau}>0\). Call these the first population and second population, respectively. Note that the restriction is only one-sided: no upper bounds are imposed on the service times of the first population. We do allow the second population to queue up before time \(\hat{\tau}\), so that after time \(\hat{\tau}\) they join the end of the queue of population 1 customers at the service facility (if any) and are served after them. Within the same population, service is always in the order of customer arrivals. After time \(\hat{\tau}\), customers from either population join at the end of the main queue.Footnote 2

We first note that the first population may be fully served by time \(\frac{a}{\mu}\). Therefore, if we set \(\hat{\tau} > \frac{a}{\mu}\), then the server would be idle in the interval \([\hat{\tau},\frac{a}{\mu}]\), which is clearly inefficient. We therefore restrict attention to the case where \(\hat{\tau} \leq \frac{a}{\mu}\).

The following proposition summarizes our findings regarding the equilibrium arrival distributions for different values of \(\hat{\tau} \leq \frac{a}{\mu}\). Let m denote the ratio \(\frac{\alpha}{\alpha +\beta}\), and define

$$ \label{eqn:defn_tau_st} \tau_0 = \frac{a}{\mu}-(1-a) \frac{\beta}{\alpha \mu}\,. $$
(26)

Proposition 6

  1. (i)

    For \( \tau_0 \leq \hat{\tau} \leq \frac{a}{\mu}\), the equilibrium arrival profile of each population is unique, with the first population arriving uniformly over the interval \(\big[ \hat{\tau} - \frac{a}{\mu m},\, \hat{\tau}\big]\) at rate μm, and the second population arrives uniformly over the interval \(\big[\tau_0, \frac{1}{\mu}\big]\) at rate μm. The PoA increases linearly from 2(a2 + 1 − a) to 2 as \(\hat{\tau}\) decreases from \(\frac{a}{\mu}\) to τ0.

  2. (ii)

    For \( 0\leq \hat{\tau} < \tau_0\), in any equilibrium, the joint arrival profile of both populations is uniform over the interval \(\big[-\frac{\beta}{\alpha \mu},\, \frac{1}{\mu}\big]\), with rate μm. Population 1 alone arrives over the interval \(\big[-\frac{\beta}{\alpha \mu}, \hat{\tau}\big]\), and the remainder from both populations arrive in arbitrary order over \(\big[\hat{\tau}, \frac{1}{\mu}\big]\). Here, the PoA equals 2.

The proof is presented in the Appendix A.

A central point to note is that population 2’s cost is not affected by this segmentation. The only effect is the potential gain to population 1. (Of course, in the stochastic setting one can expect some loss to population 2 due to the increase in queue size uncertainty as time progresses. This is an interesting point for future study.)

Another interesting observation is the phase change that occurs at the critical value of \( \hat{\tau} = \tau_0\). Above this value, population 1 obtains a concrete cost improvement over the unsegmented case. Below this value, although population 2 does refrain from arriving before \(\hat{\tau}\), there is no gain or loss for either population.

Returning to the issue of equilibrium efficiency, the important observation is that the PoA is minimized by setting \(\hat{\tau}\) to the extreme value of \(\frac{a}{\mu}\). Here, the unique equilibrium corresponds to both populations blissfully unaware of each other, as population 1 finishes its service exactly at \(\hat{\tau}\). The first population arrives as if the second does not exist and the server facility opens at time 0, the second population arrives as if the server facility opens at time a/μ and queues up appropriately before time a/μ (see Fig. 2 for an illustration). Further observing that the minimum of 2(a 2 + 1 − a) is obtained for a = 0.5, we obtain the main conclusion of this subsection

Fig. 2
figure 2

Equilibrium queue length profile for the two populations. Population 1 comprises a proportion and is served in the interval [0, a/μ]. Population 2 comprises (1 − a) proportion and is allowed service after time a/μ, although it starts queueing from time \(\tau_{0}\) onwards

Corollary 1

The PoA for a two-group temporal segmentation is minimized by setting a = 0.5 and \(\hat{\tau}=\frac{0.5}{\mu}\). The minimal value is \(\frac{3}{2}\).

Thus, the optimum is attained by splitting the two populations equally, and minimizing their interaction by allowing the second to be served only when the first has finished.

6.2 Temporal segmentation: multiple segments

Suppose that we segment the population into n parts, with each allowed to be served only after a certain time. We shall not go into a detailed analysis of this problem, but rather use the insight from the two segment case, so that we divide the population into n equal segments, of size \(\frac{1}{n}\) each, and eliminate their interaction by allowing the i-th segment to be served only after the previous ones are expected to have finished, namely at time \(\frac{i-1}{n\, \mu}\) for i = 1,2,...,n. Then the equilibrium cost for customers getting served in a slot \(\big(\frac{i-1}{n \mu}, \frac{i}{n \mu}\big)\) would equal \(\frac{ i\beta}{n\mu}\). As this pertains to a 1/n proportion of the population, the total cost equals

$$ \frac{\beta}{n\mu} \left(\frac{1}{n} + \frac{2}{n} + ... + 1\right) =\frac{\beta}{2\mu}\frac{n+1}{n} \,. $$

Comparing with Proposition 1, it is immediately seen that the PoA equals \(\frac{n+1}{n}\), which approaches the optimal value of 1 as n increases.

6.3 Priority queueing

Another way to achieve PoA equal to \(\frac{n+1}{n}\) is through dividing the population into n separate segments and assigning different priorities to them. Specifically, suppose that the population is divided into n segments with (a i : i ≤ n) denoting the respective proportions (the cost function is identical for each segment). The population segment with lower index is given priority over the segment with higher index. Then, in equilibrium customers arrive in disjoint intervals, customers of segment 1 arrive first uniformly in the interval \(\big[-\frac{\beta a_1}{\alpha \mu}, \frac{a_1}{\mu}\big]\) and are served by the server in the interval \([0,\frac{a_1}{\mu}]\). Similarly, customers of segment j ≥ 2 arrive uniformly in the interval \(\big[ \sum_{i=1}^{j-1}\frac{a_j}{\mu}- \frac{\beta a_i}{\alpha \mu}, \sum_{i=1}^{j}\frac{a_i}{\mu}\big]\) and are served in the interval \(\big[\sum_{i=1}^{j-1}\frac{a_i}{\mu}, \sum_{i=1}^{j}\frac{a_i}{\mu}\big]\).

The cost incurred by segment i equals \(\beta \sum_{i=1}^{j}\frac{a_i}{\mu}\) so that overall price of anarchy equals

$$ 2 \left [\sum\limits_{j=1}^{n} a_j (\sum\limits_{i=1}^{j}{a_i}) \right ]. $$

Through simple optimization, it can be seen that this is minimized by setting \(a_j= \frac{1}{n}\) for each j so that the PoA equals \(\frac{n+1}{n}\) as in Section 6.2.

6.4 Charging tariffs

Recall that in Section 6.1 in the two population setting, we obtained the best PoA when we divided the populations in equal parts and allowed the second population to come after time \(\frac{1}{2 \mu}\). Then, the cost to each customer in the first population was \(\frac{1}{2 \mu}\) less than that of customers in the second population. This suggests a procedure for implementing discriminatory pricing.

For brevity, we restrict our discussion to the case where customers joining the service facility queue by time \(\frac{1}{2 \mu}\) have to pay a constant tariff p while the customers joining the service facility queue after this time pay no tariff. We refer to the former as population 1 and latter as population 2. We assume here that demand of one unit is fixed and is unaffected by the pricing strategy of the service provider. Again, we allow population 2 to queue up before time \(\frac{1}{2 \mu}\) separately and join at the end of service facility queue at time \(\frac{1}{2 \mu}\). In this case, they are served after population 1 customers at the service facility queue at that time, if any, and in their order of arrival amongst population 2. We further assume that the tariff collected is returned to the society so this does not enter into the price of anarchy calculations. We now discuss the scenario \(p=\frac{\beta}{2 \mu}\) that corresponds to minimum PoA. For brevity, the discussion of the remaining two cases \(p>\frac{\beta}{2 \mu}\) and \(p<\frac{\beta}{2 \mu}\) is kept brief and is relegated to the Appendix A.

6.4.1 \(p=\frac{\beta}{2 \mu}\)

In this scenario, the first population arrives uniformly between \(\big[-\frac{\beta}{2\alpha \mu}, \frac{1}{2 \mu}\big]\) at rate μm, and the other between \(\big[\frac{1}{2 \mu}-\frac{\beta}{2\alpha \mu}, \frac{1}{ \mu}\big]\) at the same rate. The cost incurred by both the populations is \(\frac{\beta}{\mu}\): For the first population it is \(\frac{\beta}{2\mu}\) from waiting and time to service and another \(\frac{\beta}{2\mu}\) from the tariff for coming early.

Thus, a customer is indifferent to coming as part of population 1 or 2. The revenue collected by the service provider from tariffs equals \(\frac{\beta}{4\mu}\). The PoA, as before, equals 3/2. See Fig. 3 for an illustration of this scenario.

Fig. 3
figure 3

The dotted line denotes the queue profile before differential pricing. After differential pricing the darkened line denotes the queue profile of population 1 that pays β/(2μ) more than population 2 whose queue profile is shown using the lighter line. The cost to customer joining either of the two populations equals β/μ

It is easily seen that by having n − 1 separate tariffs so that customers served in the interval \(\big(\frac{i}{n \mu}, \frac{i+1}{n\mu}\big)\) for (i = 0,1,2, ..., n − 1) are charged amount \(\frac{\beta}{\mu} \frac{n-i-1}{n}\), we can achieve PoA equal to \(\frac{n+1}{n}\).

7 Numerical experiments

In our analysis in a single class customer setting, we derived the unique equilibrium arrival profile for an asymptotically limiting fluid regime where the number of customers increased to infinity. We refer to this as the asymptotic equilibrium arrival profile. When the number of customers is finite, the associated equilibrium arrival profile may be more intricate and determining it may be a subject for interesting future research. In this section we numerically test the efficacy of the asymptotic equilibrium profile in the fixed N customer setting for a simple example to get a sense of its closeness to equilibrium in finite-N queue, as N increases. We consider the case where there are N single class customers with linear costs that follow two variants of the asymptotic equilibrium strategy: In Case 1, the customers select their arrival times by sampling from a uniform distribution over their support. In Case 2, the customers arrive at deterministic evenly spaced intervals. As pointed out in Section 2, both cases represent a finite-sample approximation to the uniform fluid distribution. To further contrast the two cases, we assume that customer service times are exponentially distributed in Case 1, while they are uniformly distributed with lower variance in Case 2. We then, in both the cases, plot the expected cost incurred by a tagged customer as a function of her arrival time for increasing values of N. We observe that the resulting cost (suitably normalized) converges to a constant as N increases. This convergence is faster in Case 2 where the system is less noisy. This suggests that for reasonable values of N, the asymptotic equilibrium arrival profile may be close to an actual equilibrium arrival profile, although as mentioned earlier, further research is needed to establish this.

Case 1

We set the linear cost coefficients α = 2 and β = 1. The customer service times are exponentially distributed with rate μ = 1. Each arrival selects her arrival time as uniformly distributed in the interval \(N \times \big[-\frac{\beta}{\alpha \mu}, \frac{1}{\mu}\big]\). Customers are served on a first come first serve basis. We use simulation to estimate the expected waiting time and hence the expected cost of the tagged customer that arrives at times \(N \times \big[-\frac{\beta}{\alpha \mu}, 0, \frac{0.5}{\mu}, \frac{0.8}{\mu}, \frac{0.95}{\mu}, \frac{1}{\mu}\big]\). The cost of the customer is normalized by dividing by N. Figure 4 shows the normalized expected cost for the tagged customer as a function of her normalized arrival time (arrival time divided by N) for N = 10, 50, 100, 500, 1,000 and 10,000. Ten thousand independent simulation replications are conducted to estimate the expected waiting time in each configuration. Typically, the 95% confidence width of the resulting estimator is within 0.5% of the value of the estimator. When, N = 10,000, and the customer arrives at times \(N \times\frac{0.95}{\mu}\) or at \(N \times \frac{1}{\mu}\), this ratio was below 3%, again for 10,000 replications.

Fig. 4
figure 4

We consider N single class customers with α = 2, β = 1, service times exponentially distributed with rate μ = 1. Customers arrival times are uniformly distributed between \(N \times \big[-\frac{\beta}{\alpha \mu}, \frac{1}{\mu}\big]\). The graph shows the expected cost of a customer arriving to this queue at different times. Cost and time are normalized by dividing by N

Note that the normalized expected cost of the tagged customer trivially equals 1 for her arrival time between \(N \times \big[-\frac{\beta}{\alpha \mu}, 0\big]\). As the graph shows, this cost is higher than 1 and is increasing as the arrival time increases to \(\frac{N}{\mu}\). However, for large N (for instance, N = 1,000) this cost more-or-less stabilizes to 1.

Intuitively, this can be understood by recalling the well known Lindley’s recursion

$$ \label{eqn:Lindley} W_{n+1}= \max (W_n + S_n - I_{n+1}, 0), $$
(27)

where W n denotes the waiting time of customer n in a first come first serve queue, S n denotes this customer’s service time and I n + 1 denotes the inter-arrival time between customer n and n + 1. In our model all customers that arrive before time zero wait till time zero when the system initiates service. Lindley’s recursion is then valid for all customers that arrive after time zero.

Note that, if in our simulations, we set

$$ \label{eqn:Lindley_2} W_{n+1}= W_n + S_n - I_{n+1}, $$
(28)

for all arrivals after time zero, then it is easily seen that the resultant normalized expected cost will be 1 for an arrival at any time during \(N \times \big[0, \frac{1}{ \mu}\big]\). However, the expected waiting time increases (and hence the expected cost increases) due to the relation 27 assigning higher value to a waiting time compared to Eq. 28 whenever an arrival finds an empty queue.

The difference between the two expected costs (one computing waiting time using Lindley’s recursion, other using linear recursion) may be small when the probability of the queue emptying between time zero and the time of tagged customer’s arrival is small. This probability is obviously small for tagged customer’s arrival time close to zero (as there are many customers waiting for service at time zero) and increases as this arrival time gets closer to N/μ. It can easily be shown that for a given ε ∈ (0,1), as N becomes large, the probability of the queue becoming empty in the interval \(\big[0, \frac{N(1-\epsilon)}{\mu}\big]\) goes to zero, and hence the normalized cost stabilizes to 1 with increasing N monotonically.

Note that for finite N, under a symmetric equilibrium strategy, the tagged customer must see constant cost at all times along the support of other customers arrival distribution. Figure 4 suggests that to achieve this, customers must put relatively less weight towards the end of their support compared to asymptotic equilibrium strategy.

Case 2

Here, the customers arrive at deterministic equi-spaced time intervals - Customer i for i = 1, 2, ..., N arrives at

$$ \left (-\frac{N \beta}{\alpha \mu} + \frac{1}{2 \mu}\frac{(\alpha+\beta)}{\alpha} + \frac{(i-1)}{\mu}\frac{(\alpha+\beta)}{\alpha} \right ). $$

All parameter values are as in Case 1. The service times are assumed to be uniformly distributed between [1/2, 3/2] ( so their variance equals 1/12 as compared to variance of 1 in Case 1). This may be more realistic in many applications (such as concert or cafeteria queues) where the service times show little variability. Figure 5 shows the normalized expected cost for the tagged customer as a function of her normalized arrival time as in Case 1. As expected, the convergence to 1 is much faster in this case. Note that for small values of N, the normalized cost may actually be less than 1 for tagged customers arrival at times that are just before the next arrival in the deterministic arrival grid. Also, for the case N = 10, to understand the decrease in the normalized cost experienced by the tagged customer arriving at normalized time 0.95 as compared to normalized time 1, note that the last arrival in the deterministic arrival grid occurs at normalized time 0.925. Proximity to this arrival then leads to higher waiting and hence overall cost to the tagged customer at normalized time 0.95 as compared to the tagged customer that arrives at normalized time 1.

Fig. 5
figure 5

We consider N single class customers with α = 2, β = 1, service times are uniformly distributed between [1/2, 3/2]. Customers arrive at deterministic equally spaced intervals. The graph shows the expected cost of a customer arriving to this queue at different times. Cost and time are normalized by dividing by N

8 Conclusion

In this paper we considered the queueing problem that may arise in settings such as concert and movie theaters, cafeterias, DMV offices, Black Friday shopping queues, etc., where a large number of customers may queue up before a facility that opens for service at a particular time. The customers strategically select their arrival time distributions to trade-off waiting time in queue with costs due to late arrival. We developed a queueing framework for this problem for which we identified the fluid limit. We observed that the fluid limit allows a great deal of tractability in analyzing the strategic arrival problem faced by each customer. We identified the unique arrival profile for each customer class in equilibrium, and showed that the price of anarchy equals 2 in the single-class model while it varies around this value in the multiclass case. We further discussed structural changes in the queueing discipline and simple pricing schemes that can be used to reduce the price of anarchy. We also demonstrated through a simple numerical example that the proposed asymptotic equilibrium arrival profiles may be may be close to equilibrium in the finite-N queue, for N reasonably large.

As part of future work, we plan to study the equilibrium properties of the fluid model under more general cost functions as well as study the model introduced here under the diffusion limit. Extension to multi-server queueing networks would also be of interest in many applications particularly communication networks. We hope that this analysis motivates further research in strategic analysis of queueing systems.