1 Introduction

Stochastic Flow Models (SFMs) capture the dynamic behavior of a large class of hybrid systems (see Cassandras and Lafortune 2009). In addition, they are used as abstractions of Discrete Event Systems (DES), for example when discrete entities accessing resources are treated as flows. The basic building block in a SFM is a queue (buffer) whose fluid content is dependent on incoming and outgoing flows which may be controllable. By connecting such building blocks together, stochastic flow networks can be generated which are encountered in application areas such as manufacturing systems (Armony et al. 2015), water resources (Anderson et al. 2015), chemical processes (Yin et al. 2013), communication networks (Cassandras et al. 2002) and transportation systems (Geng and Cassandras 2015). Figure 1 shows a two-node SFM, in which an on-off switch controls the outgoing flow for each node. When the switch at the output of node 1 is turned on, a “flow burst” is generated to join the downstream node 2. Flow models commonly assume that this flow burst can instantaneously join the downstream queue, thus ignoring potentially significant delays before this can occur. Incorporating such delays through more accurate modeling is challenging but crucial in better evaluating the performance of the underlying system and seeking ways to improve it.

Fig. 1
figure 1

A two-node SFM

Control mechanisms used in SFMs often involve gradient-based methods in which the controller uses estimates of the performance metric sensitivities with respect to controllable parameters so as to adjust the values of these parameters and improve (ideally, optimize) performance. Infinitesimal Perturbation Analysis (IPA) is a method of general applicability to stochastic hybrid systems (see Cassandras et al.2010; Wardi et al. 2010) through which gradients of performance measures can be estimated with respect to several controllable parameters based on directly observable data. The applications of IPA and its advantages have been reported elsewhere (e.g., Cassandras et al. 2010; Fleck et al. 2016) and are summarized here as follows: (i) IPA estimates are shown to be unbiased under very mild conditions (Cassandras et al. 2010). (ii) IPA estimators are robust with respect to the stochastic processes involved. (iii) IPA is event-driven, hence scalable in the number of events in the system, not the (much larger) state space dimensionality. (iv) IPA possesses a decomposability property (Yao and Cassandras 2011), i.e., IPA state derivatives become memoryless after certain events occur. (v) The IPA methodology can be easily implemented on line, allowing us to take advantage of directly observable data.

While IPA has been extensively used in SFMs, the effect of delays between adjacent nodes, as described above, has not been studied to date. Therefore, the first contribution of this paper is to incorporate delays in the flow bursts that are created by on-off switching control (see Fig. 1) into the standard SFM and to develop necessary extensions to IPA for such systems. Extending the original model introduced in Chen and Cassandras (2018), we include the possibility of multiple flow bursts arising between two nodes. While in Chen and Cassandras (2018) the capacity between two nodes was assumed to be infinite, the second contribution of this paper is to allow for a finite capacity and analyze the effect of possible blocking that may arise. The third contribution is to include an application of SFMs with delays and blocking to the Traffic Light Control (TLC) problem in transportation networks.

The rest of the paper is organized as follows. In Section 2, we extend the standard multi-node SFM to include delays. In Section 3 we adapt this model to the TLC problem by explicitly modeling the delay experienced by vehicles moving from one intersection to the next. This allows us to introduce two new cost metrics for congestion that incorporate the effect of delays. In Section 4, we carry out IPA for the TLC problem with delays and in Section 5 we extend our analysis to include blocking effects. In Section 6, we provide simulation examples comparing performance results first between a model considering traffic delays and one which does not, showing that the former achieves improved performance. We then include blocking effects and show how their inclusion in optimizing traffic light settings further improves performance.

2 Stochastic flow models with delays

Consider a two-node SFM as in Fig. 1 and let {αi(t)} and {βi(t)}, i = 1,2, be the incoming flow and outgoing flow processes respectively. We emphasize that these are both treated as stochastic processes. We define x(t) = [x1(t),x2(t)], where \(x_{i}(t)\in \mathbb {R}^{+}\) is the flow content of node i (we assume that all variables are left-continuous). The dynamics of this SFM are

$$ \dot{x}_{i}(t)=\left\{ \begin{array}{l} 0\\\\ \alpha_{i}(t)-{\upbeta}_{i}(t) \end{array} \right. \begin{array}{l} \text{if }\ x_{i}(t)=0,\text{ }\alpha_{i}(t)\leq{\upbeta}_{i}(t)\\ \text{or }x_{i}(t)=c_{i},\text{ }\alpha_{i}(t)\geq{\upbeta}_{i}(t)\\ \text{otherwise} \end{array} $$
(1)

where ci is the content capacity of i. The physical meaning of xi(t) depends on the application context. For example, in water resources (Anderson et al. 2015), xi(t) may be the water volume of a tank i, αi(t) the incoming water rate to tank i and βi(t) the outgoing water rate from tank i. In the TLC problem, xi(t) is defined as the queue length of road i, αi(t) is the incoming traffic flow rate and βi(t) is the outgoing traffic flow rate from road i.

βi(t) is defined as

$$ {\upbeta}_{i}(t)=\left\{ \begin{array}{l} h_{i}(t)\\ 0 \end{array} \right. \begin{array}{l} \text{if }\ G_{i}(t)=1\\ \text{otherwise} \end{array} $$
(2)

in which hi(t) is the instantaneous outgoing flow rate at node i, and Gi(t) ∈{0,1}, i = 1,2 is a switching controller. We also define a clock state variable zi(t) for each switching controller Gi(t):

$$ \begin{array}{@{}rcl@{}} \dot{z}_{i}(t) & =&\left\{ \begin{array}{l} 1\\ 0 \end{array} \right. \begin{array}{l} \text{if }\ \ G_{i}(t)=1\\ \text{otherwise} \end{array}\\ z_{i}(t^{+}) & =&0\text{ \ \ if }G_{i}(t)=1\ \text{ and }\ G_{i}(t^{+} )=0 \end{array} $$
(3)

Here we define zi(t+) to be the value of zi(t) just after a discontinuity occurs at time t. This notation applies to all variables in the rest of the paper which may experience discontinuities when certain events take place. Thus, when G1(t) = 1,t ∈ [t1,t2), \(G_{1}(t_{1}^{-})=0\), a flow burst is created at node 1 (when x1(t1) > 0). In general, several such flow bursts may be created over (t1,t2], depending on the values of α1(t), h1(t),t ∈ (t1,t2]. In SFMs studied to date, the delay incurred by any such flow burst being transferred between nodes has been ignored, assuming instead that it instantaneously joins the queue at node 2. Under this assumption,

$$ \alpha_{2}(t)=\left\{ \begin{array}{l} \alpha_{1}(t)\\ {\upbeta}_{1}(t) \end{array} \right. \begin{array}{l} \text{if }\ \ x_{1}(t)=0,\text{ }\alpha_{1}(t)\leq{\upbeta}_{1}(t)\\ \text{otherwise} \end{array} $$

In what follows, we extend the SFM to include the aforementioned delay which depends on when a flow burst actually joins the downstream queue, an event that we need to carefully specify. While a flow burst is in transit between nodes 1 and 2, let x12(t) be its size, i.e., the flow volume in transit before it joins x2(t). For simplicity, we assume that each flow burst is maintained during this process (i.e., the burst may not be separated into two or more sub-bursts). We will use L to denote the physical distance between nodes 1 and 2.

Predicting the time when the first flow burst actually joins queue 2 is complicated by the fact that x2(t) evolves while this burst is in transit. This is illustrated through the example in Fig. 2 which we will use to describe the evaluation of this time through a sequence of events denoted by {J1,…,JK} with associated event times {σ1,…,σK}. We define J0 to be the event when the flow burst leaves node 1, i.e., the occurrence of a switch from G1(t) = 0 to Gi(t) = 1, and let σ0 be its associated occurrence time.

Fig. 2
figure 2

Typical evolution of a flow burst in transit

Therefore, an estimate of the time when the flow burst joins the tail of queue 2 is given by

$$ \sigma_{1}=\sigma_{0}+[L-x_{2}(\sigma_{0})]/v(\sigma_{0}) $$

where v(σ0) is the “speed” of the flow burst which we assume to be constant and, for notational simplicity, set it to v(σ0) = 1 (it will become clear in the sequel that this can be relaxed and treated as random in the context of IPA). Thus, we define J1 to be the event at time σ1 when the flow burst covers the distance Lx2(σ0). In general, however, \(x_{2}(\sigma _{1})\leq {x} _{2}(\sigma _{0})\equiv \bar {x}_{2}(\sigma _{1})\), i.e., the estimate\(\bar {x}_{2}(\sigma _{1})\) of x2(σ1) is based on the assumption that x2(t) remains unchanged over (σ0,σ1). This is illustrated in the example of Fig. 2, where \(\dot {x} _{2}(t)=-{\upbeta }_{2}(t)<0\) for some t ∈ (σ0,σ1). Therefore, unless x2(σ1) = x2(σ0), we repeat at t = σ1 the same process of estimating the time of the next opportunity that the flow burst might join queue 2 at time σ2 to cover the distance \(\bar {x}_{2}(\sigma _{1})-x_{2}(\sigma _{1})\) and define this potential joining event as J2. This process continues until event JK occurs at time σK, the last event in the sequence {J1,…,JK} when \(\bar {x}_{2}(\sigma _{K})=x_{2}(\sigma _{K})\).

Note that JK may occur either when \((i) \bar {x}_{2}(\sigma _{K} )=x_{2}(\sigma _{K})>0\), in which case the estimate \(\bar {x}_{2}(\sigma _{K})\) incurs no error because x2(σK) = x2(σK− 1), i.e., the queue length at node 2 remained unchanged because β2(t) = 0 for t ∈ [σK− 1,σK], or \((ii) \bar {x}_{2}(\sigma _{K})=x_{2}(\sigma _{K})=0\), in which case the flow burst joins node 2 while this queue is empty. Since in practice the queues and flow bursts may consist of discrete entities (e.g., vehicles), we define event JK as occurring when \(\bar {x}_{2}(t)-x_{2}(t)\leq \epsilon \) for some predefined fixed small 𝜖, i.e., a flow burst joins the downstream queue whenever it is sufficiently close to it. The following lemma asserts that the event time sequence {σ1,…,σK} is finite.

Lemma 1

Under the assumption that JK is defined through \(\bar {x}_{2}(\sigma _{K})-x_{2}(\sigma _{K})\leq \epsilon \), the number of events K in {J1,…,JK} is bounded. Moreover, its event time σK is also bounded.

Proof

Observe that x2(t) ≤ L, since the content of queue 2 is limited by the physical distance L. In addition, \(\bar {x}_{2} (t)-x_{2}(t)>\epsilon \) prior to event JK. It follows that KL/𝜖. Moreover, in the worst case, a flow burst travels the finite distance L to find x2(σK) = 0, therefore, σKσ0 + Lx2(σ0). □

We now formalize the dynamics of the flow transit process described above. First, the dynamics of \(\bar {x}_{2}(t)\), the estimated queue length when an event Jk occurs, are given by

$$ \begin{array}{@{}rcl@{}} \dot{\bar{x}}_{2}(t) & =&0\\ \bar{x}_{2}(t^{+}) & =&x_{2}(t)\text{ \ if }t=\sigma_{k},\text{}k=1,...,K. \end{array} $$
(4)

with \(\bar {x}_{2}(\sigma _{1})=L-x_{2}(\sigma _{0})\) and σ0 defined above as the occurrence time of a switch from G1(t) = 0 to Gi(t) = 1. The dynamics of x12(t) are given by

$$ \begin{array}{@{}rcl@{}} \dot{x}_{12}(t) & =&\left\{ \begin{array}{l} 0\\ \alpha_{1}(t)\\ h_{1}(t) \end{array} \right. \begin{array}{l} \text{if }\ G_{1}(t)=0\\ \text{if }\ x_{1}(t)=0,\text{ }\alpha_{1}(t)\leq{\upbeta}_{1}(t)\\ \text{otherwise} \end{array} \\ x_{12}(t^{+}) & =&0\text{ \ \ if }t=\sigma_{K} \end{array} $$
(5)

The dynamics of x2(t) are no longer described by Eq. 1, since an increase in the queue content is only updated when a flow burst joins queue 2 at time σK. Instead, they are given by

$$ \begin{array}{@{}rcl@{}} \dot{x}_{2}(t) & =&\left\{ \begin{array}{l} -{\upbeta}_{i}(t)\\ 0 \end{array} \right. \begin{array}{l} \text{if }\ x_{2}(t)>0\ \text{ and }\ G_{2}(t)=1\\ \text{otherwise} \end{array}\\ x_{2}(\sigma_{K}^{+}) & =&x_{2}(\sigma_{K})+x_{12}(\sigma_{K}) \end{array} $$
(6)

Note that in Eqs. 4 and 5 the values of event times {σ1,…,σK} are still unspecified. In order to provide this specification, we define

$$ \delta_{12}(t)=\bar{x}_{2}(t)-x_{2}(t) $$

to be the distance between the head of the flow burst and the tail of x2(t). Then, observe that

$$ \sigma_{k}=\sigma_{k-1}+\tau(\delta_{12}(\sigma_{k-1})) $$

where τ(r) is the time to complete a distance r ∈ (0,L] and k = 1,...,K − 1. Similar to the clock zi(t) in Eq. 3 that dictates the timing of the controlled switching process, we associate a clock z12(t) to the timing of events in {J1,…,JK} as follows:

$$ \begin{array}{@{}rcl@{}} \dot{z}_{12}(t) & =&\left\{ \begin{array}{l} 1\\ 0 \end{array} \right. \begin{array}{l} \text{if }\ {\delta_{12}(t)>0}\\ \text{otherwise} \end{array} \\ z_{12}(t^{+}) & =&0\text{ \ \ if }z_{12}(t)=\tau({\delta_{12}(t)}) \end{array} $$
(7)

with an initial condition z12(σ0) = 0 and

$$ \begin{array}{@{}rcl@{}} \dot{\delta}_{12}(t) & =&0\\ \delta_{12}(t^{+}) & =&\left\{ \begin{array}{l} L-x_{2}(t)\\ \bar{x}_{2}(t)-x_{2}(t) \end{array}\right. \begin{array}{l} \text{if }\ t=\sigma_{0}\\ \text{if }\ t=\sigma_{k},k=1,...,K. \end{array} \end{array} $$
(8)

Note that δ12(t) is piecewise constant and updated only at the times when events J0,J1,…,JK take place ending with δ12(t+) = 0 when event JK occurs, i.e., the flow burst joins queue 2. The values of τ(δ12(t)) in Eq. 7 are given by the time required for the flow burst to travel a distance \({\delta _{12}(t)=}\bar {x}_{2}(t)-x_{2}(t)\) with speed v(σ0) which we assumed earlier to be constant and set to v(σ0) = 1. Thus, τ(δ12(t)) = δ12(t). Finally, note that in this modeling framework, we assume that x2(t) is observable at event times σ0,σ1,…,σK when events J0,J1,…,JK take place.

As a final step, we generalize the model in Chen and Cassandras (2018) to include multiple flow bursts that may be generated in an interval (t1,t2] such that G1(t) = 1 for t ∈ [t1,t2), \(G_{1}(t_{1}^{-})=0\). Thus, we denote by \({J_{k}^{n}}\) the k th event for the n th flow burst to (potentially) join queue 2 and extend δ12(t) to \(\delta _{12}^{n}(t)\), σk to \({\sigma _{k}^{n}}\), and x12(t) to \(x_{12}^{n}(t)\), n = 1,2,… Also, we define Ji,j as an event such that the i th flow burst merges with the j th burst at time τi,j. For simplicity, we use ym(t) to represent \(x_{12}^{m}(t)\). We then have:

$$ \begin{array}{@{}rcl@{}} \dot{x}_{12}^{n}(t) & =&\left\{ \begin{array}{l} \alpha_{1}(t)\\ h_{1}(t)\\ \\ 0 \end{array} \right. \begin{array}{l} \text{if }\ n=1,x_{1}(t)=0,\text{ }\alpha_{1}(t)<{\upbeta}_{1}(t)\\ \text{if }\ n=1,G_{1}(t)=1\\ x_{1}(t)=0,\text{ }\alpha_{1}(t)\geq{\upbeta}_{1}(t)\text{ or }x_{1}(t)>0 \\ \text{otherwise} \end{array} \end{array} $$
(9)
$$ \begin{array}{@{}rcl@{}} x_{12}^{n}(t^{+}) & =&0\text{ \ \ if }t={\sigma_{K}^{n}}\text{ or } t=\tau_{n,n-1}\\ x_{12}^{n}(t^{+}) & =&x_{12}^{n}(t)+x_{12}^{n-1}(t)\text{ \ \ if } t=\tau_{n+1,n} \end{array} $$
(10)
$$ \begin{array}{@{}rcl@{}} \dot{\delta}_{12}^{n}(t) & =&0\\ \delta_{12}^{n}(t^{+}) & =&\left\{ \begin{array}{l} L-x_{2}(t)\\ \bar{x}_{2}^{n}(t)-x_{2}(t)\\ \delta_{12}^{n}(t)-y^{m}(t) \end{array} \right. \begin{array}{l} \text{if }\ t={\sigma_{0}^{n}}\\ \text{if }\ t={\sigma_{k}^{n}},k>0\\ \text{if }\ t={\sigma_{K}^{m}},m=1,\ldots,n-1 \end{array} \end{array} $$
(11)
$$ \begin{array}{@{}rcl@{}} \dot{\bar{x}}_{2}^{n}(t) & =&0\\ \bar{x}_{2}^{n}(t^{+}) & =&\left\{ \begin{array}{l} x_{2}(t)\\ \bar{x}_{2}^{n}(t)+y^{m}(t) \end{array} \right. \begin{array}{l} \text{if }\ t={\sigma_{k}^{n}},k\geq0\\ \text{if }\ t={\sigma_{K}^{m}},m=1,\ldots,n-1 \end{array} \end{array} $$
(12)

with the obvious generalizations of Eqs. 4-8.

The generalized SFM with delays is shown in Fig. 3. We define a series of servers dn, \(n\in \{j\in \mathbb {Z}:j=1,\ldots ,N\}\) to describe the flow transit delay between SFM where yn(t) is the content of dn. Here, N is the total number of servers required depending on a specific application. For example, in the two-intersection traffic system discussed in the next section, we set N = ⌈L/Lv⌉ where L is the physical distance between intersections and Lv is the length of a vehicle. When a new flow burst leaves server 1, the controlled switching process checks whether y1(t) = 0 to initiate a flow burst. If y1(t) > 0, it checks yj(t) for j ≥ 2 until yj(t) = 0 for some j. For example, in Fig. 3, if servers d1 and d2 are non-empty (dark color), and d3 is empty (light color), the new flow burst will join server d3 until y1(t) = 0. The first flow burst will leave server d1 when event \({J_{K}^{1}}\) occurs and joins x2(t). The flow burst in server dn will leave when either one of two events occurs, defined as follows: (1) Jn,n− 1 occurs when the n th flow burst joins the (n − 1)th burst. (2) \(E_{d_{n-1}}\) occurs when yn− 1(t) = 0.

Fig. 3
figure 3

Two-node SFM with delay

SFM Events

The hybrid system with dynamics given by Eqs. 1-8 defines the SFM with transit delays. To complete the model, we define next the event set associated with all discontinuous state transitions in Eqs. 1-8. As in prior work using SFMs, we observe that the sample path of any queue content process in our model can be partitioned into Non-Empty Periods (NEPs) when xi(t) > 0, and Empty Periods (EPs) when xi(t) = 0. Let us define the start of a NEP at queue i as event Si (S12 for queue 12) and the end of a NEP at queue i as event Ei (E12 for queue 12). In Eq. 1, observe that S1 is an event that can be induced by either an event such that α1(t) −β2(t) switches from ≤ 0 to > 0 or by an event which switches the value of β1(t); moreover, in Eq. 2, the value of β1(t) switches when an event occurs such that G1(t) changes between 0 and 1. In Eq. 6, S2 may also be induced by event Jk if it occurs when x2(t) = 0. Finally, in Eq. 5, S12 is induced by the same events that induce S2, while E12 is induced by JK since that causes the end of the flow burst that created x12(t) > 0. To sum up, assuming for now that \(c_{i}=\infty \) in Eq. 1, there are five events that can affect any of the processes {x1(t)}, {x2(t)} and {x12(t)}:

  1. 1.

    Ei: xi(t) switches from > 0 to = 0, thus ending a NEP at queue i.

  2. 2.

    Γi: αi(t) −βi(t) switches from ≤ 0 to > 0.

  3. 3.

    Jk: z12(t) = τ(δ12(t)) representing a potential joining of the flow burst x12(t) with x2(t) if δ12(t+) > 0, or the actual joining if δ12(t+) = 0.

  4. 4.

    C2Oi: Gi(t) switches from 1 (Closed switch) to 0 (Open switch).

  5. 5.

    O2Ci: Gi(t) switches from 0 (Open switch) to 1 (Closed switch).

We can now identify the event set that affects the dynamics of the three queue content processes:

$$ \begin{array}{@{}rcl@{}} {\Phi}_{1} &=&\{S_{1},E_{1},O2C_{1},C2O_{1}\}\\ {\Phi}_{2} &=&\{S_{2},E_{2},O2C_{2},C2O_{2},J_{k}\},\text{ }\\ {\Phi}_{12} &=&\{S_{12},E_{12},E_{1},C2O_{1},J_{k}\} \end{array} $$
(13)

Note that this SFM model can be extended to any network of queues with possible delays by identifying queues with dynamics of type (1) or (6) or (5). Finally, to facilitate readability in what follows, we summarize in Table 1 all frequently used notation.

Table 1 List of frequently used notation

3 Multi-intersection traffic light control with delays

An application of the SFM with delays arises in the Traffic Light Control (TLC) problem in transportation networks, which consists of adjusting green and red signal settings in order to control the traffic flow through an intersection and, more generally, through a set of intersections and traffic lights in an urban roadway network. The ultimate objective is to minimize congestion in an area consisting of multiple intersections. Many methods have been proposed to solve the TLC problem, including expert systems, genetic algorithms, reinforcement learning and several optimization techniques; a more detailed review of such methods may be found in Fleck et al. (2016). Perturbation analysis methods were used in Head et al. (1996) and Fu and Howell (2003). IPA was used in Panayiotou et al. (2005) and Geng and Cassandras (2012) for a single intersection and extended to multiple intersections in Geng and Cassandras (2015) and to quasi-dynamic control schemes in Fleck et al. (2016). However, all this work to date has assumed that vehicles moving from one intersection to the next experience no delay. In this section, we formulate the TLC problem by including delays as in Section 2 and derive an IPA-based controller to optimize selected performance metrics (cost functions). By including delays, we will see that we can define new metrics which capture “congestion” in traffic systems much more accurately.

As in Section 2, let {αi(t)} and {βi(t)}, i = 1,…,4, be the incoming and outgoing flow processesrespectively at all four roads shown in Fig. 4, where we now interpret αi(t) as the random instantaneous vehicle arrival rate at time t. We define the controllable parameters 𝜃i to be the durations of the GREEN light for road i = 1,…,4. Thus, the state vector is x(𝜃,t) = [x1(𝜃,t),x2(𝜃,t),x3(𝜃,t),x4(𝜃,t),x12(𝜃,t)] where xi(𝜃,t) is the content of queue i and x12(𝜃,t) is the content of the road between intersections I1 and I2. To maintain notational simplicity, we will assume in our analysis that (A1) There is no more than one traffic burst in queue 12 at any one time, (A2) The speed of a traffic burst v1(t) between intersections is constant, and (A3) There is no traffic coupling between I1 and I2. Assumptions (A1) and (A2) simplify the analysis and can be easily relaxed since our model can deal with multiple flow bursts as shown in Section 2. Assumption (A3) means that the distance between I2 and I1 is sufficiently large and is also made to simplify the model; this assumption will be relaxed in Section 5.

Fig. 4
figure 4

Two traffic intersections

We define clock state variables zi(t), i = 1,…,4, which are associated with the GREEN light cycle for queue i based on Eq. 3 where the controller Gi(t) is now the traffic light state, i.e., Gi(t) = 0 means that the traffic light in road i is RED, otherwise, it is GREEN. Accordingly, the departure rates and the queue content dynamics xi(t), i = 1,…,4, are given by Eqs. 1-6.

In order to provide the dynamics of x2(t) and x12(t), we will make use of our analysis in Section 2. In particular, let σ0 be the time when a positive traffic flow is generated from queue 1 and enters queue 12, i.e., the light turns from RED to GREEN for road 1 and x1(σ0) > 0. Invoking (8), we define δ12(t) to be the distance between the head of the “transit queue” 12 and the tail of queue 2. Thus, \(\delta _{12}(\sigma _{0}^{+})=L-x_{2}(\sigma _{0})\). We also associate a clock to this queue, denoted by z12(t), which is defined by Eq. 7 and initialized at z12(σ0) = 0. Finally, τ(δ12(t)) in Eq. 7 in the TLC context is given by τ(δ12(t)) = δ12(t)/v1.

Recall that a Jk event represents a potential joining of the flow burst from I1 with queue 2. The actual joining event occurs when δ12(t+) = 0 from its initial value δ12(σ0+) = Lx2(σ0). Adapting (8) and (4) to the TLC setting we get the dynamics of δ12 and \(\bar {x}_{2}(t)\), while the dynamics of x2(t) and x12(t) are given by Eqs. 6 and 5 respectively.

SFM Events

We apply the event set defined in Section 2 where we use G2Ri (traffic light i changes from GREEN to RED) to replace C2Oi and R2Gi to replace O2Ci. Thus, the set Φ1 in Eq. 13 is used as Φi for i = 1,3,4 for the corresponding queues in Fig. 4. Figure 5 shows the hybrid automaton model for queue 2 in terms of its six possible modes depending on x2(t), G2(t) and δ12(t). Similar models apply to the remaining processes, all of which are generally interdependent (e.g., in Fig. 5, some reset conditions involve x12(t)).

Fig. 5
figure 5

Stochastic Hybrid Automaton model for x2(t)

Cost Functions

The objective of the TLC problem is to control the green cycle parameters 𝜃i, i = 1,…,4, so as to minimize traffic congestion in the region covered by the two intersections in Fig. 4. In Geng and Cassandras (2012) and Fleck et al. (2016), the average total weighted queue length over a fixed time interval [0,T] is used to capture congestion:

$$ F(\theta;x(0),z(0),T)=\frac{1}{T}\sum\limits_{i=1}^{5}{{\int}_{0}^{T}}w_{i}x_{i} (\theta,t)dt. $$
(14)

where wi is the weight associated with queue i. For convenience, we will refer to Eq. 14 as the average queue cost function; with a slight abuse of notation we have re-indexed x12(t) as x5(t). However, this may not be an adequate measure of “congestion”. For instance, it is possible that the average queue lengths over [0,T] are relatively small, while reaching large values over small intervals (peak periods during a typical day). Thus, instead of restricting ourselves to Eq. 14, we define next two new cost functions.

1.Average weightedPth power of the queue lengths over a fixed interval [0,T), where P > 1. The sample function for this metric is

$$ F(\theta;x(0),z(0),T)=\frac{1}{T}\sum\limits_{i=1}^{5}{{\int}_{0}^{T}}w_{i}x_{i} ^{P}(\theta,t)dt. $$

Observing that xi(𝜃,t) = 0 during an EP of queue i, we can rewrite this as

$$ F(\theta;x(0),z(0),T)=\frac{1}{T}\sum\limits_{i=1}^{5}\sum\limits_{m=1}^{M_{i}}{\int}_{\xi_{i,m}}^{\eta_{i,m}}w_{i}{x_{i}^{P}}(\theta,t)dt, $$
(15)

in which Mi is the total number of NEPs of queue i over a time interval [0,T] and ξi,m, ηi,m are the occurrence times of the m th Si event and Ei event respectively. We also define the cost incurred within the m th NEP of queue i as

$$ F_{i,m}(\theta)={\int}_{\xi_{i,m}}^{\eta_{i,m}}w_{i}{x_{i}^{P}}(\theta,t)dt. $$
(16)

Clearly, when P = 1, Eq. 15 is reduced to Eq. 14. When P > 1, Eq. 15 amplifies the presence of intervals where queue lengths are large. Therefore, minimizing Eq. 15 decreases the probability that a road develops a large queue length. We will refer to this metric Eq. 15 as the power cost function.

2.Averageweighted fraction of time that queue lengths exceed given thresholds over a fixed interval [0,T]. The sample function for this metric is

$$ \begin{array}{@{}rcl@{}} F(\theta;x(0),z(0),T) & =&\frac{1}{T}\sum\limits_{i=1}^{5}{{\int}_{0}^{T}} w_{i}\mathbf{1}[x_{i}(\theta,t)>\zeta_{i}]dt\\ & =&\frac{1}{T}\sum\limits_{i=1}^{5}{{\int}_{0}^{T}}w_{i}r_{i}(\theta,t)dt \end{array} $$
(17)

where ζi is a given threshold and ri(𝜃,t) = 1[xi(𝜃,t) > ζi]. This necessitates the definition of two additional events: Zi is the event such that xi(𝜃,t) = ζi, xi(𝜃,t) < ζi (i.e., the queue content reaches the threshold from below) and \(\bar {Z}_{i}\) is the event such that xi(𝜃,t) < ζi, xi(𝜃,t) = ζi. Observe that \(\dot {r} _{i}(\theta ,t)=0\) with a reset condition ri(𝜃,t+) = 1 if xi(𝜃,t) < ζi, xi(𝜃,t+) = ζi and ri(𝜃,t+) = 0 if xi(𝜃,t) = ζi, xi(𝜃,t+) < ζi. Finally, we use Fi,m(𝜃) as in Eq. 16, for the cost associated with the m th NEP at queue i:

$$ F_{i,m}(\theta)={\int}_{\gamma_{i,m}(\theta)}^{\psi_{i,m}(\theta)}w_{i} r_{i}(\theta,t)dt. $$
(18)

where γi,m, ψi,m are the start and end respectively of an interval such that ri(𝜃,t) = 1.

Optimization

Our purpose is to minimize the cost functions defined in Eqs. 1415 and 17. We define the overall cost function as follows:

$$ H(\theta;x(0),z(0),T)=E[F(\theta;x(0),z(0),T)], $$

in which F(𝜃;x(0),z(0),T) is a sample cost function of the form Eqs. 14 and 15 or Eq. 17. Clearly, we cannot derive a closed-form expression for the expectation above. However, we can estimate the gradient ∇H(𝜃) through the sample gradient ∇F(𝜃) based on IPA, which has been shown to be unbiased under mild technical conditions (Proposition 1 in Cassandras et al. 2010). We emphasize that no explicit knowledge ofαi(t)andhi(t)is necessary to estimate ∇H(𝜃). The IPA estimators derived in the next section only need estimates of αi(τk) and hi(τk) at certain event times τk. Using ∇F(𝜃), we can use a simple gradient-descent optimization algorithm to minimize the associated cost metric through the iterative scheme

$$ \theta_{j,k+1}=\theta_{j,k}-\mu_{k}Q_{j,k}(\theta_{k},x(0),T,\omega_{k}), $$

in which Qj,k(𝜃k,x(0),T,ωk) is an estimator of dH/d𝜃j (in our case, dF/d𝜃j) in sample path ωk and μk is the step size at the k th iteration selected through an appropriate decreasing sequence to guarantee convergence (Fleck et al. 2016).

In the next section, we use the IPA methodology to obtain dF/d𝜃j through the state derivatives \(\frac {\partial x_{i}(\theta ,t)}{\partial \theta _{j}}\).

4 Infinitesimal Perturbation Analysis (IPA)

We briefly review the IPA framework for general stochastic hybrid systems as presented in Cassandras et al. (2010). Let {τk(𝜃)}, k = 1,…,K, denote the occurrence times of all events in the state trajectory of a hybrid system with dynamics \(\dot {x}\ =\ f_{k}(x,\theta ,t)\) over an interval [τk(𝜃),τk+ 1(𝜃)), where 𝜃 ∈Θ is some parameter vector and Θ is a given compact, convex set. For convenience, we set τ0 = 0 and τK+ 1 = T. We use the Jacobian matrix notation:

$$ x^{\prime}(t)\equiv\frac{\partial x(\theta,t)}{\partial\theta},\text{ \ \ \ \ }\tau_{k}^{\prime}\equiv\frac{\partial\tau_{k}(\theta)}{\partial \theta} $$

for all state and event time derivatives. It is shown in Cassandras et al. (2010) that

$$ \frac{d}{dt}x^{\prime}(t)=\frac{\partial f_{k}(t)}{\partial x}x^{\prime }(t)+\frac{\partial f_{k}(t)}{\partial\theta}, $$
(19)

for t ∈ [τk,τk+ 1) with boundary condition:

$$ x^{\prime}(\tau_{k}^{+})=x^{\prime}(\tau_{k}^{-})+[f_{k-1}(\tau_{k}^{-} )-f_{k}(\tau_{k}^{+})]\tau_{k}^{\prime} $$
(20)

for k = 1,...,K. In order to complete the evaluation of \(x^{\prime }(\tau _{k}^{+})\) in Eq. 20, we need to determine \(\tau _{k}^{\prime }\). If the event at τk is exogenous (i.e., independent of 𝜃), then \(\tau _{k}^{\prime }=0\). However, if the event is endogenous, there exists a continuously differentiable function \(g_{k}:\mathbb {R}^{n} \times {\Theta }\rightarrow \mathbb {R}\) such that \(\tau _{k}\ =\ \min \limits \{t>\tau _{k-1}\ :\ g_{k}\left (x\left (\theta ,t\right ) ,\theta \right ) =0\}\) and, as long as \(\frac {\partial g_{k}}{\partial x}f_{k}(\tau _{k}^{-})\neq 0\),

$$ \tau_{k}^{\prime}=-\left[ \frac{\partial g_{k}}{\partial x}f_{k}(\tau_{k} ^{-})\right]^{-1}\left[ \frac{\partial g_{k}}{\partial\theta} +\frac{\partial g_{k}}{\partial x}x^{\prime}(\tau_{k}^{-})\right] $$
(21)

In our TLC setting, we will use the notation

$$ {x}_{i,j}^{^{\prime}}(t)=\frac{\partial x_{i}(\theta,t)}{\partial\theta_{j} },\text{ \ }{z}_{i,j}^{^{\prime}}(t)=\frac{\partial z_{i}(\theta,t)} {\partial\theta_{j}},\text{ \ }{\tau}_{k,j}^{^{\prime}}(t)=\frac{\partial \tau_{k}(\theta)}{\partial\theta_{j}} $$

We also note that in Eqs. 1 and 5, \(\frac {\partial f_{k}(t)}{\partial \theta }=\frac {\partial f_{k}(t)}{\partial x}=0\) and Eq. 19 reduces to

$$ x_{i,j}^{^{\prime}}(t)=x_{i,j}^{^{\prime}}(\tau_{k}^{+}),\text{ \ \ \ } t\in(\tau_{k},\tau_{k+1}] $$
(22)

4.1 State and event time derivatives

We will now apply the IPA (20)-(22) to our TLC setting on an event by event basis for each of the events sets Φi, i = 1,…,4, and Φ12. In all cases, τk denotes the associated event time.

4.1.1 IPA for Event Set \({\Phi }_{i}=\{S_{i},E_{i} ,R2G_{i},G2R_{i}\}\cup \{Z_{i},\bar {Z}_{i}\}\), i = 1,3,4

IPA for these three processes for each of the events in the first set above is identical to that in Geng and Cassandras (2012). Thus, we simply summarize the results here.

  • (1)EventEi: \(x_{i,j}^{^{\prime }}(\tau _{k}^{+})=0\).

  • (2)EventG2Ri: Let ρk be the time of the last R2Gi event before G2Ri occurs. Then, \(\tau _{k,j}^{^{\prime }}=\mathbf {1}[j=i]+\rho _{k,j}^{^{\prime }}\) and

    $$ {x}_{i,j}^{^{\prime}}(\tau_{k}^{+})=\left\{ \begin{array}{l} {x}_{i,j}^{^{\prime}}(\tau_{k})-\alpha_{i}(\tau_{k})\tau_{k,j}^{^{\prime}}\\ {x}_{i,j}^{^{\prime}}(\tau_{k})-h_{i}(\tau_{k})\tau_{k,j}^{^{\prime}} \end{array} \right. \begin{array}{l} \text{if }\ x_{i}(\tau_{k})=0,\text{ }\alpha_{i}(\tau_{k})\leq{\upbeta}_{i}(\tau_{k})\\ \text{otherwise} \end{array} $$
    (23)
  • (3)EventR2Gi: Let ρk be the time of this event and τk be the time of the last G2Ri event before R2Gi occurs. We will use the notation \(\hat {\imath }\) to denote the index of a road perpendicular to i (e.g., \(\hat {1}=3\), \(\hat {2}=4\) in Fig. 4). Then, \(\rho _{k,j}^{^{\prime }}=\mathbf {1} [j=\hat {\imath }]+\tau _{k,j}^{^{\prime }}\) and

    $$ {x}_{i,j}^{^{\prime}}(\rho_{k}^{+})=\left\{ \begin{array}{l} {x}_{i,j}^{^{\prime}}(\rho_{k})+\alpha_{i}(\rho_{k})\rho_{k,j}^{^{\prime}}\\ {x}_{i,j}^{^{\prime}}(\rho_{k})+h_{i}(\rho_{k})\rho_{k,j}^{^{\prime}} \end{array} \right. \begin{array}{l} \text{if }\ x_{i}(\rho_{k})=0,\text{ }\alpha_{i}(\rho_{k})\leq{\upbeta}_{i}(\rho_{k})\\ \text{otherwise} \end{array} $$
    (24)
  • (4)EventSi: If Si is induced by G2Ri, then \({x}_{i,j}^{^{\prime }}(\tau _{k}^{+})={x}_{i,j}^{^{\prime }}(\tau _{k} )-\alpha _{i}(\tau _{k})\tau _{k,j}^{^{\prime }}\). If Si is an exogenous event triggered by Γi, then \({x}_{12,j}^{^{\prime }}(\tau _{k} ^{+})={x}_{12,j}^{^{\prime }}(\tau _{k})\).

For the two new events \(\{Z_{i},\bar {Z}_{i}\}\), we have:

  • (5)EventZi: This is an endogenous event which occurs when gk(x(𝜃,t),𝜃) = xi(τk) − ζi = 0. Applying (21), we have

    $$ {\tau}_{k,j}^{^{\prime}}=\left\{ \begin{array}{l} -{x}_{i,j}^{^{\prime}}(\tau_{k})/\alpha_{i}(\tau_{k})\\ -{x}_{i,j}^{^{\prime}}(\tau_{k})/[\alpha_{i}(\tau_{k})-h_{i}(\tau_{k})] \end{array} \right. \begin{array}{l} \text{if }\ G_{i}(\tau_{k})=0\\ \text{if }\ G_{i}(\tau_{k})=1 \end{array} $$
    (25)

Moreover, based on the definition ri(t) = 1[xi(t) > ζi] in Section 3, \(r_{i}(\tau _{k}^{+})=1\), which implies that \(r_{i,j} ^{^{\prime }}(\tau _{k}^{+})+\dot {r}(\tau _{k}^{+})\tau _{k,j}^{+}=0\). Since \(\dot {r}_{i}(\tau _{k}^{+})=0\), we get \(r_{i,j}^{^{\prime }}(\tau _{k}^{+})=0\).

  • (6)Event\(\bar {Z}_{i}\): Similar to the previous case, gk(x(𝜃,t),𝜃) = xi(τk) − ζi = 0 and applying (21) gives

    $$ {\tau}_{k,j}^{^{\prime}}=-{x}_{i,j}^{^{\prime}}(\tau_{k})/(\alpha_{i}(\tau_{k})-h_{i}((\tau_{k}))) $$
    (26)

In this case, \(r_{i}(\tau _{k}^{+})\equiv 0\), therefore, \(r_{i,j}^{^{\prime } }(\tau _{k}^{+})+\dot {r}_{i}(\tau _{k}^{+})\tau _{k,j}^{+}=0\) and, since \(\dot {r}_{i}(\tau _{k}^{+})=0\), we get \(r_{i,j}^{^{\prime }}(\tau _{k}^{+})=0\).

4.1.2 IPA for Event Set \({\Phi }_{2}=\{S_{2},E_{2} ,R2G_{2},G2R_{2},J_{k}\}\cup \{Z_{2},\bar {Z}_{2}\}\)

Applying IPA for this set and for Φ12 leads to new derivative estimators as detailed next.

  • (1)EventE2: This is an endogenous event ending an EP that occurs when gk(x(𝜃,t),𝜃) = x2(t) = 0 at t = τk. Applying (21) and using Eq. 6, we have \(\tau _{k,j}^{^{\prime }}={x_{2,j}^{^{\prime }}(\tau _{k}^{-})}/{h_{2}(\tau _{k}^{-})}\). It then follows from Eq. 20 that \(x_{2,j}^{^{\prime } }(\tau _{k}^{+})={x_{2,j}^{^{\prime }}(\tau _{k}^{-})-h_{2}(\tau _{k}^{-})} \tau _{k,j}^{^{\prime }}=0\).

  • (2)EventS2: In view of the reset condition in Eq. 6, this event is induced by Jk provided δ12(t+) = 0. As described in Section 2, a sequence of Jk events is initiated when a flow burst is generated at node 1 with associated event times {σ0,σ1,…,σK}. Event S2 is induced by the last occurrence of a Jk event at time σK. Thus, our goal here is to evaluate the IPA derivative \({x}_{2,j}^{^{\prime }}(\sigma _{K}^{+} )\). At first sight, it would appear that this requires the complete sequence \(\{{x}_{2,j}^{^{\prime }}(\sigma _{0}^{+}),\ldots ,{x}_{2,j}^{^{\prime } } (\sigma _{K-1}^{+})\}\) along with event time derivatives \(\{\sigma _{0,j}^{^{\prime }},\ldots ,\sigma _{K-1,j}^{^{\prime }}\}\) from which \({x} _{2,j}^{^{\prime }}(\sigma _{K}^{+})\) can be inferred. However, the following lemma shows that the only information needed from the full sequence of Jk events is \(\sigma _{0}^{^{\prime }}\).

Lemma 2

Let σk, k = 0,1,…,K be the occurrence time of event Jk for a flow burst initiated at σ0. Then,

$$ \sigma_{k,j}^{^{\prime}}=\frac{-1}{{v_{1}}}[{x_{2,j}^{^{\prime}}(\sigma_{k-1})+\dot{x}_{2}(\sigma{_{k-1}})\sigma_{k-1,j}^{^{\prime}}}]+\sigma_{0,j}^{^{\prime}} $$

Proof

Event Jk at t = σk is endogenous and occurs when gk(x(𝜃,σk),𝜃) = z12(σk) − δ12(σk)/v1 = 0. Applying (21) and using Eqs. 7 and 8, we get \(\sigma _{k,j}^{^{\prime }}=\delta _{12,j}^{^{\prime }}(\sigma _{k})/v_{1}-z_{12,j}^{^{\prime }}(\sigma _{k})\). Using Eq. 22, we have \(\delta _{12,j}^{^{\prime }}(\sigma _{k})=\delta _{12,j}^{^{\prime }}(\sigma _{k-1}^{+})\) and it follows that

$$ \sigma_{k,j}^{^{\prime}}=\delta_{12,j}^{^{\prime}}(\sigma_{k-1}^{+} )/v_{1}-z_{12,j}^{^{\prime}}(\sigma_{k}) $$
(27)

Again applying (22) gives \(z_{12,j}^{^{\prime }}(\sigma _{k})=z_{12,j}^{^{\prime }}(\sigma _{k-1}^{+})\). From Eq. 20, in view of Eq. 7, we get, for k = 1, \(z_{12,j}^{^{\prime }}(\sigma _{0}^{+})=-\sigma _{0,j}^{^{\prime }}\). The reset condition in Eq. 8 implies that \(\delta _{12}(\sigma _{0}^{+} )=L-x_{2}(\sigma _{0})\), hence \(\delta _{12,j}^{^{\prime }}(\sigma _{0} ^{+})=-{x_{2,j}^{^{\prime }}(\sigma _{0})-\dot {x}_{2}(\sigma _{0})\sigma _{0,j}^{^{\prime }}}\). Thus, in this case, Eq. 27 gives:

$$ \sigma_{1,j}^{^{\prime}}=\frac{-1}{{v_{1}}}[{x_{2,j}^{^{\prime}}(\sigma_{0})+\dot{x}_{2}(\sigma_{0})\sigma_{0,j}^{^{\prime}}}]+\sigma_{0,j} ^{^{\prime}} $$
(28)

For k > 1, based on the reset condition in Eq. 7, we have \(z_{12}(\sigma _{k}^{+})=0\). Taking the total derivative, we get \(z_{12,j} ^{^{\prime }}(\sigma _{k}^{+})=-\sigma _{k,j}^{^{\prime }}\). The reset condition in Eq. 8 now implies that \(\delta _{12}(\sigma _{k-1} ^{+})=\bar {x}_{2}({\sigma _{k-1}})-x_{2}({\sigma _{k-1}})\), hence

$$ \begin{array}{@{}rcl@{}} \delta_{12,j}^{^{\prime}}(\sigma_{k-1}^{+}) & =&\bar{x}_{2,j}^{^{\prime} }(\sigma_{k-1})+\dot{\bar{x}}_{2}(\sigma_{k-1})\sigma_{k-1,j}^{^{\prime} }\\ && -x_{2,j}^{^{\prime}}(\sigma_{k-1})-\dot{x}_{2}(\sigma_{k-1})\sigma_{k-1,j}^{^{\prime}} \end{array} $$
(29)

Applying (22), we have \(\bar {x}_{2,j}^{^{\prime }} (\sigma _{k-1})=\bar {x}_{2,j}^{^{\prime }}(\sigma _{k-2}^{+})\). Looking at Eq. 4, we have \(\dot {\bar {x}}_{2}(\sigma _{k-1})=0\) and the reset condition implies that \(\bar {x}_{2,j}^{^{\prime }}(\sigma _{k-2} ^{+})=x_{2,j}^{^{\prime }}(\sigma _{k-2})+\dot {x}_{2}(\sigma _{k-2} )\sigma _{k-2,j}^{^{\prime }}\). Thus, returning to Eq. 29, we get

$$ \begin{array}{@{}rcl@{}} \delta_{12,j}^{^{\prime}}(\sigma_{k-1}^{+})&= & x_{2,j}^{^{\prime}} (\sigma_{k-2})+\dot{x}_{2}(\sigma_{k-2})\sigma_{k-2,j}^{^{\prime}}\\ && -x_{2,j}^{^{\prime}}(\sigma_{k-1})-\dot{x}_{2}(\sigma_{k-1})\sigma_{k-1,j}^{^{\prime}} \end{array} $$
(30)

Recalling that \(z_{12,j}^{^{\prime }}(\sigma _{k}^{+})=-\sigma _{k,j}^{^{\prime } }\) and combining (28), (30) into (27), we get

$$ \begin{array}{@{}rcl@{}} \sigma_{k,j}^{^{\prime}} & =&\sigma_{k-1,j}^{^{\prime}}+\frac{1}{{v_{1}} }[x_{2,j}^{^{\prime}}(\sigma_{k-2})+\dot{x}_{2}(\sigma_{k-2})\sigma_{k-2,j}^{^{\prime}}-x_{2,j}^{^{\prime}}(\sigma_{k-1})-\dot{x}_{2}(\sigma_{k-1})\sigma_{k-1,j}^{^{\prime}}]\\ &=&\sigma_{0,j}^{^{\prime}}+\frac{1}{{v_{1}}}[-x_{2,j}^{^{\prime}}(\sigma_{k-1})-\dot{x}_{2}(\sigma_{k-1})\sigma_{k-1,j}^{^{\prime}}] \end{array} $$
(31)

where the last step follows from a recursive evaluation of \(\sigma _{k-1,j}^{^{\prime }}\) using Eqs. 28 and 31 leading to the cancellation of many of the terms above. This completes the proof. □

Let us now focus on event JK at time σK. It follows from the reset condition in Eq. 6 that

$$ {x}_{2,j}^{^{\prime}}(\sigma_{K}^{+})=\left\{ \begin{array}{l} {x}_{2,j}^{^{\prime}}(\sigma_{K})+{x}_{12,j}^{^{\prime}}(\sigma_{K})\\ +h_{2}(\sigma_{K}^{+})\sigma_{K,j}^{^{\prime}}\\ {x}_{2,j}^{^{\prime}}(\sigma_{K})+{x}_{12,j}^{^{\prime}}(\sigma_{K}) \end{array} \right. \begin{array}{l} \text{if }\ G_{2}(\sigma_{K})=1\\ \text{and }x_{2}(\sigma_{K})=0\\ \text{otherwise} \end{array} $$
(32)

Recall that \(\delta _{12}(\sigma _{K}^{+})=0\) in Eq. 32. If G2(σK) = 1 and x2(σK) = 0, then x2(σK− 1) − x2(σK) = 0, hence x2(σK− 1) = 0. It follows from Eqs. 6 and 8 that \(\dot {x}_{2}(\sigma _{K-1})=0\). Based on Case 1 above, we get \({x}_{2,j}^{^{\prime } }(\sigma _{K-1})=0\). Then, from Lemma 2, \(\sigma _{K,j}^{^{\prime }}=\sigma _{0,j}^{^{\prime }}\) and Eq. 32 becomes

$$ {x}_{2,j}^{^{\prime}}(\sigma_{K}^{+})=\left\{ \begin{array}{l} {x}_{2,j}^{^{\prime}}(\sigma_{K})+{x}_{12,j}^{^{\prime}}(\sigma_{K})\\ +h_{2}(\sigma_{K}^{+})\sigma_{0,j}^{^{\prime}}\\ {x}_{2,j}^{^{\prime}}(\sigma_{K})+{x}_{12,j}^{^{\prime}}(\sigma_{K}) \end{array} \right. \begin{array}{l} \text{if }\ G_{2}(\sigma_{K})=1\\ \text{and }x_{2}(\sigma_{K})=0\\ \text{otherwise} \end{array} $$
(33)

We conclude that the state derivative \({x}_{2,j}^{^{\prime }}(\sigma _{K}^{+})\) when event S2 occurs is independent of all event time derivatives \(\sigma _{1,j}^{^{\prime }},\ldots ,\sigma _{K,j}^{^{\prime }}\) and involves only \(\sigma _{0,j}^{^{\prime }}\), evaluated when the associated flow burst is initiated.

  • (3)EventG2R2: This is an endogenous event that occurs when gk(x(𝜃,t),𝜃) = z2(t) − 𝜃2 = 0. Based on Eq. 21, \(\tau _{k,j}^{^{\prime }}=\mathbf {1}[j=2]-z_{2,j}^{^{\prime } }(\tau _{k})\). Let ρk be the last R2G2 before G2R2 occurs. Applying (22), we have \(z_{2,j}^{^{\prime }}(\rho _{k} ^{+})=z_{2,j}^{^{\prime }}(\tau _{k})\), and from Eq. 20 we get \(z_{2,j}^{^{\prime }}(\rho _{k}^{+})=-\rho _{k,j}^{^{\prime }}\). It follows that \(\tau _{k,j}^{^{\prime }}=\mathbf {1}[j=2]+\rho _{k,j}^{^{\prime }}\). Based on Eq. 20, we have

    $$ {x}_{2,j}^{^{\prime}}(\tau_{k}^{+})=\left\{ \begin{array}{l} {x}_{2,j}^{^{\prime}}(\tau_{k})-h_{2}(\tau_{k})\tau_{k,j}^{^{\prime}}\\ {x}_{2,j}^{^{\prime}}(\tau_{k}) \end{array} \right. \begin{array}{l} \text{if }\ x_{2}(\tau_{k})>0\\ \text{otherwise} \end{array} $$
    (34)
  • (4)EventR2G2: Let ρk be the time of this event and τk be the time of the last G2R2 event before R2G2 occurs. Similar to (3) above, we get \(\rho _{k,j}^{^{\prime }}=\mathbf {1}[j=4]+\tau _{k,j}^{^{\prime }}\) and use this value in the expression below which follows from Eq. 20:

    $$ {x}_{2,j}^{^{\prime}}(\rho_{k}^{+})=\left\{ \begin{array}{l} {x}_{2,j}^{^{\prime}}(\rho_{k})+h_{2}(\tau_{k}^{+})\rho_{k,j}^{^{\prime}}\\ {x}_{2,j}^{^{\prime}}(\rho_{k}) \end{array} \right. \begin{array}{l} \text{if }\ x_{2}(\rho_{k})>0\\ \text{otherwise} \end{array} $$
    (35)
  • (5)EventJk: The analysis of this event has already been done in Case (2) above, including Lemma 2.

  • (6)EventZ2: This is an endogenous event which is triggered by Jk: if a traffic burst from node 1 joins x2(t) at t = τk and \(x_{2}(\tau _{k}^{+})>\zeta _{2}\), this results in Z2. Since \(r_{2}(\tau _{k}^{+})=1\) and \(\dot {r}_{2}(t)=0\), we have \(r_{2,j} ^{^{\prime }}(\tau _{k}^{+})=0\).

  • (7)Event\(\bar {Z}_{2}\): This is an endogenous event that occurs when gk(x(𝜃,t),𝜃) = x2(𝜃,t) − ζ2 = 0. Applying (21), we have \(\tau _{k,j}^{^{\prime }}=x_{2}^{^{\prime }}(\tau _{k})/h_{2}(\tau _{k})\). Moreover, \(r_{2}(\tau _{k}^{+})\equiv 0\), therefore, \(r_{2,j}^{^{\prime }}(\tau _{k}^{+})+\dot {r}_{2}(\tau _{k}^{+})\tau _{k,j}^{+}=0\) and, since \(\dot {r}_{2}(\tau _{k}^{+})=0\), we get \(r_{2,j}^{^{\prime }}(\tau _{k}^{+})=0\).

4.1.3 IPA for event set \({\Phi }_{12}=\{S_{12},E_{12} ,E_{1},G2R_{1},J_{k}\}\cup \{Z_{12},\bar {Z}_{12}\}\)

(1) :

EventS12: This event can be either exogenous or endogenous. If x1(τk) > 0 or if x1(τk) = 0,α1(t) > 0, S12 is induced by event R2G1 which is endogenous. Otherwise, S12 is an exogenous event and occurs when G1(τk) = 1 and α1(τk) switches from zero to some positive value.

Case (1a)

S12is induced byR2G1. Referring to our analysis of R2G1 (Case (3) for Φ1), we have already evaluated \(\tau _{k,j}^{^{\prime }}\). Then, applying (20), we get

$$ {x}_{12,j}^{^{\prime}}(\tau_{k}^{+})=\left\{ \begin{array}{l} {x}_{12,j}^{^{\prime}}(\tau_{k})-\alpha_{1}(\tau_{k}^{+})\tau_{k,j}^{^{\prime }}\\ \\ {x}_{12,j}^{^{\prime}}(\tau_{k})-h_{1}(\tau_{k}^{+})\tau_{k,j}^{^{\prime}} \end{array} \right. \begin{array}{l} \text{if }\ x_{1}(\tau_{k})=0 \ \text{ and }\ \\ 0<\alpha_{1}(\tau_{k})\leq{\upbeta}_{1}(\tau_{k})\\ \text{otherwise} \end{array} $$
(36)

Case (1b)

S12is exogenous. In this case, \(\tau _{k,j}^{^{\prime }}=0\) and applying (20) gives \({x}_{12,j} ^{^{\prime }}(\tau _{k}^{+})={x}_{12,j}^{^{\prime }}(\tau _{k})\).

  • (2)EventE12: This event occurs when the traffic burst in queue 12 joins queue 2. This is an endogenous event that occurs when gk(x(𝜃,τk),𝜃) = z12(τk) − δ12(τk) = 0 and \(\delta _{12}(\tau _{k}^{+})=0\). When this happens, it follows from the reset condition in Eq. 5 that \(x_{12,j}^{^{\prime }}(\tau _{k}^{+})=0\).

  • (3)EventE1: This is an endogenous event that occurs when gk(x(𝜃,t),𝜃) = x1(t) = 0. Applying (21), we get \(\tau _{k,j}^{^{\prime }}=-\frac {{x}_{1,j}^{^{\prime }}(\tau _{k})}{\alpha _{1}(\tau _{k})-h_{1}(\tau _{k})}\). Thus, using Eq. 20, we get

    $$ \begin{array}{@{}rcl@{}} {x}^{^{\prime}}_{12,j}(\tau^{+}_{k}) & =&{x}^{^{\prime}}_{12,j}(\tau_{k})+(h_{1}(\tau_{k})-\alpha_{1}(\tau_{k})) \tau^{^{\prime}}_{k,j}\\ & =&{x}^{^{\prime}}_{12,j}(\tau_{k})+{x}^{^{\prime}}_{1,j}(\tau_{k}) \end{array} $$
    (37)
  • (4)EventG2R1: This is an endogenous event that occurs when gk(x(𝜃,t),𝜃) = z1(t) − 𝜃1 = 0. It was shown under the analysis for events in Φ1 that for G2R1 we have \(\tau _{k,j}^{^{\prime }}=\mathbf {1}[j=i]+\rho _{k,j}^{^{\prime }}\) where ρk is the time of the last R2G1 event before G2R1 occurs. Using this value, we can evaluate the following which follows from Eq. 20:

    $$ {x}_{12,j}^{^{\prime}}(\tau_{k}^{+})=\left\{ \begin{array}{l} {x}_{12,j}^{^{\prime}}(\tau_{k})+\alpha_{1}(\tau_{k})\tau_{k,j}^{^{\prime}}\\ \\ {x}_{12,j}^{^{\prime}}(\tau_{k})+h_{1}(\tau_{k})\tau_{k,j}^{^{\prime}} \end{array} \right. \begin{array}{l} \text{if }\ x_{1}(\tau_{k})=0\\ \text{and }\alpha_{1}(\tau_{k})\leq{\upbeta}_{1}(\tau_{k})\\ \text{otherwise} \end{array} $$
    (38)
  • (5)EventJk: The analysis of this event has already been done in Case (2) above, including Lemma 2.

  • (6)EventZ12: This is an endogenous event that occurs when gk(x(𝜃,t),𝜃) = x12(𝜃,t) − ζ12 = 0. Applying (21), we have

    $$ \tau_{k,j}^{^{\prime}}=\left\{ \begin{array}{l} -\frac{x_{12,j}^{^{\prime}}(\tau_{k})}{\alpha_{1}(\tau_{k})}\\ \\ -\frac{x_{12,j}^{^{\prime}}(\tau_{k})}{h_{1}(\tau_{k})} \end{array} \right. \begin{array}{l} \text{if }\ x_{1}(\tau_{k})=0\\ \text{and }\alpha_{1}(\tau_{k})\leq{\upbeta}_{1}(\tau_{k})\\ \text{otherwise} \end{array} $$

    Since \(r_{12}(\tau _{k}^{+})=1\) and \(\dot {r}_{12}(t)=0\), we have \(r_{12,j} ^{^{\prime }}(\tau _{k}^{+})=0\).

  • (7)Event\(\bar {Z}_{12}\): This is triggered by event E12 when the traffic burst in queue 12 joins queue 2 and we reset \(x_{12}(\tau _{k}^{+})=0\). Since \(r_{12}(\tau _{k}^{+})=0\) and \(\dot {r} _{12}(t)=0\), we have \(r_{12,j}^{^{\prime }}(\tau _{k}^{+})=0\).

4.2 Cost function derivatives

Returning to Eqs. 1415, and 17, recall that the IPA estimator consists of the gradient formed by the sample performance derivatives \(\frac {\partial F}{ \partial \theta _{j}}\), j = 1,2,3,4, which in turn depend on the state derivatives that we have evaluated in the previous section. The derivation of the IPA estimator for the Average Queue cost function in Eq. 14 is similar to that in Geng and Cassandras (2012) and related prior work and is omitted. Instead, we concentrate on the two new cost functions (15) and (17).

For the Power cost function, we derive \(\frac {\partial F_{i,m}(\theta )}{\partial \theta _{j}}\) from Eq. 16, from which \(\frac {\partial F}{\partial \theta _{j}}\) is obtained by adding over all Mi NEPs of each queue i over [0,T]:

$$ \begin{array}{@{}rcl@{}} \frac{\partial F_{i,m}(\theta)}{\partial\theta_{j}} & =&Px_{i,j}^{^{\prime} }(\theta,t){\int}_{\xi_{i,m}(\theta)}^{\eta_{i,m}(\theta)}w_{i}x_{i} ^{P-1}(\theta,t)dt\\ & =&P[x_{i,j}^{^{\prime}}(\xi_{i,m}^{+}){\int}_{\xi_{i,m}(\theta)}^{t_{i,m}^{1} }w_{i}x_{i}^{P-1}(\theta,t)dt\\ && +\sum\nolimits_{j=2}^{J_{i,m}}x_{i,j}^{^{\prime}}((t_{i,m}^{j})^{+}){\int}_{t_{i,m}^{j-1}}^{t_{i,m}^{j}}w_{i}x_{i}^{P-1}(\theta,t)dt\\ & &+x_{i,j}^{^{\prime}}((T_{i,m})^{+}){\int}_{T_{i,m}}^{\eta_{i,m}}w_{i} x_{i}^{P-1}(\theta,t)dt], \end{array} $$

where \(t_{i,m}^{j},j=1,...,J_{i,m}\) is the occurrence time of the j th event in the m th NEP of queue i and \(t_{i,m}^{J_{i,m}}\) is simplified as Ti,m. The state derivative is determined on an event-driven basis using \(x_{i,j}^{^{\prime }}(\tau _{k}^{+})\) corresponding to the event occurring at time τk; for instance, if G2R1 occurs at node 1, then Eq. 23 is invoked with i = 1.

For the Threshold cost function, we know that \(r^{^{\prime }}(\theta ,t)=0\) and it follows from Eq. 18:

$$ \begin{array}{@{}rcl@{}} \frac{\partial F_{i,m}(\theta)}{\partial\theta_{j}} & =&{\int}_{\gamma_{i,m}(\theta)}^{\psi_{i,m}(\theta)}w_{i}r_{i,j}^{^{\prime}}(\theta ,t)dt-w_{i}r_{i}(\theta,\gamma_{i,m}^{+})\gamma_{i,m,j}^{^{\prime}}+w_{i}r_{i}(\theta,\psi_{i,m}^{-})\psi_{i,m,j}^{^{\prime}}\\ & =&w_{i}(\psi_{i,m,j}^{^{\prime}}-\gamma_{i,m,j}^{^{\prime}}), \end{array} $$

Note that in this case the derivative depends only on \(\psi _{i,m,j}^{^{\prime }}, \gamma _{i,m,j}^{^{\prime }}\), the event time derivatives in Eqs. 25 and 26 for i = 1,3,4 and the corresponding event time derivatives in Cases (6), (7) for each of sets Φ22 and Φ12.

5 TLC with blocking between intersections

In this section, we relax the assumption made in Chen and Cassandras (2018) to include blocking effects between intersections. Therefore, we allow for the possibility that the length of the road between two intersections is sufficiently short so that traffic backlog can fill this space up when the incoming traffic flow is large or the green traffic light duration at the downstream intersection is too short. In our modified model, we impose the constraint x2(t) ≤ L. Due to this constraint, there are two new events:

  1. 1.

    B2U: x2(t) switches from the blocking state (x2(t) = L) to the unblocked state (x2(t) < L).

  2. 2.

    U2B: x2(t) switches from the unblocked state (x2(t) < L) to the blocking state (x2(t) = L).

Figure 6 shows how the blocking state (x2(t) = L) introduces new modes for the dynamics of x1(t) and x2(t). In addition to the modes of x1(t) in Eq. 1 and of x2(t) in Eq. 5, additional modes to these dynamics are added as follows:

$$ \dot{x}_{1}(t)=\left\{ \begin{array}{l} \alpha_{1}(t)-{\upbeta}_{2}(t)\\ \alpha_{1}(t) \end{array} \right. \begin{array}{l} \text{if }\ x_{2}(t)=L,G_{2}(t)=1\\ \text{if }\ x_{2}(t)=L,G_{2}(t)=0 \end{array} $$
(39)
$$ \dot{x}_{2}(t)=0\text{ if }x_{2}(t)=L $$
(40)
Fig. 6
figure 6

Traffic Blocking for x2(t)

Events B2U and U2B only influence the dynamics of x1(t) and x2(t). Since IPA is driven by the events affecting the state dynamics, we update the event sets Φ1 and Φ2 as follows and then provide the corresponding IPA derivative estimates:

$$ \begin{array}{@{}rcl@{}} {\Phi}_{1} & =&\{S_{1},E_{1},R2G_{1},G2R_{1}\}\cup\{U2B,B2U\}\\ {\Phi}_{2} & =&\{S_{2},E_{2},R2G_{2},G2R_{2},J_{k}\}\cup\{U2B,B2U\} \end{array} $$

5.1 IPA for Event Set {S1,E1,R2G1,G2R1}∪{U2B,B2U}

Except for the new events U2B and B2U, the IPA derivative estimates for all other events in Φ1 are the same as in Section 4. Therefore, we only consider how \({x}_{1,j}^{^{\prime }}\), j = 1,…,4, are affected by these events.

  • (1)EventU2B: This can only be triggered by event Jk. When a flow burst joins x2(t), it is possible that x2(t) = L. The guard condition ensuring U2B is triggered by Jk is x12(t) + x2(t) = L when Jk occurs at time t = τk. It follows from the dynamics of x1(t) that

    $$ {x}_{1,j}^{^{\prime}}(\tau_{k}^{+})=\left\{ \begin{array}{l} {x}_{1,j}^{^{\prime}}(\tau_{k})-{\upbeta}_{1}(\tau_{k})\tau_{k,j}^{^{\prime}}\\ {x}_{1,j}^{^{\prime}}(\tau_{k})-\alpha_{1}(\tau_{k})\tau_{k,j}^{^{\prime}}\\ {x}_{1,j}^{^{\prime}}(\tau_{k})\\ +({\upbeta}_{2}(\tau_{k})-{\upbeta}_{1}(\tau_{k}))\tau_{k,j}^{^{\prime}}\\ {x}_{1,j}^{^{\prime}}(\tau_{k})\\ +({\upbeta}_{2}(\tau_{k})-\alpha_{1}(\tau_{k}))\tau_{k,j}^{^{\prime}} \end{array} \right. \begin{array}{l} \text{if }\ G_{2}(\tau_{k})=0,x_{1}(\tau_{k})>0\\ \text{if }\ G_{2}(\tau_{k})=0,x_{1}(\tau_{k})=0\\ \\ \text{if }\ G_{2}(\tau_{k})=1,x_{1}(\tau_{k})>0\\ \\ \text{if }\ G_{2}(\tau_{k})=1,x_{1}(\tau_{k})=0 \end{array} $$
    (41)
  • (2)EventB2U: This event can be triggered by events which influence the incoming and outgoing flows of x2(t). We consider each of them as follows and provide the associated IPA state derivative.

  • (2a)B2U is triggered by event R2G2. Guard conditions are needed for this case to be true, i.e., x1(t) > 0, β2(t) > β1(t) and G1(t) = 1 or G1(t) = 0. If x1(t) > 0, β2(t) > β1(t) and G1(t) = 1, the outgoing flow from the downstream node 2 exceeds the incoming flow to node 2 from the upstream node 1 to decrease x2(t) when event R2G2 occurs. If G1(t) = 0, the outgoing flow from the downstream node 2 decreases x2(t) when event R2G2 occurs since there is no incoming flow from node 1. Let τk be the event time of R2G2 which was analyzed in Section 4.1. We have

    $$ {x}_{1,j}^{^{\prime}}(\tau_{k}^{+})=\left\{ \begin{array}{l} {x}_{1,j}^{^{\prime}}(\tau_{k})+{\upbeta}_{1}(\tau_{k})\tau_{k,j}^{^{\prime}}\\ {x}_{1,j}^{^{\prime}}(\tau_{k}) \end{array} \right. \begin{array}{l} \text{if }\ G_{1}(\tau_{k})=1,x_{1}(\tau_{k})>0\ \text{ and }\ {\upbeta}_{2}(\tau_{k})>{\upbeta}_{1}(\tau_{k})\\ \text{otherwise} \end{array} $$
    (42)
  • (2b)B2U is triggered by event G2R1. The guard condition for this case is G2(t) = 1 which releases the traffic backlog from queue 2. Let τk be the event time of G2R1 which was analyzed in Section 4.1. We have

    $$ {x}_{1,j}^{^{\prime}}(\tau_{k}^{+})= {x}_{1,j}^{^{\prime}}(\tau_{k})-{\upbeta}_{2}(\tau_{k})\tau_{k,j}^{^{\prime}} $$
    (43)
  • (2c)B2U is triggered by event E1. The guard condition is α1(t) −β2(t) ≤ 0 and G2(t) = 1. Since x1(t) = 0, it is easy to show that \({x}_{1,j}^{^{\prime }}(\tau _{k}^{+})=0\).

  • (2d)B2U is triggered by an event such that β1(t) −β2(t) switches from a non-negative value to a negative value. The guard condition is x1(t) > 0 and G2(t) = 1. Let τk be the associated event time. Since this event is exogenous, we must have \(\tau _{k}^{^{\prime }}=0\) which leads to \({x}_{1,j}^{^{\prime }}(\tau _{k}^{+} )={x}_{1,j}^{^{\prime }}(\tau _{k})\).

    (2e)B2U is triggered by an event such that α1(t) −β2(t) changes from a non-negative value to a negative value. The guard conditions are x1(t) = 0 and G2(t) = 1. Along the same lines as (2d), we have \({x}_{1,j}^{^{\prime }}(\tau _{k}^{+})=0\).

5.2 IPA for event set {S2,E2,R2G2,G2R2,Jk}∪{U2B,B2U}

Since the IPA derivative evaluation for events {S2,E2,R2G2,G2R2,Jk} is given in Section 4, we only consider here events U2B and B2U in the following.

  • (1)EventU2B: The analysis of this event was shown above. When this event occurs, we must have x2(t) = L. Since \(\dot {x}_{2}(t)=0\), we have \({x}_{2,j}^{^{\prime }}(\tau _{k}^{+})=0\).

  • (2)EventB2U: Let τk be the event time of B2U. The analysis of this event is the same as that for x1(t) and omitted here. We derive only the state derivative \(x_{2,j}^{^{\prime }}(\tau _{k}^{+})\).

  • (2a)B2U is triggered by event R2G2. It follows from the dynamics of x2(t) that

    $$ {x}_{2,j}^{^{\prime}}(\tau_{k}^{+} )={x}_{2,j}^{^{\prime}}(\tau_{k})+{\upbeta}_{2}(\tau_{k})\tau_{k,j}^{^{\prime} } $$
  • (2b)B2U is triggered by event G2R1. It follows from the dynamics of x2(t) that

    $$ {x}_{2,j}^{^{\prime}}(\tau_{k}^{+} )={x}_{2,j}^{^{\prime}}(\tau_{k})+{\upbeta}_{2}(\tau_{k})\tau_{k,j}^{^{\prime} } $$
  • (2c)B2U is triggered by event E1. We have

    $$ {x}_{2,j}^{^{\prime}}(\tau_{k}^{+} )={x}_{2,j}^{^{\prime}}(\tau_{k})+{\upbeta}_{2}(\tau_{k})\tau_{k,j}^{^{\prime} } $$
  • (2d)B2U is triggered by an event such that β1(t) −β2(t) changes from a non-negative value to a negative value. Since \(\tau _{k}^{^{\prime }}=0\), we have \({x}_{2,j}^{^{\prime }}(\tau _{k} ^{+})={x}_{2,j}^{^{\prime }}(\tau _{k})\).

  • (2e)B2U is triggered by an event such that α1(t) −β2(t) changes from a non-negative value to a negative value. We have \({x}_{2,j}^{^{\prime }}(\tau _{k}^{+})={x}_{2,j}^{^{\prime }}(\tau _{k})\).

6 Simulation results

In this section, we use the derived IPA estimators in order to optimize the green light cycles in the two-intersection model of Fig. 4. We emphasize that this model is simulated as a Discrete Event System (DES) with individual vehicles rather than flows, so that the resulting estimators are based on actual observed data. This is made possible by the fact that all SFM events in the sets Φi, i = 1,…,4, and Φ12 coincide with those of the DES, therefore they are directly observable along with their occurrence times.

For simplicity, we assume that all vehicle arrival processes are Poisson (recall, however, that IPA is independent of these distributions) with rates \(\tilde {\alpha _{i}},i=1,3,4\), and that the vehicle departure rate hi(t) on each non-empty road is constant. We also set the length of each vehicle as unit 1 which does not change the nature of TLC when we initiate L and queue length xi(t), i = 1,2,3,4,12. In Geng and Cassandras (2015), only one controllable parameter per intersection was considered by setting \(\theta _{i}+\theta _{\hat {\imath }}=C\). Here, we relax this constraint. Moreover, we limit each controllable parameter so that 𝜃i ∈ [𝜃i,min,𝜃i,max] where 𝜃i,min and 𝜃i,max are given lower and upper bounds, respectively, for feasible green light cycle values. In our simulations, αi(τk) is estimated through Na/tw by counting the number of arriving vehicles Na over a time window [0,tw] and hi(t) is estimated using the same method as in Fleck et al. (2016). Three sets of simulations are presented below, one for each of the three cost metrics in Eqs. 1415 and 17.

6.1 Simulation results under the no traffic blocking assumption (A3)

In this section, we maintain Assumption (A3) and ignore any possible blocking effects.

1. Average Queue Cost Function

We minimize metric (14), over [0,T]. All three arrival processes are Poisson with rates \(\tilde {\alpha }=[0.41,0.45,0.32]\) and the departure rates at roads 1,2,3,4 are [1.2,1.3,1.2,1.1]. We choose T = 1000s, wi = 1 and 𝜃i ∈ [10,50] for all i, and the initial 𝜃i values are [40,20,20,40]. Figure 7 shows the optimal cost (averaged over 10 sample paths) considering the transit delay in the SFM between intersections (red curve) and ignoring this delay (blue curve) as a function of L. In this case, delay has no effect on the long term total average queue length, as expected. However, this metric may not accurately capture traffic congestion.

Fig. 7
figure 7

Comparison of Optimal Average Queue Cost vs L

2. Power Cost Function, P = 2

For the same settings as before and a quadratic queuing cost, Fig. 8 shows how this cost function and the associated controllable parameters converge when L = 100, achieving a 40% cost decrease. In the left plot of Fig. 9, we use the SFM both including the transit delay and ignoring this delay in order to compare the optimal costs under these two models. Clearly, including delays in our IPA estimators for L > 0 achieves a lower cost, with the gap increasing as L increases.

Fig. 8
figure 8

Optimal Power Cost Function and Controllable Parameters vs Iterations

Fig. 9
figure 9

Comparison of Optimal Power Cost with/without delay vs L

3. Threshold Cost Function

For the same settings and a common threshold ζi = 25 for all i and with L = 35, Fig. 10 shows how this cost function and the associated controllable parameters converge, with the cost converging to its zero lower bound, therefore, in this case we see that our approach reaches the global optimum. In the right plot of Fig. 9, we apply the SFM considering both the transit delay between intersections and ignoring this delay so as to compare the resulting optimal costs. Once again, including delays achieves a lower cost, with the gap increasing as L increases.

Fig. 10
figure 10

Optimal Threshold Cost Function and Controllable Parameters vs Iterations

In Fig. 11, we provide histograms of the queue contents when L = 35. On the left, the controllable parameters are at their initial values [40,20,20,40] and we can see that queues 2, 3, and 12 frequently exceed the threshold. Under the optimal solution on the right, we observe that no queue ever exceeds the threshold over [0,T], hence the optimal cost 0 is obtained. Moreover, note that the probabilities that x2(t) = 0 and x3(t) = 0 significantly increase indicating a much improved traffic balance.

Fig. 11
figure 11

Distribution of queue lengths when L = 35

6.2 Simulation results with traffic blocking included

In this section, we incorporate the effect of traffic blocking between intersections using the IPA derivative estimates derived in Section 5, so as to demonstrate improved traffic light assignments and result in performance improvements as well. As in the previous section, we consider in what follows each of the three cost functions we have defined.

1. Average Queue Cost Function

All three arrival processes are Poisson with rates \(\tilde {\alpha }=[0.43, 0.49, 0.34]\) and the departure rates at roads 1,2,3,4 are [1.2,1.3,1.2,1.1]. We choose T = 1000s, wi = 1 and 𝜃i ∈ [10,50] for all i, and the initial 𝜃i values are [43,22,24,41]. Figure 12 shows the comparison between the method with and without considering traffic blocking when L = 25. The left plots (1) provide results when IPA estimates consider traffic blocking while the right plots (2) are when IPA estimates are used without considering traffic blocking. The figures with sub-labels (a), (b) and (c) respectively show how the cost function, the blocking frequency, and the associated controllable parameters converge. The plots with sub-label (d) show the vehicle queue length in each road after convergence.

Fig. 12
figure 12

Comparison of Average Cost when traffic blocking is ignored and when it is included when L = 25

In this case, since the short length L = 25 allows for several instances of blocking as seen in the plot (2c), using the IPA estimates in Section 5 achieves a 33% cost decrease and eliminates all blocking. On the other hand, when L = 100, Fig. 13 shows that the IPA estimates considering and ignoring traffic blocking achieve almost the same cost since the distance between intersections is large enough to ensure almost no occurrence of event U2B.

Fig. 13
figure 13

Comparison of Average Cost when traffic blocking is ignored and when it is included when L = 100

2. Power Cost Function, P = 2

. For the same settings as before, Fig. 14 compares the results obtained for IPA estimates with and without considering traffic blocking when L = 30. Using IPA estimates which account for blocking achieves a 75% cost decrease while again eliminating all blocking as shown in plots (1b) and (2b). When L = 100, Fig. 15 shows that IPA estimates considering and ignoring traffic blocking have the same performance because there is no occurrence of event U2B.

Fig. 14
figure 14

Comparison of Power Cost when traffic blocking is ignored and when it is included when L = 30

Fig. 15
figure 15

Comparison of Power Cost when traffic blocking is ignored and when it is included when L = 100

3. Threshold Cost Function

All three arrival processes are Poisson with rates \(\tilde {\alpha }=[0.53,0.49,0.34]\) and the departure rates at roads 1,2,3,4 are [1.2,1.3,1.2,1.1]. We choose T = 1000s, wi = 1 and 𝜃i ∈ [10,50] for all i, and the initial 𝜃i values are [43,22,24,41]. A common threshold ζi = 25 is set for all i. Figure 16 shows that IPA estimates which include the effect of traffic blocking achieve a significant cost decrease relative to those ignoring traffic blocking. The frequency of blocking also converges to zero in the latter case, although the convergence rate is relatively low. This behavior is caused by the nature of the threshold cost function whose objective is to reduce the frequency that a queue length is larger than a threshold which is along the same lines as the objective to reduce the frequency of blocking.

Fig. 16
figure 16

Comparison of Threshold Cost when traffic blocking is ignored and when it is included when L = 35.

7 Conclusions and future work

We have extended Stochastic Flow Models (SFMs) to allow for delays which can arise in the flow movement, as well as for possible blocking effects when the distance between nodes is relatively small. We have applied this framework to the multi-intersection traffic light control problem by including transit delays for vehicles moving from one intersection to the next and derived the IPA derivative estimators for this extended SFM for three congestion cost metrics with respect to the controllable green/red cycle lengths, including two new cost metrics that better capture congestion. We have also derived IPA derivative estimators that include possible blocking effects due to the coupling between two nodes. Simulation results show that the inclusion of delays and blocking effects in our analysis leads to improved performance relative to models that ignore delays and/or blocking. Future work aims at demonstrating how this optimization framework scales up when the system considered includes a large number of nodes taking advantage of the event-driven nature of IPA whose complexity, therefore, scales with the number of events and not the (much larger) dimensionality of state space of a network.