1 Introduction

Quality of Service (QoS) [41] in telecommunication systems is directly related to the network performance of the underlying routing systems. QoS is defined as the collective effect of service performance which determines the degree of satisfaction of a user of the service. In quest for quality, current researchers are trying to maximize the quality of service of real-time embedded systems including IP (internet protocol) routers. A router is a specific case of soft-real time embedded systems. Scheduling is a crucial integral part of modern IP routers. Optimally scheduling the different tasks in a multitasking computing system is vitally important. Optimizing the system performance critically depends on appropriate processor usage time allocated to the processes for guaranteeing high system QoS. The latter is of prime concern in designing state-of-the-art real-time embedded systems e.g., routers as it addresses key attributes (parameters) like sources of errors, packet loss rate (PLR), latencies (sum of mean waiting time and service time), resource availabilities, end-to-end delay, jitter (delay variation), throughput, fair bandwidth allocation etc. A rigorous probabilistic framework for a novel optimal intelligent embedded computing scheduler, QUEST (quality-of-service enhanced stochastic), for IP routers is presented here for the first time. Two major gaps in scheduler research have been identified. One is the starvation of low priority processes. The other is the poor performance of the premier EDF scheduler at heavy traffic loads. Addressing these two problems motivated the authors to undertake the present research. In EDF scheduler and its variants, the rise of the mean waiting time to an unacceptably high level at heavy loads, is a long-standing problem which has been successfully solved in this work by explicitly focusing on the heavy-load zone (utilization close to 100%).

1.1 Scheduling attributes

The proposed QoS-enhanced intelligent stochastic packet scheduler, QUEST, for IP routers is based on pre-emptive scheduling but it differs from the conventional schedulers in that it is probabilistic in nature in order to keep the utilization fixed in a fair way. The scheduler offers the following unique advantages:

  1. (i)

    Higher priority processes cannot monopolize the processor and the lower priority processes do not starve. Lower priority processes acquire a guaranteed minimum amount of processor time due to the pre-designed distribution of individual process utilization. This justifies that the scheduler is fair in nature and eliminates the problem of priority starvation.

  2. (ii)

    The scheduler is an adaptive and re-configurable one. A machine-learning feedback controller is used to implement this adaptability and re-configurability. This feedback-controller with the help of run-time cache-miss and deadline-miss error feedbacks learns and takes corrective decisions to maximize the system QoS.

  3. (iii)

    The objective is to maximize the system QoS, subject to the constraint that utilization is kept at 100%. An optimum utilization close to 100% is enforced. In this scheduling scheme, process utilization, U i for a process P i , is expressed as,

    $$ {U}_i=\frac{T_i}{D_i} $$
    (1)

    where T i is the fraction of time spent for execution of process P i. D i is denoted as the deadline of the process P i . The state probability vector of process utilization ratio of n processes running in a system can be expressed as,

    $$ \prod =\left[{U}_1:{U}_2:..{U}_{n- 1}:{U}_n\right] $$
    (2)

    The proposed scheduler is dynamic priority based. In Section 6.3, it is demonstrated that ∑U i  = 1, which indicates that the processor utilization is 100%. Hence, the scheduler is optimally schedulable [22].

  4. (iv)

    Last, the QUEST is strongly immune from hacking because the scheduler is random in nature and therefore the next process to be executed cannot be predicted apriori.

In practice, for an end-to-end QoS sensitive multimedia traffic, which has a commitment to deliver on time, the process utilization for different classes of multimedia traffic is tailored in such a manner that a guaranteed minimum amount of processor attention for each traffic is maintained. For multimedia embedded (router) applications considered in this paper, Voice over Internet Protocol (VoIP), Internet Protocol Television (IPTV) which are real-time traffic and web browsing using Hyper Text Transfer Protocol (HTTP) which is the best effort network traffic processes follow a long-tailed Pareto distribution of process utilization ratio. In this proposed service-differentiated scheduling model, a target process utilization ratio is achieved and maintained as per designer’s requirement. A practical case of process utilization ratio, U i , for three processes has been provisioned in the ratio of 80:16:4.

1.2 System quality of service (QoS)

Delivering QoS means guaranteeing given service parameters within certain bounds for- connections made over a network [5]. The most dominant QoS parameter in a router is the packet loss rate (PLR) [36] encountered in system activities that may arise due to different errors like deadline miss, L1 and L2 cache misses [28], page fault, etc. Overall, two most important QoS’s metrics, namely, PLR and mean waiting time (related to system latency) are focused on in this paper. Practical cache miss error probabilities come in the range of [10−2–10−1] [32]. Practical deadline miss error probabilities come in the range of [0.013–0.12] [18]. For practical real-time tasks, the deadline varies in the range of 10–300 ms [2, 30].

2 Related work

Several distinguished studies deal with QoS metrics for scheduling multimedia traffic in routers. In routers, the simplest First-come first-served (FCFS) scheduler receives packets from all input traffic classes. Packets are assigned to a single queue upon arrival and are serviced on a first-come, first-served basis. An FCFS scheduler cannot differentiate multimedia traffic classes. Packets may be dropped if the queue is full. Cristofaro et al. [8] have presented a detailed comparative analysis of QoS attributes for the VoIP and video conferencing traffic with different queueing policies. However, the study has no focus on PLR. By using First-Come-First-Served (FCFS) and Earliest Deadline First (EDF) schedulers, Saleh and Dong [29] have studied three QoS metrics, namely, miss ratio, delay, and average size of the buffer. The authors have demonstrated the efficiency by using the EDF scheduler in a hybrid network to provide QoS guarantees. But the authors have shown that the FCFS schedulers are more efficient for serving best-effort data traffic than the EDF. But, the research has no specific theme on the priority starvation of lower priority traffic class, re-configurability of the scheduler and process utilization. In [15], the authors have proposed an analytical model for priority queueing systems in a heterogeneous long range dependent self-similar and short range dependent Poisson traffic. The proposed model cannot guarantee a steady state process utilization ratio.

Toral-Cruz et al. [33] have analyzed QoS parameters, namely, jitter and packet loss rate of VoIP traffic. The studies have revealed that VoIP jitter can be modeled by self-similar processes with short or long range dependences. However, the work does not concentrate on maximizing the QoS metrics. Rikli et al. [27] have evaluated various queueing disciplines, such as, fair queueing (FQ), priority queueing (PQ), custom queueing (CQ), low-latency queueing (LLQ) in IP routers to provide the end-to-end QoS requirements for various traffic classes. In case of increasing high priority traffic sources, for target QoS requirements, the authors have proposed solution either by changing the prioritization scheme at the switching routers in favour of priority classes or by allocating more bandwidth. However, the scheme cannot eliminate the problem of priority starvation for low priority best effort traffic classes and allocation of bandwidth is not a dynamic one.

Ghazela and Saïdaneb [13] have proposed a queuing delay control and adjustment method, which guarantees the required QoS in terms of per-service traffic flow authorized for the real-time multi-service traffic. This method deals how to control the queuing delay value at the specified waiting delay by adjusting the arrival probability, so that the QoS delay for real-time services may be guaranteed. However, the scheme has no provision of reconfiguring the scheduler. The proposed method does not deal with the dominant QoS metric PLR.

In [21], the authors have demonstrated a reconfiguration-aware real-time scheduling mechanism under QoS constraints where only VoIP traffic has been considered. Further, no explicit mechanism to enhance the system QoS and supporting queueing theory are not mentioned. Greco et al. [14] have contributed on a multitasking, pre-emptive RTOS environment in a stochastic scheduling domain. Although the model is based on Markov chain, it provides no focus on state estimation by machine learning. Further, the scheduler is not a re-configurable one.

Based on literature survey it is observed that in a multitasking scheduler in IP routers, dynamically optimizing the system QoS based on Markov chain model has not been specifically focused. The novelty of search technique to find the global minimum value of PLR in the search space is novel in this work. Several approaches have been proposed based on real-time pre-emptive scheduling algorithms, for example static priority scheduling: rate monotonic (RM), dynamic priority scheduling: earliest deadline first (EDF) and its variants. In these scheduling mechanisms, lower priority processes get over penalized because of suspension of execution by the higher priority processes. Using EDF in a dynamic environment of real-world applications for an overloaded system processes miss deadlines frequently resulting in very low value of throughput. EDF is unsuitable in real-time packet network traffic as all traffic classes receive the same miss rate irrespective of deadline requirements and traffic characteristics. Further, EDF does not honour class differentiation for traffic and therefore fails to comply with the service level agreements (SLAs) with client processes. Last, EDF and its variant A-EDF are deadline driven, where process utilization has no explicit focus. The root of the problem can be traced to its deterministic and deadline-driven mode of operation. Taking a novel alternative route here, namely, non-deterministic stochastic and utilization (load)-driven operation, the bottleneck has been circumvented.

These problems have been solved through the proposed scheduling framework. Here, a non-deterministic optimal scheduler, QUEST, which is random in nature has been implemented so that the highest priority process does not dominate the processor execution time and the problem of starvation of the low priority process never occurs. QUEST is strictly traffic class-sensitive and fully conforms to SLAs. Additionally, QUEST is a deadline-aware utilization-driven scheduling scheme.

The rest of this paper is organized as follows. Section 3 and 4 discuss proposed system model and formulate the scheduling mechanism and queue management, respectively. Section 5 presents simulation methodology, followed by simulation results in Section 6. Dynamic global optimization and re-configurability of the scheduler are described in Section 7. Section 8 reports run-time estimation of transition probability matrix (TPM) by machine learning. Stability and accuracy of run-time TPM estimation is provided in the same Section. A comparative performance analysis of QUEST is evaluated in Section 9. Test-bed implementation for QUEST is presented in Section 10. Finally, conclusion is stated in Section 11.

3 Proposed system model

The design has been implemented for three classes (multimedia traffic flows) - VoIP, IPTV and HTTP. A Finite-state machine (FSM) based on Markov chain model for the scheduler is reported in this paper. Markov model is a stochastic model in which the probability that a random variable, X, takes on the value xn+1 at time step (n + 1) is entirely determined by its state value in the previous time step n and it is independent of its state values in earlier time steps: n-1, n-2, etc. Each process in this scheme modelled as a particular Markov state. The processes are characterized by their state probabilities (p ij )s which are defined as probabilities of processes to be in their own states (p ij,i= j ) or to make transitions to other states (p ij , i ≠ j ). In this scheme, the class processes settle to a steady state probability distribution according to time evolution.

The underlying model behind this scheduling framework is a Hidden Markov Model (HMM). To find the most likely (ML) path of reaching the desired final steady state probability vector (string) is a heuristic process. Therefore, HMM is an NP-Hard problem. Since HMM is an NP-Hard problem [23], Markov initial TPM parameters (matrix elements) are calculated apriori using machine learning Metropolis-Hastings algorithm: stated in algorithm 1 [7]. Metropolis-Hastings algorithm is a special class of Markov Chain Monte Carlo (MCMC) method, with constraints like the diagonal elements of the TPM are in the range: [0.4–0.9] and the non-diagonal elements are in the range: [0.01–0.6] [34]. It has been observed that a faster convergence is achieved [34] in such cases. Because of Markovian property, target steady state probability distribution can be generated. The corresponding TPM is estimated by maximum likelihood. To support the above proposition in an embedded computing environment in a router, the desired steady state probability distribution, ξ:φ:ϊ (where ξ + φ + ϊ = 1) for three processes representing their corresponding three classes have been considered.

figure c

The first step is to initialize the sample value for each random variable. The algorithm consists of three steps: First, a proposal sample y sample is generated from the proposal distribution p(y (i) |y (i-1) ); second, based upon the proposal distribution and the full joint density π(∙), the acceptance probability is computed using acceptance function α(y sample | y (i-1) ); third, the candidate sample is accepted with probability α, or rejected with probability (1-α). For multimedia IP traffic considered in this work, the desired (fractal Pareto type) steady-state distributions are of the order of 0.80: 0.16: 0.04 as justified later in Section 4 with Table 1. So ξ = 0.8, φ = 0.16 and ϊ = 0.04 are considered. An initial approximate estimate for the 3 × 3 Transition Probability Matrix (TPM), ‘T’ is estimated by using the machine learning Metropolis-Hastings algorithm to provision a steady state distribution of process utilization ratio 0.80: 0.16: 0.04. ‘T’ is stated in Eq. (3).

$$ T=\left(\begin{array}{l}0.90\kern1em 0.08\kern0.91em 0.02\\ {}\begin{array}{ccc}\hfill 0.39\hfill & \hfill 0.56\hfill & \hfill 0.05\hfill \end{array}\\ {}\begin{array}{ccc}\hfill 0.42\hfill & \hfill 0.18\hfill & \hfill 0.40\hfill \end{array}\end{array}\right) $$
(3)

The ∏, the state probability vector, is treated as process utilization ratio as discussed earlier. Ignoring apriori information, an initial unbiased state probability vector, 0  = 1/3[1 1 1] is applied and the estimated final state probability vector, ∏f is obtained as, [0.79829:0.16154: 0.040071], as illustrated in Fig. 1a. Figure 1b indicates that, although an initial biased state probability vector, 0  = [0.1 0.5 0.4] is applied, the estimated final state probability vector, ∏f is obtained as, f  = [0.79837:0.16155:0.040075], approximately same as in Fig. 1a.

Fig. 1
figure 1

Confirmation of convergence of three states. a0 = 1/3[1 1 1], b0 = [0.1 0.5 0.4]

The result confirms that a final practical process utilization ratio, u  = [U 1 : U 2 : U 3 ] distribution i.e. [0.80: 0.16: 0.04] for three processes of corresponding classes, has been achieved, irrespective of the initial distribution. It is to be noted that a specific value of U i , achieved here, is under the control of designer’s choice. In general, any target values of ∏f, namely, [0.81 0.13 0.06], [0.65 0.25 0.10], etc. can be achieved as per designer’s requirement because Metropolis-Hastings algorithm can generate any arbitrary desired steady state distribution [7].

4 Scheduling mechanism and queue management

The multi-service packet scheduling (PS) scheme, QUEST, as shown in Fig. 2 accepts three different classes of incoming multimedia traffic - VoIP, IPTV and HTTP. Traffic streams are classified by a classifier and fed to three distributed FIFO queues: Q1, Q2 and Q3 for VoIP, IPTV and HTTP, respectively. Migration of traffics among the queues are not allowed. The proposed model is defined as M/BP/1/./QUEST. In this underlying model, ‘M’ denotes traffic arrivals which are of Markovian type modulated by Poisson process (MMPP). In real world applications, this scheme is a fair estimation of large number of independent memory-less events [20]. Further, according to recent approaches [1], for a settled system, incoming traffic streams defined by different distributions converge to a Poisson distribution as time evolves. ‘BP’ refers to the service time distribution which is of Bounded Pareto type. ‘1’ indicates single processor. The incoming processes are being scheduled and executed according to the QUEST scheduler in a preemptive-resume manner. Service of each traffic is related with the defined value of QoS Class Identifiers (QCI). QCI defines its performance objective and a lower value QCI denotes more restrictive services in terms of performance. The deadlines for VoIP, IPTV, HTTP are set as stated in Table 1. These values have been taken considering acceptable practical deadline [6, 31] in real world applications.

Fig. 2
figure 2

Illustration of M/BP/1/./QUEST model. Qi: Ready queues, Wi: Waiting queues, EQ: Expired queue

Table 1 Service models parameters

Let, Pi denotes the representative process for the corresponding class Ci. The priorities assigned to processes are inversely proportional to their deadline [19]. Therefore, the priority of execution of processes are kept in the order of, P 1  > P 2  > P 3 and the process utilization ratio is provisioned as [0.8:0.16:0.04]. In this scheduling policy, a clock interrupt generates the timing slices or quanta. After each slice, the next process is picked up from the ready queue. The scheduler runs through the ready queue, selects a process from a queue of processes to execute depending on the outcome of a random number generator, runs through the time slice, eventually placing the finished process in an expired queue. For practical real-time tasks, deadlines are in the range of 10–300 ms [2, 30]. Considering uniform burst time which is made possible by traffic conditioning algorithms like token bucket, leaky bucket, etc., the process utilization (U i ) [22, 37] of the system is expressed in Eq. (4).

$$ \sum_1^3{U}_i={T}_B.\left(\frac{1}{D_1}+\frac{1}{D_2}+\frac{1}{D_3}\right)\le 1 $$
(4)

In this scheme, T B denotes the burst time (service time) and the deadlines of processes are denoted by D i . In case, D 1  = 20 ms, D 2  = 100 ms, D 3  = 400 ms the value of burst time is calculated as, T B 16 ms. Allowing 4 ms timing jitter (T J ) provides the required value of time quantum (T Q ). Thus, T Q  = T B  + T J =20 ms. In this framework, the time quantum, T Q , is set at 20 ms so that pre-emption does not result in deadline misses. In practical case, this value of time quantum 20 ms is acceptable because it is at least equal to the minimum process deadline 20 ms, which is required for highest priority VoIP (process P1) traffic to avoid context switching. Thus, designing the value of burst time as 16 ms concretely justifies its use to keep the system utilization 100%. Although, for demonstrating the concept, the authors have considered three traffic classes, the framework is general and can be expected to any number of processes because it is based on Markov model.

4.1 QUEST scheduling algorithm

Algorithm 2 states formal description of the proposed scheduling algorithm.

figure d

Algorithm 2 clearly indicates that QUEST is a true dynamic-priority scheduler because the next process to be executed depends purely on the outcome of the random number generator decided at run-time and may not have the highest priority among the pending processes.

4.2 Mean waiting time

Let, a random variable X taking value x in the interval [l, q]. The probability density function of Bounded Pareto distribution of queue service time is given by

$$ \begin{array}{cc}\hfill {f}_x(x)=\frac{\theta .{l}^{\theta}.{x}^{-\left(\theta +1\right)}}{1-{\left(\frac{l}{q}\right)}^{\theta}},\hfill & \hfill l\le x\le q\hfill \end{array} $$
(5)

where θ is the shape parameter, l and q denote minimum and maximum IP data file sizes, respectively.

The second moment of this distribution is calculated as,

$$ {E}_x\left({x}^2\right)=\underset{l}{\overset{q}{\int }}{x}^2.{f}_x(x) dx=\frac{\theta .{l}^{\theta}}{1-{\left(\frac{l}{q}\right)}^{\theta}}.\frac{\theta}{\left(\theta -2\right)}.\left({l}^{2-\theta}-{q}^{2-\theta}\right) $$
(6)

The second moment of the service time distribution, E[X 2 ] is calculated as,

$$ E\left[{X}^2\right]=\frac{E_X\left({x}^2\right)}{L_C^2} $$
(7)

where L c , is the link capacity of the system.

From queueing theory, mean waiting time, W s without using a stochastic admission controller can be expressed as

$$ {W}_S=\frac{\lambda E\left[{X}^2\right]}{2\left(1-\rho \right)} $$
(8)

where ρ, is normalized load of the system and the traffic arrival rate is denoted by λ. The arrival rate is expressed in terms of number of incoming packets per second.

4.3 Packet loss rate (PLR)

PLR is expressed as, PLR = (Ns- Nr)/Ns, where Ns and Nr are denoted as number of packets sent and number of packets received, respectively. In this work, the packet loss rate (PLR) is expressed as the root mean squared error, P e,rms, of L1, L2 cache miss and deadline miss errors of the system. P e,rms is stated in Eq. (9). L1 cache miss error, L2 cache miss error and the deadline miss error are denoted by C L1 , C L2 and D e respectively.

$$ {P}_{e, rms}=\sqrt{C_{L1}^2+{C}_{L2}^2+{D}_e^2} $$
(9)

For each of the three processes: VoIP, IPTV, HTTP, the above r.m.s error is calculated from Eq. (9) and substituted in the second row of error probability matrix, E, given in Eq. (10).

5 Simulation methodology

For simulation, an initial model is characterized by two matrices, i) the TPM, ‘T’ stated in Eq. (3) for the Markov model considered (here three-state model) and ii) ‘E’, an error (vector) probability matrix in (10). Practical values of cache miss errors [32] and deadline miss error [18] rates have been taken.

$$ E=\left(\begin{array}{l}\begin{array}{ccc}\hfill 0.98\hfill & \hfill 0.9\hfill & \hfill 0.8\hfill \end{array}\\ {}\begin{array}{ccc}\hfill 0.02\hfill & \hfill 0.1\hfill & \hfill 0.2\hfill \end{array}\end{array}\right) $$
(10)

The three elements in the second row in Eq. (10) represent error probabilities of the processes of corresponding classes and the elements in first row indicate the probabilities of correctness. The simulation framework has been developed using a discrete event simulator, DEVS suite [9] and MATLAB R 2015 b (version 8.6) in a computer having specification of Intel i3 CPU 2.5 GHz, 4GB RAM, Windows 7 platform. Monte Carlo method has been applied for confirmation. Following (Table 2) system environment for simulation was used:

Table 2 Simulation parameters

6 Simulation results

In this section simulation results are presented.

6.1 Waiting time of individual process

Waiting time for each class of traffic in this simulation are plotted with respect to increasing normalized load as shown in Fig. 3.

Fig. 3
figure 3

Waiting time comparison for different processes for QUEST

6.2 Comparative performance analysis of mean waiting time

In this subsection, a comparative performance analysis in terms of mean waiting time, for QUEST, with current state-of-the-art scheduling algorithms - deferred pre-emption (DP) [4], earliest deadline first (EDF) [37] and accuracy-aware EDF (A-EDF) [24] has been illustrated in Fig. 4.

Fig. 4
figure 4

Mean waiting time with increasing load

Figure 4 shows that, QUEST experiences significantly the lowest value of mean waiting time with higher normalized load and it exhibits 23% improvement with respect to best competing A-EDF. Usage of a stochastic admission controller [12] which is permissible in QUEST, keeps the mean waiting time low even at high traffic loads close to 100%. On the other hand, EDF and its variant A-EDF are not stochastic, avoiding usage of such admission controllers. Therefore, for EDF, mean waiting time can be low only for loads below about 80% [12], which contradicts our original problem objective of 100% utilization. If stochastic admission controller is not used, in high load condition, the mean waiting time rise-rate would be steeper as happens with EDF and A-EDF depicted in Fig. 4. Furthermore, Rate Monotonic (RM) as well as DP (Fig. 4) are static priority scheduling algorithms and therefore, experiences significant rise of mean waiting time with increasing normalized traffic load.

6.3 Steady state probability analysis and system stability

Simulations were performed considering random arrival of processes with the given error vector. The error vector provides error positions in 2000 sequences (iterations). The probability of finding the processor in a given state is calculated from ‘T’ and the error probability is obtained from ‘E’.

As shown in Fig. 5, Process P1 (VoIP), Process P2 (IPTV), and Process P3 (HTTP) achieve steady state probabilities of 0.796, 0.161 and 0.043, respectively. The PLR (denoted as P e ) thus obtained is 0.0045 (Fig. 6), which is acceptable because it falls within the standard PLR threshold of 1% [17].

Fig. 5
figure 5

Convergence of State Probability Vector П

Fig. 6
figure 6

Pe converges to a steady state with number of increasing iterations

Thus, the lowest priority process traffic HTTP secures a guaranteed 4.3% process utilization which validates authors’ claim that low-priority process starvation is eliminated. Simulations were performed to calculate the packet loss rate (PLR) which is denoted as P e. Results show that with the increasing count of sequences (iterations), P e settles to a steady state value (shown in Fig. 6). This validates consideration of the processes as stable Markov states, and establishes system stability.

7 Dynamic global optimization and re-configurability of QUEST

PLR is to be minimized to optimize system performance. Due to the varying nature of load, the pre-allocated state transition probabilities of matrix ‘T’ are unfit to provision the QoS at its maximum. This problem is solved in a unique, ingenious way by re-configuring the matrix ‘T’ using reconfiguration (tuning) parameters, Δ1, Δ2 and Δ3 as stated in Eq. (11).

$$ {T}_{recon}=\left(\begin{array}{l}\begin{array}{ccc}\hfill 0.90-2{\Delta}_1\hfill & \hfill 0.08+{\Delta}_1\hfill & \hfill 0.02+{\Delta}_1\hfill \end{array}\\ {}\begin{array}{ccc}\hfill 0.39+{\Delta}_2\hfill & \hfill 0.56-2{\Delta}_2\hfill & \hfill 0.05+{\Delta}_2\hfill \end{array}\\ {}\begin{array}{ccc}\hfill 0.42+{\Delta}_3\hfill & \hfill 0.18+{\Delta}_3\hfill & \hfill 0.40-2{\Delta}_3\hfill \end{array}\end{array}\right) $$
(11)

These reconfiguration parameters drive the PLR to a minimum value and hence QoS back to maximum value by the feedback controller shown in Fig. 7.

Fig. 7
figure 7

Feedback control system for re-configuring the QUEST scheduler

In reality, the processor usage allotment to all processes is dynamic over time and event-driven. The system QoS is dynamically monitored by the scheduler using a feedback controller with the help of decision making unit (DMU) and necessary corrective actions are implemented.

Use of feedback controller in the proposed QUEST is of twofold. Feedback controller increases performance of QUEST irrespective of internal and external uncertainties. Further, it automatically reconfigures the scheduler to run within user defined range on-the-fly.

The error feedback controller is used to reconfigure the QUEST by suitably tuning Δi s. The 3D-contour plot of PLR (denoted as P e) as function of Δ1 and Δ2 with Δ3 = 0) is shown in Fig. 8. Similarly, P e can be plotted as function of Δ2 , Δ3 and Δ1 , Δ3. It has been noted that P e is globally minimum at 0.001 if values of Δ1, Δ2, Δ3 are kept at 0.025, −0.09 and 0, respectively.

Fig. 8
figure 8

Re-configuration space of Pe vs. Δ 1, Δ 2; Δ 3 = 0

8 Run-time estimation of TPM by machine learning

Machine learning algorithms are used to learn knowledge or properties from the data for optimizing a performance criterion. Recently many state-of-the-art machine learning algorithms have been developed and applied in diversified fields. In [39], the authors have presented an automated and accurate classification method based on eigenbrains and machine learning, in order to detect Alzheimer’s disease (AD) subjects and AD-related brain regions using 3D MR images. Zhang, Y. & Wang S.(2015) [38] have proposed a novel AD detection method by displacement field (DF) estimation between a normal brain and an AD brain. The DF was treated as the AD-related features, reduced by principal component analysis (PCA), and finally fed into three classifiers: support vector machine (SVM), generalized eigenvalue proximal SVM (GEPSVM), and twin SVM (TSVM). J. K. Williams [16] have applied random forest algorithm to diagnose aviation turbulence. Random forests are a combination of tree predictors such that each tree depends on the values of a random vector sampled independently and with the same distribution for all trees in the forest. In [3], the authors have proposed a methodology for multi-label classification via multi-target regression in a streaming setting.

In [10], the authors have studied theoretical and empirical analysis of support vector machine methods for multiple instance classification. Support vector machine is a supervised machine learning algorithm which can be used for both classification or regression challenges. In [11], Elghazel et al. have studied unsupervised feature selection with ensemble learning. Ensemble Learning is a machine learning which uses more than models to make a prediction. The underlying design for this is that collective opinion of many is more likely to be accurate than that of one. A prediction is made based on combined outcomes of each of the models. The outcome can either be combined using average or the outcome occurring the most, or weighted averages. Ensemble Learning attempts to find a trade-off between variance and bias. K-means clustering is an unsupervised Machine Learning algorithm that deals with clustering of data. Using training data, the model finds the best structures and forms clusters. Wang, X. et al. [35] have modified the MinMax k-means algorithm based on PSO to determine the parameters which can subject the algorithm to attain the lowest clustering errors.

Because the QUEST scheduling mechanism is re-configurable in nature, specific values of TPM parameters at a given time during system operation are uncertain. Therefore, it is essential to dynamically estimate the TPM parameters (elements of the matrix ‘T’) during operation. The transition probability matrix (TPM) parameters are estimated by a forward-backward machine-learning algorithm which learns during run-time from the observed error patterns (sequences) that serve as training data. Here, for a given Δ i algorithm 3 is applied to estimate the TPM parameters. The flowchart of the algorithm is illustrated in Fig. 9.

Fig. 9
figure 9

Flowchart of algorithm 3: Forward-backward machine-learning

In this algorithm, p ij , e jk and n are given by transition probability, probability of error and iteration index, respectively. Let, i (t) and i (t + 1), denote the current state and the next state of the FSM respectively. The visible error pattern is presented by S = [010 20 ..1000 30 1..0 44 000001...] where elements of this pattern are denoted by S k and 1 s represent errors.

$$ {p}_{i j}= P\left[{\ddot{w}}_j\left( t+1\right)|{\ddot{w}}_i(t)\right] $$
(12)

and

$$ {e}_{j k}= P\left[{S}_k(t)|{\ddot{w}}_j(t)\right] $$
(13)

Computation has been started with an estimate of p ij and e jk and to calculate improved values of them until convergence criterion, ċ is achieved. In this estimation, x i (t) is the probability that the scheduler is in state i (t) and has generated the error sequence up to step t. Similarly, y i (t) to be the probability that the model is in state i (t) and will generate the rest of the error sequence. An improved value can be calculated by defining z ij (t) - the probability of transition between i (t − 1) and j (t), given the model generated the entire training visible sequence S T by any path. z ij (t) is defined as follows:

$$ {z}_{i j}(t)=\frac{p_{i j}{e}_{j k}{x}_i\left( t-1\right){y}_j(t)}{P\left({S}^T|\dot{c}\right)} $$
(14)

where P(S|ċ) denotes the probability that the model generated sequence S T. Let, p′ij is the the estimate of the probability of a transition from i (t − 1) to j (t). The value of p′ij can be found by taking the ratio between the expected number of transitions from i to j and the total expected number of transitions from i .

$$ {p}_{ij}^{\prime }(t)=\frac{\sum_{t=1}^T{z}_{ij}(t)}{\sum_1^T\sum_k{z}_{ik}(t)} $$
(15)

Similarly, an improved estimation of e′jk can be calculated,

$$ {e}_{jk}^{\prime }(t)=\frac{\sum_{\begin{array}{c}\hfill t=1\kern1.5em \hfill \\ {}\hfill s(t)={s}_k\hfill \end{array}}^T\sum_l{z}_{jl}(t)}{\sum_{t=1}^T\sum_l{z}_{jl}(t)} $$
(16)

Improved estimates for pij and ejk are repeated using Eqs. (15) and (16) until the change is significantly less than convergence criterion ċ. In this estimation, ċ has been set at 0.001.

8.1 Stability and accuracy of run-time TPM estimation

As the process load varies on a demand basis within the system, the PLR changes accordingly. Therefore, the elements of ‘E’, the error probability matrix too changes with respect to time and iterations. After 900 iterations the system simulates the newly estimated model having modified TPM. In this learning, Forward-backward algorithm is guaranteed to converge to a maximum log likelihood ratio as shown in Fig. 10.

Fig. 10
figure 10

Plot of log likelihood with respect to no. of Iterations

This convergence signifies stability of the system. The accuracy of the proposed scheduler is validated by comparing the run-time error patterns for initially considered TPM and for the estimated regenerated one. These patterns are illustrated in Fig. 11.

Fig. 11
figure 11

Pr (0m|1) for initial model and for newly estimated (regenerated) model

The two run-time error patterns are almost identical, confirming accuracy of the proposed model.

9 System performance analysis of QUEST

The run-time PLRs for individual traffic flow in QUEST are illustrated in Fig. 12.

Fig. 12
figure 12

PLR for each multimedia IP traffic

The figure illustrates that the VoIP traffic in QUEST has a minimum value of PLR with increasing normalized load compared to IPTV and HTTP. The rise rate of run-time PLR for HTTP traffic is significantly highest.

A comparative performance analysis of PLR (here, denoted as P e ) for current state-of-the-art scheduling algorithms - earliest deadline first (EDF), deferred preemption (DP), accuracy-aware EDF (A-EDF) with respect to QUEST for increasing normalized loads are illustrated in Fig. 13.

Fig. 13
figure 13

PLR for DP, EDF, A-EDF, QUEST

The L1, L2 cache miss errors and deadline miss errors for aforementioned scheduling algorithms with typical values of L1 = 32 KBytes and L2 = 256 KBytes at a normalized load of 0.9 are depicted in Fig. 14.

Fig. 14
figure 14

Cache and deadline miss errors for DP, EDF, A-EDF, QUEST

It is observed from Figs. 13 and 14 that the QUEST scheduler outperforms other scheduling schemes and offers the lowest value of PLR. The PLR is reduced by 37 % in QUEST compared to A-EDF with lower values of cache and deadline misses. For QUEST, the improvement is due to use of Hidden Markov Model (HMM) filter (Baum-welch based) which is a probabilistic model applicable for finite and discrete process states. In contrast, A-EDF uses Kalman filter for process state estimation. Kalman filter is a special case of HMM applicable only for continuous and infinite states for a linear state space model which is not valid in digital embedded systems. Further, Kalman filter assumes Gaussian noise, whereas HMM filter makes no such assumptions and is thus more general and accurate. Furthermore, EDF and A-EDF have no explicit control on utilization, leading to unacceptably high deadline miss rates at heavy loads. In stark contrast, QUEST enforces utilization close to 100%, making lower deadline misses even at heavy loads. This conclusively establishes QUEST’s superiority over EDF and A-EDF.

10 Test-bed implementation for QUEST

The performance of the proposed QUEST scheduler was validated in NetFPGA® [25] – a renowned open platform for high-performance networking router using field-programmable gate array (FPGA) hardware. The platform was customized for the implementation of the reconfigurable scheduler QUEST in the following experimental setup shown in Fig.15.

Fig. 15
figure 15

Test-bed implementation of QUEST

The QUEST was implemented in a router which was placed between an ISP gateway and a multiport switch. The NetFPGA®- router was connected with the Internet having a speed of 10 Mbps. Three classes of multimedia IP traffic, namely, VoIP (Skype), IPTV (live streaming) and HTTP (web browsing) were being scheduled and executed according to the QUEST. Three laptops were used for receiving each class of traffic and a renowned Paessler PRTG® network monitor [26] console was connected with a multiport switch to monitor the performance of the QUEST. The trace of the run-time processor utilization over a continuous monitor of 1 h 20 min is depicted in Fig. 16. The router was switched on at 12:15 PM. The scheduler adapts itself to reach a steady state processor utilization which is very close to 100% at 12:37 PM. A utilization very close to 100% (within the range of 91 to 97%) was maintained over a period of continuous 58 min except at 1:35 PM when the utilization falls below 90%.

Fig. 16
figure 16

Trace of processor utilization using Paessler® PRTG network monitor on 23rd March, 2017

The individual process utilization monitored for each class of multimedia IP traffic is depicted in Fig. 17.

Fig. 17
figure 17

Time trace of process utilization ratio for VoIP, IPTV and HTTP

The experimental results indicate that the steady state process utilization ratio in the order of 80:16:4 for VoIP, IPTV and HTTP traffic was achieved. The mean waiting time (in ms) and the run-time PLR (expressed in percentage) of QUEST for varying load over a continuous observation period of 55 min are depicted in Fig. 18.

Fig. 18
figure 18

Trace of PLR and mean waiting time for QUEST with varying load

It is clear from the figure that the maximum value of packet loss rate for QUEST is 0.49%, which is within the standard PLR threshold of 1%. The maximum value of mean waiting time is 8.2 ms which is less than the minimum deadline of 20 ms.

Detailed experiment characterization is too long and is suitable for a separate, forthcoming publication.

11 Conclusion

This paper presents a novel re-configurable QoS-enhanced intelligent real-time packet scheduler - QUEST, for multimedia IP traffic in routers. Machine learning algorithms were used for the first time to our best knowledge to design a QoS-maximized optimal fair stochastic packet scheduler to dynamically optimize the system QoS during run-time. In stark contrast to the schedulers available in the literature, this scheduler was shown to maximize the system-QoS, guaranteeing utilization fixed at 100%. QUEST addresses poor performance of the premier EDF scheduler at heavy loads. Its other unique advantages, namely, avoiding priority starvation and arbitrary pre-programming of process utilization ratio, were validated with rigorous simulations. Performance of the scheduler was analyzed using QoS’s two most important metrics, namely, packet loss rate and mean waiting time (related to system latency). Simulation results indicate that the performance of the proposed scheduler is substantially superior compared with current state-of-the-art scheduling algorithms. An improvement of 37% in PLR and an improvement of 23% in mean waiting time were obtained over the competing scheduler A-EDF. The accuracy of the QUEST was further established by comparing the run-time error patterns for initial and estimated TPM and they were found to be almost identical. A design for QUEST’s implementation in NetFPGA® router has been presented. Extension to fuzzy queueing systems is underway and would be published in forthcoming papers. The dynamic optimization presented in Section 7 can be further improved by applying stochastic computational intelligence algorithms like simulated annealing (SA), particle swarm optimization (PSO) [40], etc.