DRLBTSA: Deep reinforcement learning based task-scheduling algorithm in cloud computing

Mangalampalli, Sudheer; Karri, Ganesh Reddy; Kumar, Mohit; Khalaf, Osama Ibrahim; Romero, Carlos Andres Tavera; Sahib, GhaidaMuttashar Abdul

doi:10.1007/s11042-023-16008-2

DRLBTSA: Deep reinforcement learning based task-scheduling algorithm in cloud computing

Published: 17 June 2023

Volume 83, pages 8359–8387, (2024)
Cite this article

Download PDF

Access provided by Autonomous University of Puebla

Multimedia Tools and Applications Aims and scope Submit manuscript

DRLBTSA: Deep reinforcement learning based task-scheduling algorithm in cloud computing

Download PDF

Sudheer Mangalampalli ORCID: orcid.org/0000-0002-1485-8783¹,
Ganesh Reddy Karri¹,
Mohit Kumar²,
Osama Ibrahim Khalaf³,
Carlos Andres Tavera Romero⁴ &
…
GhaidaMuttashar Abdul Sahib⁵

1218 Accesses
14 Citations
1 Altmetric
Explore all metrics

Abstract

Task scheduling in cloud paradigm brought attention of all researchers as it is a challenging issue due to uncertainty, heterogeneity, and dynamic nature as they are varied in size, processing capacity and number of tasks to be scheduled. Therefore, ineffective scheduling technique may lead to increase of energy consumption SLA violations and makespan. Many of authors proposed heuristic approaches to solve task scheduling problem in cloud paradigm but it is fall behind to achieve goal effectively and need improvement especially while scheduling multimedia tasks as they consists of more heterogeneity, processing capacity. Therefore, to handle this dynamic nature of tasks in cloud paradigm, a scheduling mechanism, which automatically takes the decision based on the upcoming tasks onto cloud console and already running tasks in the underlying virtual resources. In this paper, we have used a Deep Q-learning network model to addressed the mentioned scheduling problem that search the optimal resource for the tasks. The entire extensive simulationsare performed usingCloudsim toolkit. It was carried out in two phases. Initially random generated workload is used for simulation. After that, HPC2N and NASA workload are used to measure performance of proposed algorithm. DRLBTSA is compared over baseline algorithms such as FCFS, RR, Earliest Deadline first approaches. From simulation results it is evident that our proposed scheduler DRLBTSA minimizes makespan over RR,FCFS, EDF, RATS-HM, MOABCQ by 29.76%, 41.03%, 27.4%, 33.97%, 33.57% respectively. SLA violation percentage for DRLBTSA minimized overRR,FCFS, EDF, RATS-HM, MOABCQ by48.12%, 41.57%, 37.57%, 36.36%, 30.59% respectively and energy consumption for DRLBTSA over RR,FCFS, EDF, RATS-HM, MOABCQ by36.58%,43.2%, 38.22%, 38.52%, 33.82%existing approaches.

AI Enabled Resources Scheduling in Cloud Paradigm

A novel dynamic multi-objective task scheduling optimization based on Dueling DQN and PER

Article 20 June 2023

A novel deep reinforcement learning scheme for task scheduling in cloud computing

Article 29 June 2022

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

Cloud Computing paradigm is a regime change in various industries, which changed utilization of computing, storage and network infrastructures and laid a platform to cope up with the evolvement of huge data in various industries especially to handle data intensive computations, large chunks of data storage. Therefore, this paradigm evolved as a utilization model through which all services i.e. computation, storage, network are given to consumers as services on demand. This model initially evolved as virtual infrastructure i.e., IaaS for various companies but later it was evolved as computing platform where we can develop our applications and install various software’s by using different services provided by cloud platform. Now a days cloud computing is useful in various sectors and some of the domains are mentioned here but not limited i.e., healthcare, education, entertainment, Government organizations, multimedia, transport, IoT, AI and ML. In cloud paradigm, services related to IoT, AI, ML, image processing requires huge processing capacity infrastructure as all these services consists of multimedia data which need to be processed accurately and scheduling multimedia data is a challenge in cloud paradigm. All the above-mentioned domains use various service models based on SLA. SLA depends on user and organization to which services they are subscribed. It is the responsibility of cloud provider to render services based on agreement and violation of SLA should not be happened from the cloud provider. Many of users are accessing virtual resources in cloud simultaneously and it is difficult to handle all these requests and assign virtual resources according to SLA is a challenging task and cloud paradigm provision resources automatically to users based on SLA without human intervention and these provisioning of virtual resources to tasks are to be handled by a scheduler.

The effectiveness of cloud computing paradigm mainly depends on how scheduler effectively manages tasks and schedules tasks onto suitable virtual resources. It also effects various parameters i.e. energy consumption, SLA violation that leads to the issues related to both cloud provider, user. If scheduler is not suitably mapping tasks to virtual resources then it directly effects makespan, which takes high amount of execution time, which leads to decay in quality of service. If a task takes huge amount of execution time then it may also incurs high amount of energy consumption. This can be also one of the reason to effect quality of service. Finally, if a given task is not expected to complete within stipulated time or if a task is provisioned for certain amount of time to a virtual resource but if user is still accessing resource after the provisioned time then it is a violation of SLA. It will happen in this paradigm due to improper scheduling of tasks to virtual resources. Therefore, it will cause a problem to cloud provider in view of SLA violations. Many of authors used heuristic techniques [4, 12, 17, 22] and nature inspired approaches [1,2,3] for tackling task scheduling but still there is lacking of an effective scheduler which schedules these dynamic tasks onto virtual resources appropriately while minimizing metrics such as energy consumption, makespan and SLA violations. Therefore, we have used a Machine learning technique i.e. Q-learning based on reinforcement learning method to solve task scheduling problem focused on multimedia tasks which fed to task manager by calculating priority of tasks, those tasks are fed to Q-learning model which takes decision based on upcoming tasks and tasks running in virtual machines. Tasks already running in VMs will be consolidated or migrated based on upcoming tasks at cloud console and decision taken by the ML model employed in scheduling algorithm while minimizing metrics makespan, SLA violation, energy consumption.

1.1 Motivation and contributions

The Cloud paradigm emerged as a utility computing approach where all computing, storage, network infrastructure to be given as a utility to cloud user. When all these services are given as utility with ease and seamless access, many of users will be attracted towards this paradigm. Theend users around the world who are working in different sectors are using cloud services based on their requirement. Providing cloud services to all users without any interruption is a huge challenge in cloud computing because cloud resources are heterogenous nature and upcoming requests are diverse as well as uncertain. Therefore, for assigning virtual resources to user requests there is need of an efficient scheduling approach that handles requests and map them onto virtual resources while maintaining the quality of service and SLA violations. The above reason motivates us to do the research in this area of cloud computing. We have also evaluated very important and primary parameters, which influences the performance of cloud model i.e., makespan- which is time taking to execute a task on a VM, SLA Violation- agreement made by cloud user and provider for the services, Energy consumption- which is consumption of energy by VMS at computation and idle time. The objective of our research is to optimize all these parameters without violating the conditions.

The contributions of the article are given below here.

1.
A Scheduling algorithm is proposed by employing a ML technique which dynamically takes decision according to upcoming and existing tasks.
2.
Deep Q- Learning network model is used as a ML technique, which is based on Reinforcement learning, and it is integrated into scheduling module.
3.
The Extensive simulations are carried on Cloudsim. Initially random workload have been considered and then we have tested efficacy of our algorithm using HPC2N and NASA parallel work logs for evaluation of parameters makespan, energy consumption and SLA Violation.
4.
The experimental results show that proposed approach DRLBTSA is superior to existing round robin, FCFS, Earliest deadline first, RATS-HM, MOABCQalgorithms.

The remaining paper is organized as follows: Existing state of arts approaches are presented and compared in section 2, problem formulation and proposed methodology based upon ML approach are discussed in section 3 & 4. The computing simulation results are discussed in section 5 and conclusion is discussed in last section of the article.

2 Related works

The authors formulated a resource allocation and security mechanism [5], which used a hybrid ML, approach i.e., RATS-HM technique. Totally, this work was done in three stages. In the first stage, a Cat Swarm optimization technique was used to address makespan, throughput. In the second stage, a DNN was used to address metrics such as bandwidth, load on resources for efficient allocation of resources to tasks. Finally, in the third stage a security authentication scheme was implemented to provide security to data stored in cloud. The Cloudsim [7] is used as simulation toolkit to assess the FCFS, RR algorithms performance, results it was identified that proposed RATS-HM mechanism shown a great influenceand surpass existing techniques for mentioned parameters. Authors in [26] proposed task scheduling model for large-scale cloud computing systems to address parameters i.e. task execution delay, resource utilization. Methodology chosen for this approach is a ML approach based on reinforcement learning. Four techniques are used for developing of scheduling mechanism i.e. RL, DQN, RNN-LSTM and DRL-LSTM. Matlab was used for simulation purpose. A real time dataset was taken from Google cluster, it was given as an input to algorithm, and among all techniques, DRL-LSTM performs better than other algorithms when they are compared with RR, PSO and SJF for above mentioned metrics.

Authors in [14], devised a scheduling mechanism AIRL based on reinforcement learning technique to schedule time sensitive requests in cloud. Main objective of AIRL is to minimize request response time, maximize success rate of user requests. Finally AIRL was compared over different schedulers i.e. RR, earliest, random, DQN. From Simulation results,the proposed AIRL shows a great effect over baseline algorithms. In [8], authors proposed scheduling algorithm which addresses QoS, cost of VMs, success rate, response time for scheduling model. This framework uses a DQN model, which works based on reinforcement learning. Entire experimentation was done on a real time cloud and it was evaluated against random, RR and earliest schedulers and from simulation results it was identified that DQN overcomes mentioned algorithms for mentioned parameters. [32], scheduling framework formulated minimizes execution time, waiting time of tasks. Authors used a ML technique i.e. CDDQLS based on reinforcement learning. Entire simulation carried on Cloudsim, posed deadline and resource constraints. After simulation CDDQLS evaluated over Random, Time shared, Space shared algorithms and it shown a great impact for mentioned algorithms. [10] proposed task scheduling model formulated to minimize makespan. It uses a ML approach named as DQN which uses reinforcement learning strategy for scheduling tasks. Experimentations conducted on MATLAB and compared against HEFT, CPOP algorithms. From results, it revealed that makespan greatly minimized over baseline mechanisms.

In [33], scheduling scheme designed to minimize makespan. A machine learning model used as methodology. QL-HEFT i.e. a combination of Q-Learning and HEFT algorithms. This process was done in two stages. In First phase, tasks will be sorted to get effective task allocation by Q-Learning. In second phase, processor allocation was done based on HEFT. Entire scheme implemented over Cloudsim. It compared with existing HEFT, CPOP algorithms. Finally this scheme was shown impact over existing approaches with respect to makespan. In [9], Dynamic task scheduling model which aims minimization of energy consumption, utilization of CPU. It was modeled by using Q-learning technique which is a ML approach. This mechanism is totally lies in two phases. In first phase, all incoming tasks are assigned with a VM in cloud using M/M/S queuing mechanism. In Second phase, by using decisions of Q-Learning tasks are allocated to corresponding VMs in cloud. This approach was implemented on Cloudsim, evaluated over Random, Fair Schedulers. In [25], trust aware scheduling mechanism was developed to minimize makespan, to improve QoS and to address security challenges posed in cloud environment. This work was done in three phases i.e. computation of trust levels of VMs, computation of priorities of tasks and careful scheduling of tasks based on above mentioned conditions. It was implemented on a Hadoop cluster and data generated onto Hadoop clusters are collected from Google cloud platform real workload traces and evaluated over PSO,SJF, RR algorithms and finally from results trusted aware scheduling performs better than existing approaches.

In [34], schedulingtechnique is formulated to optimize the significant QoS parameters and modeled by DQTS i.e. a combination of Q-learning and deep neural network. It was implemented on Workflowsim. Initial workload was generated randomly and used synthetic datasets. It evaluated against existing models. From results, it shows impact over existing mechanisms for load balancing. In [28], edge computing-based task scheduling algorithm was developed to maximize task degree satisfaction and success ratios. It was modeled with DRL to solve task scheduling and resource allocation. It was implemented by using python language and compared with FCFS and SJF state of art algorithms. From results above-mentioned parameters were improved to a great extent. In [19], DeepJS, a job scheduling mechanism developed to improve makespan to address scheduling issues in cloud datacenters. It uses reinforcement learning integrated with bin packing algorithm. It simulated by cloudsim and workload taken from real world workload traces. It compared against existing models, which uses heuristics, finally from results DeepJS converges fast more and minimizes makespan compared with other approaches. In [36], authors formulated a QoS aware scheduler aims at response time, utilization of VMs and user request distributions among VMs. It was modeled by using Deep Reinforcement learning. It was implemented on a customized simulation environment. Real world traces of NASA workload were used for simulation and evaluated over RR, FF, random, earliest and best fit approaches. From results, it observed that average response time minimized by DRL approach by 40% and success rate was improved by 93% over compared mechanisms. Authors in [11] formulated an effective scheduling mechanism in fog environment, which aims to reduce delay of service and computational costs. It was modeled by combining Deep Q-Learning and double Q-Learning mechanisms. It was implemented on ifogsim and evaluated against FF, GS and RS algorithms, evaluated metrics energy, cost, and these parameters shown a huge impact against existing algorithms. In [35],proposed workflow scheduling technique to address makespan, cost. Technique used in this technique was DQN model, which is a multi-agent technique based on reinforcement learning which gives rewards as time and cost. It was implemented on real time cloud environment i.e. AWS and extensive simulations were carried out and it shown huge impact in above-mentioned parameters. Authors in [27] developed an energy efficient task scheduler, which uses RANN model. GA used to generate dataset, which is of 18 million instances. It implemented on MATLAB, evaluated over existing approaches, this proposed task scheduler overcomes existing models by makespan, energy consumption, required active racks, execution overhead.In [5], an efficient resource allocation with light weight authentication scheme developed by authors. A hybridized mechanism developed i.e. RATS-HM. It consists of three steps. In first step, they used ICS-TS which optimizes makespan of scheduling mechanism, In second step, they used GO-DDN which is a deep neural network mechanism for efficient allocation of resources and in final step a light weight authentication mechanism developed. Experimentations conducted extensively on Cloudsim. From results, RATS-HM allocated resources effectively to users while addressing deadline constraints. In [16], a workload balancing strategy proposed by authors by addressing parameters i.e. cost, degree of imbalance, resource utilization. MOABCQ i.e. Q-learning added to modified ABC approach to model scheduling strategy. Extensive simulations conducted on Cloudsim. MOABCQ approach evaluated using realtime workload datasets and synthetic workloads. It compared with existing approaches and from results MOABCQ shows significant impact on existing approaches. In [29], authors used deep reinforcement learning approach used to propose energy aware task scheduling model developed to minimize energy consumption, makespan, resource utilization. It compared over SOTA approaches and deep reinforcement approach outperformed for above specified parameters. [37] proposed a task scheduling approach addresses energy concerns in datacenters for realtime workloads. DRL methodology used for energy aware scheduling. Extensive simulations revealed energy aware scheduling mechanism tackled realtime jobs in datacenters by minimizing energy consumption while improving QoS services provided by cloud provider.

From the above Table 1, all existing scheduling algorithms uses different variations of reinforcement learning techniques and addressed metrics, which we have mentioned in above table. Despite usage of above metrics task scheduling is still ineffective and therefore we have used Deep Q-Learning network to schedule tasks effectively by considering priorities of tasks and schedule them by the decision of ML model i.e. DQN addressed metrics makespan, SLA Violation, Energy consumption.

Table 1 Existing Task scheduling mechanisms using ML Techniques

Full size table

In the below section, we have accurately defined problem, mentioned proposed system architecture in detailed manner.

3 Problem definition and proposed system architecture

In this section, problem definition is given below.

Definition

Assume we have K tasks, which are indicated as t_K = {t₁, t₂, t₃, …t_K}, n VMs which are indicated as VM_n = {VM₁, VM₂, VM₃, …, VM_n}, pphysical hosts indicated H_p = {H₁, H₂, H₃, …, H_p} and r datacenters indicated DC_q = {DC₁, DC₂, DC₃, …. . DC_q}. Scheduling problem defined in such a way that these K tasks are scheduled on to n VMs sitting in p physical hosts in turn resided in q datacenters. Incoming task priorities to be considered before scheduling onto VMs priorities of tasks are considered and fed to DQN model, which takes scheduling decision based on upcoming and current running tasks in underlying resources, which minimizes makespan, SLA violation and Energy consumption. The below table represents notations of proposed architecture (Table 2).

Table 2 Notations used in Proposed System Architecture

Full size table

The optimal task scheduling architecture is represented in Fig. 1, which considers diverse requests from different users simultaneously. After submission of tasks to cloud interface application task manager collects those requests and calculates priorities of all tasks based on length of task, processing capacities of tasks. Further, it will be fed to DQN model based on task priorities, which is integrated with scheduling model. From the recommendations of scheduler, which is integrated with DQN model, have to schedule tasks appropriately onto the VMs. Initially, scheduler need to send these prioritized tasks onto execution queue and send these tasks onto VMs. In this proposed architecture, after every certain time interval T our scheduler needs to keep track of upcoming requests and resource manager about virtual resources. For every time interval T scheduler, which consists of DQN, model keeps track of upcoming requests, executing requests in VMs and virtual resources in resource manager. Therefore, based on these conditions scheduler will take a decision dynamically i.e. mapping a task to new VM or mapping a task to an existing VM or migrating existing tasks to another VMs if a VM is sufficient storage and processing capacity to accommodate running tasks. We have used Deep Q-Learning model as a methodology to schedule tasks intelligently based on the above said conditions for every time intervalT. It will update its decisions of scheduling to scheduler, takes care scheduling tasks appropriately onto VMs. Main aim of this scheduler to effectively task mapping to VMs based on their priorities minimizing parameters named as makespan, SLA Violation and Energy consumption. Initially evaluation of priorities of tasks need to check dependencies of task priorities of tasks. Therefore, evaluation of priorities of tasks entire load on VMs need to be calculated. The overall load on VMs can be identified by following eq. 1.

$$l{o}^{VM}=\sum l{o}^n$$

(1)

Where loⁿ indicates current running n number of VMs.

After calculation of current load on VMs, as all VMs are running in p physical hosts. Therefore, overall load on hosts are calculated using eq. 2.

$$l{o}^{H_p}=l{o}^{VM}/\sum {H}_p$$

(2)

Where $l{o}^{H_p}$ indicates load on p physical hosts, lo^VM indicates current load on all VMs, H_p indicates hosts.

After calculation of load on VMs and physical hosts, identified a threshold value as cloud computing paradigm is dynamic and to process huge number of requests by VMs in a balanced manner and this load balancer need to work according to the requests coming onto VMs. Therefore, to have a load balancer in our model, a threshold value was calculated. Threshold value should be dynamic as cloud workloads are not static and different parameters such as upcoming requests, existing resource capacity etc. Therefore, threshold value in this model can be calculated using following eq. 3.

$$t{r}^p=\frac{\sum_{i=1}^pl{o}_i^{H_p}}{p}$$

(3)

Where tr^p is a dynamic threshold value identified in our work, $l{o}_i^{H_p}$ is load on p physical hosts. This threshold value continuously changing as workload in cloud is dynamic and based on threshold value, utilization of hosts are calculated whether they are underutilized, balanced or over utilized. Utilization of hosts can be calculated using following eqs. 4, 5 and 6 respectively.

The below eq. 4 used to calculate over utilization of hosts.

$$V{M}_n>t{r}\;^p-\sum l{o}^{VM}$$

(4)

The below eq. 5 used to calculate underutilization of hosts.

$$V{M}_n<t{r}^p-\sum l{o}^{VM}$$

(5)

The below eq. 6 used to calculate balanced utilization of hosts

$$V{M}_n=t{r}^p-\sum l{o}^{VM}$$

(6)

From above equations, 4, 5 and 6 utilization of hosts through dynamic threshold value is calculated. Now, to schedule appropriate workload over cloud resources (VMs), need to calculate the processing power of resources as calculated in eq. 7. It is defined as multiplication of number of processing elements in VM, number of processing instructions per second in VM. It calculated by using following equn. Mentioned below.

$$p{r}_{ca}^{VM}=p{r}^{no}\ast p{r}^{mips}$$

(7)

The above equation shows that it is processing capacity of particular VM in n number of VMs considered in our architecture and after this entire processing capacity of VMs calculated as follows by using eq. 8.

$$ov{r}\;_{vm}^{pr}=\sum p{r}_{ca}^{VM}$$

(8)

Now after calculation of processing capacities of VMs, priorities of upcoming requests calculated based upon dependencies or inter-dependency, size of tasks, required resources by the request and many more parameters. Therefore, length of task is calculated using following eq. 9.

$${t}_k^l={t}_{mips}\ast {t}_{pr}$$

(9)

After calculation of length of task then priority of an incoming task onto scheduler can be calculated by using below equation.

$${t}\;^{prio}=\frac{t_k^l}{p{r}_{ca}^{VM}}$$

(10)

Based on priorities of tasks these are moved onto execution queue and map those tasks by scheduler to appropriate VMs. For this scheduling model, we have also considered a dead line constraint in a way that task should complete its execution before deadline i.e. dl^t.

In this research work, our focus is to address parameters i.e. makespan, SLA Violation, Energy Consumption. Whenever makespan is evaluated, to calculate execution time because makespan is evaluated based on how much execution time it is taking for a task to run on a certain VM. Execution time of task for a certain task calculated by using following equation.

$$e{t}_{t_k}=\frac{e{t}_t}{p{r}_{ca}^{VM}}$$

(11)

For every task, which is scheduled into execution, queue gets a VM based on its availability in resources and it all depends on finishing time of a task. Therefore, finishing time of task calculated using below equation.

$$f{t}^{t_k}=\sum V{M}_n+e{t}_{t_k}$$

(12)

In this model, we assumed that each task should complete its execution within specified deadline. Therefore, for every task we are scheduling in this model finish time should always be less than or equal to its deadline. It is indicated as below.

$$f{t}^{t_k}\le d{l}^t$$

(13)

Here after mentioning deadline constraint, calculation of execution time, finish time then we have calculated makespan as in any scheduling model or mechanism makespan needs to be minimized. It is defined as execution time of tasks running over virtual resources. It is calculated as follows.

$${m}^k=\max \left(\ f{t}^{V{M}_n}\right)$$

(14)

$$\min ft\left({t}_kV{M}_n\right)=\sum\nolimits_{i=1}^k\sum\nolimits_{j=1}^n{\delta}_{ij} ft\left({t}_kV{M}_n\right)$$

(15)

From above eq. 15δ_ij is set to 1 if task t_k is assigned to VM i.e. VM_n otherwise set to 0.

Thus, from eqs. 14 and 15 makespan is calculated.

Our next focus to minimize the energy consumption in cloud computing paradigm. It is one of significant and impactful parameters in cloud paradigm. As for processing the huge workloads, need the large scale infrastructure or cloud resources, which leads to increase the energy consumption as well as large emissions of CO₂ [21, 38]and damages environment. Therefore, we are focusing on minimizing energy consumption in cloud paradigm. In Cloud model, energy consumption based on consumption for computing time, idle time. In this model for a VM energy consumption is calculated as follows. In cloud computing model any VM either should be in active state i.e. computing instructions or it should be in idle state represented in below eq. 16.

$$V{M}_n=\left(\genfrac{}{}{0pt}{}{\gamma_n\kern0.5em Active\ State\ of\ VM}{\tau_n\kern0.5em Idle\ State\ of\ VM}\right)$$

(16)

Energy consumption of all n VMs are calculated by using following equation

$${e}_{V{M}_n}^{con}=f{t}_n\ast {\gamma}_n+\left({m}^k-f{t}_n\right)\ast {\tau}_n$$

(17)

$${\min}_{act}^{con}=\left({e}^{mx}-{e}^{mn}\right)\ast re{s}^{util}+{e}^{mn}$$

(18)

Energy consumption in datacenter calculated as follows

$${e}^{con}=\sum {e}_{V{M}_n}^{con}+{\min}_{act}^{con}$$

(19)

Thus, from eqs. 16, 17, 18 and 19 Energy consumption in cloud computing calculated.

Our next focus is to minimize SLA Violation in cloud computing. Here, we need to discuss importance of SLA Violation. Service Level Agreement is one of the important perspective in terms of both user and cloud provider as if our system does not work according to SLA then problems will be persisted for both cloud provider and user. Therefore, it is important to design our scheduler, which should not violate SLA made between both user and cloud provider. Thus, we have defined SLA violation in cloud computing by using below equation.

$$SLAV=\frac{1}{p}\sum\nolimits_{i=1}^p\frac{T_i^{over}}{T_i^{over- active}}\ast \frac{1}{n}\sum\nolimits_{j=1}^n\frac{T_j^{pd}}{T_j^{cpc}}$$

(20)

Where p indicates number of physical hosts, ${T}_i^{over}$ indicates total time for which host gets overloaded, ${T}_i^{over- active}$ indicates amount of time a host lies in active state.${T}_j^{pd}$ indicates estimation of performance degradation for a VM, ${T}_j^{cpc}$ indicates requested CPU capacity of a VM during its specified time.

4 Methodology

This section precisely discusses about methodology used to design our scheduling algorithm. We have used a reinforcement learning approach [31], which takes adaptive decisions. It is a machine learning approach, which considers inputs and gives decisions based on the history of previous events. Over a period of time it learns from previous decisions and makes adaptive decisions. For any Reinforcement learning approach there are three basic parts. They are 1. Input and Output states- The data which we are given as an input to the model is Input and the output state is to represent an outcome processed by algorithm based on data given as Input. 2. Rewards- This state is a representation of outcome by algorithm with positive or negative reward. 3. Artificial Intelligence framework- This can be used to take a decision based on input supplied to algorithm and gives outcomes through which rewards will be generated that may be good or bad. If those rewards generated at a time T are positive and framework captures those positive rewards and continue towards the next state for much more optimal decisions. If rewards are negative, it will learn from that experience and it will try to improve its decision making for the next state.

The complete functionalities of proposed methodology presented in the form of pseudo code.

Time complexity of the proposed Methodology:

The total time complexity of proposed methodology depends on time complexities of individual components of proposed methodology. Let’s break down each component and analyze its time complexity:

1.
Collect requests and calculate task priorities: Time complexity of this component is O(n), n indicates number of requests received from cloud interface application.
2.
Feed task priorities to DQN model: The time complexity of this component depends on the implementation of the DQN model. In general, the time complexity of a single forward pass through a neural network with n layers is O(n), since each layer involves a matrix multiplication operation.
3.
Schedule tasks onto VMs: The time complexity of this component depends on the implementation of the scheduler and the DQN model. The time complexity of scheduling tasks using a DQN model is typically higher than traditional scheduling algorithms like round robin, as it involves training a neural network on a large dataset. The time complexity of adding tasks to the execution queue and scheduling them onto VMs depends on the implementation of the execution queue and VM manager.
4.
Keep track of upcoming requests and resources every T time interval: The time complexity of this component is O(1), since it involves fetching the upcoming requests and virtual resources from the resource manager.
5.
Make dynamic decisions: The time complexity of this component depends on the number of decisions to be made, and the time complexity of each decision. In general, time complexity of component is O(m), where m is number of decisions to be made.

Overall, time complexity of algorithm can be approximated as O(n + f + s + m), where n is number of requests received, f is time complexity of feeding the task priorities to the DQN model, s is the time complexity of scheduling tasks onto VMs, and m is the time complexity of making dynamic decisions.

Reinforcement learning approach uses agents to take decisions based on inputs given to system. Generally, any machine-learning model consists of different states. Therefore, agents will take specific actions based on input, which generates rewards either good or bad. These rewards can be useful for the decision to be taken by the algorithm in the next state. Reinforcement learning approach works based on these rewards and agents will try to take actions in next state based on reward generate in this current state. This approach learns from its previous states whether it is a good or bad reward and will take its decisions over a period of time. This adaptive nature i.e. self-learning based on its previous states is one of the advantage of reinforcement learning approach.

The above Fig. 2 gives representation of task scheduling using deep reinforcement learning in which, agent will learn through history of incoming user request sequences i.e. agent will be trained through previous user requests or tasks which are coming onto cloud console. Initially, a Prioritized user request is to be given as input to agent and it should make a scheduling decision based on situation in cloud environment. Decision would be given as an output of executed task or user request, which is (e.g. makespan, energy consumption, SLA violation in our study) a reward for an agent. If the value of reward is bad then agent improve its decision by updating its parameters of model. If the value of the reward is good then it will be stored in the current state and it will be used for the next time when decision is to be made by the agent in the next state.

In reinforcement learning, we have used Q- learning for our scheduling model [30]. This Q- learning is one of the most powerful technique as it doesn’t need any knowledge of current system. It will make decisions based on past actions stored as Q-function as a pair with two states indicated a q(S, A). It will updates its states by using below equation.

$$q\left({S}^t,{A}^t\right)\leftarrow q\left({S}^t,{A}^t\right)+\sigma \ast \left[r{e}^t+\hbox{\pounds}\ast maximu{m}^aq\left({S}^{t+1},A\right)-q\left({S}^t,{A}^t\right)\right]$$

(21)

Where σ is rate of learning and value of it is in between (0,1). re^t is reward for taking action i.e. A^t for state S^t. £ is discount factor and its value lies in between (0,1).

From above equation for every iteration, q-learning model needs to check for rewards and updates its decisions according to model by using above equation. In classical q-learning, all q values are stored in q-table but to apply these q-learning model to a problem such as scheduling in cloud computing it is difficult to adaptive and optimal decisions for classical model as number of states and actions are comparably high for task scheduling technique. Therefore, we want to use a deep neural network in combination with reinforcement learning which can be helpful for our scheduling problems in cloud computing. Therefore, we have used a Deep Reinforcement Learning model [39] to tackle the problem of scheduling in cloud computing. Moreover, that Deep Reinforcement Learning approach already proved in various types of scheduling techniques in cloud computing as mentioned in [15, 26]. Therefore, combining deep neural network with reinforcement learning used for task scheduling in our model. The main reason to usethis scheduling model is to make it as a smart scheduler no prior knowledge will be given to agent and algorithm need to take a decision when real time data is given as input. It consists of different states as in q-learning, which consists of action space and state space.

4.1 Action space

In action space, as we have already mentioned n VMs which we have considered in this work. All incoming requests initially fed to task manager and after each task priority will be calculated and given to scheduler and it consists of DQN model which takes decision and sends tasks to a execution queue with respect to priorities of tasks. Then based on decision of Scheduler they need to execute on VMs according to the entry of tasks in queue according to their priorities. Thus, action space in our model is defined as follows.

$$A=\left[V{M}_1,V{M}_2,V{M}_3,\dots, V{M}_n\right]$$

(22)

4.2 State space

In this subsection, we are defining a state space where it consists of state of a task at specific time and state of a VM at that time when task arrives.

Let us assume that a task t arrives at time T and it is to be represented as T_t. Then, state of this task can be represented as follows

$${S}_{Tt}={S}_t\cup {S}_{Tt}^{VM}$$

(23)

Where S_t is a state of a task t at time T and ${S}_{Tt}^{VM}$ is a state of a VM when a task t comes on to a VM at time T.

$${S}_{Tt}=\left[{t}_k^l,{t}\;^{prio},e{t}_{t_k},f{t}^{t_k},{m}^k,{e}_{V{M}_n}^{con}, SLAV\right]$$

(24)

Where ${t}_k^l$ is length ofk tasks, t ^prio is priority of ktasks, $e{t}_{t_k}$ is execution time of ktasks, $f{t}^{t_k}$ is finish time of ktasks, m^k makespan of ktasks, ${e}_{V{M}_n}^{con}$ energy consumption of n VMs, SLAV is SLA Violation.

4.3 Reward function

Aim of this study is to find optimal mapping between cloud resources, mixed tasks with help of our DRLBTSA scheduler to optimize significant QoS parameters energy consumption, time, SLA violation. Thus, our reward function should be in terms of minimization of metrics mentioned in our work. It can be defined as follows.

$$re=\min \left({m}^k,{e}_{V{M}_n}^{con}, SLAV\right)$$

(25)

4.4 Training the agent

When incoming tasks arrived at scheduler DRLBTSA agent need to make decision in current state by considering priorities of tasks and resources available in Physical hosts and according to that it should map tasks to VMs. For this to happen, our DQN model need to be trained in such a way that initially it should map a task to a VM by considering above said condition with a probability ρ and its value decrease over to zero with respect to time. Therefore, initially DQN agent explores randomly and give its decision and later it gives its decisions by previous q-values stored in q-table. To make it happen, experience replay, fixed q-target values are to be used in algorithm. Whenever agent takes decisions over a period of time it will gain experience and it is to be represented as experience replay here in our work. Whenever a decision is taken by agent it will gives you a reward and it is represented as re^t and further state is to be represented as S^t + 1. These experiences will be stored as values in a memory to be represented as replay memory indicated as ω. The values to be stored in this replay memory are (A^t, S^t, re^t, S^t + 1) and capacity of a replay memory to be represented as Mω, and replay memory can be taken as batches and indicated as Gω. Whenever iterations are to be run and values in replay memory are to be updated and here iterations are to be called as batches. In our work, entire training was done in offline. After completion of training to our agent then it will become intelligent enough to take decisions in a smart way. In our work, 50 neurons were used for hidden layers in DQN model. We have kept scheduling time for taking decision for an agent is 10 ms, frequency of learning is f = 1, time for learning of agent taken as ε. The proposed DRLBTSA algorithm is shown below.

The above algorithm flow is discussed here in a detailed manner. Initially all parameters such as batch size, replay memory, learning frequency, learning rate and discount factor are initialized. In the next step, q-function consists of state space and action space are initialized to zero. In the next step, for each event i.e. for every incoming task comes to cloud platform priority of tasks are calculated. For every event, scheduler need to choose action space i.e. VMs for corresponding state space i.e. tasks based on priority of tasks, availability of resources in physical hosts. Based on this condition, every time scheduler need to make a decision. In the next step, tasks according to their priorities to be scheduled to corresponding VMs with a random probability if it is a first task which is to be scheduled for first time otherwise DRLBTSA need to take a decision from existing q-table available for agent. After this step when action space to be chosen for every state space a reward will be generated. Here in our work, it is minimization of parameters i.e. makespan, energy consumption, SLA Violation. Reward value can be calculated by using eq. 25. If it is positive reward, reward score will be improved and if it is negative then it have to improve i.e. scheduler need to improve based on its experience. If positive rewards are encountered i.e. makespan, energy consumption, SLA violation they will be updated as minimized values of the current state. Rewards either positive or negative they will be stored in replay memory. After evaluating rewards it should update its state space to the next state by using eq. 21. This process continues until last state space i.e. task is encountered (Fig. 3).

5 Simulation set up and experimental results

This section discusses simulation, results of our work. Entire simulation is conducted on cloudsim [7] simulator. We have identified real time parallel worklogs from HPC2N [13] and NASA [23] which are of high performance computing clusters, those workloads are given input to our algorithm. After that we have fabricated different datasets on our own with different distributions and those were explained here in a detailed manner.

5.1 Configuration settings for simulation

Our simulation was carried out on cloudsim [7]. We have done extensive simulations using cloudsim [7]. In our work, initially we fabricated datasets in such a way that tasks have to be with different distributions and these workload distributions are fed to scheduler. Then to test our work efficiency, we used HPC2N [13], NASA [23] worklogs from high performance computing clusters. The fabricated datasets distributions are considered as follows i.e. Uniform, Normal, left and right skewed distributions. We represent all these datasets with uniform, normal, left and right skewed distributions as d1, d2, d3, d4. Worklogs of HPC2N and NASA are represented as d5 and d6. Uniform distribution tasks means all types of tasks distributed in an equal manner. Normal distribution represents more number of medium size tasks, less number of small and large tasks. Left skewed distribution indicates more number of small tasks, less number of large tasks. Finally Right skewed distribution indicates less number of small tasks, more number of large tasks. We have intentionally fabricated these distributions as we need to verify how our algorithm behaves with different types of tasks. Finally we have given HPC2N and NASA parallel worklogs as d5 and d6 datasets as input to algorithm to check its efficiency through realtime workload. After giving workload as input to algorithm, we have evaluated our DRLBTSA against existing baseline algorithms RR, FCFS, Earliest deadline first, RATS-HM, MOABCQ algorithms. We have taken standard configuration settings for our simulation from [20]. The following Table 3 clearly mention configuration settings required for simulation, Table 4 indicates various parameter settings of compared approaches with proposed DRLBTSA.

Table 3 Configuration Settings in Simulation

Full size table

Table 4 Parameter Settings for various algorithms

Full size table

5.2 Calculation of makespan

Makespan of tasks are evaluated using configuration settings available in Table 3 and different workloads are given to our DRLBTSA scheduler. Initially we have given workloads of d1, d2, d3, d4, d5 and d6 datasets and evaluated makespan using these datasets. We have evaluated our work against existing baseline algorithms RR, FCFS, Earliest deadline first. DRLBTSA ran for 100 iterations. Below Table 5 indicates calculation of makespan.

Table 5 Evaluation of makespan

Full size table

The above Table 5 shows makespan of different tasks with different fabricated datasets i.e. d1, d2, d3, d4 with different distributions and workloads from HPC2N [13], NASA [23]. From Table 4, it was clearly shows that our DRLBTSA algorithm minimized makespan over RR, FCFS, EDF,RATS-HM,MOABCQ algorithms.

The above Fig. 4 and Table 4 clearly shows that our proposed DRLBTSA approach evaluated over RR, FCFS, EDF, RATS-HM, MOABCQ algorithms and from simulation results makespan is minimized over the mentioned algorithms.

5.3 Calculation of energy consumption

Energy consumption is evaluated using configuration settings available in Table 3 and different workloads are given to our DRLBTSA scheduler. Initially we have given workloads of d1, d2, d3, d4, d5 and d6 datasets and evaluated Energy Consumption using these datasets. We have evaluated our work against existing baseline algorithms RR, FCFS, EDF, RATS-HM, MOABCQ. DRLBTSA ran for 100 iterations. Below Table 6 indicates calculation of Energy Consumption.

Table 6 Evaluation of Energy Consumption

Full size table

The above Table 6 shows Energy consumption of different tasks with different fabricated datasets i.e. d1, d2, d3, d4 with different distributions and workloads from HPC2N [13], NASA [23]. From Table 5, it was clearly shows that our DRLBTSA algorithm minimized energy consumption over RR, FCFS, EDF,RATS-HM, MOABCQ algorithms.

The above Fig. 5 and Table 5 clearly shows that our proposed DRLBTSA approach evaluated over RR, FCFS, EDF, RATS-HM, MOABCQ algorithms and from simulation results Energy Consumption is minimized over the mentioned algorithms.

5.4 Calculation of SLA violation

SLA Violation is evaluated using configuration settings available in Table 3 and different workloads are given to our DRLBTSA scheduler. Initially we have given workloads of d1, d2, d3, d4, d5 and d6 datasets and evaluated SLA Violation using these datasets. We have evaluated our work against existing baseline algorithms RR, FCFS, EDF, RATS-HM, MOABCQ. DRLBTSA ran for 100 iterations. Below Table 7 indicates calculation of Energy Consumption.

Table 7 Evaluation of SLA Violation

Full size table

The above Table 7 shows SLA Violation of different tasks with different fabricated datasets i.e. d1, d2, d3, d4 with different distributions and workloads from HPC2N [13], NASA [23]. From Table 6, it was clearly shows that our DRLBTSA algorithm minimized SLA violation over RR, FCFS, EDF,RATS-HM, MOABCQ algorithms.

The above Fig. 6 and Table 6 clearly shows that our proposed DRLBTSA approach evaluated over RR, FCFS, EDF,RATS-HM,MOABCQ algorithms and from simulation results SLA Violation is minimized over the mentioned algorithms.

5.5 Results discussion

In this section, we discussed about results of evaluated parameters and their improvement over existing algorithms i.e. RR, FCFS, EDF, RATS-HM, MOABCQ. We have clearly mentioned about improvement of our DRLBTSA over existing baseline algorithms for parameters makespan, energy consumption, SLA Violation. The below Tables 8, 9 and 10 indicates improvement of makespan, energy consumption, SLA Violation.

Table 8 Improvement of makespan over existing algorithms

Full size table

Table 9 Improvement of Energy Consumption over existing algorithms

Full size table

Table 10 Improvement of SLA Violation over existing algorithms

Full size table

Table 8 clearly represents our proposed DRLBTSA improves makespan over compared existing algorithms with different varying workloads.

Table 9 clearly represents our proposed DRLBTSA improves Energy Consumption over compared existing algorithms with different varying workloads.

Table 10 clearly represents our proposed DRLBTSA improves SLA Violation over compared existing algorithms with different varying workloads. In Tables 8, 9 and 10 improvement of results means it is minimization of parameters mentioned in our work.

6 Conclusion and future work

The scheduling of diverse workload over cloud paradigm is a challenge issue, due to dynamism and heterogeneity nature of cloud computing. It is very difficult to map tasks to precised VMs. Many existing authors proposed various scheduling mechanisms to map tasks to VMs but still there is a chance to do research in this area for mapping of tasks to appropriate VMs. The scheduling in cloud model is highly dynamic scenario as many of miscellaneousworkload request the resources in multi-tenant environment to accomplished the demand based upon the processing capacities. To effectively map every task onto suitable VM, we have proposed DRLBTSA approach that find the optimal resources considering priority of taskswhile minimizing makespan, SLA Violation, Energy Consumption. We have used a machine-learning model i.e. DQN-model to solve task scheduling problem in our research. This DQN model is one of the variants of Deep Reinforcement learning. In this work, we have extensively done the simulations on cloudsim and input to algorithm is done through fabricated datasets different distributions, realtime parallel worklogs from HPC2N, NASA. We have evaluated our algorithm against existing baseline algorithms i.e. FCFS, RR, EDF, RATS-HM, MOABCQ. This simulation ran for 100 iterations. From results, it observed that our proposed approach i.e., DRLBTSA shown impact over baseline algorithms for above mentioned parameters. In future, we will test the efficacy of DRLBTSA by deploying in open stack and we want to generate realtime workloads in open stack environment and test the efficacy of our scheduler.

Data availability

Authors not interested to disclose the availability of data.

References

Abualigah L, Alkhrabsheh M (2022) Amended hybrid multi-verse optimizer with genetic algorithm for solving task scheduling problem in cloud computing. J Supercomput 78(1):740–765
Article Google Scholar
Adhikari M, Srirama SN, Amgoth T (2022) A comprehensive survey on nature-inspired algorithms and their applications in edge computing: Challenges and future directions. Softw Pract Exp 52(4):1004–1034
Article Google Scholar
Agrawal K, Khetarpal P (2022) Computational intelligence in edge and cloud computing. J Inf Optim Sci 43:607–613
Google Scholar
Amer DA et al (2022) Elite learning Harris hawks optimizer for multi-objective task scheduling in cloud computing. J Supercomput 78(2):2793–2818
Article Google Scholar
Bal PK et al (2022) A Joint Resource Allocation, Security with Efficient Task Scheduling in Cloud Computing Using Hybrid Machine Learning Techniques. Sensors 22(3):1242
Article Google Scholar
Biswas D et al (n.d.) Optimized Round Robin Scheduling Algorithm Using Dynamic Time Quantum Approach in Cloud Computing Environment
Calheiros RN et al (2011) CloudSim: a toolkit for modeling and simulation of cloud computing environments and evaluation of resource provisioning algorithms. Softw Pract Exp 41(1):23–50
Article MathSciNet Google Scholar
Cheng F et al (2022) Cost-aware job scheduling for cloud instances using deep reinforcement learning. Clust Comput 25(1):619–631
Article Google Scholar
Ding D et al (2020) Q-learning based dynamic task scheduling for energy-efficient cloud computing. Futur Gener Comput Syst 108:361–371
Article Google Scholar
Dong T et al (2020) Task scheduling based on deep reinforcement learning in a cloud manufacturing environment. Concurr Comput Pract Exp 32(11):e5654
Article Google Scholar
Gazori P, Rahbari D, Nickray M (2020) Saving time and cost on the scheduling of fog-based IoT applications using deep reinforcement learning approach. Futur Gener Comput Syst 110:1098–1115
Article Google Scholar
Ghafari R, HassaniKabutarkhani F, Mansouri N (2022) Task scheduling algorithms for energy optimization in cloud environment: a comprehensive review. Clust Comput 25:1035–1093
Article Google Scholar
HPC2N: The HPC2N Seth log; 2016. http://www.cs.huji.ac.il/labs/parallel/workload/l_hpc2n/.0
Huang Y et al (2021) Deep adversarial imitation reinforcement learning for QoS-aware cloud job scheduling. IEEE Syst J 16:4232–4242
Article Google Scholar
Karthiban K, Raj JS (2020) An efficient green computing fair resource allocation in cloud computing using modified deep reinforcement learning algorithm. Soft Comput 24(19):14933–14942
Article Google Scholar
Kruekaew B, WarangkhanaKimpan. (2022) Multi-objective task scheduling optimization for load balancing in cloud computing environment using hybrid artificial bee colony algorithm with reinforcement learning. IEEE Access 10:17803–17818
Article Google Scholar
Kumar R, Bhagwan J (2022) A comparative study of meta-heuristic-based task scheduling in cloud computing. In: Artificial Intelligence and Sustainable Computing. Springer, Singapore, pp 129–141
Chapter Google Scholar
Lahande P, Kaveri P (2022) Implementing FCFS and SJF for finding the need of Reinforcement Learning in Cloud Environment. ITM Web of Conferences. Vol. 50. EDP Sciences
Li F, Bo H (2019) Deepjs: Job scheduling based on deep reinforcement learning in cloud data center. Proceedings of the 2019 4th international conference on big data and computing
Madni SHH et al (2019) Hybrid gradient descent cuckoo search (HGDCS) algorithm for resource scheduling in IaaS cloud computing environment. Clust Comput 22(1):301–334
Article Google Scholar
Mohanapriya N et al (2018) Energy efficient workflow scheduling with virtual machine consolidation for green cloud computing. J Intell Fuzzy Syst 34(3):1561–1572
Article Google Scholar
Nabi S et al (2022) AdPSO: adaptive PSO-based task scheduling approach for cloud computing. Sensors 22(3):920
Article MathSciNet Google Scholar
NASA (n.d.): https://www.cse.huji.ac.il/labs/parallel/workload/l_nasa_ipsc/
Nayak SC et al (2022) An enhanced deadline constraint based task scheduling mechanism for cloud environment. J King Saud Univ Comput Inf Sci 34(2):282–294
Google Scholar
Rjoub G, Bentahar J, Wahab OA (2020) BigTrustScheduling: Trust-aware big data task scheduling approach in cloud computing environments. Futur Gener Comput Syst 110:1079–1097
Article Google Scholar
Rjoub G et al (2021) Deep and reinforcement learning for automated task scheduling in large-scale cloud computing systems. Concurr Comput Pract Exp 33(23):e5919
Article Google Scholar
Sharma M, Garg R (2020) An artificial neural network based approach for energy efficient task scheduling in cloud data centers. Sustain Comput Inform Syst 26:100373
Google Scholar
Sheng S et al (2021) Deep reinforcement learning-based task scheduling in iot edge computing. Sensors 21(5):1666
Article Google Scholar
Siddesha K, Jayaramaiah GV, Singh C (2022) A novel deep reinforcement learning scheme for task scheduling in cloud computing. Clust Comput 25(6):4171–4188
Article Google Scholar
Spano S et al (2019) An efficient hardware implementation of reinforcement learning: The q-learning algorithm. IEEE Access 7:186340–186351
Article Google Scholar
Staddon JER (2020) The dynamics of behavior: Review of Sutton and Barto: Reinforcement learning: An introduction. J Exp Anal Behav 113(2):485–491
Article Google Scholar
Swarup S, Shakshuki EM, Yasar A (2021) Task scheduling in cloud using deep reinforcement learning. Procedia Comput Sci 184:42–51
Article Google Scholar
Tong Z et al (2020) QL-HEFT: a novel machine learning scheduling scheme base on cloud computing environment. Neural Comput & Applic 32(10):5553–5570
Article Google Scholar
Tong Z et al (2020) A scheduling scheme in the cloud computing environment using deep Q-learning. Inf Sci 512:1170–1191
Article Google Scholar
Wang Y et al (2019) Multi-objective workflow scheduling with deep-Q-network-based multi-agent reinforcement learning. IEEE Access 7:39974–39982
Article Google Scholar
Wei Y et al (2018) DRL-scheduling: An intelligent QoS-aware job scheduling framework for applications in clouds. IEEE Access 6:55112–55125
Article Google Scholar
Yan J et al (2022) Energy-aware systems for real-time job scheduling in cloud data centers: A deep reinforcement learning approach. Comput Electr Eng 99:107688
Article Google Scholar
Zhang X et al (2019) Energy-aware virtual machine allocation for cloud with resource reservation. J Syst Softw 147:147–161
Article Google Scholar
Zhou G, Tian W, Buyya R (2021) Deep reinforcement learning-based methods for resource scheduling in cloud computing: A review and future directions. arXiv preprint arXiv:2105.04086

Download references

Author information

Authors and Affiliations

School of Computer Science and Engineering, VIT-AP University, Amaravati, AP, India
Sudheer Mangalampalli & Ganesh Reddy Karri
Department of Information Technology, NIT Jalandhar, Jalandhar, India
Mohit Kumar
Al-NahrinNanorenewable Energy Research Center, Al-Nahrin University, Bhagdad, Iraq
Osama Ibrahim Khalaf
Universidad Santiago de Cali, Cali, Colombia
Carlos Andres Tavera Romero
Department of Computer Engineering, University of Technology, Bhagdad, Iraq
GhaidaMuttashar Abdul Sahib

Authors

Sudheer Mangalampalli
View author publications
You can also search for this author in PubMed Google Scholar
Ganesh Reddy Karri
View author publications
You can also search for this author in PubMed Google Scholar
Mohit Kumar
View author publications
You can also search for this author in PubMed Google Scholar
Osama Ibrahim Khalaf
View author publications
You can also search for this author in PubMed Google Scholar
Carlos Andres Tavera Romero
View author publications
You can also search for this author in PubMed Google Scholar
GhaidaMuttashar Abdul Sahib
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Sudheer Mangalampalli.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Mangalampalli, S., Karri, G.R., Kumar, M. et al. DRLBTSA: Deep reinforcement learning based task-scheduling algorithm in cloud computing. Multimed Tools Appl 83, 8359–8387 (2024). https://doi.org/10.1007/s11042-023-16008-2

Download citation

Received: 30 August 2022
Revised: 26 April 2023
Accepted: 11 June 2023
Published: 17 June 2023
Issue Date: January 2024
DOI: https://doi.org/10.1007/s11042-023-16008-2

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

DRLBTSA: Deep reinforcement learning based task-scheduling algorithm in cloud computing

Abstract

Similar content being viewed by others

AI Enabled Resources Scheduling in Cloud Paradigm

A novel dynamic multi-objective task scheduling optimization based on Dueling DQN and PER

A novel deep reinforcement learning scheme for task scheduling in cloud computing

1 Introduction

1.1 Motivation and contributions

2 Related works

3 Problem definition and proposed system architecture

Definition

4 Methodology

4.1 Action space

4.2 State space

4.3 Reward function

4.4 Training the agent

5 Simulation set up and experimental results

5.1 Configuration settings for simulation

5.2 Calculation of makespan

5.3 Calculation of energy consumption

5.4 Calculation of SLA violation

5.5 Results discussion

6 Conclusion and future work

Data availability

References

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

DRLBTSA: Deep reinforcement learning based task-scheduling algorithm in cloud computing

Abstract

Similar content being viewed by others

AI Enabled Resources Scheduling in Cloud Paradigm

A novel dynamic multi-objective task scheduling optimization based on Dueling DQN and PER

A novel deep reinforcement learning scheme for task scheduling in cloud computing

Explore related subjects

1 Introduction

1.1 Motivation and contributions

2 Related works

3 Problem definition and proposed system architecture

Definition

4 Methodology

4.1 Action space

4.2 State space

4.3 Reward function

4.4 Training the agent

5 Simulation set up and experimental results

5.1 Configuration settings for simulation

5.2 Calculation of makespan

5.3 Calculation of energy consumption

5.4 Calculation of SLA violation

5.5 Results discussion

6 Conclusion and future work

Data availability

References

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation