1 Introduction

In recent years, cloud computing is one of the fastest growing technologies in the field of IT that provide services to every user from anywhere at any time [1]. It uses the internet and servers to control the applications and data. In a cloud environment, the Service Level Agreement (SLA) share the resources to all around the world for providing quick service to cloud users [2]. Load balancing is one of the major problem and challenging issue in the cloud environment. In order to increase resource utilization, it aims to assure that all the computing resources are distributed effectively [3]. These resources are offered on-demand to meet the SLA’s requirements. For each application in the data center, the providers of services should guarantee the Quality of Service (QoS) while attaining the utilization of the server and energy. In the cloud, dynamic resources can be effectively managed using virtualization technologies (VM resource) that balances the workload for the entire system, schedule and allocate the resources efficiently [4, 5].

The three important stakeholders of the cloud are end users, Cloud Provider (CP) and Cloud Developer (CD) [6]. Based on this, the cloud service are classified into privacy policy, and hybrid. The commonly used services include infrastructure, software, and platform. The cloud users must agree to SLA defined by the CP before utilizing the services. In order to share the resources dynamically, there required a cloud service broker (CSB) agent in between the cloud resources and Cloud Service Providers (CSP). CSB will help in selecting the appropriate data-center to meet user requirements. Cloud providers are responsible for providing the Cloud services to the users. Finally, the CD manages and satisfy the requirements for both the users and the providers [7,8,9].

In CC, there are many challenging issues that affect the performance of CC and cloud service among multiple nodes, but the workload balancing and service brokering is major challenge in CC. [10]. So, researchers concentrated on these specific problems and developed some techniques.

The Cloud-based Multimedia Load Balancing (CMLB) approach [11] is one of the effective methods of load balancing for the cloud-based multimedia system. It fully considers the network conditions and loads for all servers, which is reasonable for resource allocation and planning. The inefficient use of optimized resources resulted in a minimum throughput measure. LB ant colony optimization [12] is a hybrid algorithm that energetically balances the workload of the complete system while decreasing the makespan from the set of tasks. But, user’s tasks cannot adapt to heterogeneous processing. To overcome this problem, Cluster-based LB method [13] is introduced that performs well in heterogeneous nodes by considering the resource demands of each task and reduce the overheads by clustering the machines. But there is a no compromise between the data center, SLA and the consumption of energy. Dynamic Resource Allocation (DRA) with skew-ness effectively allocate the resources and improves the utilization of resources [14]. There are several issues related to Dynamic Resource Provision approach in Cloud Computing such as resource allocation [15], profit maximization and price discrimination [16], scheduling [17], load balancing [18], etc. The CSP provides resources to the users in the form of pay-per-use. To allocate resources efficiently and to meet user expectations, different resource allocation algorithms are used [19]. To meet the need of the user, Software as a Service (SaaS), leasing resources from the Infrastructure as a Service (IaaS) that affect service quality due to its variable performance [20]. LB is used to confirm that all computing resources are allocated in an efficient and fair manner and ensures that no one node is overloaded. It has become a challenge for CP when there is a large number of Cloud User (CU) [21].

1.1 Problem statement

In cloud computing, when user demand arrives at the data center, the request must be allocated to different VMs, but it should be distributed equally across the CC systems. However, it still presents numerous problems related to the unpredictability of performance, sharing of resources, execution time, energy efficiency of resources based on needs and many others. In this condition, we are using prediction method and load balancing in dynamic resource allocation strategy. The cloud provider can easily adjust and organize more resources to meet user demand with low cost and can eliminate the unused space of resource wastage in LUA. The CSB become a more interesting area of research in the CC field in GUA, which guarantees throughput, makespan, use of resources, minimizes the time and cost of execution in the cloud environment.

To overcome all the above issues, this work focuses mainly on three factors: Load balancing, Resource Allocation and scheduling in VM. Load balancing allocates the load over the nodes in the distributed system and improves the utilization of resources and response time of the task. Resource allocation assigns the resources and satisfies the user expectations efficiently with lower allocation time and scheduling map the task to the most suitable resource. In our work, we proposed MADRL-DRA and DOLASB algorithms for all these three factors to improve the performance of optimal multi-cloud service, minimizes the client’s request cost and develops the CSB policy. The main objectives of our work are to design a prediction model to predict the activity from the client’s request and to design a scalable, model-independent and adaptive method to achieve LB and desired degree of scheduling. By this model, we can reduce the user’s cost and response time, and increase the throughput and resource utilization.

In the rest section of the paper, we discuss the dynamic resource provision approach, which is an effective method to provide resource allocation based on changes in workloads. The resource allocation is performed by MADRL-DRA method and DOLASB method for load detection and optimization of services. The experimental output shows that the proposed method increases the energy efficiency and throughput and reduce the execution time and cost of the user’s task when compared to other approaches.

1.1.1 The contribution of this proposed work is as follows

  • A dynamic resource provision model called MADRL is developed to predict the customer request and DRA provision the cloud service based on the LB to decrease the response time, makespan and customer request cost.

  • DOLASB method minimizes the cloud customers cost, and make a profit at the same time for CSB.

  • The combination of BD-MIP algorithm achieves an optimal solution for the multi-service configuration, VM allocation, and cloud service broker optimization problems.

The remaining section of the manuscript is as follows: Sect. 2 presents the related works based on LB, scheduling, and resource allocation. In Sect. 3, the model description of the proposed method is explained in two sections: MADRL-DRA for LB and DOLASB for CSB model. In Sect. 4, the experimental result and analysis of the proposed method are explained. Finally, in Sect. 5, we conclude the work based on the obtained result.

2 Related works

The papers that are surveyed related with the LB, resource allocation, and scheduling are explained below.

2.1 Load balancing (LB)

Zhao et al. [22] introduced an LB based on the Bayes and clustering approach (LB-BC) that focused mainly on the problem of selecting the physical host to use the required activities. These algorithms do not ensure high execution efficiency for the next task even if they have the ability to achieve high resource utilization. LB-BC achieves the long-term LB process and Bayes is integrated with the clustering process to acquire the optimal physical host. This model minimized the number of error activities but the throughput of this service is low.

Paya and Marinescu [23] used an energy-saving model to balance the scale and load application in the cloud. This model establishes the optimal energy and improves the number of servers for regime operation. Hence, the energy, server, and throughputs are slightly reduced.

Chen et al. [24] introduced the Min–Min algorithm begins with a set of all pending jobs. First of all, the time required to complete an activity is calculated. Here, the work with minimum completion time is selected. Finally, the selected node and the selected job are mapped. The node’s ready time is updated. This process is repeated until all unassigned jobs are assigned. Here, the algorithm is executed with the shortest execution time but leads to starvation, not suitable for a dynamic environment.

2.2 Scheduling

Gill and Buyya [25] introduced the framework called Self-Management of Cloud Resources for Execution of Clustered Workloads (SCOOTER) that efficiently plans the resources provided by the cloud and manages the SLA by considering the self-management properties. The performance of QoS parameters can increase the CS. In this, the performance of CC was evaluated with parameters such as energy consumption, cost, SLA violation rate, use of resources, etc.

Singh and Chana [26] introduced an automatic resource planning structure called fuzzy logic based energy use programming. It is presented to plan cloud resources in data-center. This framework has been validated in the CloudSim-based simulation environment and offers poor CC service performance than other existing methods.

Ma et al. [27] presented a new planning algorithm based on a GA for scheduling the tasks. Scheduling of tasks and resource allocation are the two important factors in CC. For planning, this model represents four types of aspects such as time to complete the activity, activity expenses, reliability, and bandwidth. This model has also adopted the crossover and the mutation operation linked to the rules to increase quality service. But GA did not work well on the problem in which the resources were strictly bound.

Liu et al. [28] introduced the Ant Colony Optimization Algorithm (ACOA) used to solve combinatorial optimization problems in load balancing. It not only balances the load but also minimizes the duration of the yield. It assumes that all tasks are mutually independent and intensive from the computational point of view. The algorithm completes the scheduling process by simulating the foraging process of the ants. In the beginning, the ants choose a random course. When the ants reached the desired goals, they calculate the fitness path, in which point the ants set the pheromone on the path based on the physical form of cloud service. Finally, it is necessary to update the choices of pheromone and behavior in order to focus the ants on the high fitness path and reach the optimal solution as frequently as possible.

2.3 Resource allocation

Wei et al. [29] have introduced the Imperfect Information (IISG) game using the HMM method. It is used for resource distribution function, which is related to the Stackelberg. This model maximizes the revenue of the supplier and resource requester. Initially, the current bid of the service provider is predicted by the HMM. To ensure the optimal return to the infrastructure provider, the DRA model has been proposed and supports synchronous deployment for many service providers and resources.

Pillai and Rao [30] have proposed a mechanism for allocating resources in cloud technologies that depend on the coalition formation principle and insecurity principle of game theory. This approach avoids inter-programming complexities by solving the problem of optimizing coalition formation.

Peng et al. [31] introduced a framework for monitoring, analyzing and increasing the performance of the system. Authors implemented the neural network for transforming the simulation task with an abstract description into particular resource requirements based on their qualities and quantities. In this, a new mathematical model is introduced for representing the complex allocation of resources in a multi-tenant cloud environment. Here, GA was established to attain optimal resource allocation.

Shojafar et al. [32] have introduced a new paradigm of energy efficient method for adaptive resource management in fog computing to provide a real-time cloud services to the vehicular clients. This developed Cognitive computing-inspired scheduling, which is used for tuning resource configuration, input and output traffic of virtualized fog platform. The main aim is to increase the overall communication performance and improve the energy efficiency in fog computing while meeting the QoS requirement. The result of the proposed method shows decrease of energy consumed in overall NEtFc and TCP/IP connection. A review on popular LB and service brokering approaches are discussed in Table 1

Table 1 Comprehensive review of previous literature based on LB and service brokering in CC

3 Proposed approach for dynamic provisioning of resources

3.1 System description

In this work, MADRL-DRA and DOLASB methods have been used as a resource management scheme in CC. The proposed CC architecture consists of a user, the multi-agent system [local user agent (LUA) and global user agent (GUA)], and cloud provider. First, the LUA predict the activities of all users using MADRL-DRA to allocate the VM and perform LB based on service request description. The GUA will schedule the task for CSB using DOLASB. The system architecture is described in Fig. 1.

Fig. 1
figure 1

System architecture of the proposed method

3.2 Service of local user agent (LUA)

Initially, the prediction will be performed in LUA at the customer level without burdening the GUA. The customer will provide a description of the service request consisting of QoS parameters, a number of requested VMs, VM configuration, type of application and minimum required SMI score. The LUA decides the number of resources actually used without causing SLA violations, including the over provisioning problem. Each agent has the function of monitoring the customers assigned to it. Then create a resource usage table by monitoring the required cloud resources. Once a new request has arrived from the customer, it recreates the customer’s history to better adapt than the actual service. This agent uses a MADRL model to recreate cloud client requests by predicting the value of resources wasted based on customer demand. Hence, we expect a possible set of CS and resources from MADRL model. Then the resulting requests arrived after the recreation phase is sent to the GUA.

3.2.1 Multi agent deep reinforcement learning—Dynamic Resource Allocation (MADRL-DRA) algorithm

The MADRL-DRA algorithm is used to predict and allocate the customer data level in LUA without burdening the GUA. This algorithm is mainly used to balance the load of the task. It contains two components: prediction and matching component. The MADRL-DRA scheme is shown in Fig. 2.

Fig. 2
figure 2

Flowchart of proposed MADRL-DRA

3.2.1.1 Prediction component

It is a hybrid model, which uses the previous historical data, and environmental behavior. These component help us to evaluate environmental history to provide an assessment of future performance. It is commonly used to achieve high accuracy consistently over other time series types and provides the specific length of data for tracking model changes and the matching component.

3.2.1.2 Matching and detection component (MDC)

The detection and matching component is primarily used to detect if the predictive data is correct or not. If the data is not successful, it provides accurate predicted data. This data are assigned by the VM technical function. We have considered a data-center with 10 machines.

Virtual machine Based on the priority field, the VM data values are created. These created tasks are assigned by SLA and develop new forecast data. The scheduling of data is performed by some algorithms such as VM monitoring algorithm, LB method, and Estimation of CSP. These algorithms are performed in DRA controller which is responsible for the proper functioning of the component and minimize the SLA damage. The algorithm of this process is shown in Table 2.

Table 2 Pseudo code of the proposed method in LUA for predicting and load balancing all user activities

When a service request arrives at the cloud planner, the scheduler divides the tasks according to their function and then creates the work frame based on their order of priority. This assignment is represented by a VM policy which is used to share the VMs functional power for operating the host. The VM allocation is shared through the CPU significance field. When the appropriate list of VM is created, then it determines whether the load balanced the CS to get the balanced load in each cloud’s center. After this process, if there is no best solution in cloud data-center, the scheduler ready to check any other new VMs resources are available in CS, if the function is yes, it automatically predefine the resource sizes to launch new VMs with service provisioning requests.

The resources cannot provide extra VMs because the list of resource allocation is not enough. So, the service request is kept in a scheduled order until the VM attain accurate resources. This kind of approach helps to enhance response time and performance improvement and these advantages make low SLA damage.

3.3 Service of global broker agent

The GUA section consists of some main parts such as optimization algorithm, service catalog, SMI calculator, distribution plan, CP for task planning. When a task arrives from the LUA outside the Cloud system, the Task Allocation Platform (TAP) selects the user task to assign it to some hosts. The steps are performed in the global broker agent:

  1. 1.

    First, the CPs are ordered based on SMI characteristics using the weighted sum model. Then GUA chooses the suppliers who meet a minimum SMI.

  2. 2.

    Second, the optimal solution is generated by the proposed algorithm which is used to reduce the cost of the infrastructure.

So, the TAP uses these probabilities to select the host that will accept the best task to assign the task in DRA planning model.

3.3.1 Quality service of service measurement Index

SMI score is a hierarchical model used for various service in CC such as the attributes and categories. In this, the attributes are rated from zero to ten in which zero is the smallest score of the SMI. The formula for assessing the measure and attribute is explained by the CSMIC. This provides a performance indicator namely Key Performance Indicator (KPI), which is used in multi-cloud service. CPS satisfy the SMI type of consumer demand and the attribute value is defined by CSMIC, which is evaluated for the cost optimization program. Based on this, KPI indicator the SMI divided the attribute categorize.

  • Accountability This attribute is the collection of QoS that measure the properties of particular characteristics in CSP. These properties may be free of the cloud service provider. It used to evaluate CSP compliance and accountability and security.

  • Usability This is used in fast service organization in the cloud service. The performance of these service attributes depends on learnability, installability, and accessibility.

  • Agility attributes In CC, this attribute is more advanced than other attributes by using SMI measures. It denotes the aptitude of customers to quickly change the vector, strategy or tactics with minimal deficits

  • Performance In CSP, the SMI provide various organizations (business and academics). This attribute directly performs in appearances of provided services. The function of this service is represented by accuracy and stability.

  • Security and privacy These services are very important in CC because the host data secure is a major problem in CSP. It is a multi-dimensional service attribute, it is denoted by services, physical facilities and service data, and access control.

  • Financial In SMI service, the cost of the CSP is based on the quality of the metrics. Because cloud user cost is the series problem in IT organization.

The value of SMI performance in CC is measured by using some parameters namely weight calculation and SMI score calculation is given below:

  1. (i)

    Calculation of relative weight This weighted calculation is used to measure SMI attribute. This allocates the SMI attributes function, which is related to the predefined structural systems or using relative weight based on consumer options. For customization, they can either use pairwise evaluation method or can provide direct arbitrary weight. In this, the local significance vector is calculated from the meaning of implications elements to get the best solution by using Eq. (1),

    $$Pu = \lambda_{\hbox{max} } u$$
    (1)

    where \(\lambda_{\hbox{max} }\) denotes the main eigenvalue of the matrix, Pu denote the weight calculation of SIM.

    Compare the weighted value of customer allocated resource with other results to obtain the stabilized weighted values. In category y the weighted value is represented by x, and then SMI category weight \(u_{y}\) is measured as follows:

    $$u_{y} = \frac{{vu_{y} }}{{\sum\nolimits_{y} {vu_{y} } }},\forall y$$
    (2)

    where \(vu_{y}\) denotes the allotted weight value for attribute and y is a set of SMI categories.

    Let \(vu_{yx}\) represent the user allotted weight for x attributes and \(vu_{yn}\) weight measures for n attributes. Let \(u_{x}\) and \(u_{q}\) denotes the weight calculation and normalized attributes.

  2. (ii)

    Calculation of SMI score Let Pqx represent the x attribute of an SMI characteristic for CS supplier q and Pqxn, this calculation indicates the lower SMI score (n) than attribute x and uxn denotes the normalized characteristic for every attribute type is calculated by Eq. (3):

    $$P_{qx} = \sum\limits_{n} {u_{xn} } P_{qxn} ,\,\forall (q,x)$$
    (3)

    Let Pqy represent the SMI of the classification y for the service Provider q. The category of SMI \(P_{qxy}\) is evaluated by using a weighted method of SMI attributes \(P_{qy}\) and calculate n value. The category of SMI score \(P_{qy}\) for all set is evaluated by:

    $$P_{qy} = \sum\limits_{n} {u_{xy} } P_{qy} ,\forall \left( {q,y} \right)$$
    (4)

    For all CP the total value of SMI score is evaluated by Eq. (5),

    $$Rp_{q} = \sum\limits_{y} {u_{y} } P_{qy} ,\forall q$$
    (5)

    Let \(Rp_{q}\) represent the complete SMI score for CP ‘q’. The weighted method is used to restrain the total SMI score \(Rp_{q}\) of every SMI scores \(P_{qy}\).

3.3.2 Optimization algorithms

In Optimization Algorithm, the physical resources of the multi-cloud are optimized using the MIP and Bender Decomposition algorithm. First, the linear optimization problem is optimized by MIP. This algorithm is used to minimize the charge of resources in the multi-cloud. The Bender Decomposition algorithm contains a number of solvers to resolve modelling problems and to solve the complicated mathematical problem in MIP formulation.

3.3.2.1 Mixed integer programming algorithm (MIP)

The MIP algorithm is used to solve the small instance optimization problem using a mathematical model. It can optimize the subproblems as well as fixed problems. To simplify this problem, first, we set the initial values in binary order. From this fixed value, we assume the nonlinear problems. Then set the configure rate and frequency for all nonlinear problems. Second, adjust the frequency and configure a rate for continuous service process. Then evaluate the value of each fixed problem by using the rate and frequency values to find the best optimal solution.

The general equation of the optimization is given in Eq. (6):

$$\sum\limits_{iqm} {C_{iq}^{o} } a_{iqm} + \sum\limits_{iqm} {C_{iqm}^{f} } a_{iqm} + \sum\limits_{q} {E_{q} } c_{q}$$
(6)

where \(C_{iqm}^{o}\) denotes the price of VM class, i by CP in q location and m is a pay per use plan. \(C_{iqm}^{f}\) denotes the price value of VM class i, q is a location of m as pay per use plan and f is a plan flat rate. \(E_{q}\) denotes the fixed CP cost for additional CS, \(c_{q}\) represent the number of CP selected, \(a_{iqm}\) denotes the VM provisioned. The goal of this computing is to minimize the overall distribution rate of VMs between various-CP and is given in Eq. (7)

$$\sum\limits_{q} {a_{iqm} } \ge F_{im} ,\forall (i,m)$$
(7)

where \(a_{iqm}\) denotes the number of VM provisioned in class i. \(F_{im}\) denotes the number of the user required in class i for execution in m location of the data center. The distribution of supply for VMs achieve better resource capacity in the restricted state by CP is given by,

$$\begin{gathered} a_{{qpm}} \le B_{{iqm}} \cdot \forall \left( {i,q,m} \right) \hfill \\ \sum\limits_{q} {c_{q} \le n,\forall q} \hfill \\ \end{gathered}$$
(8)

where \(c_{q}\) represent a number of variables are selected for CP, \(B_{iqm}\) denoted the max capacity of VM in class i and the CP is offered by q in data center location m. The collateral guarantees are provided with the legal and legal provisions by the supplier, this process is assigned by Eq. (9).

$$c_{jqm} \ge vc_{jm} ,\forall (i,q,m)$$
(9)

where \(c_{iqm}\) represent the CP q, m denotes the location of CP and its compliance with j regulation. \(vc\) denotes the minimum categories required in SMI.

The number of VMs (max and min) are provided by CP in location m is limited by Eqs. (10, 11).

$$\underline{{N_{jqm} }} c_{q} \le a_{jqm} \ge \overline{{N_{jqm} }} c_{q} ,\forall (j,q,m)$$
(10)
$$P_{qz} \ge v_{z} ,\forall (q,z)$$
(11)

where \(p_{qz}\) denotes the SMI score categories z for cloud service provider q. Ensure that CSP has an SMI rating and characters are larger than the consumer’s needs are evaluated by using Eqs. (12) and (13).

$$p_{qb} \ge v_{b} ,\forall (q,b)$$
(12)

where \(p_{qb}\) denotes the SMI score categories b for cloud service provider q.

$$Rp_{q} \ge v_{sp} ,\forall q$$
(13)

where \(Rp_{q}\) denotes the total SMI of the p cloud provider, and \(v_{sp}\) represent the min SMI score of CP. The Eq. (14) indicates the variables from a set of a non-negative accept integer value. The non-negative accept integer value is evaluated by using Eqs. (15) and (16).

$$a_{iqm} \in {\rm M}_{0} ,\forall (j,q,m)$$
(14)
$$c_{q} \in \left\{ {0.1} \right\},\forall q$$
(15)
$$c_{iqm} \in \left\{ {0,1} \right\},\forall (i.q.m)$$
(16)

The MIP programming languages are used to solve the optimization problem with the use of personal solvers and world-class cloud solvers. This model is joined with current CB for the Web service-based interface.

3.3.2.2 Benders decomposition method (BD)

This approach is used to solve the complexity problem of mathematical programming. Then we divide the MIP problem into the Main Problem (MP) and Sub-Problem (SP). The main problems contain some whole variables and the sub-problem contains continuous variables. The MP and many sub-problems can be solved in parallel. The classic problem of bender decomposition is implemented and solves the sequence of secondary problems and MP. The bender decomposition has some variables and limitations of the MIP. The BD algorithm can involve iterations between the main problem and the sub-problems.

The first MP M (b, n = 0) has none of the benders cuts (i.e., n = 0) is given as follows:

$$\sum\limits_{q} {E_{q} } c_{q}$$
(17)

where \(E_{q}\) a static user cost of the CP, \(c_{q}\) denotes the number of CP (q) selected. The problem to be dualized (i.e., linear formation), the equivalent of the internal problem can be indicated as follows.

$$\sum\limits_{iqm} {a_{iqm} } C_{iqm}^{o} \, + \,\sum\limits_{iqm} {a_{iqm} } \,C_{iqm}^{f}$$
(18)

where \(a_{iqm}\) is a continuous variable, \(C_{iqm}^{f}\) denote the flat rate of VMs, \(C_{iqm}^{o}\) denote the pay per user cost of VMs.

By introducing the dual variables \(\sigma_{im}\) and \(\pi_{iqm}\) corresponding to the two constraints and the double formation S (σ, π|c) of the problem can be written as follows.

$$\sum\limits_{im} {\sigma_{im} } F_{im} + \sum\limits_{iqm} {\pi_{iqm} } B_{iqm}$$
(19)

where \(B_{iqm}\) is generated by using a pseudo-random value, \(F_{im}\) denotes the number of the user required in class i for execution in m location. The output is added to the MP at each iteration from the objective of the SB P (π|b, σ) is evaluated by using Eq. (20), l denotes the number of SMI score measured.

$$\sum\limits_{im} {\sigma_{im} } F_{im} + \sum\limits_{iqm} {\pi_{iqm} } B_{iqm} \le l$$
(20)

The output of the major problem is increased by adding some variables from Eqs. (21) and (22).

$$\sum\limits_{q} {E_{q} } c_{q} + l$$
(21)

where \(E_{q}\) a static user cost of the CP, \(c_{q}\) denotes the number of CP (q) selected, l denotes the number of SMI score measured.

$$\sum\limits_{im} {\sigma_{im} } F_{im} + \sum\limits_{iqm} {\pi_{ciqm} } B_{iqm} \le l,\forall y$$
(22)

The MP N(c, l) can be completed by the Benders y to the first MP N (c, l = 0) after announcing the Benders created from far away.

Finally, the job characters are dynamically arranged the virtual resource and increase resource utilization. Here the jobs are executed based on the priority order and the set of VM resource are assigned based on the high priority to low priority job. When the high priority arrives, then allow running on the VM. If there is no best VM assign, then the algorithm checks the low priority into lease type job. The first priority work is allowed to perform on the resources, which is prevented from the low-resource priority. If the job started after the other jobs in VMs is completed, the initially suspended lease type work will be restarted and send it to the CP.

Table 3 shows the scheduling and optimization for VMs and Fig. 3 shows the steps for optimizing the MIP problem in CC. Here, we used a bender decomposition algorithm in MIP to solve the scheduling problem much faster than the other methods. It separates the MIP problem in two ways: (i) master problem (optimization problem) and (ii) subproblem (decision problem). In the optimization problem, we consider Makespan, Response Time, and User cost problems and for the decision problem, we consider a number of tasks, processor and memory size. This problem is a major problem in scheduling activities on the VM. In bender, the master problem generates a solution in the form of a binary variable and passes them to the sub-problem. If the sub-problem is unlimited (unbounded) the bender cut is applied to produce an optimal solution and can be applied to the MP to create a whole solution. This process is repeated up to the upper bound (Ub) and the lower bound (Lb), thus providing an optimal solution to the MIP problem. The goal is to find feasible solutions, reduce the makespan or minimize the total cost of the user.

Table 3 Pseudo code of the proposed method in GUA for scheduling and optimizing the VM activities
Fig. 3
figure 3

Optimization using bender decomposition for MIP problem (BD-MIP)

3.4 Complexity analysis

This section describes the complexity of Load Balancing and Service Brokering in CC. To analyse the complexity of MADRL-DRA and DOLASB method for LB and CSB, we consider m the total number of VMs assigned in CC, n represents the number of users task in the data center. In this document, the user task allocation processes have been started from Algorithm 2 (step 4), this process will execute the m = n time based on the order of priority. Therefore, the priority allocation will be executed at time \(O\left( m \right)\). The final complexity of task allocation after using the proposed method is \(O\left( {mn^{2} } \right)\). This allocation process is called by Algorithm 2 and 3 in Table 2.

After assignment of the activity, the VMs are scheduled according to algorithm 1 (Table 3) of step 2, the amount of time required for the LB in CC is analysed. Based on LB processing, time variable T is assigned as \(\theta (n) = \left[ {\frac{n}{2}} \right] + (mn) + T\), where T is the total time for the LB computing, m indicates the total number of VMs assigned in CC, n represents the number of users in the data center. The variable \(\theta\) will be accessed based on the worst case and best case, therefore it neglect the lowest case from the above equation. The complexity of scheduling is \(O\left( {\frac{n}{2}} \right)\). Hence, the complexity of the proposed BD-MIP in Algorithm 3 (Table 3) will be determines based on step 5 (the algorithm is bounded by the ordering of the task). From this algorithm, we get \(O\left( {n\left( {\log n} \right)} \right)\) computing time for sorting technique. So, the overall complexity time of our proposed method is determined as \(O\left( {\log \frac{mn}{2}} \right)\).

3.5 Statistical analysis

The Statistical Analysis of proposed method is evaluated by using ANOVA testing method, it can also be used to analyse the mean and variance of other existing method. To evaluate the statistical performance of proposed method, we takes three measures namely Standard Deviation \(\left( \sigma \right)\), Maximum Load \(\left( {L_{\hbox{max} } } \right)\), and Minimum Load \(\left( {L_{\hbox{min} } } \right)\). These measures are clearly described below:

3.5.1 Standard deviation \(\left( \sigma \right)\)

The standard deviation measure the distribution set of workloads from its average workloads \(\left( \mu \right)\).

$$\sigma = \sqrt {\frac{1}{m - 1}} \times \sum\limits_{j - 1}^{m} {\mu - Wl{}_{j}}$$
(23)

where m denotes total number of VM, \(\mu\) represents average workload. The average workload denotes the ratio between the number of user task and VMs.

3.5.2 Maximum load \(\left( {L_{\hbox{max} } } \right)\)

The maximum load of all VM is analysed by using mathematical expression as follows:

$$l_{\hbox{max} } = \hbox{max} \left( {Wl_{j} } \right),1 \le j \le m$$
(24)

where high balancing of workload should have \(l_{\hbox{max} } = \mu\).

3.5.3 Minimum load \(\left( {L_{\hbox{min} } } \right)\)

The Minimum workload of all VM is analysed by using mathematical expression as follows:

$$l_{\hbox{min} } = \hbox{min} \left( {Wl_{j} } \right),1 \le j \le m$$
(25)

where high balancing of workload should have \(l_{\hbox{min} } = \mu\).

4 Experimental and result

The experiment was designed to estimate the function of LB, resource allocation and the planning in CC is discussed in this section. The proposed technique is analyzed using some basic parameters such as makespan, throughput, power consumption, and resource utilization. These desirable metrics are used to calculate the operation of the proposed algorithms. The task developed a unified solution to indicate the configuration of cloud management. The job planner can reduce user requests for VM events. We have chosen a work plan of 500 (workloads) that appreciates the needs of cloud resources. Requests for virtual users will be processed in this test to assign a number of pre-estimation desired for the VM. The task sends to the VM for execution and the execution of the job is based on the priority mode. The job prediction is based on its deadline, if there is not an activity of great importance compared to the low priority that will be executed after the execution of any other job. If errors in the result part, it is returned back to the user.

The parameter metrics of GUA is planned in the SIM cloud, accuracy, normal response time, and blocking probability. These metrics are used to calculate the performance of CSB and resource allocation. The LB metrics of LUA is evaluation interval, response time, activity migration number, and degree of imbalance.

In this paper, our simulation implementation includes a number of data-center and the number of VMs running on the physical host. Table 4 shows the system parameter settings. We used two algorithms both in LUA and in GUA to clarify the variation of data execution time and response time by changing the scenario by increasing the number of activities. We also simulate each of the two scheduling cases.

Table 4 Simulation of parameters settings

The Local User Agent has a large number of user tasks in which each user carries different information in the description part of the service request. The description of the cloud service part includes some data’s or accessibility zones, VMs, historical data, optimization criteria and the required level of SMI attributes, etc. These tasks are split into training 70% and testing 30%. The MADRL algorithm is used to predict user demand based on the attributes which are described in the above table.

Due to the unavailability of the user task, the synthetic data is created using uniform distributed function available in the programming language. For the validation purpose, the number of CP is loaded with a set of providers. The set of Virtual Machine (VM) is i, set of cloud location is denoted by m and set j are loaded number of location in nine VM. The VM user demand parameter is shown in Table 5.

Table 5 The parameter of user task (VM) demand \(B_{iqm}\)

The parameter \(B_{iqm}\) is created based on the distribution between lower bound to the upper bound. The service provider of each user task \(p_{qz}\) parameter is generated based on normal distribution between the upper bound and lower bound. The cost of each user task is determined by using uniform distribution with lower bound and upper bound.

The optimal solution of bender decomposition in Eq. (17) is performed in both static and dynamic variables for the deployment plan is given in Table 6. If the \(a_{iqm}\) is positive, the user task cost will automatically reduce for non-basic variable and increase the constraint, if the \(a_{iqm}^{{}}\) is negative, the basic variables are non-binding and the constraints are reduced.

Table 6 The implementation plan of the VM

Table 7 shows the objective results of the test experiment conducted between the number of VM and CP with a different number of user tasks of length 1246–10,000 MI in dynamic cloud environments. The main problem of the user’s cost is reduced and the processing speed is increased when the number of VM is increased in LUA.

Table 7 Objective results of test experiments

The comparison of time complexity analysis of our proposed method is shown in Table 8. The task allocation, LB, scheduling and the optimization algorithm for CSB having low time complexity than other existing algorithm. The complexity of these existing algorithms (Proactive SSLB, RALBA and METC) are depend on the number of iteration required in CC process [38, 34, 42]. The complexity of our proposed method has a less time complexity, because the existing algorithms can use only polynomial users and tasks.

Table 8 Performance evaluation on computational complexity

4.1 Comparison of different scheduling algorithm

In this section, various algorithms of operation planning are discussed and the related parameters are given in Figs. (4, 5, 6, 7, 8, 9, 10). Here, we compare both dynamic and static scheduling algorithm to the proposed algorithm. The comparisons of performance evaluation are given in Table 9.

Fig. 4
figure 4

Comparison of makespan between the proposed method and the existing methods

Fig. 5
figure 5

Comparison of resource utilization between the proposed technique and the existing methods

Fig. 6
figure 6

The energy efficiency is compared with the existing algorithms

Fig. 7
figure 7

Evaluation of waiting time between the proposed approach and the existing method

Fig. 8
figure 8

Comparison of throughput between the proposed method and existing methods

Fig. 9
figure 9

Execution time is compared with some other existing methods

Fig. 10
figure 10

Comparison of energy consumption between the proposed method and existing methods

Table 9 Comparison based on Performance Evaluation of a various algorithm

4.1.1 Makespan of CS

The makespan is used to measure the total time of the job scheduling perform in the global agent. The result of the makespan is compared with the prior algorithms. The value of the makespan is considered by using the equation as follows,

$$Makespan\, = \,final\,execution\,time\,{-}\,Starting\,time$$
(26)

The Fig. 4 shows the process of proposed makespan is compared to some existing algorithm. Here we use fixed task and for these cases, the proposed makespan value (550 s) is better than the existing method of ETVMC with 788 s [39]. Here, we can detect that the proposed method offers relatively better performance for all types of workload when compared with the ETVMC algorithm.

4.1.2 Resource utilization of CS

The utilization is measured based on memory utilization and CPU in the CC. During this process, when the number of workloads increases, the value of the source utilization also increased. The equation of resource utilization is given below:

$$Resource\,utilization\, = \,\sum\limits_{i = 1}^{n} {\frac{Actualtime\,spent\,by\,resource\,to\,execute\,workload}{Total\,resource\,up\,time}}$$
(27)

The performance of the utilization is shown in Fig. 5. In this, the utilization of resource result in the proposed approach is (99%) better than the existing Min–Min and Max–Min approach of 70% and 73% respectively. In this case, the proposed method is 29 points greater than the existing algorithm [40]. The resource rate of VM is smooth when using the proposed technique indicates that the most consistent workload in all VMs with high utilization. The disadvantage of Min–Min is when the VM level is high then the utilization rate also high if the number of VM is low the utilization also goes to low value. The function of the Max–Min algorithm is opposite to the Min–Min method. Therefore the task of the Min–Min method and Max–Min algorithms are not performed well for dynamic CC.

4.1.3 Energy efficiency of CS

The energy-efficient of proposed is planning for a real-time independent aperiodic request to save energy in the cloud center and to increase the level of workload successfully. It is given as follows,

$$Energy\,efficiency\, = \,\sum\limits_{i = 1}^{n} {\frac{Number\,of\,workload\,executed\,in\,a\,datacenter}{Total\,energy\,consumed\,to\,executed\,those\,workload}}$$
(28)

Figure 6 shows the energy efficiency of DRA is 32% as compared with some existing methods such as ORRA, OLS-OSPE, and EAR-OSPE. [41]. Here the proposed technique is 10% greater than the previous algorithms.

4.1.4 Waiting time of CS

The waiting time for the task is calculated as the average of the total waiting times for all activities as shown in the Eq. (29). It includes the time essential to map the activities VM and VM migration time.

$$Waiting \, time\, = \,\sum\limits_{j = 1}^{n} {\frac{{WP_{j} - WQ_{j} }}{m}}$$
(29)

where \(WP_{j}\) is a workload execution starting time, \(WQ_{j}\) denotes the workload submission time, and \(m\) represent the number of workloads.

Figure 7 shows the comparison and evaluation of proposed wait time with two methods. The waiting time performance of DRA algorithms is 75 s and the Max–Min is 299 s, Min–Min is 301 s. The performance of the waiting time is varied for three types of algorithms when the number of tasks increased. [42]. Here we compare the previous algorithms with the proposed DRA method. In this comparison, the proposed concept has the lowest waiting time than compared to existing methods. Because the existing methods have some disadvantages (i) first, the Min–Min methods show the rising movement of waiting times when the job becomes very difficult. (ii) Second, the function of Max–Min is better than the Min–Min, but the performance of the Max–Min algorithm is poor when compared to the proposed method. Because, if the task level is increased, the Max–Min will create poor wait result.

4.1.5 Throughput of CS

In CC, the work efficiency is calculated by using throughput. The throughput is evaluated between a total number of a completed job and the workload execution time. This is defined as follows;

$$Throughput\, = \,\frac{Total\,execution\,\,time}{Total\,number\,of\,workload}$$
(30)

Figure 8 shows the comparison of throughput performance of proposed work with some existing algorithms. Here the proposed method has a higher result (5 s) than the existing method PSO (2.9 s), and TRS (1.5 s). In this, the throughput process of the proposed method is compared with the Eco-aware online algorithm, PSO and TRS algorithm. The performance of the previous method is based on the cloud service energy level in the data-center [43]. The Eco-aware online algorithms incoming actions in the selected range when the value of the grid is insignificant in their consistent delay limits and increase accessible energy in cloud data-center for every time interval. The PSO algorithm is a simple and fastest algorithm, but it needs selective jobs because the center has limited energy. The existing algorithm also has limited energy in the cloud data-center. Therefore the result of this comparison shows that the DRA is better than other types of algorithms.

4.1.6 Execution time of CS

The execution time of the proposed model is decreased and increase the throughput of the activity planning structure. This metric defined as the time taken to finish the task scheduling in a particular time period. It is analysed by,

$$Execution \, time\, = \,\sum\limits_{j = 1}^{m} {\frac{{WA_{j} - WQ_{j} }}{m}}$$
(31)

where the m represents is the number of workloads, \(WA_{j}\) represent the completion time of workload and \(WB_{j}\) is a submission time of workload.

Figure 9 shows the comparison of proposed and existing method execution time. In this, the average execution time of DRA is 100 s, HYSARC is 480 s, and CMMS is 500 s. The performance of the existing method has slow convergence due to local optima [44]. The result of our proposed model required a low execution time than the existing algorithm. The existing algorithms are the greedy method and its functions are based on various strategies leads to poor output compared to the proposed method.

4.1.7 Energy consumption of CS

Energy consumption is the volume of energy used to produce minimum energy consumption CC. In the CC environment, the energy consumption is affected when the workload increase then the value of energy level will be decreased, it produces CS problems in cloud brokering period. The energy consumption is calculated by Eq. (32).

$$Energy \, consumption\, = \,\left( {PC\,P_{\hbox{max} } - PC\,P_{\hbox{min} } } \right) \times U_{t} + PC\,P_{\hbox{min} }$$
(32)

where \(PC_{\hbox{max} }\) represent the extreme power consumption at the high load, \(PC_{\hbox{min} }\) denotes the lowest power consumption in idle mode.

Figure 10 shows the energy efficiency value of DRA is decreased 3.8×107 J and the existing method is increased to 5.0×107 [45]. The calculation of energy efficiency is compared with Adaptive Heuristic for Dynamic VM Consolidation-Medium Absolute Deviation (AHDVC-MAD), Adaptive Heuristic for Dynamic VM Consolidation-Interquartile Range (AHDVC-IQR) to the proposed method. These methods are comparatively outstanding algorithms in heuristic-based dynamic consolidation algorithms. The comparison result shows the fixed VM resource algorithms has expressively high allocation policies than the dynamic consolidation algorithms. The DRA algorithms were more outstanding due to a significant decrease level of SLA violations.

4.2 Statistical analysis of variance (ANOVA) significant test

The statistical analysis is performed to estimate the classification accuracy by conducting an ANOVA approach on the experimental results. The result obtained from the test are described on Table 10, 11, and 12, which shows that the experiments based on standard deviation \(\left( \sigma \right)\), Maximum Load \(\left( {l_{\hbox{max} } } \right)\), and Minimum Load \(\left( {l_{\hbox{min} } } \right)\) achieves a confidence level of 97%. This statistical analysis is used to find the mean of experimental result from various group of VM and state the present algorithm have any statistical difference among VM or not. The ANOVA statistical test has two main process such as null-hypothesis (H0) and alternative hypothesis (H1), it is defines as,

$$H_{0} = \mu_{1} = \mu_{2} = \mu_{3} = \ldots = \mu_{n}$$
(33)
$$H_{1} \, = \,Mean\,or\,not\,equal$$
(34)

In the ANOVA test, it is observed that the differences in the accuracy measure are statistically significant to the statistical value F and the probability value P. During the statistical test period, if the F statistic is less than F critical, then it rejects the null hypothesis, this condition shows that the mean value of each VM group is not equal. Second, if the F statistic is greater than F critical then the null hypothesis is rejected and accepts all the alternative hypotheses. Here, we conduct the ANOVA test for each VM group on different workloads (500–1500) values, assuming that the alpha is 0.05. Based on this testing process, we can conclude that there is a significant difference between each group of VMs. The VMs are analysed based on standard deviation, minimum task, maximum task in the VM and that evaluation is based on the mathematical Eqs. (23), (24) and (25). In the Table 10, 11, and 12, SS indicates the sum of the square, \(df\) refers to the degree of freedom and \(MS\) refers the average of Square respectively.

Table 10 ANOVA test for random, RR and proposed algorithm in terms of standard deviation \(\left( \sigma \right)\) of VM
Table 11 ANOVA test for random, RR and proposed algorithm in terms of maximum load \(\left( {l_{\hbox{max} } } \right)\) of VM
Table 12 ANOVA test for random, RR and proposed algorithm in terms of Minimum Load \(\left( {l_{\hbox{min} } } \right)\) in VM

As seen from the above Tables 10, 11, and 12, the F statistic is greater than F critical. So, null hypothesis are rejected and this establish the probability value of the F statistic (P value) is less than the alpha value (0.05).

This outcome shows that the mean value of compared algorithms are not equal and is shown in Table 13. This implies that the performance of the proposed algorithm is better than other algorithms [38, 34].

Table 13 Comparison of statistical analysis for proposed and existing method

5 Conclusion

In this work, a new innovative DR provisioning approach is used to reduce the problem occurs in the cloud service. The MADRL-DRA algorithm is used in the local user agent to forecast the activities of the user’s request based on the service request description area and the DRA controller to reduce the SLA damage. Using these expected data, the LB process is performed. The result of the local agent is used to schedule available tasks in a global broker agent between the CP and CU using DOLASB method. The results and experimental analyses are simulated under Cloud SIM platform. In the local broker agent, forecast accuracy, average response time are used as the performance metrics to calculate the task prediction. The makespan, response time, imbalance workload, the number of task migration are the metrics for LB measure and the resources allocated is evaluated by using throughput, usage, time, and energy. We evaluated a cutting-edge comparison with our methodologies to demonstrate the importance of the proposed method.

For our proposed method, we used reinforcement approach and optimal load aware scheduling algorithm for LB and service brokering. This work can be extended with a new scheme of secure scheduling algorithm with NUMA-aware scheduling. In particular, there needs a primary reason to find workload and mapping resources. The proposed method uses a symmetric approach based on cost analysis of the VM and further analysis can also be determined with the improved version in future direction.