1 Introduction

In cloud computing applications, delivered as service over the Internet and that is provided with the help of hardware and system software of the datacenter (SaaS). Cloud computing is also called as utility computing, the service being sold. Amazon Web Services, Microsoft Azure, Google App Engine are few examples of the cloud service provider. Using cloud, end- user can access data anytime, anywhere [1, 2].

One of the best feature of cloud computing is resource elasticity. Because of which business customer can scale up and down utilization of resources as per their need without investing amount in licensed software and infrastructure. There are three types of cloud service models used widely that are Software as a Service (SaaS) which provides all support online so does not need any installation from client sites, Infrastructure as a Service (IaaS)—it provides the infrastructure such as storage, network, CPUs on demand, these resources are provided on rent basis and Platform as a Service (PaaS)—which is set of tools and services developed to make coding and deploying those applications quickly. In literature four cloud deployment models are discussed that are private, public, community and Hybrid cloud [3].

One of the key feature of cloud computing is virtualization technology which abstracts the physical infrastructure through a virtual machine monitor (VMM) or hypervisor [4]. VMM maps virtual machine to physical resources. VMM allows several VM or guest operating systems to share a single Physical Machine (PM) securely and fairly.

Also virtualization helps to relocate VMs to a minimum number of PM’s therefore the number of active PM’s can be reduced. This approach is called server consolidation. When some servers gets multiple requests from users to access the services and all other servers in datacenters are having only few requests, it means system is not able to balance the load, such problem is said to be a load balancing service problem [5, 6]. The load balancing problem can be solved by scheduling requests in a sequence so that efficient resource utilization can be done. In this regards, a good task and VM scheduler can improve the performance of resource utilization and avoid the load imbalance issue [7]. Our proposed system can solve this issue and provide the better load stability.

Rest of the paper is organized as follows: Sect. 2 describes related work. Our contributions are presented in Sect. 3. Section 4 represents proposed system architecture along with different algorithm to handle the system efficiently. Section 5 describes experimental results and discussion and Sect. 6 briefs about conclusion and future work.

2 Related work

This section presents different resource allocation and scheduling techniques which helps to present our proposed work.

Sonkar et al. [8] studied the several resource allocation and scheduling techniques and proposed an optimal resource management strategy which improve the overall utilization of server resource and also avoids load imbalance issue. Providing good QoS to the user is a critical challenge for Cloud Based Companies because workload demand is variable with respect to time. Calheiros et al. [9] solved this issue by introducing autoregressive integrated moving average (ARIMA) model which assesses accuracy of future workload prediction by using real traces of requests of web servers. da Rosa Righi et al. [10] presents elasticity model for high performance computing application in the cloud called Auto-elasticity. Xiao [11] proposed a resource allocation system which avoids overload of the system while minimizing the number of server used and also introduced the skewness concept which measure uneven utilization of server.

Farahnakian et al. [12] presented a dynamic Virtual Machine (VM) consolidation method called Utilization Prediction aware VM Consolidation (UP-VMC). To consolidate VMs into the least number of active Physical Machines system considers both the current and future utilization of resources. Walsh et al. [13] presents a distributed architecture, executed a precise model of data center which demonstrates how utility functions can permit for collection of autonomic elements to persistently optimize the utilization of computational resources in a dynamic and heterogeneous environment. The Li [14] presented adaptive resource allocation strategy for pre-emptible jobs by considering updated actual task execution. In this case static resource allocation performs static task scheduling which is generated offline and this can be done using adaptive list scheduling and adaptive min–min scheduling strategy. Best fit and worst fit dynamic VM Scheduling techniques are proposed by Rathor et al. [15]. In best fit, VM finding the host which supports the full utilization of its resources. The binary search method is used to search best fit host which reduces the VM allocation time. In worst fit VM allocation strategy, if first PM does not have sufficient resources, then in this case system power ON the new PM and allocate the VM on that machine. In Cloud environment When the number of VM increase or decrease also requests increase or decrease for such a situation a VM scheduling algorithm Artificial Bee Colony Longest Job First (ABC_LJF) [16] is proposed. This technique is suitable to maintain system stability and scheduling to prevent system crash. Hu et al. [17] proposed the load balancing scheduling technique of VM resources on the Cloud computing environment by implementing a tree structure utilizing Genetic algorithm for scheduling. The system which can resolve the quandary of load imbalance in Cloud computing by considering preceding data and the current state of work in advance to the performance demeanor. Panchal et al. [18] has discussed VM Load Balancing Algorithm of Weighted Active Monitoring which can implement in Cloud SIM Tools. This Load Balancing algorithm is proposed for the Data center to efficaciously load balance the application requests among the existing virtual machines allocating a certain weight, for to accomplish improved performance parameters such as replication and Data processing time. In First come first serve (FCFS) Scheduling is one of the simplest scheduling algorithm proposed by the author Yuan et al. [19]. Jobs are executed on first in first out basis. The disadvantage of this method it is no preemptive, i.e., average waiting time is high. Round robin scheduling algorithm is a preemptive scheduling algorithm by Raj [20]. In which each task have a fixed time to execution. If one of the task is not finish in it quantum then it require to wait for their next turn. Context switching is used to remember the state of the preempted task. Jobs are given priorities to execute in Generalized Priority algorithm proposed by Cao et al. [21]. The VM’s are also prioritized, the highest priority is given to that VM which executes higher Million Instruction per Second (MIPS). This scheme is better than RR and FCFS scheduling techniques. Li et al. [22] presented Ena Cloud technique in which application is encapsulates on a VM. This techniques supports applications scheduling and live migration to reduce the number of active machines, so that it save energy. Li et al. proposed [23] a balanced system which handles the VM demands in real time and providing assurance for better resource utilization. The total active physical machines and their energy consumption can be reduced as outcome of high resource utilization. Beloglazov et al. proposed [24] an effective resource management strategy for to constantly combine VMs leveraging live relocation and power off idle nodes to decrease power consumption along with essential Quality of Service. Buyya et al. [25] focuses the improvement of dynamic resource provisioning and allocation systems that reflect the interaction among numerous data center infrastructures and comprehensively effort to increase data center energy efficiency and performance.

3 Contribution

Our contribution to this work is as follows:

  • Proposed new algorithm which includes VM capacity and execution time for load prediction and performance improvement.

  • Proposed VM clustering and optimization algorithm to improve job sequencing performance.

  • VM clustering algorithm introduced here to reduces searching time of VM and solves the imbalance state problem in traditional methods.

  • The optimization algorithm used here which reduces the time for selection of VM for suitable job.

4 System model

Based on the research gap, proposed system is design that analyzes load prediction algorithm based on execution time of each task on Virtual Machine and sequencing them to improve service response time using optimal sequencing algorithm.

The proposed work implementation will be based on the architecture as shown in Fig. 1.

Fig. 1
figure 1

System workflow

4.1 System workflow

In our proposed work, we have taken input parameters from cloudsim tool and then applied federation → clustering → job scheduling → prioritization.

The system workflow is explained as follows:

Load Prediction is done by applying optimal job and VM sequencing algorithm and the procedure for the same is done by retrieving inputs from cloudsim and stored into cloud database. Then VM capacity is calculated as (number of processing elements) × (million instruction per second) + bandwidth. Then created VM’s are assigned to respective datacenter. Next step is to apply federation of the datacenter in which grouping the datacenters based on cost and Million Instruction per Second (MIPS). Further, based on the VM capacity clustering is applied into the federation. After performing the clustering, the jobs are initialized. Then based on job capacity, matched VM cluster is assigned. Next, after entering into VM cluster, the jobs are assigned to respective VM based on their available space and execution time. The Virtual machine execution time is calculated based on number of jobs are assigned to it.

In our system these jobs are executed by VM in a sequence. It means the jobs with lowest execution is executed first and job with higher execution time is executed subsequently. Like this our proposed system performs the optimal job sequencing algorithm. Further based on the total time required to execute all the jobs by VM, the VM’s are also scheduled in sequence in VM cluster, i.e., VM which is having the less execution time having the priority first and then subsequent VM’S are scheduled for execution of jobs. Finally, load is predicted based on utilized capacity of VM and remaining capacity.

Following are the steps to implement sequencing algorithm.

  1. 1.

    Retrieve inputs from cloudsim tool and store it into database.

  2. 2.

    VM capacity is calculated based on \({\text{VM capacity }} = ({\text{nofpe }} \times {\text{MIPS}}) + {\text{bw}}\)

  3. 3.

    The created VM’s belongs to respective datacenters, i.e., each and every datacenters have number of virtual machines. So, these virtual machines are categories based on their datacenters (DC).

  4. 4.

    Federation are implemented in this work. The federation is nothing but just grouping DC with respect to cost of the DC. Because, in real time scenario, sometimes users need services or Million instruction per second (MIPS) based on their required amount cost. So this work classifies the DC with respect to their cost and MIPS. In this work federation of DC are classified into following categories,

    • High MIPS with low cost

    • High MIPS with medium cost

    • Medium MIPS with medium cost

    • Low MIPS and medium cost

  5. 5.

    Due to the federation of DC Virtual machines also belongs to respective DC’s federation group.

  6. 6.

    Here, we are applying clustering based on Virtual Machine’s capacity in each federation groups. For example, if in federation1 there are 2 DC, DC1 have 10VM and DC2 have 7VM then these 17VM’s belong to federation1. So, these 17VM’s will be clustered based on their capacity.

  7. 7.

    After clustering of VM, the input jobs are initialized.

  8. 8.

    Based on the job’s capacity, matched VM clusters are assigned. Then, after entering into the VM cluster job will be assigned to VM with respect to their available space and execution time.

  9. 9.

    The Virtual Machine’s execution time is calculated based on assigned job’s execution time. For example if job1, job2, job3 are assigned to VM1 and their execution time such as 2, 3, 4 then the execution time for VM1 will be 2 + 3+4 = 9.

  10. 10.

    Based on execution time of Virtual machines, VM’s are prioritized into each and every VM clusters. For example, VM1, VM2,VM3,VM4 into a belongs to same cluster then their execution time, respectively 8, 6, 7, 11 then it will be prioritized as VM2, VM3, VM1, VM4.

  11. 11.

    Then after assigning of jobs the utilized capacity is calculated to predict load applied on each Virtual Machine.

Thus, combining all above steps results in an efficient optimization algorithm.

4.2 Algorithms

The proposed system is implemented using following algorithmic strategies. Algorithm 1 is constructed for clustering the datacenters. Clustering of datacenter is performed based on datacenter cost and MIPS Parameter and outcome of this algorithm is system grouped datacenter into four clusters.

figure a

The algorithm 1 can be best explained as follows:

Initially each datacenter’s calculated for the no of MIPS executed and the cost of the MIPS executed by each datacenter. A threshold value is set for the lower MIPS and upper MIPS. Also a threshold value is set for the lower and the upper limit for cost. Based on the threshold value MIPS and cost values are divided into three ranges namely Low, Medium, High. The Datacenters are Clustered based on the following order

Cluster 1: Maximum No of MIPS Executed in Minimum Cost → F1

Cluster 2: Maximum No of MIPS Executed in Medium Cost → F2

Cluster 3: Medium No of MIPS Executed in Medium Cost → F3

Cluster 4: Low No of MIPS executed in Medium cost → F4

For example a list has 5 datacenters having the MIPS value in List = 10, 20, 30, 40, 50;

So highest MIPS value is 50 so Max = 50; Max/3 = 50/3 = 16.66 (round off to 17) val1 = 17;

So result * 2; 17* 2 = 34; val2 = 34; val3 = max = 50;

Threshold values 17, 34, and 50

So take the list;

In Group1: (< val1)

Element = {10}; //Low MIPS

In group 2 (>=val1 && < val2)

Element = {20, 30}//Medium MIPS

In group 3: (>=val2)

Element = {40, 50}//High MIPS

As described in the above example, the created datacenters are listed in and grouped into three group vice, low, medium, high based on its MIPS rate.

The Same way the cost of the datacenters are also grouped into three groups are low medium and high.

figure b

The algorithm 2 is designed for VM Placements purpose, the VM can be placed on a Physical machine by checking the capacity of Physical machines in the clustered Datacenters.

figure c

The algorithm 3 is best explained with following example

As proposed system having 4 cluster of datacenter group, in each datacenter group, list the VM, take the capacity of VM in each cluster, and compute the maximum capacity of the VM.

Ex: if there are three group in datacenter F1, F2, F3, F4;

Consider F1 has 10 VM placed.

So compute capacity of the each VM, let VM capacity of each be in vmlist as

F1 = list {100,550,850,752,150,658, 200,300,400,685}

Maximum in list is 850;

Threshold is 850/3 = 283.3333 (round off to 284)

Threshold 1 = 284;

Threshold 2 = (284*2) = 568;

Threshold 3 = 850;

Condition 1: (< threshold 1)

F1_C1 = list {100,150,200}

Condition 2: (>=threshold1 && < threshold2)

F1_C2 = list {300,400,550}

Condition 2: (> threshold2 && <=maximum value)

F1_C3 = list {658,685,752,850}

Similarly doing this for all the federated group of datacenters.

figure d

In Algorithm 4 the Job size are consider as a parameter and the conditional probability is calculated for all the jobs. Here the conditions are

  1. 1.

    job size > 4000

  2. 2.

    job size < 4000 and ≥2000

  3. 3.

    job size < 2000

The probability value is calculated for the jobs that are satisfying the conditions.

As there are two conditions to compute P (a) and P (b);

Then the conditional probability will be P (a|b) = P (ab)/P (b) or simply P (a|b) = P (a) × P (b)/P (b)

$$P(a|b) = \frac{{{\text{P}}({\text{ab}})}}{{{\text{p}}({\text{b}})}}{\text{Or simply}}P\left( {a |b} \right) = \frac{{{\text{P}}\left( {\text{a}} \right) \times {\text{P}}\left( {\text{b}} \right)}}{{{\text{p}}\left( {\text{b}} \right)}} = p(a).$$
(1)

So in our implementation, it is required to compute the execution time based on the job size.

System group the job size into < 2000, 2000–4000, > 4000;

So in the first case, and the third case p (a) is only provided

Case 1: probability of jobs with job size < 2000;

$${\text{Therefore}},{\text{ p }}\left( {{\text{job }} < { 2}000} \right) \, = \frac{{{\text{no\,of\,jobs }} < 2000}}{total\;no\;of\;jobs}.$$
(2)

Case 3; probability of jobs > 4000;

$${\text{Therefore}},{\text{ p }}\left( {{\text{job }} > { 4}000} \right) \, = \frac{{{\text{no\,of\,jobs}} > 4000}}{total\;no\;of\;jobs}.$$
(3)

But in case 2; there are 2 conditions; jobs > 2000 and jobs < 4000;

$${\text{Therefore}};{\text{ p }}\left( {{\text{jobs }} > 2000{\text{ and}}\,{\text{jobs }} < 4000} \right) \, = \frac{{p\left( {jobs > 2000} \right) \times p(jobs < 4000)}}{p(jobs < 4000)}.$$
(4)

Based on this conditional probability, the execution time of job is estimated with the three possible conditions.

Job are assigned to the VM’s based on the broker associated to it. The broker can assign jobs to the VM’s that are only allocated to them.

5 Performance analysis

In real Time the Datacenter capacity will be of huge size. It is not really possible to show creation of data centers and virtual machines. So the System is implemented using cloudsim tool having physical configuration of is i3 processor, 4 GB RAM and 1 TB HDD.

Figure 2 Illustrates total time needed to execute all the jobs by a VM, the VM’s are also scheduled in sequence in VM cluster, i.e., VM which is having the less execution time having the priority first and then subsequent VM’s are scheduled for execution of jobs. It is observed that within VM if there are two jobs for ex. job 19 having execution time 671 and job 0 having execution time 658, then by applying our proposed system these jobs are also sequenced that is job having less execution time, i.e., job 0 will be given first priority for execution then job 19 as shown in Fig. 2.

Fig. 2
figure 2

Displaying priority of jobs

Then Load is predicted based on utilized capacity of VM and remaining capacity as shown in Fig. 3. The load of each virtual machines is calculated by extracting the total capacity and the utilized capacity of the each and every virtual machine, i.e., Load (L) = (total capacity − utilized capacity)/total capacity × 100.

Fig. 3
figure 3

Displaying predicted load

Figure 4 shows the Graph for the first parameter, i.e., accuracy of VM placement in different federated cluster. It is observed that maximum 93% accuracy is achieved to place the VM’s in Federated datacenter. When there are 55 virtual machines to be allocated to the federated datacenters (FDC), the best datacenter will be chosen by the VM placement algorithm. Thus, the placement accuracy is a measure of the correctly placed VM’s in the best datacenters while comparing with the total number of virtual machines. This measure is computed for each and every federated datacenter.

Fig. 4
figure 4

Graph for accuracy of VM placement

Second parameter is load stabilization or balancing which is the measure of the rate of correctly migrated jobs to the best virtual machines while comparing with the total number of jobs which has to be migrated due to overload or under load. This measure is computed while varying the number of virtual machines. Load stabilization should be high. When it is low, it represents that, the proposed methodology is not able to stabilize or migrate the unallocated jobs effectively. In case, if the number of virtual machines are small, stabilization rate may be low. When the number of virtual machines are considerably large, the stabilization rate will be high and the proposed model will show the same result as shown in Fig. 5—if there are 2500 virtual machines then the load stability percentage is 25% which shows that as compared to other methods our system is outperforms.

Fig. 5
figure 5

Graph for load stabilization: no. of VM’s vs load stabilization in %

Third parameter is response time. The Fig. 6 compares the response time taken by the VM’S. The response time is the measure which represents the time taken to complete the executions of the jobs which are allocated to the virtual machines. The datacenter can hold ‘n’ number of virtual machines placed in it. The response time increases, if the number of virtual machines increases. Since the increase in the number of virtual machines shows the increase in the availability of the resources for the execution of the jobs in the datacenter. As shown in Fig. 6 for 3500 Virtual machines only 25 ms are required for getting response, whereas for the same 3500 VM’s using Greedy algorithm take 50 ms. As per the observation in Fig. 6 the proposed system is very effective as compared to all other systems.

Fig. 6
figure 6

Graph for number of VM’s vs response time

Figure 7 shows the graph for response time when the jobs are increased. Job scheduling time increases with the increase in the number of jobs. But when simultaneously the virtual machines are increased, the job scheduling time decreases. Since when minimal number of virtual machine refers to job allocations, waiting time and execution time. It is observed that proposed system takes only 110 ms for 7000 jobs whereas ABC_FCFs takes 200 ms to respond the 7000 jobs. As we have compared the proposed system with three existing methods and it is concluded that proposed system outperforms as compared to the all other existing systems.

Fig. 7
figure 7

Graph for increased number of job vs response time

The Fig. 8 shows the reduction in response time of the virtual machine when the number of jobs needed to be executed are fixed and the number of virtual machines are increased in each case. It is pointed that only 40 ms are required as a response time for 21,000 virtual machines. So it takes very less time (as shown in Fig. 8 with pink line) to execute the jobs when more no of VM’s are provided. The execution time is less because execution time refers to time which is taken to complete the allocated task. The execution time of our proposed framework is less as compared to other existing system because of our two major contributions, First is the federations of the data center group’s similar datacenters and helps the allocation of the similar jobs to a datacenter and second is our job scheduling model effectively predicts the best jobs to be placed to the best virtual machine for speedy execution.

Fig. 8
figure 8

Graph for increased no. of VM’s vs response time

6 Conclusion and future work

This paper presented a load prediction algorithm based on execution time of each task on Virtual Machine and sequencing them to improve service response time using optimal sequencing algorithm. Also this paper presented a VM Placement technique, which reduces time needed to place a virtual machine to physical machine- to respond the user’s request. Finally, this paper discussed job priority algorithm based on execution time of job which improves the system performance. The proposed scheduling strategy was simulated using Cloudsim tools.

Our model outperformance is established by the parameters-high accuracy of virtual machine placement, high rate of load stabilization, minimal job scheduling time, and minimal response time. All these parametric responses shown that our proposed model outperforms.

As a future work, we are to planning to design a system which can reduce the energy consumption. In this view a novel energy efficient resource management model can be designed to handle resource scheduling and for the minimization of energy utilized by the cloud data centers for the computational work.