1 Introduction

There are various emerging trends in computer science fields and it is changing as per market demands, complex applications, and user-friendly requirements. Cloud computing is one of the latest trends in IT industries and the fastest growing technology due to its applications in many domains such as Military Science, Advanced Quantum machines, Management Science, E-Commerce, different biological sciences for data simulation, and many more other different subject areas. Formally, it is defined as the platform for end users where the Internet should be the active connection and provide the user requirements in terms of resources both hardware and software. It is an extension of conventional systems such as distributed, grid, and parallel computing [1] and it works on the principle [2] “Pay and Use the Resource”.According to NIST [3], the definition of cloud computing is “Cloud Computing is a model for enabling convenient, on-demand network access to a shared pool of configurable computing resources that can be rapidly provisioned and released with the minimum management effort or service provider interaction”. Cloud computing is equivalent to distributed computing over a network and can execute multiple programs or applications concurrently. Two major focuses [4] of the cloud computing platform are Virtualization and Abstraction. Two major classifieds of cloud computing models are deployment models and service models [5] as shown in Fig. 1. Software as a Service (SaaS), Platform as a Service (PaaS), and Infrastructure as a Service (IaaS) belong to the service model category. Similarly, public, private, community, and hybrid clouds are belongs to the deployment models categories. This computing is prominent technology but there are several issues/challenges [6] that have to be concerned. Some of them are performance, bandwidth cost, portability and interoperability, availability and reliability, scalability and elasticity, what to migrate, virtualization, security and privacy, service delivery and billing, etc.

Fig. 1
figure 1

Cloud computing categorization [7]

Cloud Computing can be categorized into three characteristics [9]: first is cloud computing works based on the virtualization technology to access services and to realize fast deployment of resources, second is user can get cloud computing facilities framed services that are designed to the huge amount of information and get accessed via the internet, and the third is cloud computing resources can be dynamically extended and customize according to the needs of users and charged based on what they have used only. An important fact is that users don’t need to control and manage them individually, which can decrease the burden of the end user’s dependence on IT expertise and processing.

General framework of cloud computing shown in Fig. 2 which is consisting five layers such as User Infrastructure Layer, Cloud Application Layer, Platform Layer, Unified Resource Layer, and Infrastructure Layer. These layers are fundamental for designing the cloud domain [8] and the details of these layers are as follows:

Fig. 2
figure 2

Cloud architecture [8]

  • User Infrastructure Layer: This layer represents the front end of cloud users. And it contains the tablet, PC, mobile devices and servers, etc., and is used by end users/cloud users to access the services of the cloud computing system. Here Internet plays a vital role because cloud users get connected via the internet to the cloud.

  • Cloud Application Layer: This layer is related to application software or resources are available to the cloud users in a direct way. CSP (Cloud Server Provider) collect these resources and deploy them based on an ‘on-demand access or ‘pay as you go, model. Here applications like customer relationship management (CRM) e–services and e-research [8] can be deployed by cloud users.

  • Platform Layer: This layer provides the platform level services and delivers development, hosting, deployment and managing. All these services at the user level. Major services are resource management, load balancing, scheduling and service discovery.

  • Unified Resource Layer: This layer contains pods, virtual machines, and logical storage, Physical resource layer is abstracted by this layer.

  • Infrastructure Layer: This layer comprises physical resources that contain huge number of servers/ host machines and physical storage devices. All these depend on the size of the cloud data center.

Several challenges require to be solved such as [6] privacy, legal, vendor lock-in, open standards, security, IT Governance, consumer and storage, performance interference and noisy neighbors, load balancing, etc.

2 Load Balancing in Cloud Computing

Cloud computing is trending and its users are growing at rapid speed which is also increasing the load on the resources such as virtual machine and some others etc. The term load balancing is associated with load on the resources which provides a way to balance the load among the machine which are overloaded, idle, or underloaded.

It is the consistent distribution of load among available machines where the load can be defined as the complete or total allotment of the task to the virtual machine. A noteworthy load balancing approach not only boosts the overall performance but also enhances the performance, matrices also called quality of services (QoS) i.e. makespan, response time, resource utilization, throughput, etc. Load balancing leads to better user satisfaction and ensures that cloud resources are highly utilized. It is also provides a massive improvement in performance and maintains system stability in adverse situations like system failure, overwhelming loads of machines, etc.

Hardware load balancing and software level balancing are two classified of LB. First is hardware load balancing and another is software load balancing. Hardware-based load balancing can be done by a hardware load balancer that is placed in the forepart of the server and provides the direction of all requests to the server. This direction of the path is based on the performance of the system i.e. utilization of memory, CPU, and VMs. Here is a routing manager that directs and distributes server load based on server resources. Whereas software based load balancing, the service executes on each machine in a clump or cluster, if any one of them goes down or fails, another machine in the clump can take place and alter the communicating path among the available machines and involve to engage the extra load, here server the server machine helps to remove the single point of failure of a clump/cluster.

The basic architecture and working process of load balancing in the cloud domain as shown in Fig. 3 [11].

Fig. 3
figure 3

Basic architecture of load balancing

As shown in the above architecture of LB, all the end users submit the request to the virtual machine (VM) for accessing their task on the application window with the help of the Internet. After that cloud service provider (CSP) collects all request and pass them to the cloud manager. Cloud manager associated with load balancer, here load balancer computes all the details regarding the status of unassigned VM by the efficient use of load balancing algorithms. Based on this information load balancer identifies the idle, overloaded, or underloaded status of the VM and then distributes the load among all the available VM accordingly.

2.1 Classification of Load Balancing(LB) Approaches

This section discusses the classification [27] of LB which is based on two factors such as System Load and System Topology. The details of classification shown in Fig. 4.

Fig. 4
figure 4

Classifications of load balancing approaches

  • System Load: This load balancing is classified into centralized, distributed and mixed approaches. Details of these approaches as follows:

Centralized Approach: It is the central part of the whole network which is control and manages the allocation of resources.

Distributed Approach: Here, each node collects load information from other nodes and then creates its load vector independently. These local load vectors are responsible for the local decisions.

Mixed Approach: It is a combination of both centralized and distributed approaches and singly provides benefits.

  • System Topology:This load balancing is also classified into three parts such as static approach, dynamic approach and adaptive approach. Details of these approaches as follows:

Static Approach: Homogeneous and stable environment uses a static algorithm and delivers a better result. If during the execution time dynamic changes to the attributes occur then this approach does not meet the requirements and flexibility.

Dynamic Approach: During before and during execution time, both conditions are desirable for the dynamic approach because it is more flexible and also able to take a different kinds of attributes in the system.

Adaptive Approach: In cloud computing, when the system changes are done frequently, the adaptive approach provides better performance.

This kind of approach does the satisfactory distribution of system load by converting.

their parameters dynamically and also their algorithms.

2.2 Challenges of Load Balancing

There are two important parties in the cloud computing environment which are cloud service providers (CSP) and end-user who are using cloud resources. Various challenges [28] are shown in Fig. 5 faced by these parties during using cloud services. The brief details of these challenges are in Table 1.

Fig. 5
figure 5

Load balancing challenges

Table 1 Details of challenges in load balancing

3 Quality of Service(QoS) Parameters

This section discusses various parameters [12] that affect load balancing in the cloud computing environment. These parameters are used to allocate the tasks onto the virtual machine and also help to analyze the performance between the various models in the cloud computing environment. There are two major parameters such as makespan and energy consumption. Other parameters are reliability, fault tolerance, associated cost, migration time, response time, throughput, thrashing, accuracy, scalability, and predictability. These parameters are also known as Quality of Services (QoS) parameters which can be expressed as follows:

  1. I.

    Turnaround Time (TT) [13]: It is defined as the execution of the tasks of the given workflow on the cloud platform and the duration of time between submission time and finishing time. It is formulated as.

$$\:TT=Max\{{AFT}_{i,j}\}$$
(1)
  1. II.

    Actual Finishing Time(AFT) [13]: AFT of task Ci on virtual machine Vj is computed as the earliest start time (EST) of Ci and EFTi, jand formulated as.

$$\:AF{T}_{i,j\:}=EST+\frac{{C}_{i}^{length}}{{V}_{j}^{cap}}$$
(2)

Where EST is the estimated start time of task Ci.

Cilengthis the length of task Ci.

Vjcapis the capacity of the virtual machine Vj.

  1. III.

    Average Response Time (RTavg) [13]: The difference between the earliest start time (ESTi) and arrival time(ATi) provide response time and RTavg of all task can be calculated as.

$$\:R{T}_{i}=ES{T}_{i}-A{T}_{i}\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:$$
(3)
$$\:R{T}_{avg}=\frac{(\sum\:_{i=1}^{n}R{T}_{i})}{n}$$
(4)
  1. IV.

    Average Utilization (Uavg) [13]: Utilization of each VM can be defined as the utilization of the system. Resource utilization is one of the major factor for associated Cost (Ca). It can be computed as:

$$\:{U}_{j}=\frac{MA{T}_{j}}{TT}\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:$$
(5)

Where MATj refers last computed machine available time of Vj and TT is turnaround time.

The average utilization can be calculated as:

$$\:{\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\text{U}}_{\text{a}\text{v}\text{g}}=\frac{({\sum\:}_{\varvec{j}=1}^{\varvec{m}}{\varvec{U}}_{\varvec{j}})}{\varvec{m}}$$
(6)
  1. V.

    Migration Cost (MIGcost) [14]: It is time elapses for the migration of various resources from one host to another host. This migration can be either task or a VM among the host machines. Maximum Mt will degrade the load balancing and makespan. It is one of the major factors is the VM crashing during the execution of the task. The migration cost of task Ci is computed as the task transfer from one virtual machine to another and denoted as.

$$\:MI{G}_{cost}=\frac{{C}_{i}^{size}}{B{W}_{k}}$$
(7)

Where \({C_i}^{size}\) is the size of task Ci and BWkis the bandwidth of kth virtual machines.

  1. VI.

    Estimated Computation Time(EC) [15, 16,17,18]: It is the scheduling attributes used for allotments of task Ci onto a virtual machine and computed as follows.

$$\:{\:\:\:\:\:\:\:ECT}_{ij}=\left[\begin{array}{ccc}{ECT}_{11}&\:{ECT}_{12}\cdots\:&\:{ECT}_{1n}\\\:{ECT}_{21}&\:{ECT}_{22\:\:\:\:\:\:\:\:\:\:\:\:}\dots\:.&\:{ECT}_{2n}\\\:{ECT}_{m1}&\:{ECT}_{m2}&\:{ECT}_{mn}\end{array}\right]$$
(8)

Where ECTijisthe time of job/task Ci on virtual machineVj.

  1. VII.

    Average ECT (ECTavg) [13, 15]: Average ECT of task Cican be calculated as the ratio of the summation of ECT of all machines and the total number of machines. i.e.

$$\:EC{T}_{avg}=\frac{\sum\:_{j=1}^{TotalM}EC{T}_{i,j}}{{M}_{total}}$$
(9)
  1. VIII.

    Critical Path(CP) [15, 19, 20]: Critical path of the Cientryto Cjexitis computed as.

$$\:CP=\underset{{path\in\:}{C}_{i}}{\text{max}}\left\{length\left(path\right)\right\}$$
(10)
$$\:where\:length\left(path\right)=\sum\:_{{C}_{i}\in\:C}EC{T}_{avg}\left({C}_{i}\right)\:\:+\sum\:_{e\in\:E}{D}_{T}\left({C}_{i},{C}_{j}\right)$$
(11)
  1. IX.

    Earliest Start Time (EST) [15, 21]: It is defined as follows:

$$\:EST\left({C}_{i},{V}_{j}\right)\:=\left\{\begin{array}{c}0\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:if{C}_{i}\in\:{C}_{entry}\\\:\underset{{C}_{i}\in\:{pred}\left({C}_{i}\right)}{\text{max}}\left\{{EFT}\right({C}_{j},{V}_{j})+MET\left({C}_{i}\right)+{D}_{T}\left({C}_{i},{C}_{j}\right)\}\:\:\:\:\:\:otherwise\end{array}\right\}\:\:\:\:\:$$
(12)
  1. X.

    Minimum Execution Time (MET) [15, 21] It is calculated as follows:

$$\:MET\left({C}_{i}\right)=min.\left\{ECT\left({C}_{i},{V}_{m}\right)\right\}$$
(13)
  1. XI.

    Earliest Finished Time (EFT) [15, 22]: It is computed as follows:

$$\:EFT\left({C}_{i},{V}_{j}\right)=EC{T}_{ij}+EST\left({C}_{i},{V}_{j}\right)$$
(14)
  1. XII.

    Load Balance Level [13]: The load balancing level can be defined as the variation of utilization time on an available virtual machine from average utilization and calculated as.

$$\:L{B}_{l}=\sqrt{\sum\:_{j=1}^{m}\frac{{({U}_{j}-{U}_{avg})}^{2}}{m}}$$
(15)
  1. XIII.

    Other Cost-related Parameters [14]: Total gain represents an economic aspect of the core concern for the cloud users of the system and cloud service provider and is related to cost calculated as:

$$\:TG=\sum\:_{i=1}^{n}Profi{t}_{i}-\sum\:_{i=1}^{n}Los{s}_{i\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:}\:\:\:\:\:$$
(16)
$$\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:TL=\sum\:_{i=1}^{n}Los{s}_{i}-\sum\:_{i=1}^{n}Profi{t}_{i}{s}_{i\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:}\:$$
(17)

Where TG is Total Gain and TL is Total Loss

$$\:Profi{t}_{i}=(Deadline\:of\:{C}_{i}-AF{T}_{i,j\:\:})*Cost\:of\:{V}_{j}$$
(18)
$$\:Los{s}_{i}=(AF{T}_{i,j\:\:}-Deadline\:of\:{C}_{i})*Cost\:of\:{V}_{j}$$
(19)
  1. XIV.

    Makespan(Ms) [17, 20]: It is defined as the total time taken by a job for its completion from start to end state. It should be a minimum. All tasks execution finish on available VM and computed as:

$$\:{M}_{s}\:=Min.\left\{EFT\right({C}_{exit},\:M\left)\right\}$$
(20)
  1. XV.

    Waiting Time(Wt) [12]: Waiting time is refers to the total amount of time the ready task waits for the CPU to be assigned.

$$\:Wt=TAT-BT\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:$$
(21)

Where TAT stands for turnaround time and is calculated as the difference between completion time and arrival time.

BT refers to burst time and is calculated as the net amount of time taken by the CPU to execute the whole task,

  1. XVI.

    Response Time(Rp) [24]: Efficient makespan depends on response time. Simply acknowledged for a task by the system when the task is submitted to the system. Integration of transmission time (Tt), service time (St), and waiting time(Wt) of task(t) in the system. i.e.

$$\:{\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:R}_{p}\left(t\right)=\:{T}_{t}+{S}_{t}+{W}_{t}$$
(22)
  1. XVII.

    Reliability (Re) [12]: Major factor of reliability is system configuration. It can be improved by the availability of other resources in case of failure of any system during the execution of the job. The stability of any system also depends on the reliability metric.

  2. XVIII.

    Throughput(Tp) [26]: Number of tasks executed per unit time by resource i.e. VM. System performance is measured in terms of throughput and it should be maximized. It is calculated as:

$$\:{\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:T}_{p}\alpha\:\frac{1}{M}$$
(23)
  1. XIX.

    Thrashing(Ts) [12]: It is related to system resources such as memory i.e. paging system. In respect of cloud computing, VMs are taking more time in migration rather than executing the task. It disturbed the proper scheduling of the tasks on VM.

  2. XX.

    Accuracy(Acc) [12]: Corrective result of execution of the task.The important factor for today’s technology world. It minor decreases the makespan.

  3. XXI.

    Predictability(Pb) [12]: It is an important factor for load balancing and makespan. It decides how to the allocation of tasks, execution of the task, and completion of the task on available virtual machines.

  4. XXII.

    Scalability(Sc) [12]: It is an important feature of any computer system. Define the capacity of the system in case of overload due to the size of the task or the increased number of tasks in the system.

  5. XXIII.

    Energy Consumption (Ec) [29]: It is also an important metric for cloud computing. The primary objective is to minimize energy consumption. Major resources for energy consumption are personal terminals, networking nodes, and local servers.

  6. XXIV.

    Fault Tolerance(Ft) [12]: Ft is one factor for system capability and mechanism to provide regular services in case of one or more system elements are failed. It is a little bit costly.

  7. XXV.

    Speedup(\(\:{\varvec{S}}_{\varvec{s}\varvec{p}\varvec{e}\varvec{e}\varvec{d}\:}\)) [23, 24, 25]: It is calculated as the total value of minimum ECT of all the task Ci and the scheduling length of a given DAG on a Vj.

$$\:{S}_{speed\:}=\frac{Min[{\sum\:}_{j=1}^{m}EC{T}_{i,j}]}{scheduling \ length}$$
(24)

4 Critical Study and Analysis of Load Balancing Methods

This section is the study of load balancing methods it is divided into four subsections such as the brief explanation of LB methods, based on the simulator and tools used, based on environments, and based on evaluative parameters. There are twenty load-balancing methods are taken for this study.

4.1 Load Balancing Methods

This section included twenty load balancing methods including the name of the article, it designated all algorithms by load balancing method (LBM) i.e. LBM1,LBM2….LBM20, and it also includes a brief explanation of all twenty algorithms as given in Table 2.

Table 2 Summary of LB algorithm

4.2 Simulator Tools & Techniques

Simulator tools are the core solution in the process of deployment. The Performance of the application can be determined by the simulator. By using these tools users and service providers can get performance reports of the relevant application and can get this service at no cost. Some most popular simulators are discussed here:

  1. a.

    Cloudsim: University of Melbourne, Australia bringsforthacloudsimsimulator [10]. This simulator can be accessed as an open source and can work with Unix/Linux and windows. In 2009, the first version of cloudsim has been launched, and another version came into the market with an advanced version.

  2. b.

    Cloud Analyst: It is an extended version of the cloudsim simulator. It provides a high level of flexibility and configuration in simulation [49]. Simulation can also be saved in different file formats [50]. Changes in parameters can be done easily without focusing on coding and executing repetitively for the same parameter or altered manner.

  3. c.

    Grid Sim: It is another simulator tool for simulation. It is based on SimJava2 [51] and it is a general-purpose discrete event simulation package that is carried out in java. It uses a message-passing operation elucidated by SimJava to communicate with each other among components.

  4. d.

    Matlab: It is a high-level programming language used by scientists, researchers, and engineers to perform the calculation of array mathematics and matrix directly. Matlab can be used to run from a simple program to a complex one. The desired simulator can be created with the use of Matlab because of its large availability of functions and tools. This kind of simulator provides better user requirements to execute their codes and find appropriate results. Nazi TabatabaeiYazdi et al. used Matlab to create a multicloud simulator to implement their experiment [52].

  5. e.

    Mininet: Mininet is an emulator that is used to test how software is interconnected while underlying hardware or hardware and software are together. Minimet [53, 54] is a network emulator that runs a cluster of switches, end hosts, links, and, routers on a single Linux kernel. It allows various profiling tools including iperf and perf [54].

  6. f.

    Java: Various categories of users used and learned java other than languages. As compared to C + + java have no limitation as the language for simulation. The most important characteristic of java is that it is lack a pointer and this allows more optimization possible for parallel and sequential codes both [54, 55]. C language also can be used for the coding of blocks while simulation because it is known as a mother of all programming languages and also suitable for traditional algorithm (See Table 3).

Table 3 Critical analysis based on simulator used

4.3 Simulation Environment

The computing environment gives information about the computer system, host (server), or workstation and the type of operating system, application, and peripherals they used. In cloud computing the computing environment would be either homogeneous or heterogeneous and are as follows:

  1. a.

    Homogeneous Environment: This environment [56] refers to the H/W of the processor or software module (operating system, compiler) that uses the same storage representation and the same result for various operations performed on floating point numbers when the floating point number communicates with the processor. So the communication layer assures the exact transmission of the floating point value. It is a low-power consumption platform.

  2. b.

    Heterogeneous Environment: In this environment [56], different sets of the architecture of the processor is used, and also used different storage representations. The result may differ from one processor to another based on floating point precision. So the communication layer doesn’t assure exact transmission. It is a high-power consumption platform.

4.4 Comparison of Load Balancing Methods Based on QoS Parameters

QoS parameters are the attributes of the load balancing method that provide the efficiency of algorithms. Here Table 4 depicts the performance matrices based on Table 5.

Table 4 Critical analysis based on environment used
Table 5 Critical analysis based on performance metrics of load balancing Algorithm

5 Conclusion

Numerous algorithms have been suggested for the solution to the LB problem. This critical and comprehensive review provided good scope to researchers for the advancement of LB algorithms for the CCE. This paper will be helpful for the identification of research problems to further change different QoS parameters. This paper presented various LB techniques in different environments, simulators & tools, and OoS parameters i.e. waiting time, response time, throughput, reliability, energy consumption, etc. These parameters are crucial for efficient LB algorithms and play a vital role while selecting and designing new LB problem algorithms for further extension in future work.