1 Introduction

Distributed computer systems are one of the main results of networking technology which are made from the connection of heterogeneous and homogeneous computing resources and high performance computing systems [13]. Users send their requests to the distributed system and the operating system distributes the jobs among the computers to respond in minimum time without informing the users about this distribution.

Since the entering tasks are based on random and irregular patterns and due to the heterogeneity of resources, it is conceivable that a computer resource is overloaded in one site of the distributed system while in another site there is an idle resource. Thus, resource management and load balancing are vital challenges in designing distributed systems to utilize complete computing power of resources [47].

Load balancing algorithms are divided into two categories of static and dynamic schemes. The static schemes do not use any system information and simply utilize an average behavior of the system. In contrast, dynamic schemes consider the instantaneous system states for distributing jobs. Despite dynamic schemes having more precision compared to static schemes, the overhead costs of dynamic schemes from exchanging necessary system information causes static schemes to have equal or even better performance than dynamic schemes [2, 8, 9]. Thus, this paper focuses on static load balancing problem.

Jobs in a distributed system can be categorized based on their resource usage characteristics and ownership. Based on the number of job classes considered, we have a single-class or multi-class (multi-user) job distributed system. In this paper, we consider the static load balancing problem for multi-user jobs in a distributed computer system that consists of heterogeneous host computers (nodes) interconnected by a communication network.

One of the main objectives of load balancing problem is minimizing the makespan, which means the maximum load over all the machines of the distributed system [911]. The load balancing schemes which aim to minimize the makespan are desired to achieve a balanced allocation of jobs over all machines based on their computational power to minimize the computation time of all the jobs. Thus, it can be said that the aim of minimizing the makespan is the fair allocation of jobs at the level of end-system and in proportion to the computational powers of the machines. Because of the responsibility of distributed system in processing users’ jobs, minimizing the expected response time of users’ jobs must be considered in designing a solution for the load balancing problem.

For balanced allocation of users’ jobs among machines and achieving approximately equal expected response time for all of the users, a measure has been introduced call the fairness index [12]. A fairness index approximately equal to 1 means an equitable allocation of users’ jobs among the machines or a fair allocation of jobs at the level of end-user. The proposed load balancing scheme is focused on minimizing two measures of makespan and fairness index, since its objective is establishing fairness in distributing jobs at the level of both end-system and end-user.

In this paper, it is assumed that the loads of the entry jobs to the distributed system are arbitrarily divisible. These loads have the property that all jobs in the load demand an identical type of processing. This type of load can be partitioned into any number of load fractions. Also, the load fractions may have precedence relations or not. The loads that have no precedence relations can be independently processed. It is assumed that the entry loads to the distributed system are of this type. There are various examples applications of this kind of loads, such as image processing applications, signal processing applications, matrix computations, etc. [1316].

In a distributed system, each user only focuses on his/her jobs and desires to have more computing power. Thus, users have a selfish behavior in distributing their jobs and do not care about the performance of other users [17]. On the other hand, performance of the job allocation for each user depends on the allocation of other users’ jobs. So, if a user knows the allocation of other users’ jobs, can achieve to a better allocation of his/her jobs.

Game theory is a method for modeling a phenomenon when decision-makers interact. A game is composed of three main components: players, strategies of each player, and the players’ preferences [1820]. Originally, this theory was presented for modeling problems in the field of economics. However, this theory also has been applied to problems in the field of computer networking, especially to problems of resource allocation [21, 22] and also distributed systems [2, 7, 2326].

In all of the game-theoretic models, the basic entity is a player. The selected decision by the player in each step of the game is the strategy profile of the player in that step and the total strategy profile of all players is termed the overall strategy profile of the game. A utility function is defined for each player of a game-theoretic model which is an estimate of the obtained payoff for the player based on the overall strategy profile of the game. It is assumed that the players are rational, i.e. based on the selected decision of other players, the player will take a decision that increases his/her payoff. So, each player attempts to earn a higher payoff. In non-cooperative game models, the selected strategy by each player is independent of the others. However, in the cooperative games, the players cooperate in order to adopt a strategy. Since, in each step players adopt a strategy that maximizes their payoff, a solution of a game is an equilibrium. This equilibrium is an overall strategy profile of the game where each player can’t earn a better payoff by changing his/her strategy if the strategy profile of the other player(s) are fixed. So, the equilibrium is stable and the players have no incentive to change their strategy profile. The determination of the utility function for the players is an important part of a game model, because the players select their strategy profiles based on the payoff that is obtained from their utility function. So, the solution of the game is the result of this function [17, 18].

If the above mentioned components of the game theory are determined for a system, then the system can be modeled using this theory. Modeling a problem using game theory and applying rules among agents help in controlling agents’ behavior. Every efficient and optimal solution of the load balancing problem depends on the allocated load of users’ job among system resources. Because of the dependency of users’ allocating decisions on the result of load balancing, this type of problem can be modeled using game theory [17].

In this paper the load balancing problem is modeled as a non-cooperative game among users of the distributed system. Because of the selfish nature of users in the distributed system, the load balancing game is modeled in such a way that the decision of each user, besides considering the user’s expected response time, attempts to balance workload based on the current computational power of the system.

In distributed systems, each user desires to execute his/her jobs in minimum time and utilize the maximum computing power of system resources. If the system is load balanced during the job execution then these goals are achieved. Thus, game theory is used to model the load balancing problem and the game solution is used for the job allocation that leads to load balanced job execution.

Since all players are trying to achieve their goals and the game ends when all players are reached quantitatively their payoff, it can be said that a game-theoretic load balancing scheme tries to establish fairness in the obtained payoff by all of the players. Since, the proposed load balancing scheme both minimizes the users’ expected response time and balances the distribution of the loads among all machines, we conclude that this method offers quality of service guarantees at the both user and system levels. It is for this reason that the utility function of each user is estimated by the combination of two parameters: the expected response time and the unbalanced allocation error.

The strategy profile of the game is also Pareto optimal. This is the case when there is no strategy profile that Pareto dominate it [17, 18, 27]. Based on [20, 28] in a multi-agent system there is no guarantee that the Nash equilibrium of a game is the Pareto-efficient solution. It is also notable that certain equilibriums in multi-agent system may not be stable [28]. Thus, in this paper, evolutionary algorithms are applied to approximate the optimal or near-optimal solution of the game based on the concept of Nash equilibrium. Although finding Pareto optimal solution is not guaranteed, simulation experiments show appropriate distribution of the workload for the proposed load balancing scheme.

The paper is structured as follows. The considered distributed computer system model and components of the proposed non-cooperative load balancing game are described in Sect. 2. Section 3 describes the methodology of approximating optimal or near-optimal distribution of the workload using genetic algorithm and an introduced hybrid population-based simulated annealing algorithm. Performance evaluation of the proposed scheme and simulation results are presented in Sect. 4. Finally, Sect. 5 presents the conclusions of the paper.

1.1 Related work

Many studies exist about the load balancing problem, in which different methods such as deterministic or heuristic approaches are used [5, 25, 29, 30]. However, there exist some studies that use the non-cooperative approach for load balancing in distributed systems [2, 25].

Most of the above used the global approach, where the focus is on minimizing the expected response time of the entire system over all the jobs. The schemes that implement the global approach determine a load allocation to obtain a system-wide optimal response time and the fairness of allocation was not considered.

Since game theory seeks for an equilibrium for all the players of the game, using this theory helps in finding a fair load balancing solution. Thus, game-theoretic load balancing schemes are fair, however there is no guarantee for fairness for methods that are not based on game theory [17, 25].

In [25], Grosu and Chronopoulos proposed an algorithm for solving the load balancing problem in distributed computer systems based on game theory that modeled the system as a non-cooperative game among users. Optimal allocation of users’ jobs is the focus in this scheme. The expected response time of each user is considered as his/her payoff. In this scheme, the best response of players is deterministically estimated by the proposed method. In [31], Son et al. proposed a load balancing model for future internet. In this research the static load balancing problem is modeled as a non-cooperative game among users and as a cooperative game among the processors. The objective of the introduced load balancing game is minimizing the overall expected response time. The cooperative game tries to establish fairness between processors and the aim of the non-cooperative game is establishing fairness between the users. In [32], a class-based multipath routing algorithm is proposed to support Quality of Service. This algorithm is designed to have traffic load information in one stable period as guidance for controlling the traffic forwarding in the next stable period. In this algorithm the resources on multipath can be selected by the path selection function to achieve load balancing, increase network throughput and reduce the queuing delay. In a study investigated by Kołodziej and Xhafa the task scheduling problem is modeled as a non-cooperative game in a Grid system [24]. In this scheme, security parameters are considered as a measure in the performance of solution. A genetic algorithm is utilized for solving the game model. In [23], Rzadca and Trystram studied resource management in computing Grids by analyzing the selfish load balancing problem. The problem is modeled as a game among clusters. Players’ strategies are modeled in a way that the clusters are promoted to cooperate in sending their jobs to each other. In [5], a dynamic agent-based method is presented by Eftekhari et al. for solving the load balancing problem. Some local and global agents are considered in the distributed system. Load balancing is established by using message passing via a threshold-based mechanism. In a study conducted by Mani [30], a solution has been represented for solving the static load balancing problem in distributed computing systems using real-coded genetic algorithm. A mixture of expected response time of jobs and communication delay parameters are considered in obtaining the solutions. A genetic algorithm converges to an optimal or near-optimal distribution of jobs among machines by iterating on an initial population of the allocation probabilities of jobs.

1.2 Our contribution

The contribution of this paper is twofold. Firstly, it is to introduce the new parameter of unbalanced allocation error to estimate the utility function of each player, which helps in allocating users’ jobs to be more balanced and be based more on the current computation power of system resources. Using this parameter helps to model the game in a way that each player considers the system performance besides user performance in his/her utility function. Thus, the load balancing game can consider performance at both system level and user level.

Because of the selfish nature of users in the distributed system, several non-cooperative load balancing schemes have been introduced the past. The novelty of the proposed scheme is that it considers both end-user and end-system in solving the load balancing problem while the players of the game are only the users. However, in the previously proposed and studied load balancing schemes, the players that are considered as an entity of the system (i.e. users, machines, communication system, etc.) concentrate on a strategy that only increases their own payoff.

Fairness is a major issue in many modern utility computing systems such as Amazon Elastic Compute Cloud [33] and Sun Grid Compute Utility [34] where users pay a price for the compute capacity they actually consume. So, the proposed load balancing scheme can be applied to the cloud computing.

Secondly, it is to introduce a new hybrid population-based simulated annealing algorithm for approximating the optimal or near-optimal solution of the game using the concept of Nash equilibrium. This algorithm tries to control the exploration and exploitation of the problem space [3537]. Since the dimensions of the load balancing problem increase depending on the number of users and computers of the distributed system, evolutionary algorithms can provide an effective method for optimum search over solution pool.

2 Non-cooperative load balancing game

2.1 Distributed system model

It is assumed that each computer of the distributed system is modeled as an M/M/1 queue, which means the arrival rate of jobs to the computer is according to the Poisson distribution, and their processing time are according to the exponentially distribution. All computers are connected together by a communication network. The considered parameters for describing the system model in this paper are as follows [25, 30, 38].

  • m: Number of users

  • n: Number of computers

  • \(\mu _i \): The average processing rate of the computer i

  • \(\lambda _{j }\): The average job generation rate of user j

Fig. 1
figure 1

A general model of the distributed computer system

Figure 1 is shows a general model of a distributed computer system in which the computers are modeled as M/M/1 queues and computer i has the Poisson arrival rate of jobs with an average rate of \(\Phi _\mathrm{i}\) [3942].

2.2 Load balancing game model

In distributed systems, each user desires to execute his/her jobs in minimum time and utilize maximum computing power of system resources. Because of this selfish manner of users, the load balancing problem in this paper is modeled as a non-cooperative game among users of the distributed system.

To model this problem by game theory, the strategy of each player is considered as the fractions of the jobs that he/she sends to the computers of the distributed system for processing. If p\(_\mathrm{ji}\) represents a fraction of user j’s job which is allocated to computer i, the strategy profile of player j is as the vector \(\hbox {p}_\mathrm{j} = (\hbox {p}_\mathrm{j1}, {\ldots },\hbox {p}_\mathrm{jn}\)), and the overall strategy profile of the load balancing game is as the vector P = (\(\hbox {p}_{1},{\ldots }\hbox {p}_\mathrm{m}\)). So, the overall strategy profile of the game is an \(\hbox {m}\times \hbox {n}\) matrix where the jth row of this matrix is represents the strategy profile of the jth player. Thus, user j will send his/her jobs to computer i by the rate of p\(_\mathrm{ji}\lambda _\mathrm{j}\). Figure 2 represents a general model of the expressed load balancing problem in a distributed system.

Fig. 2
figure 2

A general model of the load balancing problem

The proposed load balancing scheme can be applied to the cloud computing. Cloud computing is an emerging commercial infrastructure paradigm. This technology promises to eliminate the need for maintaining expensive computing facilities by companies and institutes. Clouds are becoming an alternative to clusters and parallel production environments for scientific computing applications [4345]. Cloud systems provision on demand to users a cluster of computers (virtual machines). Such clusters could be homogeneous or heterogeneous distributed systems where the resources are geographically distributed. An example of such clouds is Amazon EC2 [33] and GoGrid [46]. We can make the realistic assumption that the user job request scheduling is served by software agents. They are responsible to negotiate the service level agreement (SLA) with the cloud authority. So, we consider a game theory to model this process which must be (approximately) optimal (in some sense) to both the user and the system. At the user end we assume that there exists a scheduler which packages the job request in proper way for the cloud system. So we can assume that jobs streams are created by arbitrarily dividing the entire computation tasks.

By considering the afore-mentioned model of the distributed system and the load balancing game model, some initial preferences must be contemplated in order to have legitimate and valid strategies for each player. These initial preferences are as follows [2, 25, 30, 47].

  1. I.

    \(\sum \limits _{j=1}^m {\lambda _j } \,\langle \,\sum \limits _{i=1}^n {\mu _i,\,} \,i=1,\ldots ,n,\,\,\,j=1,\ldots ,m\)

  2. II.

    \(p_{ji} \ge 0,\,\,i=1,\ldots ,n,\,\,\,j=1,\ldots ,m\)

  3. III.

    \(\sum \limits _{i=1}^n {p_{ji} } =1,j=1,\ldots ,m\)

  4. IV.

    \(\sum \limits _{j=1}^m {p_{ji} } \lambda _j \langle \mu _i,\,i=1,\ldots ,n\)

Inequality I states that the total arrival rate of jobs must be less than the total processing rate of the system to assure the processing of all jobs. Constrains II and III are guarantees that all of the user’s job must be allocated to the computers. Inequality IV shows that the job arrival rate of users to each resource must be less than the processing rate of the resource.

In the introduced load balancing scheme the users are considered as players of the game. Each user desires to execute more amounts of his/her workload to achieve more payoff, without considering the payoff of the other players or the payoff of the system. To enforce each player to care about improving the performance of the system besides increasing the user’s payoff, the utility function of players is determined by a combination of two parameters. The first parameter is the expected response time of user that is about the user’s payoff and the second parameter is about the unbalanced allocation error of users’ jobs whose purpose is to obtain a balanced allocation of jobs among machines. The following explains a way of estimating the utility function of each player.

According to [2, 25], if P represents the strategy profile of the load balancing game, the expected response time of computer i, is computed as follows.

$$\begin{aligned} E_i (p)=\frac{1}{\mu _i -\sum \limits _{j=1}^m {p_{ji} \lambda _j } } \end{aligned}$$
(1)

Since user j sends fraction p\(_{ji}\) of his/her jobs to computer i, the overall expected response time of user j will be calculated as follows [2, 25].

$$\begin{aligned} D_j (p)=\sum \limits _{i=1}^n {\frac{p_{ji} }{\mu _i -\sum \limits _{k=1}^m {p_{ki} \lambda _k } }} \end{aligned}$$
(2)

The second parameter considered in calculating the players’ payoff is the unbalanced allocation error of users’ jobs. Using this parameter helps each user to distribute jobs based on the current computational power of computers. Thus, declaring this parameter in the game model will lead to consideration of system performance in the utility function of the players. To estimate the unbalanced allocation error, firstly, the mean utilization of users’ jobs of the computational power of system must be computed. The computer executing the maximum load of user j’s jobs is considered as C, which is found as follows.

$$\begin{aligned} \frac{p_{jC} \lambda _j }{\mu _C -\sum \limits _{\begin{array}{l} l=1 \\ l\ne j \\ \end{array}}^m {p_{lC} \lambda _l } }\ge \,\frac{p_{ji} \lambda _j }{\mu _i -\sum \limits _{\begin{array}{l} k=1 \\ k\ne j \\ \end{array}}^m {p_{ki} \lambda _k } }\,\,,\,\,\forall i=1,\ldots ,n \end{aligned}$$
(3)

Let \(U_{ji}\) be the utilized computing power of computer i by user j in proportion to all the computers in the distributed system which is computed based on the ratio of transmitted load of user j to computer i in proportion to the transmitted load of this user to computer C, i.e.:

$$\begin{aligned} U_{ji} =\,\frac{p_{ji} (\mu _C -\sum \limits _{\begin{array}{l} l=1 \\ l\ne j \\ \end{array}}^m {p_{lC} \lambda _l } )}{p_{jC} (\mu _i -\sum \limits _{\begin{array}{l} k=1 \\ k\ne j \\ \end{array}}^m {p_{ki} \lambda _k } )} \end{aligned}$$
(4)

Thus, the average utilized computing power of all n computers in the distributed system for user j, is estimated by the following equation.

$$\begin{aligned} \bar{U}_j =\frac{\sum \limits _{i=1}^n {U_{ji} } }{n} \end{aligned}$$
(5)

The unbalanced allocation error of user j’s jobs among n computers is calculated by the mean square error of \(U_{ji}\), and \(\overline{U_j }\), in the following way.

$$\begin{aligned} \Delta _j =\frac{\sum \limits _{i=1}^n {(U_{ji} -\overline{U_j } )^2} }{n} \end{aligned}$$
(6)

Since the goal of load balancing is optimum allocation of users’ jobs among system resources and maximizing the system performance, players also must follow this objective, the second parameter of unbalanced allocation error is applied for this reason.

Let \(\alpha \) and \(\beta \) be fixed numbers in the interval of [0, 1]. We will use them to express trade-off between the D\(_\mathrm{j}\) and \(\Delta _\mathrm{j}\). By the linear combination of the two parameters D\(_\mathrm{j}\) and \(\Delta _\mathrm{j}\), the payoff of player j is estimated as follows.

$$\begin{aligned} F_j =\alpha \times D_j +\beta \times \Delta _j \end{aligned}$$
(7)

In other words, the greater \(\alpha \) implies the more attention of the player is paid to the end-user’s performance (i.e. user’s expected response time), and the greater \(\beta \) implies the more attention is paid to the system performance. Since, for the function F\(_\mathrm{j}\), the closer the value of the \(\alpha \) to 1, the more concentration of the load balancing game to the end-user for the expected response time of their jobs. Also, the closer value of the \(\beta \) to 1, the more concentration of the load balancing game on the performance of the machines of the distributed system. Choosing the appropriate values for the parameters of \(\alpha \) and \(\beta \) depends on the objectives that are determined in designing the distributed system. Since \(F_{j}\) represents the payoff of player j, the player prefers a strategy with lower \(F_{j}\) value between the strategy profiles \(p_{j}\) and \(p_{j}^{*}\), in order to receive a higher payoff.

3 Solving the load balancing game

In multi-agent systems (MAS), it may be that certain equilibriums not be stable [28]. Moreover, there is no guarantee for Pareto-efficiency of the Nash equilibrium. Due to these reasons and according to many published studies on evolutionary algorithms as a search methods to find the Nash equilibrium in game theory [4851], in this paper efforts have been made to approximate the optimal or near-optimal solution of the load balancing game using evolutionary algorithms (EA) and the concept of Nash equilibrium.

Nash equilibrium of a game is a strategy profile in which no player can increase his/her payoff while other players’ strategies are fixed. Thus, if the strategy of player j is represented as \(p_{j}^{*}\) and other players strategies are shown as \(p_{-j}^{*}\), then the Nash equilibrium of the game is the strategy profile \(p^{*}\) which that has the following conditions [18].

$$\begin{aligned} F_j (p_{-j}^*,p_j^*)\mathop \ge \limits _j F_j (p_{-j}^*,p_j )\,\,\,\,\,\,for\,all\,p_{ji} \in p_j \end{aligned}$$
(8)

In this paper two methods based on evolutionary algorithms are introduced for solving the load balancing game. The first method is based on the genetic algorithm and the second one is an introduced algorithm called SAGA that uses the combination of a genetic algorithm and simulated annealing for solving the load balancing game.

Fig. 3
figure 3

The manner of applying evolutionary algorithm in the load balancing game

The general procedure for solving the load balancing game using evolutionary algorithms is the same. At first, an initial population of players’ strategy profiles are generated randomly based on the preferences mentioned in Sect. 2. The initial population is the fractions of users’ jobs which are allocated to the computers. So, according to the size of the population, several m\(\times \)n matrices as overall strategy profiles of the game will produce randomly the initial population.

In each iteration of the algorithm, for each player j, the evolutionary algorithm focuses only on the strategy profile of this player in order to approximate the optimal or near-optimal response of player j in proportion to the fixed strategies of other players. In other words, in iteration j of the whole procedure of the algorithm, operators of the EA work only on the strategy profiles of player j that are the jth row of the m\(\times \)n overall strategy profile matrices, while all other players play their last approximated near-optimal response strategies. It should be noted that, up to that moment, the last strategy profile of other players is the best approximated near-optimal strategy profile for them. The evolutionary algorithm iterates until the termination condition is satisfied for all the players.

As it was mentioned above, in the introduced general platform for solving the load balancing game using the evolutionary algorithm, all of the players except the player j selects a fixed strategy and the player j tries to select his/her best strategy according to the best strategy that is approximated for other players. This procedure of approximating the optimal strategy profile corresponds to the concept of estimating the Nash equilibrium in non-cooperative games.

At the end of the jth iteration the approximated near-optimal response of player j will be substituted in the overall strategy profile of the game. Figure 3 represents a general model for applying the evolutionary algorithm in the load balancing game. Table 1 shows the pseudo code of estimating the solution of the proposed game-theoretic load balancing algorithm.

Table 1 Estimating the solution of the proposed load balancing game

The termination condition is the same for all the evolutionary algorithms that are used in this paper. The evolutionary algorithm terminates when no player can reach further payoff regarding his/her past estimated near-optimal response, meaning that all players have selected their approximated Pareto optimal strategies. If \(F_{j}^{itr}\) represents the payoff of player j in iteration itr of the evolutionary algorithm, then \(F_{j}^{itr-1}- F_{j}^{itr}\) represents the improvement of player j’s payoff. The evolutionary algorithm terminates when the sum of square deviation of all players’ payoffs is less than a small number \(\varepsilon \), i.e.:

$$\begin{aligned} \sum \limits _{j=1}^m {( {F_j^{itr-1} -F_j^{itr} })} ^2\prec \,\,\varepsilon \end{aligned}$$
(9)

The type of genetic algorithm considered in this paper is the canonical genetic algorithm [35, 52], because of its simple operators and its appropriate speed which are evidenced in the simulations results. The operators of the genetic algorithm are simple crossover and swap mutation, introduced by Mani et al. [30] and the tournament selection operator [52, 53].

3.1 Hybrid population-based simulated annealing algorithm (SAGA)

Because of the computational complexity of load balancing problem, the hybrid population-based simulated annealing algorithm (SAGA) is introduced to estimate the optimal or near-optimal response of players. This algorithm is a mixture of a simulated annealing and a genetic algorithm.

This algorithm which is based on the stochastic gradient descent process that is used in the simulated annealing algorithm tries to perform more exploration in the initial runs in order to search better the complex space of the load balancing problem. The algorithm tries to gradually increase the amount of exploitation based on the functionality of the genetic algorithm to approximate a better solution based on the explored spaces. The procedure of the algorithm is explained as follows.

Similar to the procedure mentioned in Sect. 3, at first an initial population of solutions is generated randomly. Then SAGA starts with an initial temperature. To estimate the solution of players the algorithm is iterate for each player. In each iteration of SAGA, the population is sorted in ascending order of the fitness value of the player who corresponds to the iteration number (i.e. \(F_{j})\). This process of sorting the strategy profiles is performs based on the fitness function of the player for controlling the exploration and exploitation. Strategy profile of the population may consists of a different strategy profile for player j. Since all of the strategy profiles of the population have been sorted based on the player j’s fitness; in iteration j and for each overall strategy profile in the population the following process will be followed for player j.

The fitness value of the strategy profile of the game is compared with the fitness of the last strategy profile, which is the worst for player j. This difference value is shown by \(\Psi \). Performing the crossover or mutation operation on player j’s strategy profile is determined by a random amount of \(\delta \) which is generated with a normal distribution. If the amount of \(\delta \) is less than \(\exp (\frac{-k\Psi }{T})\), swap mutation is performed, otherwise simple crossover is performed. In here, k is a fixed number to scale the \(\Psi \), and T is the current temperature of hybrid population-based simulated annealing (SAGA).

After applying this procedure on all individuals of the population, the temperature decreases based on a linear relation according to the simulated annealing algorithm [54, 55], and the procedure will repeat until the termination condition is satisfied. So, in high temperatures of the SAGA, the probability to performing mutation is more than that of crossover; when the temperature decreases, the probability of crossover is increased and the probability of mutation is decreased, and so on. This operation of the SAGA algorithm shows that in high temperatures and for the strategy profiles of player j that have lowest fitness, more exploration will be done. In contrast, by decreasing the temperature for the strategy profiles of player j that have greater fitness, the probability of performing exploitation will be higher than exploration which is help the algorithm to converge to the near-optimal solution by searching the explored spaces. Table 2, shows the “Optimal Solution Approximation” procedure of Algorithm1 by applying the hybrid population-based simulated annealing algorithm.

4 Simulation results

The proposed load balancing algorithm has been simulated in a distributed computer system. The simulation results are presented in this section. The main parameters studied in the simulations are makespan and fairness index. The fairness index indicates the equality of users’ job allocation in the system [12], which is calculated as follows.

$$\begin{aligned} fi(D)=\frac{\left( {\sum \limits _{j=1}^m {D_j } }\right) ^2}{m\sum \limits _{j=1}^m {D_j^2}} \end{aligned}$$
(10)

In here, vector D is the expected response time of system users and is represented as \(\hbox {D} = (\hbox {D}_{1},{\ldots }, \hbox {D}_{j},{\ldots }, \hbox {D}_\mathrm{m}\)). The range of the fairness index is [0, 1]. The closer to 1 the fairness index is implied that the more equitable is the distribution of users’ jobs in the system.

Simulations are done on a system with 10 heterogeneous computers which are distributed among 15 users. Table 3 shows the configuration of the simulated distributed system. The first row of Table 3 shows the number of computers with their processing rates which are represented in the corresponding entry of the second row.

Table 2 Using SAGA to estimate the solution of the load balancing game
Table 3 Configuration of the simulated distributed system

The ratio of the total arrival rate of jobs to the aggregate processing rate of the system is defined as the system utilization. This parameter demonstrates the utilization of the processing rate of the machines in the distributed system by the entry jobs. The default value of the system utilization in the simulations was set to 70 %.

The genetic algorithm (GA) and the introduced hybrid population-based simulated annealing algorithm (SAGA) are applied to the proposed load balancing scheme, which are named GALB and SAGALB, respectively. The input parameters of the genetic algorithm and simulated annealing are determined in the simulations and they depend on the assumptions that are considered in designing the distributed system.

4.1 Performance evaluation

In this section we study the performance of the proposed load balancing scheme. To indicate the fair allocation of users’ jobs, the expected response time of users’ approximated optimal strategies in GALB and SAGALB in proportion to the different number of computers in the distributed system, is represented in Table 4. Users’ estimated expected response times are close to zero, which shows the near-optimal system allocation of users’ jobs over computers. This parameter is a portion of utility function of the players in the game. Also, the similarity of users’ expected response times indicates the fair distribution of jobs and fair utilization of system resources by users.

Table 5 represents the expected response time of computers for different numbers of users in the distributed system for GALB and SAGALB. According to Table 5, computers with similar processing rate achieve the same expected response time, which is more obvious for GALB. These results indicate that the expected response time is independent of the allocation of jobs to these computers [2].

Table 4 Expected response time of users in GALB and SAGALB, for different numbers of computers (n)

The introduced parameter for solving the load balancing game is the unbalanced allocation error. To demonstrate the impact of this parameter, the variation of the makespan for different values of \(\beta \) and the constant value of \(\alpha \)=0.3 is shown in Fig. 4.

As it is concluded from Fig. 4, by increasing the parameter of \(\beta \), i.e., the impact of the unbalanced allocation error on payoff of the players, the makespan decreases. Since, the greater value for the coefficient of the parameter of unbalanced allocation error (i.e. \(\beta \)) in the utility function, the more concentration of the players on distributing the loads based on the computational power of the machines. In other words, by increasing the parameter of \(\beta \), the load balancing game will care more about the performance of the end-system.

To investigate the good fitting of utility function of players the performance of the system is studied for different values of \(\alpha \) and \(\beta \). According to Eq. (7), by increasing the amount of \(\alpha \), the players will be more concerned about minimizing the users’ expected response time. As a result, more effort from players to achieve lower expected response time will increase the fairness index. Also, increasing \(\beta \) leads to more attention of the players to the performance of the end-system which results in the makespan being minimized.

Table 5 Expected response time of computers in GALB and SAGALB, for different numbers of users (m)

Since both measures of makespan and fairness index have been evaluated in the simulations to consider the performance of both end-system and end-user, the values of the parameters of \(\alpha \) and \(\beta \) are determined based on these two measures by trial and error. Fairness index and makespan of the distributed system for five different values of \(\alpha \) and \(\beta \) are presented in Table 6.

It can be observed from Table 6 that in most cases increasing \(\beta \) in the GALB and SAGALB schemes leads to the makespan and fairness index being decreased, in contrast to \(\alpha \) being increased. Since the optimal solution is approximated, results may be associated with small deviations. So, GALB and SAGALB result in lower values of makespan in (\(\alpha \)=0, \(\beta \)=1), and higher values in (\(\alpha \)=1, \(\beta \)=0). However the Results of GALB have some tolerance in terms of the fairness index. Since the values of \(\alpha \)=0.3 and \(\beta \)=0.7, have shown near-optimal results for both GALB and SAGALB schemes, these values are chosen in the simulations by trial and error.

However, the measures of makespan and fairness index determine the degree of importance of optimum service to the users and increasing the performance of the machines for the distributed system. Thus, determining the best values for these parameters may not be possible empirically. In fact, estimating the optimum values for these parameters requires the introduction of a new optimization problem.

4.2 Comparison results

The proposed load balancing scheme is compared with two other schemes [5, 25], which are named as Thresh and Non-Coop, respectively. Since Non-Coop [25] is a fair and static scheme and because the dynamic nature of the Thresh [5] scheme, these schemes are selected to be compared with the proposed scheme. Makespan and fairness index of GALB and SAGALB are compared to Non-Coop and Thresh schemes, in terms of the number of different computers and users in the distributed system in Figs. 5 and 6, respectively. The processing rate of the computers that are added to the distributed system is in the interval of the processing rate of the default machines which is shown in Table 3.

Figure 5 indicates the performance of the proposed schemes by increasing the number of computers in the distributed system. It is concluded that GALB and SAGALB result in a more acceptable makespan compared to Non-Coop and Thresh. Also, by increasing the number of computers, which enlarges the size of the distributed system, although the values of the makespan will gradually decrease but the superiority of the proposed methods is still preserved. The performance of the GALB and SAGALB schemes are almost the same in terms of the makespan, however the SAGALB shows a better performance for different sizes of the distributed computer system.

Fig. 4
figure 4

The variation of makespan by changing the coefficient of the unbalanced allocation error parameter

Table 6 Makespan and fairness index for different values of \(\alpha \) and \(\beta \)
Fig. 5
figure 5

Makespan and fairness index in proportion to the number of computers

Fig. 6
figure 6

Makespan and fairness index in proportion to the number of users in the distributed system

Fig. 7
figure 7

Makespan and fairness index in proportion to the system utilization

The fairness index of the proposed schemes of GALB and SAGALB is nearly equivalent to the Non-Coop and close to 1.0, for different number of computers which indicates the fairness in distribution of jobs. The fairness index of the Thresh gradually decreases which shows the inability of this scheme in distributing the jobs fairly.

As concluded from Fig. 6, in terms of makespan, the results of the proposed schemes show significant improvements compared to Non-Coop and Thresh schemes. For small number of users in the distributed system (i.e. less than 20 users), the performance of the GALB and SAGALB schemes will be the same in terms of the makespan, however by increasing the number of users, the performance of the GALB will gradually become a little better than SAGALB scheme. Moreover, by increasing the number of users, the fairness index of the proposed schemes are remains close to 1.0. However, the performance of the Thresh scheme in terms of the fairness index gradually decreases as the number of users increase.

The performance of the proposed schemes is evaluated for different values of system utilization to assess these schemes by varying the total job arrival rate to the system. Figure 7 shows the makespan and fairness index of the load balancing schemes for different values of system utilization that are varied from 0.1 to 0.7.

As demonstrated in Fig. 7, when the system utilization is 0.1, the makespan of the GALB, SAGALB and Thresh schemes are almost the same and the Non-Coop scheme has a higher makespan. By increasing the system utilization and the total job arrival rate to the distributed system, the GALB and SAGALB schemes shows a better performance in terms of the makespan in comparison to the other schemes and the performance of the Thresh scheme is decreases. Although the performance of the GALB and SALB are almost the same for different values of the system utilization but the GALB presents a better performance in some cases in terms of the makespan. The superiority of these schemes in comparison to the other schemes is maintained by increasing the system utilization. The fairness index of the proposed schemes is similar to the Non-Coop scheme and will be close to 1.0 by increasing the system utilization. However, for the Thresh scheme, the fairness index is decreases gradually by increasing the system utilization, which shows that the Thresh is not a fair load balancing scheme.

The simulation results demonstrate the ability of the introduced evolutionary algorithms in approximating the optimal or near-optimal solution for the proposed load balancing game. The appropriate results for makespan and fairness index indicate that the estimated solution shows a good performance for both end-user and end-system.

5 Conclusion and future work

In this paper a method is proposed for load balancing in distributed systems that modeled the load balancing problem as a non-cooperative game among users. The strategy of each player is the fractions of his/her job that is sent to the computers. The preferences of players are a combination of two parameters as user’s expected response time and an introduced parameter named unbalanced allocation error. The unbalanced allocation error parameter help to distribute the workload based on the current computing power of the machines. This leads to each player preferring the strategy profile that besides maximizing user’s utility by minimizing his/her expected response time, maximizes system performance by minimizing the makespan, too. The combination of these two parameters will make the load balancing scheme to care about both of the end-user and end-system.

Since there is no guarantee for Pareto-efficiency of the Nash equilibrium, the optimal solution in the proposed game model, is approximated based on the concept of Nash equilibrium by a genetic algorithm, and a proposed hybrid population-based simulated annealing algorithm (SAGA).

The simulation experiments show that the introduced load balancing scheme, besides fairly distributing users’ jobs over computers, results in a suitable and near-optimal makespan.

The values of the coefficients of \(\alpha \) and \(\beta \) in the utility function of the players is determined empirically. In fact the value of these coefficients depends on the objectives that are determined in designing the distributed system. So, estimating the optimal or near-optimal values for these coefficients in the introduced load balancing scheme needs the definition of a new optimization problem which will study in further works.