1 Introduction

Software-Defined Networking (SDN) is a new network architecture that decouples data plane from control plan and manages network with a global view [1]. Due to its centralized control, open interface and network programmability, SDN can improve network performance and has been deployed in the Internet and data center networks [2]. However, a single SDN controller has not enough control ability to process increasing number of traffic flows and applications in large-scale networks [3]. Researches in [4,5,6] propose to achieve the logically centralized controller with physically distributed multiple controllers to improve the scalability and reliability of the control plane. Specifically, a network is partitioned into several domains, each of which has one domain controller to manage switches and flow requests [7]. Controllers communicate with each other about domain information to ensure a consistent network view. As traffic varies in the network [8], controllers in different domains could handle the different number of flow requests, and the static matches between switches and controllers may result in unbalanced load allocation on controllers: hot controllers with insufficient control capability and cold controllers with low resource utilization [9].

Dynamic switch migration is an elastic control approach to solve the problem of unbalanced load distribution on controllers. It migrates the control of switches from overloaded controllers to underloaded controllers. However, existing schemes only focus on the balancing performance of control load on controllers but ignore migration efficiency, which may lead to high migration costs, increase control overheads, and squander network resources. In this paper, we propose Efficiency-Aware Switch Migration (EASM) to achieve good load balancing performance on the controllers and low migration costs. The main contributions of this paper are summarized as follows:

  • We identify the inefficiency migration problem of existing switch migration schemes and use analysis and examples to explain the undesirable results caused by the existing schemes.

  • We propose EASM for effective switch migration. EASM consists of three algorithms. EASM-1 calculates trigger factors to measure the load balancing performance of controllers. If the trigger factor exceeds a threshold, EASM-2 selects migrating switches by solving the migration efficiency problem, which characterizes load balancing rate and migration cost simultaneously. EASM-3 changes the mapping relationship between switches and controllers.

  • We evaluate the performance of EASM against baseline schemes. The results show that if the controller load imbalance happens, EASM reduces the controller response time for efficient migration by about 21.9%, improves the controller throughput by 30.4% on average, decreases the migration cost and migration time, and gains the better load balancing performance.

The rest of paper is organized as follows. Section 2 illustrates the motivation. Section 3 introduces the overview of EASM strategy. Sections 4 and 5 detail two components of EASM: load balancing judgement and switch migration design. The simulation results are presented and analyzed in Section 6. Section 7 reviews the related work. Section 8 concludes this paper.

2 Motivation

Switch migration is usually used for adjusting the distribution of controller loads through migrating the switch from the overloaded controller to the underloaded controller. However, existing migration designs are difficult to realize good load balancing performance and low migration cost. In this section, we illustrate the problem through an example in Fig. 1 and compare existing solution with our scheme.

Fig. 1
figure 1

A motivating example for comparison

Figure 1 shows an SDN using three distributed controllers. In the figure, the network consists of Domain1, Domain2 and Domain3, and each domain has several switches and is controlled by its domain controller. Table 1 shows the flow arrival rate of each switch at time T. At time T, we use the total flow rates of a domain to represent the controller load of the domain, and use the normalized load variance to represent load balancing rate (LBR), as shown in Eq. (1), where Li is the load of the ith controller, and \( \overline{L} \) is the average load of n controllers. LBR represents the degree of closeness to the ideal load distribution. The higher LBR, the more balanced load distribution.

$$ LBR=\frac{1}{n}\cdot \frac{\sum \limits_{i=1}^n\left({L}_i-\overline{L}\right)}{\sqrt{\sum \limits_{i=1}^n{\left({L}_i-\overline{L}\right)}^2}} $$
(1)
Table 1 The flow arrival rate of switches in the network

In Fig. 1a, the controllers are initialed with the unbalanced loads, and the load balancing rates of three domains are computed as follows.

  • Initial network state under Load Imbalance (LI):

$$ {\displaystyle \begin{array}{l}{Load}_{LI}\left({c}_1\right)=30+30+30=90 KB/s\\ {}{Load}_{LI}\left({c}_2\right)=30+30+40+50=150 KB/s\\ {}{Load}_{LI}\left({c}_3\right)=30+40=70 KB/s\\ {}{LBR}_{LI}=0.641\end{array}} $$

In Fig. 1a, c1 and c3 are low-utilized controllers, while c2 is an overloaded controller. Existing Switch Migration follows OpenFlow 1.3 [10], where one controller has three roles: master, equal and slave. Master controller is used for processing Packet-in requests sent from switches; equal and slave controllers are used as backup. Each switch connects to one master controller and several slave controllers.

Figure 1b shows the result of using an Existing Switch Migration scheme [11]. In the figure, the load of c3 is the lightest and the flow rate of s7 reaches 50KB/s, ESM migrates s7, the switch with the highest flow from c2 to c3 the lightest loaded controller, to balance controller loads. After migration completed, the controller loads and load balancing rate are updated as follows.

  • Existing Switch Migration (ESM)

$$ {\displaystyle \begin{array}{l}{Load}_{ESM}\left({c}_1\right)=30+30+30=90 KB/s\\ {}{Load}_{ESM}\left({c}_2\right)=30+30+40=100 KB/s\\ {}{Load}_{ESM}\left({c}_3\right)=30+40+50=120 KB/s\\ {}{LBR}_{ESM}=0.852\end{array}} $$

During the switch migration, the network will produce the relevant migration costs during switch migration. We use the product of the flow rate and hop to approximately express migration cost MC.

$$ {MC}_{ESM}=50\times 4=200 KB/s $$

Existing Switch Migration brings about the high migration costs, which will aggravate the burden of the overloaded controller. The bigger value of MC, the lower controller throughput. If we can consider switch migration from the perspectives of both load balancing rate and migration costs, the controller performance will be better.

Figure 1c shows the migration result of EASM. In the figure, by simultaneously considering the load balancing rate and the migration cost, EASM migrates s6, the switch with both higher flow rate and shorter migration hops from c2 to c3, to balance controller loads. After migration completed, the controller loads and load balancing rate are updated as follows.

  • Efficiency-aware Switch Migration (EASM)

$$ {\displaystyle \begin{array}{l}{Load}_{EASM}\left({c}_1\right)=30+30+30=90 KB/s\\ {}{Load}_{EASM}\left({c}_2\right)=30+30+50=110 KB/s\\ {}{Load}_{EASM}\left({c}_3\right)=30+40+40=110 KB/s\\ {}{LBR}_{EASM}=0.915\\ {}{MC}_{EASM}=40\times 3=120 KB/s\end{array}} $$

Table 2 show the comparison of three scenes. Compared with ESM, EASM scheme not only improves the load balancing rate, but also reduces the migration costs. From the example, we can see that both load balancing rate and migration cost must be jointly considered in switch migration in order to improve the performance of the controller. We solve three problems in this paper: (i) determining which switches should be migrated; (ii) building the migration efficiency model to select migration switches and controllers; (iii) efficiently implementing switch migration.

Table 2 The comparison of three scenes

3 EASM overview

In this section, we first introduce the notations in this paper and then describe the design of EASM.

3.1 Notations

We formulate the SDN network as the undirected graph G = (V, E), where V and E are node set and link set, respectively. We assume all controllers could be deployed in the topology optimally [12], and each controller manages some switches. The primary notations used in this paper are listed in Table 3.

Table 3 Notations

3.2 EASM strategy

EASM implements switch migration from the perspective of migration efficiency to improve load balancing rate and reduce migration costs. Figure 2 shows the overall design of EASM. In the figure, EASM consists of two modules: (1) load balancing judgement; (2) switch migration design. In the load balancing judgment module, EASM measures the controller loads and builds the load difference matrix, and the trigger factor is defined to judge whether the controller loads are balanced. In the switch migration design module, according to migration efficiency model, EASM builds the migration mapping with three factors: emigration controller, migrating switch, and immigration controller, and implements efficient switch migration. Next, we will present the details of load balancing judgement and switch migration design in section 4 and section 5, respectively.

Fig. 2
figure 2

The overall design of EASM

4 Load balancing judgement

In this section, we will firstly compute the controller’s load through synthesizing different network overheads. Then, we introduce a new effective mechanism to judge controller load balancing and design a load imbalance algorithm.

4.1 Controller’s load

In SDN, the controller loads mainly come from three parts: data interaction overhead for traffic transmission, routing path installment for new flows and state synchronization overhead for global view among controllers.

Data interaction overhead

To achieve the centralized control, domain controllers send/receive information from/to switches, including flow table and traffic data of each switch. We formulate data interaction overhead of controller cm is Fdata(cm) as follows:

$$ {F}_{data}\left({c}_m\right)=\nu \cdot \sum \limits_{s_i\in E\left({c}_m\right)}{h}_{im}\cdot J\left({s}_i,{c}_m\right) $$
(2)

where ν is the average rate of polling one switch, which depends on the number of links; him is the hop between si and cm; J(si, cm) is the connection relationship of si and cm.

Routing formulation overhead

When a switch receives a new flow, it sends Packet-in requests to controller and asks for the flow’s routing path. Upon receiving the request, the controller calculates a new path for the flow and establishes the path by installing flow entries in switches on the path. Figure 3 shows the routing formulation of controller. In the figure, a new flow destines to s5 arrives at s2. s2 sends a request to its master controller c1, and c1 must process Packet-in sent by switches and establishes the path.

Fig. 3
figure 3

Routing formulation in the distributed network

Therefore, for cm, routing formulation overhead Frouting(cm) contains two parts that are Packet-in processing fpacket(cm) and flow table distributing ftable(cm),

$$ {f}_{packet}\left({c}_m\right)={P}_{packet}\cdot \sum \limits_{s_i\in S}\sum \limits_{c_m\in C}{h}_{im}\cdot J\left({s}_i,{c}_m\right) $$
(3)
$$ {f}_{table}\left({c}_m\right)=\sum \limits_{s_i\in S}\sum \limits_{c_n,{c}_m\in C}{\alpha}_{s_i}\cdot {h}_{mn}\cdot {h}_{im}\cdot J\left({c}_m,{c}_n\right) $$
(4)
$$ {F}_{routing}\left({c}_m\right)={f}_{packet}\left({c}_m\right)+{f}_{table}\left({c}_m\right) $$
(5)

where Ppacket is the average size of Packet-in sent by switch; \( {\alpha}_{s_i} \) is the average flow rate of switch si; him is the hop between si and cm; J(cm, cn) is the connection relationship of cm and cn.

State synchronization overhead

In the multi-controller SDN network, synchronization messages are sent between controllers to maintain the global network view, producing state synchronization overhead Fstate(cm),

$$ {F}_{state}\left({c}_m\right)={\zeta}_{sync}\cdot \sum \limits_{c_m,{c}_n\in C}J\left({c}_m,{c}_n\right)\cdot {h}_{mn} $$
(6)

where ζsync is the size of synchronization packet; J(cm, cn) is the connection relationship of cm and cn; hmn is the hop between cm and cm.

Therefore, the controller loads are the linear aggregation of the three overheads in the network [12]. The computation of load L(cm) is shown in Eq. (7).

$$ L\left({c}_m\right)={\sigma}_1\cdot {F}_{data}\left({c}_m\right)+{\sigma}_2\cdot {F}_{routing}\left({c}_m\right)+{\sigma}_2\cdot {F}_{state}\left({c}_m\right) $$
(7)
$$ \sum \limits_{i=1}^3{\sigma}_i=1 $$
(8)

where σ1, σ2 and σ3 are the corresponding weights for different overheads, respectively.

4.2 Effective mechanism

We design a simple but effective mechanism to determine whether the controller loads are balanced in the network.

Firstly, we generate the load difference matrix:

$$ {D}_{M\times M}=\left\{\begin{array}{l}d\left({c}_1,{c}_1\right)\kern1.7em d\left({c}_1,{c}_2\right)\kern2em \dots \kern1.8em d\left({c}_1,{c}_M\right)\\ {}d\left({c}_2,{c}_1\right)\kern1.6em d\left({c}_2,{c}_2\right)\kern2em \dots \kern1.7em d\left({c}_2,{c}_M\right)\\ {}\dots \kern4.899998em \dots \kern3.499999em \dots \kern2.6em \dots \kern0.8000001em \\ {}d\left({c}_M,{c}_1\right)\kern1.4em d\left({c}_M,{c}_2\right)\kern1.6em \dots \kern1.4em d\left({c}_M,{c}_M\right)\end{array}\right\} $$
(9)

where \( d\left({c}_m,{c}_n\right)=\frac{L\left({c}_m\right)}{L\left({c}_n\right)} \), which mean the load difference between controller cm and cn.

For a given load difference matrix, the balancing judgement is shown in Eq. (10),

$$ \exists {c}_m,{c}_n\in C,\kern0.3em {\delta}_{mn}=\left|d\left({c}_m,{c}_n\right)-d\left({c}_n,{c}_m\right)\right|>\Lambda $$
(10)

where δmn is the trigger factor. If δmn is larger than the threshold Λ, there is controller load imbalance in the network, and we need to carry out switch migration at the moment.

Equation (11) shows the computation of the threshold,

$$ \Lambda =\frac{\mathit{\max}{D}_{M\times M}-\mathit{\min}{D}_{M\times M}}{\mathit{\max}{D}_{M\times M}} $$
(11)

where maxDM × M and minDM × M represent the maximum load difference and the minimum load difference, respectively.

Example

Based on the load imbalance scenario in Fig. 1a, we use an example to illustrate the validity of our proposed balancing judgement mechanism.

In Fig. 1a, LLI(c1) = 90KB/s, LLI(c2) = 150KB/s, and LLI(c3) = 70KB/s. Thus, we can get the load difference matrix:

$$ {D}_{3\times 3}=\left[\begin{array}{l}d\left({c}_1,{c}_1\right)\kern0.7em d\left({c}_1,{c}_2\right)\kern0.8000001em d\left({c}_1,{c}_3\right)\\ {}d\left({c}_2,{c}_1\right)\kern0.7em d\left({c}_2,{c}_2\right)\kern0.8000001em d\left({c}_2,{c}_3\right)\\ {}d\left({c}_3,{c}_1\right)\kern0.7em d\left({c}_3,{c}_2\right)\kern0.8000001em d\left({c}_3,{c}_3\right)\end{array}\right]=\left[\begin{array}{l}1.0\kern0.7em 0.6\kern0.8000001em 1.3\\ {}1.7\kern0.7em 1.0\kern0.8000001em 2.1\\ {}0.9\kern0.7em 0.5\kern0.8000001em 1.0\end{array}\right] $$

We get δ21 = 1.1, δ23 = 1.6, δ13 = 0.4 and Λ = 0.7, respectively. Because both δ21 and δ23 are larger than Λ, we identify that there is load imbalance. Using the method in Fig. 1b, c also show the same result. We can see that our balancing judgement mechanism is valid.

4.2.1 Load imbalance detection algorithm

We design load imbalance detection algorithm for this mechanism, which is described as follows. In SDN network, controllers interact information with switches, and compute the aggregated load L(cm) and load difference d(cm, cn) (Line 1). We get the load difference matrix DM × M at the moment (Line 2). Then we compute the trigger factor δmn for different controllers in DM × M, and compare it with threshold Λ (Line 5). If δmn surpasses this threshold, we conclude that there is load imbalance in the network (Line 6). All trigger factors, which surpass the threshold, will be added into a new set TF (Line 7). The pseudo-code of the algorithm is shown in Table 4.

Table 4 Load imbalance detection

In EASM-1, Line 1 gets the aggregated loads and load difference, and its time complexity is O(M); Line 2 constructs the load difference matrix, and its time complexity is O(M2); Line 5 computes trigger factor, and its time complexity is O(N); Line 6 to Line11 generates TF, and its time complexity is O(M(M − 1)). Thus, the overall time complexity of EASM-1 is O(M2).

5 Switch migration design

In this section, we will determine the migrating objects for switch migration, including emigration controller, migrating switch and immigration controller. Then, we implement dynamic migration decision according to the presupposed migration triplet.

5.1 Migrating objects determination

With the help of load difference matrix and trigger factor, we have detected the load imbalance in the network. Next, we will determine migration objects, including emigration controller, migrating switch and immigration controller.

  1. (1)

    Emigration controller

In order to balance controller loads quickly, we set the overloaded controller as the emigration controller. Therefore, according to the characteristic of the constructed load difference matrix, we compute the trigger factors between all controllers. Meanwhile, we get controller cm and cn from δmn if δmn > Λ. In particular, we assume L(cm) > L(cn), and then set cm as the emigration controller.

Through traversing the entire load difference matrix, we can get several emigration controllers, and all of them will be stored in emigration controller set CEM.

  1. (2)

    Migrating switch and immigration controller

Through analyzing and comparing the cases in motivation (Section 2), we can find the selections of migration switches and immigration controllers have the big influences in load balancing rate and migration costs. Therefore, we introduce the migration efficiency model, which characterizes the load balancing rate and the migration cost simultaneously, to optimally select the migrating switches the immigration controllers.

Firstly, we give the definitions and computations of the migration cost and load balancing rate.

Definition 1

Migration cost. When switch si is migrated from controller cm to cn, it will generate migration cost \( {MC}_{c_m,{c}_n}^{s_i} \), including migrating request (the front part of Eq. 12) and load change (the latter part of Eq. 12). Next, we will analyze the migration cost in detail.

Migrating a switch from the one controller to another controller produces network cost. We define the consumption of network resource as the migration cost. The migration cost of a switch consists of two parts: (i) migrating request cost and (ii) load change cost. They are detailed below:

  1. (i)

    Migrating request cost. During a switch migration, this switch firstly sends a communication packet, which is similar to Packet-in packet, to the immigration controller to request migration. This cost of the procedure is the migrating request cost. Concretely, when switch si is migrated from controller cm to controller cn, the migrating request cost can be computed as PPacket ⋅  ∑ J(si, cn), where PPacket is the average size of Packet-in sent by the switch, and J(si, cn) is a binary variable that describes the connection relationship between switch si and controller cn. The connection relationship firstly appears in Section 3.1. We calculate the value of J(si, cm) based on the physical connection between si and cm. J(si, cm) = 1 means si connects with cm, otherwise J(si, cm) = 0.

  2. (ii)

    Load change cost. If the immigration controller accepts the migrated switch, the switch’s traffic will be handled by controller. This process causes the load change of controllers, and the cost of the procedure is load change cost. Here, we consider the controller’s load is only related to switch traffic and the length of a path from a switch to the controller. The load change cost can be computed as \( {\alpha}_{s_i}\cdot \mid {h}_{in}-{h}_{im}\mid \), where \( {\alpha}_{s_i} \) is the average flow rate of switch si; hin is the length of the path between si and cn; him is the length of the path between si and cm.

Based on above analysis, we formulate the migration cost with Eq. 12.

$$ {MC}_{c_m,{c}_n}^{s_i}={P}_{Packet}\cdot \sum J\left({s}_i,{c}_n\right)+{\alpha}_{s_i}\cdot \left|{h}_{in}-{h}_{im}\right| $$
(12)

Definition 2

Load balancing rate. This paper computes the controller load variance as load balancing rate, and \( \overline{L} \) is the average load of controllers. Before migrating the switch, we get load balancing rate:

$$ \eta =\frac{1}{M}\cdot \sum \limits_{m=1}^M{\left(L\left({c}_m\right)-\overline{L}\right)}^2 $$
(13)

After the switch has migrated, η, \( {\overline{L}}^{\ast } \), L(cm) and L(cn) are updated, and the results are shown in Eq. (14) to Eq. (16).

$$ {\eta}^{\ast }=\frac{1}{M}\cdot \sum \limits_{m=1,m\ne n}^M\left[{\left({L}^{\ast}\left({c}_m\right)-{\overline{L}}^{\ast}\right)}^2+{\left({L}^{\ast}\left({c}_n\right)-{\overline{L}}^{\ast}\right)}^2\right] $$
(14)
$$ {L}^{\ast}\left({c}_m\right)=L\left({c}_m\right)-{\alpha}_{s_i}\cdot {h}_{im}\kern0.1em $$
(15)
$$ {L}^{\ast}\left({c}_n\right)=L\left({c}_n\right)+{\alpha}_{s_i}\cdot {h}_{in} $$
(16)

Definition 3

Migration efficiency. We define the ratio of load balancing rate changing and migration cost as the migration efficiency \( {\tau}_{s_i{c}_n} \). The higher \( {\tau}_{s_i{c}_{\mathrm{n}}} \), the better controller performance after migration.

$$ {\tau}_{s_i{c}_n}=\frac{\left|{\eta}^{\ast }-\eta \right|}{MC_{c_m,{c}_n}^{s_i}} $$
(17)
$$ \forall {s}_i\in S,{c}_j\in C,\kern0.3em J\left({s}_i,{c}_j\right)=\left\{0,1\right\} $$
(18)
$$ \forall {s}_i\in S,\sum \limits_{c_m\in C}J\left({s}_i,{c}_m\right)=1\kern0.1em $$
(19)
$$ \exists {c}_m\in C,\kern0.3em L\left({c}_m\right)\le {\Omega}_m $$
(20)

Equation (18) restricts the connections of all devices. Equation (19) represents that each switch only connects with one controller. Equation (20) shows there is no possible that all controllers are in the overloaded states.

Based on the migration efficiency model, we design to select migrating switches and immigration controllers.

  • Migrating switch selecting

The migrating switch si is selected from the switch set Γ(cm) managed with emigration controller cm, and si must consider the following conditions. First, cm is more willing to migrate the switch with the high migration efficiency to relief its loads. Moreover, from the perspective of delay, cm preferentially abandons the switch that is far from it. Therefore, we select the migrating switch based on a probability distribution, which is shown in Eq. (21) and Eq. (22).

$$ {s}_i=\underset{\Gamma \left({c}_m\right)}{\arg}\max {\rho}_{s_i} $$
(21)
$$ {\rho}_{s_i}={\tau}_{s_i{c}_n}\cdot \frac{\left|\overline{L}-\left(L\left({c}_m\right)\cdot \left|\Gamma \left({c}_m\right)\right|-{\alpha}_{s_i}\right)\right|\cdot {e}^{\left(\max {h}_{im}\right)}}{e^{\sum \limits_{s_i\in \Gamma \left({c}_m\right)}\left(\max {h}_{im}\right)}} $$
(22)
  • Immigration controller selecting

When the migrating switch si is moved into its slave controller, it firstly detects whether the migration will cause a new overloaded controller. If so, this controller will be abandoned. Therefore, under the guidance of the migration efficiency model, cn will be selected as the immigration controller according to Eq. (23) and Eq. (24).

$$ {c}_n=\arg \max \left\{{\Phi}_{\mathrm{n}}\right\} $$
(23)
$$ {\Phi}_n=\gamma \cdot \left[{\Omega}_n-L\left({c}_n\right)-{\alpha}_{s_i}\right]+\left(1-\gamma \right)\cdot {\tau}_{s_i{c}_n} $$
(24)

where Φn represents the weighted sum of remaining processing capacity and the migration efficiency, and γ is the corresponding weight.

Based on the above computation, we can get several immigration controllers, and they are saved in immigration controller set CIM.

5.1.1 Optimal object selection algorithm

Based on the known loads condition, we will select the optimal migration objects in EASM-2. Firstly, for any δmn in TF, if L(cm) > L(cn), we set cm as the emigration controller and add it into the set CEM (Line 3 to Line 5). Then, we compute migration costs and load balancing rate to get the migration efficiency (Line 8). The migrating switch is selected according to the maximum selection probability (Line 10). The selection of the immigration controller is optimized by SA method. Initial temperature decreases to a moderate stage until the system comes to a balance point, where no more changes require (Line 16 to Line 18). In the next stage, it begins with a lower temperature and allows the model to move toward the better solution (Line 19 to Line 22). The selected immigration controller will be added into the set CIM (Line 25). All migration objects are determined in the end. The pseudo-code of the algorithm is shown in Table 5.

Table 5 Optimal object selection

In EASM-2, Line 1 to Line 8 selects the migrating emigration controller, and its time complexity is O(2(M − 1)). Line 10 computes the migration efficiency, and its time complexity is O(M ⋅ N). The time complexity of selecting the migrating switch is O(M). After Line 11, SA method is implemented for selecting the immigration controller, and its time complexity is O(M ⋅ k), which is related to initial temperature and cooling rate. Therefore, the overall time complexity of EASM-2 is O(M ⋅ N).

5.2 Migration decision formulation

Through determining the migration objects, we have acquired the required elements of switch migration. However, the relationships of migration objects aren’t one-to-one correspondence, and one emigration controller may migrate switches into the multiple immigration controllers. Therefore, in order to ensure the well-organized and efficient switch migration, we introduce the triplet to represent the precise mapping between migration objects.

Definition 4

Migration Triplet. For any switch migration, the migration triplet is defined as [cm, si, cn], where those three elements form the determined migration mapping. cm is the emigration controller, selected from CEM; si is migrating switch, selected from Γ(cm); cn is the immigration controller, selected from CIM. When [cm, si, cn] is constructed, cm have no choice but to migrate si to cn.

In practice, there may be multiple switches needed to be migrated in the network, so all triplets form a set Tr, which is required to update in real-time after every switch is migrated. By this way, it can prevent the migration disorder effectively. When all migrating switches are migrated into those target controllers, we will redetect whether the controller loads meet the load balancing after the migrations completed. According to the final results, we decide to quit EASM or return module 2 until meeting the requirement of load balancing (∀cm, cn ∈ C, δmn < Λ).

As a dynamic balancing method, switch migration will cost the particular control resources. Particularly, when the load balancing condition becomes better, we must reduce the occurrence of migration to avoid unnecessary consumption of the controller resources. Therefore, in order to achieve the better migration effects, we can adjust the threshold Λ according to Eq. (11) after network update. Moreover, the bigger Λ, and the lower migration frequency.

Particularly, if there are multiple migrating objects with the same migration efficiency and multiple immigration controllers, we migrate those switches to their closest controllers. For example, the switch will be migrated into the controller with the minimum path length among all controllers. If the path length is same, then EASM would randomly select switch to migrate.

5.2.1 Dynamic migration decision algorithm

In EASM-3. Firstly, we select the elements from the migration object sets to construct the triplet set Tr (Line 2), which includes a series of migration mappings. For each [cm, si, cn], we will migrate si from cm to cn (Line4 to Line 5), and then shrink Tr (Line 8). After Tr is empty, we will update the network state and improve the threshold Λ to complete the dynamic switch migration (Line 10). The pseudo-code of the algorithm is shown in Table 6.

Table 6 Dynamic migration decision

In EASM-3, it mainly executes switch migration and updates controller states, and its time complexity is associated with triplet. The overall time complexity of EASM-3 is O(M + N).

6 Evaluation

6.1 Simulation setting

In this section, we evaluate the performance of EASM under the experimental environment shown in Fig. 4, and make the following descriptions.

Fig. 4
figure 4

The experimental topology

  1. (1)

    Experimental platform

We select OpenDaylight [13] as the experimental controller, and use Mininet [14] as a test platform. OpenDaylight is programmed by Java and supports multiple versions of OpenFlow protocols. Mininet developed by Stanford University is set as the test platform. The physical devices contain five servers with the same configuration (Intel Core i7 3.5GHz 4GB RAM). The operation system is Ubuntu 16.04 and the development kit is JAVA 8. EASM is designed in the application layer of OpenDaylight controller. Considering the performance conflict between OpenDaylight and Mininet on one server, we run OpenDaylight with EASM on four servers (NO. 1- 4) and install Mininet on one server (NO. 5). All servers are connected by H3C S1016 switch.

  1. (2)

    Topology selecting

We select the authoritative network topology to make the experiments more persuasive. First, we demonstrate the validity of EASM in Internet2 OS3E [15] with 34 nodes and 42 links. Then, we reselect several topologies from Topology Zoo [16] to prove the load balancing performance and the topological adaptability.

  1. (3)

    Parameters setting

We use Iperf [17] to generate TCP flows to simulate the distribution of the network traffic. The average flow requests are 200KB/s. The controller capacity is limited to 5MB, and v = 15KB/s, Ppacket = 30Byte, ζsync = 18Byte. The link bandwidth is finite, thus we set the number of switches managed with one controller is from 5 to 20 [18].

  1. (4)

    Simulation comparison

To verify the performance of EASM, we compare it with the other three strategies.

  • No Switch Migration (NSM): the connections between switches and controllers are static.

  • Closest Switch Migration (CSM): the overloaded controller randomly migrates the switches into the closest underloaded controller to solve the load imbalance [11].

  • Maximum Utilization Switch Migration (MUSM): like the typical switch migration scheme, it migrates switch into the controller that has maximum residual capacity [19].

  • Efficiency-aware Switch Migration (EASM): the switch migration is implemented according to migration efficiency to improve load balancing rate and reduce the migration cost.

The evaluation indexes include controller response time, controller throughput, migration cost and migration time, and load balancing rate.

6.2 Result analysis

6.2.1 Controller response time

Controller response time is one of evaluation indexes. When the load imbalance occurs, controller response time will be increased significantly. In the experiment, we change the flow request counts to make some controllers overload, and observe the change of controller response time. Flow request count of OS3E is shown in Fig. 5, and each simulation time is 12 hours.

Fig. 5
figure 5

Flow request counts in OS3E

The average controller response time of four strategies is shown in Fig. 6. We can see that NSM has the most drastic time fluctuation with the change of flow request counts. CSM and MUSM have the smaller fluctuation range, and EASM has the slightest fluctuation. The reasons are explained as follows. Because NSM does not implement switch migration during load imbalance, there is the biggest difference of the controller response time between the normal controller and the overloaded controller. Although CSM and MUSM adopt switch migration to balance controller loads and lower response time, nearest migration is easy to cause new load imbalance after migration, and MUSUM is lack of global planning. EASM analyses the composition of controller loads in detail and constructs the load difference matrix to avoid the local optimal problem, which could ensure the high-efficiency migration and reduce the controller loads quickly. Compared with other strategies, the average controller response time of EASM has reduced 21.9% at least.

Fig. 6
figure 6

Average controller response time

The cumulative distribution function (CDF) of controller response time is shown in Fig. 7. Due to the reasonable migration model setting, EASM is less vulnerable to the change of flow request counts than other strategies.

Fig. 7
figure 7

CDFs for different strategies

6.2.2 Controller throughput

Based on the flow request counts in Fig. 5, we use the average controller throughput to reflect the load condition. The higher throughput, the better controller performance. The experiment result is shown in Fig. 8.

Fig. 8
figure 8

Controller throughput

Due to the static connection of NSM, it has the lowest throughput, which is less than 2000packets/s. The remaining three strategies implement switch migration during controller overload, so the average controller throughput has been improved obviously. CSM and MUSM have the similar throughputs, which are close to 3000packets/s. Differing from the unilateral migration decision (the shortest distance in CSM and the largest capacity in MUSM), EASM makes efforts to improve load balancing rate while reducing migration cost through setting migration efficiency model. Meanwhile, the reasonable design of triplet also ensures the concurrent and coordinating migrations. Therefore, the average controller throughput of EASM reaches about 3660packets/s, which has increased by 30.4% on average compared with the other migration strategies.

6.2.3 Migration cost and migration time

In this experiment, we remove NSM and only record the migration costs and time of CSM, MUSM and EASM. That is because NSM does not perform switch migration. As shown in Fig. 9, in terms of migration cost, MUSM is the highest, CSM and EASM have the similar results. On the other hand, in terms of migration time, CSM is longest, MUSM takes the second place and EASM is the shortest.

Fig. 9
figure 9

Migration cost and migration time

There are several reasons to explain this result. First, CSM has the smaller migration costs due to both its closest migration strategy and less interactions between migration objects. However, it is easily to cause the immigration controller becoming a new overloaded controller because of concentrating on migration distance but ignoring controller capacity. At this time, CSM must migrate switches again, and migration time is the longest. Second, MUSM searches the controller with the maximum residual processing capacity the as immigration controller, but doesn’t consider the additional network costs. Third, EASM optimizes the migration objects based on the migration efficiency, which uses the selection probability to determine the migrating switches and chooses the immigration controllers optimized by simulated annealing. Both two operations could reduce migration costs effectively. The sequential migration process of EASM also makes migration time lower.

6.2.4 Load balancing rate

Firstly, in order to verify the comprehensiveness of EASM strategy, we select the Internet2 OS3E as the experimental topology, and formulate a situation that there are four migrating switches with the same migration efficiency and four immigration controllers. Figure 10 shows the load balancing rate of EASM. It is clearly seen that EASM has the higher and stable average load balancing rate, and the load balancing rate fluctuates less. This is because EASM design considers different scenarios and has the universality for the network. Especially, when switches have the same migration efficiency, EASM can still keep efficient migration according to the minimum path length among controllers. Therefore, EASM has the better general applicability for SDN network.

Fig. 10
figure 10

Load balancing rate under multiple migrating objects with the same migration efficiency and multiple immigration controllers

Further, we validate the load balancing performance of EASM in the other topologies selected from Topology Zoo, and the network scale gradually expands. As shown in Fig. 11, and the normalized processing is implemented for load balancing rate to make comparisons more clearly. We observe that the load balancing rate of EASM is higher than the other three strategies, and it almost does not change along with the topology expanding. This is because the setting of migration efficiency and triplet in EASM could achieve the efficient migration planning and fast switch migration. Therefore, EASM has a stronger ability to maintain the load balancing rate at a high level, and can adapt to different network topologies.

Fig. 11
figure 11

Load balancing rate in different topologies

7 Related work

There is a large spectrum of related work along controller load balancing. We only review some closely related ones here from the mainstream solutions.

Controller deployment scheme

The original SDN network relies on the centralized controller, which has the problems of low processing performance and poor scalability, so the related researchers propose to deploy multi-controller, such as HyperFlow [4], Onix [5] and Kandoo [6]. In order to balance the loads of distributed controllers, in [20], the authors firstly consider the controller deployment, and optimizes the locations of controllers based on the average delay and the maximum delay. Meanwhile, this method also introduces the deployment instances to analyze the distribution of loads. In [12], the authors design a Pareto- based Optimal COntroller (POCO) placement, which makes a compromise in controller performance, failure tolerance and load balancing. In [21], the authors propose a dynamic controller planning with load regulation, and this architecture could adjust the number of active controllers adaptively and minimize the flow setting time. However, it must collect the traffic information periodically and implement load redistribution from the entire control plane. In [22], the authors consider the usage of control resource, and design JumpFlow to reduce the usage of flow table and the ratio of average control messages. In [23], the authors propose an efficient online algorithm for dynamic SDN controller assignment, and mainly consider response time and maintenance cost. A hierarchical two-phase algorithm that integrates key concepts from both matching theory and coalitional games is designed to solve it.

Switch migration scheme

Following OpenFlow 1.3 protocol [10], a switch could be connected with multiple controllers in the meantime. A switch may be simultaneously connected to multiple controllers in equal state, multiple controllers in slave state, and at most one controller in master state. Each controller may communicate its role to the switch via a role request message, and the switch must remember the role of each controller connection. The subdomain controller is the master role of the subdomain switches, but those switches can set the other subdomain controllers as slave roles. Therefore, some people study controller load balancing from the perspective of switch migration. In [11], the authors design an ElastiCon architecture with double threshold values, and ElastiCon migrates switch into the closest neighbor controller. In [24], the authors propose switch migration based on clustering controller and divide the whole network into multi-domains. The dynamic allocation of controller load between multiple clusters is realized by switch migration. Meanwhile, this method also supports failover and controller backup. In [19], the switch migration is programmed as the maximum resource utilization, and the distributed hopping algorithm is designed based on Log-Sum-Exp function to approximate optimal object. Besides, it runs on each controller independently. In [25], the authors introduce load variance-based synchronization (LVS) to improve the load balancing performance in the multi-controller and multi-domain network. LVS conducts state synchronization among controllers if and only if the load of a specific server or subdomain exceeds a certain threshold. In [26], by constructing the game-playing fields, the authors design a decision-making mechanism based on zero-sum game theory to reelect a new controller as the master for the switches. In [27], the authors propose BalCon (Balanced Controller), which is an algorithmic solution designed to tackle and reduce the load imbalance among SDN controllers through proper SDN switch migrations. However, BalCon is only suitable for the small-scale network.

8 Conclusion

In this paper, we make the first attempt to optimize the process of switch migration through introducing the migration efficiency, and propose an Efficiency-Aware Switch Migration (EASM) strategy for balancing multi-controller loads. The essence of EASM is to migrate switches with the consideration of load balancing rate and migration costs to improve the migration efficiency and balance the controller loads. Simulation results show that EASM simultaneously achieve low controller response time, high controller throughput, low migration cost and better load balancing rate. In the future, we will improve EASM in the following aspects: (1) deploying EASM in a large-scale test bed, (2) researching the reliability of controller, (3) considering the security of the switch migration.