1 Introduction

Mobile ad-hoc networks (MANETs) (Torkestani and Meybodi 2011; Shadi 2015) for internet of things (IoTs) is the group of communication devices that are able to exchange information with each other even if the infrastructure is not fixed. These devices have the ability to forward the data on behalf of one another and can move in any direction. These are low cost and flexible devices with ease of plug and play benefits. As we know from the IP subnet concept in the network (Xie et al. 2013), the devices in MANETs for IoTs can be divided into clusters that are related to each other. The information about network devices and its associated links is maintained within each cluster. In this way, the cluster is assumed to be a single node (logical) when the whole network is managed. The network layer only needs to maintain information about these devices (clusters). in this way, the clustering mechanism brings the less control overhead messages. The selection of an optimal cluster head is an NP-hard problem (4).

One of the emerging issue in cluster based MANETs for IoTs is the design and implementation of a clustering scheme that is able to communicate data towards destination with less overhead (Qayyum et al. 2015). There are two concerns that make the clustering problem in MANETs for IoTs a hot area of research. Firstly, the management of MANETs for IoTs can be carried out efficiently with clustering mechanism. The number of connected devices in MANETs for IoTs may be more than hundreds or even thousands. With large number of communication devices, the number of control packets may very high when the flat network architecture (Yang and Sun 2016; Singh et al. 2014; Zhao et al. 2013a, b) is designed. The scalability issue may arise with flat network infrastructure when the size of the network becomes large. The scalability issue in MANETs for IoTs is more challenging as compared to wired networks due to the mobility of network devices. Hence, it is very important to manage the network in an efficient manner. The most useful method to manage the MANETs for IoTs is the clustering (Sumalatha et al. 2015). Secondly, the clustering mechanism provides as base for the solution of different issue that may arise in MANETs for IoTs e.g. routing, intrusion detection system, topology control etc. handling these issues require a well-structured clustering scheme.

The ultimate goal of clustering scheme in MANETs for IoTs is the discovery of a set of connected devices that are able to cover all the static and non-static nodes deployed in the network. A device can be part of only one cluster at any specific time. In cluster based MANETs for IoTs, all the clusters may not have a cluster head (CH). The CHs have the advantage of easy network management, most of the clustering schemes discussed in the literature have a strong focus on the election of CHs. The formation and maintenance of clusters should be in such a way that both the power consumption and bandwidth usage should decrease. The decrease in cost should be in both cluster formation and maintenance. If the cost is not minimized, the clustering structure may become more expensive than conventional schemes. Optimization techniques (Xibin et al. 2013) such as genetic algorithm, ant colony optimization, particle swarm optimization etc can be applied in order to form balanced and stable clusters with low cost.

The important application requirement like the stability and load balancing of clustering algorithm is considered in this paper. Balance clusters are achieved when the number of nodes a cluster head serve should approximately be same. The size of the backbone network is minimized to achieve stable clusters. The equality in terms of workload for all the cluster heads can be assured. In this way, each cluster will uniformly consume its battery power and thus the network lifetime is increased as a result of the balance cluster structure. The authors of (Yang and Sun 2016) mentioned that, discovering an optimum cluster headset with more than one parameter are NP-hard. The existing search methods for example hill climbing, often fails to optimize the non-linear multi model functions. So, a random search mechanism might be essential. One most popular random search optimization method is a genetic algorithm (GA) that works on the principles of evaluation, i.e. survival of the fittest. GA finds optimal solutions, but it suffers from local optimum problem. In general, a hybrid algorithm based on honey bees, GA and Tabu search is used to find a near to optimal solution regarding a fitness function for NP-hard problems.

However, as we study the stable, balanced clustering in a frequently changing network setting, the CP turns out to be a dynamic optimization problem (DOP). In current years, the significance of proposing artificial intelligence algorithms for DOPs in real world applications has attracted a large number of researchers (Yang and Sun 2016). Whenever a topology change occurs in MANET, the simple way of addressing the DOPs is to reset clustering from the start at the cost of clustering overhead.

In this study, we introduce a hybrid dynamic optimization algorithm known as genetic Tabu Bee clustering algorithm (GTBC) to address dynamic clustering problems in MANET. This algorithm is based on honey bee algorithm. The benefits of genetic algorithm (GA) and Tabu search (TS) are also utilized. In this scheme, the bee algorithm accomplishes a kind of neighbourhood search joint with a random search.

As stated earlier, a main weakness of existing GA based clustering schemes are getting stuck in a local optimum. We employed different characteristics of three meta heuristic algorithms and propose a new hybrid algorithm able to improve the results of existing schemes.

The main contributions of this work are:

  1. (i)

    We formulate the dynamic clustering problem in a MANET to the dynamic optimization problem that results in an objective function based on node degree, residual energy; neighbour’s behaviour and relative mobility.

  2. (ii)

    Some useful parameters are identified that play an effective role in the cluster formation process. These are: node connectivity index, remaining energy, relative mobility (nodes velocity and direction relative to its neighbours), neighbour’s quality, communication workload, trust and reputation.

  3. (iii)

    The proposed algorithm forms stable and balanced clusters that decrease the topology changes, prolong the network lifetime and reduce clustering overhead.

  4. (iv)

    Features of the honey bee algorithm, genetic algorithm and tabu search are used to form good quality clusters that results in improved performance.

The remaining paper is organized in the following way. The existing research is discussed in Sect. 2. The cluster formation is presented in Sect. 3. The design of GBTC algorithm for IoTs is described in Sect. 4. Section 5 is all about experimental setup and simulation results. At last, Sect. 6 includes some useful discussion for future researchers and conclusion of the article.

2 Literature review

In this section, clustering schemes are classified into different categories such as Energy efficient clustering, cooperative and trust based clustering, mobility aware and stable clustering, swarm intelligence and PSO based clustering etc. The objectives and limitations of the existing work are summarized below.

2.1 Energy efficient clustering

The objective of this scheme is to reduce the energy consumption throughout the cluster formation process. It may be achieved by reducing the re-clustering or re-affiliation of nodes and the network lifetime should be increased. The network overhead should be decreased. In some proposals, the transmission range of mobile nodes is adjusted to reduce the energy depletion of nodes.

In a paper (30), the authors proposed a cluster based hierarchal routing protocol in large scale MANET. The cluster heads are selected on the basis of node degree and link expiration time. A proactive mechanism is used to control the cluster heads and reactive mechanism is used to control other nodes. The authors claim that a dominant set of nodes is selected as a cluster headset. The cluster headset is responsible for intra and inter cluster heads communication.

Design a hierarchal clustering protocol and corresponding protocol are proposed. It jointly utilizes the features of table driven and on-demand routing. The combined weight metric (node degree and link expiration) is used to select the cluster head. The authors claim that less number of clusters are formed in no time. The simulation is performed on the basis of cluster lifetime, end to end delay, delivery ratio and routing overhead.

Findings: the mobility of nodes is not considered during cluster formation and unstable clusters may be formed. The ignorance of node remaining energy may increase the frequency of topology changes. The control message overhead may be high as non-optimal nodes may be selected for the role of cluster heads.

The authors of papers (Mohammed Tarique and Tepe 2009; João Trindade and Vazão 2014) did not mention the clustering process. The authors of papers (Shadi 2015, 4; Basurra et al. 2014) use low mobility metric for cluster head selection and if the neighbour nodes are moving with high speed then the clustering will create more overhead than flat based routing approach. Similarly, relative mobility of neighbour nodes is not considered during cluster formation in the papers cited in this section. A node with high energy and low mobility will be selected as cluster head in paper (Basurra et al. 2014), but the node may have very few neighbours or even no neighbours and thus will create more overhead.

The authors (Shadi 2015, 4; Mohammed Tarique and Tepe 2009) did not mention the mobility model used in the clustering process. Most of the papers assume random mobility in this section and their protocols may degrade performance in the absence of random movements. The simulation parameters and performance evaluation used in papers under consideration has no common objective. The simulation tools used are different. The objectives of the proposed work did not reflect in the simulation study in many papers.

2.2 Mobility based clustering

In this scheme, the node mobility is the key parameter while forms cluster based MANET. The nodes move from one location to another location either with random mobility model or in a statistical fashion. The node’s future mobility pattern based on the heuristic may also have a strong impact on the clustering protocols. The mobility, speed and direction known as relative mobility should be considered during cluster formation. If not, the nodes with the same speed to its neighbours, but different mobility direction may be selected as cluster head and will result in inefficient clustering structure and re-clustering will create more overhead.

The mobility aware clustering algorithms discussed in this section and some others not mentioned here have serious limitations.

Findings: In some algorithms, the nodes with low mobility are considered good candidates for cluster heads like (Torkestani and Meybodi 2011). Some authors assume predefined cluster heads (Xie et al. 2013). In some algorithms (Torkestani and Meybodi 2011; Xie et al. 2013, 5; Cai et al. 2015; Neethu and Singh 2015), the number of neighbours are ignored during cluster formation. The nodes with maximum neighbours results in a more stable clustering structure. The remaining energy of nodes is very important to become a cluster head. Since the cluster heads manage the clusters and consume more energy than ordinary nodes. The authors of (Torkestani and Meybodi 2011; Xie et al. 2013; Robert and Chriqi 2012; Lyes Dekar and Kheddouci 2008) etc. did not consider the remaining energy of nodes during clustering process. The node’s neighbour also plays an important role in stable clustering. Good quality neighbours (with relative mobility) results stable clusters and require low energy and bandwidth resources. The authors in (Torkestani and Meybodi 2011; Xie et al. 2013) did not mention the neighbours method and its quality. Some authors assume predefined clusters. The protocol may stop functioning when the size of the network changes. Some authors like (Torkestani and Meybodi 2011) did not explain properly the clustering process.

2.3 Stable clustering

To reduce the ripple effect of re-clustering, the stable and long life clusters should be designed. The stable clusters may be designed in such a way that the nodes energy remaining, and mobility should be taken into account. When the cluster members leave and join the cluster quickly, the re-clustering or re-affiliation algorithm should be called repeatedly. The communication and computation overhead should increase. To address the issue of re-clustering, some proposals are discussed and criticized in this section.

Findings: The authors of papers (Zhao et al. 2013a, b) made an attempt to form stable clusters, but the power consumption and remaining battery do not guarantee stable clusters. The node’s remaining energy is very important to form stable clusters, but this important metric is ignored in (Seungjin and Yoo 2013; Rawashdeh and Mahmud 2012). The nodes with high energy are suitable for cluster head nodes to form stable clusters. Similarly the node’s mobility is ignored in paper (Guizani et al. 2015; Ahmad et al. 2012) but the nodes with high mobility or different mobility direction may be selected as cluster head that result unstable clusters. The number of clusters must be kept low for stable cluster architecture. The nature of routing is not explained in the paper (Guizani et al. 2015, 21). The nodes with more sustainable power are suitable to become cluster heads, but the remaining energy of nodes is ignored. With one hop clustering, the number of clusters is very large and small size clusters can be formed so to form stable and large size clusters, multi hop clustering should be used. The papers (Zhao et al. 2013a, b; Ahmad et al. 2012) assume one hop clustering mechanism.

2.4 Optimization and swarm intelligence based clustering

When the number of nodes in a MANET is large, the partition of the nodes into the different non overlapping cluster becomes an optimization problem. Once the parameters are optimized, it can be used for different scenarios. In this section, different optimization based approaches to ad-hoc clustering is discussed and analyzed.

Findings: The mobility is an important parameter in clustering algorithm and is ignored in (Singh et al. 2014; Xibin et al. 2013; Keerthipriya 2015). When the relative mobility of nodes is considered as in (Keerthipriya 2015), the stable clusters are formed and the riffle effect of re-clustering and node re-affiliation is reduced. To form long life clusters in MANET, the node connectivity with its neighbours should be high and is ignored in (Zahidi et al. 2013). The papers (Singh et al. 2014; Keerthipriya 2015; Zahidi et al. 2013; Bednarczyk et al. 2015) did not mention the mobility model used in the simulation study. The simulation tool used for validation is not mentioned in paper (Xibin et al. 2013; Keerthipriya 2015). Some papers like (Gurpreet et al. 2014) uses throughput, end to end delay to form stable clusters and the cluster life time is not measured. The performance results of the proposed method are not satisfactory. The simulation matrices used in the simulation are not explained as in (Singh et al. 2014; Zahidi et al. 2013).

2.5 Load balancing through evolutionary algorithms clustering

Load balancing in a MANET is an important research area. To balance the energy consumption in the network, the nodes in the network will serve for the whole life of the network. The nodes in the network guarantee the connectivity, as the energy of a single node may not drain quickly. This section introduces some recent research to balance the load in the MANET through clustering.

Findings: The nodes remaining energy plays an important role in optimized clustered architectures and is not deliberated in (Yang and Sun 2016). The mobility of nodes to balance the energy consumption in the network, the relative mobility of neighbour nodes must be taken into account. The papers (Yang and Sun 2016; Hui Cheng and Yang 2011; Pandi Selvam and Pandi; Selvam 2012) did not assume the node movements during individual encoding. The cluster headset represents an individual. There is no re-clustering mechanism in a paper (Yang and Sun 2016) and the cluster head role for the whole network lifetime may drain its energy quickly. The node’s mobility has a strong influence on network topology and is not considered in this paper. The re-clustering mechanism is initiated after some time in (Sujoy Sett and Parag Kumar Guha Thakurta, Effect of Optimal Cluster Head Placement in MANET through Multi Objective GA, IEEE International Conference on Advances in Computer Engineering and Applications (ICACEA) 2015) but the time based re-clustering and re-affiliation did not guarantee good performance in some scenarios.

2.6 QoS based clustering

Quality of service (QoS) based clustering can be defined as meeting some provisional conditions when clustering the network and transmitting data packets from source to destination. The requirements depend on the application, but some general requirements are jitter, delay and packet loss. The paper (Ahmad et al. 2012), proposed a cluster based solution to transmit high priority data on shortest paths without any delay to achieve QoS and normal data is aggregated to save energy and increase network lifetime. Query based top-k query routing mechanism is used to achieve QoS in paper (Yang and Sun 2016). The cluster based ring clustering algorithm is proposed to reduce the time and message complexities. Some authors focus on the achievable rate in the intra-clustering model. The mobility of nodes during the cluster formation process needs alerting consideration, but the protocols reviewed in this section did not consider node movements. The topology maintenance will be increased with unstable and low quality clustering structure.

Author of the paper (Ahmadi et al. 2015) proposed a route selection scheme for real time data transmission. The routes to destination are calculated using cellular automata and genetic algorithm is used to select the optimal path for real time data. The parameters, i.e. delay and energy are considered to select the QoS shortest paths. The simulation results are compared with AODVand some other schemes. The main objective of the proposed scheme is forward the real time packets on QoS shortest paths. The shortest paths are first calculated and the best path is selected via genetic algorithm.

Findings: the proposed algorithm is compared with AODV and many other state of the art routing scheme are proposed in recent years. The GA is used to select the QoS shortest path and running GA require a large number of computations to find a solution and may compromise the QoS requirements. The load on communication link is not considered during a route optimization and hence a shortest route with heavy load may be selected for real time traffic.

2.7 Selected clustering schemes

Finally, the clustering schemes considered for simulation and comparison in this paper are:

A study of stable data transmission using hierarchical share group in MANETs (Yang and Sun 2016)—the focus of this paper is the formation of hierarchal sharing group structure (HSG). The HSGC decrease the traffic load and the reformation overhead by continuous streaming service among the mobile nodes through a secure connection. The nodes are divided into clusters and the node in a cluster with highest degree or maximum number of neighbours is selected as sharing group leader.

Objectives (i) In this research, the hierarchical sharing group mechanism is used to minimize the traffic load of the network by stable streaming service among mobile nodes. (ii) To minimize the traffic on the network and share files quickly, sub sharing groups are formed with neighbouring mobile nodes. (iii) Relay of streaming data to nodes in the sharing group is the responsibility of sharing group header and has a module to manage the monitoring information of each node.

Findings. (i). The node with highest connectivity is elected as sharing group header (SGH) and its energy main drain quickly due to high load for the whole life of the network. (ii) A low energy node may be selected as a group leader and the network may stop functioning after a short time. (iii) A group leader may leave the group due to mobility and the authors proposed no such solution for this problem. (iv) Mobility of group members is not considered and mentioned in this paper.

In ANTALG (Singh et al. 2014), source nodes are selected for transmission randomly from a set of nodes to save the time and to form the pheromone Table. The highest pheromone value obtained from iteration is used. This value helps in achieving local maxima. The global pheromone value is updated by using the computation in the beginning.

Objectives (i) The shortest paths are identified by considering the quality of the links. (ii) The paths which are used for future packets and the routing policy changes in a stochastic manner due to the movement of ant agents and packets through the network. (iii) The reduction in route discovery overhead is resulted with random selection of the end nodes. (iv) The probability of most promising route selection is increased with global updating of the pheromone. (v) Fast packet delivery is ensured by minimization of the End-to-End delay.

Findings. (i). Dynamic Changes in topology are not addressed in this paper. (ii) It produces more overhead packets when the mobility is high. (iii) It fails to work when the network size grows significantly. (iv) The network lifetime decreases as local changes appear globally.

Clustering optimization (Zhao et al. 2013a, b)—in MANETs clustering, the communication work load of node to become the CH is assumed in this paper. The algorithm optimizes the node degree, the lifetime of the cluster, the energy consumption and the communication work load. Regardless of whether a node is cluster head or an ordinary node, it must participate in communication. In this paper, the node degree, energy and the remaining lifetime of a node and communication load are optimized to form balanced clusters.

Objectives (i) If a node is selected as a CH, it will make more communication in the cluster, but that is only an overhead on top of its own communication workload. If all other criteria are equal, we should choose a node with smaller communication workload to serve as cluster head. That will minimize the overall communication workload of the cluster head node. (ii) A new clustering scheme is proposed which strikes a balance between communication workload, node degree, power consumption, and the remaining lifetime of cluster heads. (iii) We present an algorithm based on simulated annealing, combined with the idea of dominating solution to handle multiple optimization objectives.

Findings (i) This paper did not mention when the re-clustering mechanism will be activated. (ii) It is assumed that every scenario has a GPS device which is not practical in most scenarios. (iii) A node p with high energy is picked which takes information about other nodes location violates the ad-hoc nature. (iv) It fails to perform well when the nodes in the network have the same energy.

3 The proposed multi objective clustering problem formulation

In multi objective optimization, the problem under consideration has multiple objectives that are maximized or minimized concurrently. The solution of each problem must satisfy a number of constraints. We have multi-dimensional search space in multi objective optimization. As we study the stable clustering in a constantly altering network setting, the CP turns out to be a dynamic optimization problem (DOP). So an objective/fitness function is required to check the validity of the solution for the CP. This section, map the clustering problem in MANET to dynamic optimization problem. We first demonstrate our dynamic network model and then formulate the clustering problem. Initially, we assume that MANET is an undirected and connected topology graph G (V, E), where V denotes the set of nodes and E denotes the set of communication links between nodes that are within the radio transmission range of each other.

The CP can be informally defined as follows. In the beginning, we are given a number of nodes interconnected with each other; our task is to elect a cluster headset from the given nodes. The size of the cluster headset should be kept as minimum as possible and the number of member nodes a cluster head serve should be approximately the same. The basic structure of MANET after cluster formation is depicted in Fig. 1. Some nodes go to sleep state and some sleeping nodes are scheduled to wake up stochastically or periodically to save the energy. Due to the node movement from one location to other or energy depletion, the topology changes dynamically. The focus of this work is to find a cluster headset quickly when a topology change occurs.

Fig. 1
figure 1

The basic structure of GBTC Network

To find the cluster headset, the CHs for the set are selected on the basis of its weight.

The following parameters are taken into consideration to calculate the fitness xi of a communication device i for cluster headset.

  1. (i)

    Node energy (Enode): The node with maximum energy is the most suitable candidate for the role of cluster head. The communication device with minimum energy has a low probability or no chance to become a cluster head.

  2. (ii)

    Node degree (Dnode): The communication device with maximum number of neighbours is the best candidate to become a cluster head. The communication device with low degree is the worst candidate for the role of cluster head.

  3. (iii)

    Node mobility (Mnode): The communication devices with relative mobility are more suitable candidates for the role of cluster heads and the communication device with different mobility is not suitable to play the role of a cluster head. The mobility of a communication device can be calculated by observing the speed and direction of the communication device.

    Node speed (Snode): The communication device with approximate similar speed to its neighbours is the best candidate for the cluster head while the communication devices with different speed are the worst candidates for the role of cluster heads.

    Node direction (Dinode): The communication device with same mobility direction to its neighbours is a suitable candidate while the communication device with different mobility direction is the worst candidate to become a cluster head.

Hence, \({M_{node}}={S_{node}}+Di_{node}.\)

  1. (iv)

    Node quality (Qnode): The neighbour with one hop distance is considered a good neighbour. The neighbour two hops away is considered satisfactory neighbour and the communication devices with more than two hop distance are not considered the neighbours.

The parameters discussed above can be used to calculate the fitness of a communication device to become CH. In this way, the weight xi of a communication device i can be calculated by:

$${x_i}=\left( {{E_{node}}+{Q_{node}}+{D_{node}}+{M_{node}}} \right).$$
(1)

Here, Enode is the remaining energy of a communication device, Dnode designate the number of neighbours of communication device i, Mnode is the mobility of a communication device and Qnode represent the quality of the neighbours. The ability of a communication device to become the CH is calculated (optimized) in such a way that the Euclidean distance of each CH to other CH should be approximately the same. Here, the problem of grouping n ad-hoc network communication devices into a k number of non overlapping clusters is measured and considered. Particularly the problem is stated by the Eq. 2 below as in (Ahmad et al. 2017):

$$Minimize~F\left( {W,C} \right)=~\mathop \sum \limits_{{i=1}}^{v} \mathop \sum \limits_{{j=1}}^{k} \begin{array}{*{20}{c}} {{w_{ij}}{{\left( {{x_i} - {c_j}} \right)}^2}} \\ {} \end{array},$$
(2)
$$Subject~to~~~\mathop \sum \limits_{{j=1}}^{k} \begin{array}{*{20}{c}} {{w_{ij}}=1 \quad i=1, \ldots, v} \\ \end{array},$$
$${w_{ij}}=0~\;or\;~1 \quad i=1,~ \ldots ,~v,~j=1, \ldots ,~k.$$

The total number of communication devices in MANET for IoTs is represented by n in the above equation. k is the total number of CHs that need to to selected from all communication devices (unknown or known), xi ∈ ln (i = 1,...,n) is the weight of device i in MANET for IoTs. The average fitness of cluster head is represented by cj ∈ ln (j = 1,...,k) and can be calculated as:

$$Cj=~\frac{1}{{{N_j}}}~~~\mathop \sum \limits_{{j=1}}^{k} {w_{ij}}{x_i}.$$
(3)

Here, xi represent the size of the jth cluster (food source), the relationship weight of a communication device i with cluster j is represented by wij, if the node i is assigned to a cluster j, the value of wij may be 0 or 1 depends on the assignment of a device to a cluster. If the device is part of the cluster j, the value of wij will be 1 and vice versa.

The onlooker bees observe the dance of the employed bees in order to analyse the amount and direction of nectar. in this way, the selection of new cluster head is carried out on the basis of its probability associated with quantity of nectar. The probability that onlooker bees visit the CH can be calculated as:

(4)

The quantity of nectar at any location i is represented by in the equation and fs represent the number of total cluster heads. The onlooker bees search for the quantity of nectar at the neighbourhood site or food source in the jurisdiction of network device i by the following formula.

$$F{S_{i~}}\left( {x+1} \right)=F{S_i}\left( x \right)+~{\alpha _{ij}} \times {\varvec{v}}{\varvec{a}}{\varvec{r}}.$$
(5)

In equation above, the patch size of the neighbourhood for the jth food location is \({\alpha _{ij}},\) the accepted value of \({\varvec{v}}{\varvec{a}}{\varvec{r}}\) is in the range (Torkestani and Meybodi 2011) and is a random uniform variable. The fitness value is then calculated after neighbourhood search. The value of new solution should not exceed the region edge. The assessment of new solutions is made on the basis of nectar quantity. In order to memorize the new food location and forget the old position, the employed bee check the nectar amount in the candidate location if the value of the fitness function allows. The quantification of both (old and new) candidate positions is carried out on the basis of nectar amount. High nectar value represent best site while low nectar value represent the worst site. The quantity of nectar in a specific site and its location provides a base for the probable elucidation of the clustering problem in MANET for IoTs. The quantity can be calculated by:

(6)

The function that needs to be minimized is represented by the equation:

$$c{f_i}=~\frac{1}{k}\mathop \sum \limits_{{j=1}}^{k} d\left( {x,~C{H_j}} \right)$$
(7)

Here, d represents the distance of a node (cluster head) j with all its one and two hop neighbours. The CHj is the jth cluster head in the cluster headset and x is an arbitrary node that represents the neighbours of the jth cluster head.

4 Proposed genetic bee tabu clustering (GBTC)

The GBTC performs its operations in the following manner. First, the weights of the nodes are calculated on the basis of different metrics using Eq. 1. Second, a set of nodes (scout bees) with higher weights is selected for the role of cluster heads. Third, the fitness of the cluster headset is evaluated using Eq. 2. Fourth, The steps (step 3 to step 8) in algorithm 1 are performed. Once the cluster headset is obtained, the information is broadcasted to all the nodes in the network in step five. The nodes affiliation as cluster head or a member or border node is identified in the next step. The change in topology is handled in the maintenance phase. The proposed method works in two phases.

4.1 Cluster setup phase

GBTC exploits the search ability of the HBA, tabu search and GA to solve the local optimization problem of the genetic algorithm. More precisely, the job is to find the suitable cluster heads set cj (j = 1,2,3,…,k) Such that minimize the fitness function (Eq. 2). Where, k represents the number of clusters in the network. The process of this algorithm is presented in the form of pseudo code in Algorithm 1. The working mechanism of the algorithm is further explained below.

A number of parameters are required for the proper functioning of the algorithm. These are: Number (n) of nodes or bees in the network, (k) number of clusters in the network, (c) scout bees or cluster headset, (m) number of bees calculated by m = n − c, (max) stopping criteria, (pm) the probability of mutation in GA and (s) the size of Tabu List.

The network is the group of n nodes (bees in this case). Initially, the population of c scout bees is employed in this algorithm. Each individual scout (bee) represents a possible cluster head. A set of cluster heads (scout bees) is selected on the basis of its weights at the start. The fitness of the scout bees is calculated based on the objective function. The cluster heads are then selected based on the optimal solution.

The relationship among each node and all cluster heads are identified to find the number of nodes a cluster head will serve (i.e. The CH closest to the node). The initial clusters can be formed in this way. In step 3 of algorithm 1, the fitness of a cluster headset can be calculated by the objective function (Eq. 2). The selection of a bee for the neighbourhood search is carried out in step 5.i.a.

figure b

The bees with higher weights are selected for the neighbourhood search in step 5.i.a. The weight value of each bee can be used to select the suitable cluster heads or scout bees. On the other hand, fitness values are used to find the possibility of the bee being selected. The differential employment in conjunction with scouting is a primary operation of the GBTC bee algorithm. The neighbourhood search can be carried out using Eq. 8 below.

$${{\text{c}}_{\text{j}}}+{\text{1}}=cj+{\text{ceil}}\left[ {{\text{rnd}} \times {{\text{n}}_{\text{s}}}} \right].$$
(8)

Here, cj is the current cluster head and cj + 1 is another site in cj neighbourhood and ns is the neighbourhood size. The weight of the employed bees will be evaluated in step 5.i.b. The bee with the highest weight value is selected to form part of the next bee population. A genetic algorithm is used to assign the remaining bees in step 5.ii.a.

Two bee sets are selected to generate the offspring’s using crossover operator. The mutation is applied on new offspring’s. The fitness of offspring’s is evaluated when it is not in Tasu list. Finally, the bee is added to tabu list for the next generation. The offspring will be inserted at the top of the list and if the list is full, items are pushed to vacate a new position. The mutation and crossover operations are applied with probability pm. Length of the tabu list will be constant.

When the offspring’s are not in the list they are added to tabu list, fitness is evaluated and offspring’s are assigned to the next generation. This process is repeated until all the bees are generated for the next generation.

The bee colony will be generated in two ways. One part of the colony is generated through the fitness evaluation of the scout bees. Tabu search and genetic algorithm generate the other part. In GBTC, the global and local search is performed simultaneously. The algorithm becomes more powerful by using crossover and mutation operators. The crossover has the capability to randomly exchange solutions in an organized way in the hop that good solutions generate better ones. The diversity in population is achieved through mutation. In order to prevent premature convergence, the GBTC uses crossover and mutation operators to increase the diversity of the population in each iteration.

4.1.1 Initialization

All nodes in the network have a unique id in the initial state and are known as ordinary nodes. Every node maintains information about their neighbours in the form of a table with two columns: neighbour id and its type. The cluster headset is calculated using algorithm 3.1 discussed above. The cluster headset information is then broadcast to all the nodes in the MANET. The node launches initialization process when it notices that its id is within the cluster headset.

  1. (1)

    When a node encounter it’s id in cluster headset, it broadcasts this information to its neighbours.

  2. (2)

    The nodes receiving a beacon message from the cluster head mark itself as its member.

  3. (3)

    The process is repeated till all nodes change its state from ordinary node to cluster head or member node.

  4. (4)

    A node in the communication range of two or more cluster heads mark itself as boarder node. The information is then transmitted to the potential cluster heads.

The objective of this work is to decrease the number of nodes involved in the routing backbone network and equal the number of nodes a cluster head manage.

4.2 Maintenance phase

As mentioned earlier, the topology changes in MANET are very frequent due to the mobility or failure of a node. The re-clustering will be required in these networks when the intra-cluster communication links break. The power of the proposed method arises from the fact that part of this algorithm runs independently on each node locally. In previous clustering algorithms, the normal operations must be stopped during re-clustering process.

In our algorithm, the cluster maintenance process is initiated locally on a single node or group of nodes when a topology change occurs whereas the others carry on their usual process. The optimal cluster heads are again generated (re-clustering) after some predefined time for the whole network without interrupting the normal operations. The nodes in a MANET are free to move and a node may leave one cluster and joins another. Due to highly dynamic topology, the cluster membership changes very frequently. Another benefit of this algorithm is that, each node quickly converges to the new cluster head before the optimal cluster headset is calculated in the next round. This is due to the fact that in the initial clustering, the optimal probability of the applicant CHs grows relative to their fitness value (Eq. 1), therefore in the absence of the ideal CH, algorithm quickly converges to the sub optimal CH which have a high fitness value (Eq. 1).

4.2.1 Joining the cluster

When a new node joins the network or an existing node receives no response from its cluster heads it send a JREQ message to its one hop neighbours. When a cluster head receives JREQ message, it replies by sending JREP back to the node. The reply message contains the information about its relative mobility and ID. The sender node (new nodes or node not affiliated to any CH) then joins the nearest cluster head and send the cluster head select message to the CH. If a node receives more than one JREP message from CHs, then the node chooses the CH with minimum relative mobility, and sends it cluster head select message.

4.2.2 Leaving the cluster

When a node wants to leave the cluster or network, two methods are used depends on either the node is a member or cluster head. The nodes first send LREQ message to the neighbours. The nodes receiving the message LREQ checks either the node are cluster head or member. If the leaving node is a cluster head, the neighbour nodes, then form a cluster and a node with high fitness (nectar) calculated using (Eq. 1) is selected as cluster head. Each node calculates its fitness individually as the fitness algorithm runs on each host locally. So, re-clustering can be applied in part of the network as partial clustering. The nodes in the cluster are re-affiliated to the new cluster head. When a cluster member wishes to leave the cluster, no action is required.

4.2.3 Inter-CH communication

The inter cluster head communication in the proposed scheme is reduced by minimizing the number of clusters in the whole network as shown in Sect. 5.1. The cluster heads are spread all over the network and special reuse of the frequency is possible. Non-overlapping clusters are formed and the cluster head nodes have multiple interfaces with different communication ranges like communication between cluster heads on long range and communicate with member nodes on a short range. The proposed scheme can also be applied when the nodes have single interface. In this case, the communication between cluster head that are not within the direct communication range can be carried out by relaying packets on boarder nodes.

5 Experiment evaluation

In this paper, the performance evaluation of HSGC (Yang and Sun 2016), OCMANET (Zhao et al. 2013a, b), ANTALG (Singh et al. 2014) and GBTC is carried out in EstiNet 9.0 (Shie-YuanWang et al. 2013). The number of communication devices deployed in the region of size 1000 m × 1000 m are between 50 and 100. The range of node speed is between 1 and 80 km/h. The IEEE standard for communication adopted in simulation is 802.11p (1609.x). The CSMA/CA (carrier sense multiple access with collision avoidance) mechanism is used for sharing the medium. Two ray ground propagation model is adopted during the experiment. The communication device’s channel capacity is set to 2 MB per second. The omnidirectional antenna is embedded in each communication device with a range between 100 m to 200 m. 512 bit packets are allowed to travel in the network. The communication devices have the potential to maintain information about their neighbours and have built in queue for store and forward purpose. The CBR (constant bit rate) source is used to generate the traffic and is set to 20 packets per second. The duration of a single test/experiment is 20 min. The simulation is repeated for 100 different runs and the results are averaged.

The performance of GBTC, HSGC (Yang and Sun 2016), OCMANET (Zhao et al. 2013a, b) and ANTALG (Singh et al. 2014) is evaluated for different metrics in this section.

5.1 Number of clusters

In this subsection, a number of simulation tests are carried out the number of clusters formed in different runs. The experiment is repeated for different size network ranges between 50 and 500. The size of the network is increased or decreased by 50 incremental/decremented steps. The mobility model adopted is random and hence, the communication devices may move in any direction. The network area is set to 1000 m × 1000 m. The radio range of each network device is fixed to 100 m. The maximum speed limit is set to 70 km/h.

In Fig. 2, the results are averaged on the basis of cluster head (no of clusters) against the size of the network (number of communication devices) and are presented in the form of a graph. As shown in the Fig. 2, the number of clusters surges when we increase the number of network devices. The effect of network size is observed in all algorithms as shown in the graph. By comparing the curves in the graph, we observe that the number of clusters formed with HSGC is lower as compared to OCMANET. HSGC has the worst performance in this regard. The main reason is that, these algorithms give less importance to this metric as discussed in Sect. 2. The best algorithm seen in the graph is our proposed GBTC for IoTs. In GBTC, the initial cluster headset is selected on the basis of node weight. The solution is further optimized by evaluating the distance between all cluster heads.

Fig. 2
figure 2

Average number of cluster vs MANET size (tx range 100 m)

The nearby cluster heads are discarded and smaller clusters are merged. Hence, the average number of clusters in GBTC is minimized. The results clearly demonstrate that GBTC scheme for IoTs perform well as compared to existing clustering algorithms under observation.

As shown in the graph, ANTALG perform better as compared to OCMANET and HSGC. The radio range is set to 200 m and a series of experiments are carried out in this setting. All the findings obtained during the experiment are depicted in Fig. 3. High transmission range significantly decreases the number of clusters as shown in the curves. The reason behind this significant change is that when the transmission range o a communication device become high, it will cover a larger area and hence a large number of communication devices lies within the range. Hence, the size of backbone network decreases because larger network is covered by a small set of CHs. Similar to the results shown in Fig. 2, the graph in Fig. 3 clearly indicate that the proposed GBTC algorithm for IoTs outperform ANTALG, HSGC and OCMANET.

Fig. 3
figure 3

Average number of cluster vs MANET size (tx range 200 m)

5.2 Cluster lifetime

In this subsection, the influence of network device’s speed on the lifetime of clusters is studied. The network devices with a high mobility ratio may decrease the lifetime of the cluster. The number of communication devices deployed in this experiment is 50. The radio range of communication devices is fixed to 100 m. The communication devices are allowed to move at the speed in the range between 1 and 80 km/h. The results obtained during individual experiments are averaged. The unit used to measure the lifetime of the cluster is seconds. In Fig. 4, the average life of the cluster is presented. The performance of the proposed GBTC clustering mechanism for IoTs is higher than others in terms of cluster stability as shown in Fig. 4. Some existing work on clustering problem assume constant mobility characteristics of communication devices while they vary with time. The behavior of communication devices during their movement cannot be predicted in the long run in these schemes. In our proposed GBTC for IoTs, the duration of cluster members association with its CH is long. The reason behind this advantage is the consideration of communication devices movement direction and speed formally known as relative mobility during the CH selection process. The communication devices which are relatively more stable are the most suitable candidates for the role of CHs. The arcs depicted in Fig. 4 shows that OCMANET form clusters for short duration. The performance of HSGC is better than OCMANET but is graded after ANTALG. The most unstable clusters are formed when we run OCMANET. This algorithm does not consider the relative mobility of communication devices when forming the clusters and hence, the stability metric is compromised.

Fig. 4
figure 4

Nodes speed vs cluster duration (trx range 100 m)

The radio transmission range of all communication devices is set to 200 m in the next experiment and the results are shown in Fig. 5. The other parameters are kept constant as in Fig. 3. The lifetime of the network for different transmission ranges is studied. As a result, we conclude that the increase in transmission range will increase the lifetime of the cluster for all the schemes under consideration. The cluster head neighbours stay within the neighbourhood for a long time. One simple way that prolongs the life of clusters in to increase the transmission range. Our proposed GBTC for IoTs results better than ANTALG, HSGC and OCMANET as shown in Fig. 5.

Fig. 5
figure 5

Nodes speed vs cluster duration (trx range 200 m)

5.3 Re-affiliation rate

Multiple simulation tests have been performed to evaluate the effect of communication devices speed and its radio range on re-affiliation rate. Re-affiliation rate is the study of communication device’s behavior in response to topology changes or in other words, it is the frequency that a communication device leaves a cluster and/or joins another cluster in a specific time interval. The results obtained during these experiments are presented in this subsection.

Communication devices may re-affiliate to a new cluster due to two reasons. Firstly, if the cluster head leaves its role as a CH due to any reason (may be due to energy depletion, mobility etc). secondly, the communication devices itself leaves its place and may move to another location. So the communication devices will not be associated to the cluster head any more. With high speed MANETs for IoTs, the re-affiliation rate will probably increase because the network devices may leave its CH in no time and may join another CH. The random waypoint mobility model is set to all the experiments in this subsection. The simulation area is set to 1000 m × 1000 m. the number of communication devices deployed in the network is 50. The transmission range of all the communication devices is set to 100 m. The communication devices are moving within the range 1–80 km/h.

The graph presented in Fig. 6 show the average re-affiliation rate of communication devices with diverse speed. Here, the speed of communication devices greatly affects the re-affiliation rate. All the algorithms under consideration have high re-affiliation rate when the mobility of communication devices increases. It is worthy to mention that GBTC for IoTs have the lowest re-affiliation rate amongst all others as shown in Fig. 6. The results discussed in the previous section also clearly indicate that our proposed GBTC algorithm for IoTs may perform well. The re-affiliation rate of GBTC for IoTs is lower because the relative mobility of communication devices and some other important metrics are considered during the cluster formation process. When the lifetime of the cluster is long, the re-affiliation rate may be low. The relative mobility of communication devices is taken into consideration in OCMANET and hence it performs well after GBTC for IoTs. The members of a cluster in OCMANET stay connected with its CH for a long time and its re-affiliation rate will be low. ANTALG perform worse among all existing algorithms under consideration with respect to re-affiliation ratio. In this algorithm, the mobility of communication devices was not considered during the cluster head selection process, and this results unstable clusters in the network.

Fig. 6
figure 6

Re-affiliation rate with different speed

The outcome presented in Fig. 7 show the results of experiments when the radio range of communication devices is changed in every test. Here, the communication devices are moving at a constant speed of 40 km/h. The radio transmission range of communication devices vary between 20 and 200 m. The transmission range is changed by 30 incremental steps in each test. The results shown in Fig. 7 demonstrate that the radio transmission range has a similar effect on all the schemes tested during simulation including the proposed GBTC for IoTs. When the transmission range of communication devices is high, it will cover a larger area and the probability that a device may leave the current cluster and may join another cluster will be low. The performance of our proposed GBTC for IoTs is satisfactory for all the values of transmission ranges as compared to other schemes. In GBTC for IoTs, the speed and direction of communication devices is taken into account during the cluster formation process and stable clusters are obtained as a result.

Fig. 7
figure 7

Re-affiliation rate vs radio transmission range

5.4 Control message overhead

The number of packets that are exchanged between communication devices within the network per unit time during cluster formation is termed as control message overhead. Each clustering scheme needs some messages that must be transmitted to the clusters in MANET for IoTs. The control overhead of some protocols under consideration and GBTC for IoTs is presented in this subsection. The protocols used for comparision are OCMANET, ANTALG, and HSGC. In this series of experiments, 50 communication devices are deployed in the network. The area for simulation is set to 1000 m × 1000 m. The RWP mobility model is adopted. The radio transmission range of communication devices is random between 10 and 80 km/h for each experiment. Later on, the transmission range is set to 200 m. Figures 8 and 9 show the average of results obtained during a series of simulation experiments. Among all others, OCMANET perform worse in terms of control message overhead. The graph clearly shows that the curve is on high scale in case of OCMANET. The effect of the increase in mobility speed can be observed in all the schemes selected for simulation. When the mobility of communication devices becomes high, the control overhead increases. The control overhead may be minimized when the transmission range of communication devices is set to high values. High transmission range of communication devices may consume more energy as compared to devices with low transmission modules. The lifetime of the clusters may be long with high transmission ranges, but the overall network lifetime may decrease. In the proposed GBTC for IoTs, the remaining power of communication devices is taken into consideration during the formation of clusters.

Fig. 8
figure 8

Node speed vs control overhead (trx range 100 m)

Fig. 9
figure 9

Node speed vs control overhead (trx range 200 m)

In GBTC, the suitability of a node is checked before its selection for the role of a cluster head and the cluster head play its role for a long time hence, the control message overhead is minimized. The factors that reduce the control overhead include long life cluster heads, mobility consideration, high energy nodes selection, etc.

As shown in Figs. 8 and 9, the proposed GBTC algorithm for IoTs has the lowest control overhead while the OCMANET has the highest overhead. In the following paragraph, the reasoning on why GBTC performs well is described. As discussed in earlier sections. GBTC algorithm for IoTs consider the relative mobility of communication devices during cluster formation. This may result stable clusters in the network. Once stable clusters are formed, the re-affiliation rate will ultimately become lower. Similarly, if there is low re-affiliation, then the control messages overhead will become lower because the re-clustering procedure will be called less frequently. In this way, the proposed GBTC algorithm for IoTs outperform all existing schemes under consideration.

5.5 Load balancing

In this section, the quantification for load distribution is calculated which shows that how much a cluster is balanced. It is very difficult to equally distribute perfectly the number of nodes in every cluster throughout the lifetime of the network, but ideally every cluster must have equal number of nodes all the time. This is due to the fact that changes in topology are very frequent in MANETs. To calculate the load balance factor, we need to calculate the cardinality of cluster heads.

As discussed in (Ahmad et al. 2017), the load balancing is defined as the converse of the variance of the cardinality of the clusters. Hence,

$${\text{Load balancing factor}}={\text{1}}/{\text{C}}{{\text{H}}_{\text{n}}}{\sum _{\text{I}}}{\left( {{{\text{x}}_{\text{i}}}-\upmu } \right)^{\text{2}}},$$

where CHn is the total number of CHs, the cardinality of cluster i is represented by xi, and the average number of CH neighbours is represented by µ = N − CHn / CH that should be the total number of nodes in the network. The distribution of load will be better if the equation value is high. The value of the equation will tend to infinity for ideal balanced networks.

The graph in Fig. 10 shows the Load balance factor of the HSGC, ANTALG, OCMANET and our proposed GTBC. The value of the equation is calculated for network size having 50 nodes and with varying transmission ranges. The graphs show that GTBC gives more balanced clusters than others under consideration.

Fig. 10
figure 10

Load balancing factor vs transmission range (10–100 m)

5.6 Computational overhead

In this set of simulation experiments, the average number of instructions executed per unit time is studied. The results are shown in the form of graph in Fig. 11. The computational overhead is tested fro different size networks. Some tests are carried out for different transmission ranges. The speed of communication devices is set to 50 km/h. once again, with high ratio transmission range values, neighbourhood configurations become long and hence it results in slower execution. A similar effect is seen when the number of communication devices is increased in the same proportion and the neighbourhood contains a large number of communication devices. The frequency of re-clustering procedure initiation may become high with an increase in the number of network devices. To conclude, the number of communication devices deployed in the network is a monotonically increasing function of the number of instructions executed per unit time.

Fig. 11
figure 11

Number of nodes vs number of instructions with a transmission range 100 m

6 Conclusion and future work

In this paper, a new hybrid algorithm based on Tabu list, honey bee, a genetic algorithm for clustering problem is proposed. We first formulate the clustering problem in a MANET to the dynamic optimization problem. Second, a number of matrices are identified that must be considered during the cluster head election process. Third, a multi objective algorithm for stable and balanced cluster formation is presented. The algorithm is compared with other existing techniques and the results show that the proposed algorithm outperforms in terms of cluster lifetime, re-affiliation rate and control overhead. The beauty of the proposed algorithm is that, it considers relative mobility, i.e. node speed and direction to form the stable clusters. For load balancing, the connectivity of mobile nodes is considered during cluster formation. To evenly distribute the energy consumption in the network, the remaining energy of nodes is considered during the cluster head selection process. Finally, to further enhance the quality of clusters, the behavior of neighbour nodes during cluster formation is considered.

Our proposed genetic bee Tabu clustering (GBTC) forms stable and balance clusters efficiently. The algorithm does not suffer from premature convergence. In order to prevent premature convergence, the GTBC uses crossover and mutation operators to increase the diversity of the population in each iteration. The crossover has the capability to randomly exchange the solutions in an organized way with the hope that good solutions generate better ones. The diversity in population is achieved through mutation.

The algorithm can be improved further by considering more matrices i.e. trust reputation, communication load, etc. during the cluster head selection process. More simulation can be conducted to evaluate the network lifetime, the number of alive nodes operating in the network at a specific time. The algorithm can be modified and will be applied to wireless sensor networks and vehicular ad hoc networks.