1 Introduction

Wireless sensor network (WSN) is the group of wireless nodes that are often randomly deployed in a targeted area over vigorously changing environments. These nodes can sense, process and forward data to neighboring nodes and base station (BS). Moreover, these small devices have limited capabilities such as small memory, low computation, low processing and most importantly small power unit. SNs are scattered over a large geographic area containing hundreds of nodes to monitor a target region. A tiny, low cost device having sensors on board, connected wirelessly with self-organizing capability, can be connected to the Internet for controlling and monitoring environments, homes, offices, cities etc. [1]. These SNs can be arranged anywhere on the ground, on the roads, underwater, on bodies, in the air, inside buildings and even in vehicles. Researchers have shown great interest in developing WSN communication capabilities and have used sensors in a variety of other technologies such as personal digital assistants (PDA), vehicular ad hoc networks (VANETs), mobile phones, and Internet of Things (IoTs) [2,3,4,5]. As the sensed information has to be transferred to BS for required action, therefore efficient routing mechanism is required for transmission of data to BS [1,2,3,4].

In WSN, to efficiently utilize the available resources especially battery, different hierarchical techniques have been proposed. The aim of such techniques is to obtain energy efficiency and maximize network lifespan. In hierarchical routing, clustering is the most widely used technique to achieve these goals. Clustering schemes by design eliminate the redundant messages in formation of efficient clusters and intelligent selection/reselection of the CH. In literature, researchers have proposed various clustering protocols, but issues such as optimizing energy efficiency and load balancing require further research. Moreover, topology construction is also vital to distributing nodes uniformly in the clusters or grids in case of grid-based approaches to make the network efficient. The periodic reformation of clusters and reselection of CH results in excessive energy consumption that could lead to poor network performance [6].

Routing in WSNs is more challenging than mobile ad hoc network or VANETs, as WSN has resource constraints [1]. Therefore, to meet the challenges in WSN, new routing mechanisms are being developed keeping in view the application requirements and the underlying network architecture. Due to frequent topological changes in the network, maintaining routes is a major issue and if not carefully handled may result in high energy consumption.

In order to achieve prolong system lifespan and minimum energy consumption, various routing techniques have been introduced in the literature. Furthermore, they can be broadly categorized into different classes: network structure, topology based, reliable routing scheme, and communication model scheme [1]. The network structure is further categorized into flat and hierarchical protocols. In flat networks, all SNs cooperate with each other through multihop routing in which each node has the same role. In hierarchical approaches, nodes are clustered into groups, and, by some criteria, a cluster head (CH) is selected that is responsible for routing. In hierarchical routing, usually two-layer approach is used, where one layer is used for sensing the physical environment and the other is used for routing. The low energy sensor nodes (SNs) are used for sensing while high energy SNs are often used for collecting, aggregating, and forwarding data [6]. Clustering approach is the most widely used technique for energy efficiency to achieve scalability and effective communication. Cluster-based hierarchical approaches have some advantages such as increased scalability; efficient data aggregation and efficient channel bandwidth utilization. The main problem of clustering is non uniform clustering which leads to high energy dissipation of SN, increase in total energy consumption, and poor network connectivity [6].

Metaheuristic approach is the preferable optimization scheme to enhance performance parameters in hierarchical clustering protocols. In the proposed work, a biologically inspired distributed clustering approach is presented. A novel protocol is proposed here with different objectives: select the best possible node to be the CH using bat flower pollinator (BFP) based on real time information, and enhance stability period and lifetime by reducing energy consumption in the form of reducing internal overhead and cost of processing energy. In addition, fuzzy logic system (FLS) has been a dominant topic in intelligent systems research. As an extension of the well-known ordinary fuzzy set (type-1 fuzzy sets), the concept of type-2 fuzzy sets (T2FS) is used in this research. Type-2 fuzzy logic (T2FL) model can handle the uncertainty environment more accurately than type-2 fuzzy logic model (T1FL) because the membership degrees of T2FL are themselves fuzzy sets. In this paper, we propose a novel clustering protocol using interval type-2 fuzzy sets (IT2FS) along with BFP for CH selection. To inherit the strength of these two methods, we combine IT2FLS and BFP into one methodology. In this protocol, the large computation such as for acquiring fuzzy rules for CH selection and hence clustering is undertaken by a sink node while the small one is computed by nodes in a distributed way.

The rest of the paper is organized as follows: in Sect. 2, the relevant works are presented. In Sect. 3, Bat Flower pollination algorithm is outlined. Sections 4 and 5 focused on the network model and the proposed methodology respectively in detail. The performance analysis of proposed algorithm is described in Sect. 6. Finally, in Sect. 7 the conclusion of our work is summarized.

2 Related Work

As pointed out earlier, the nodes are tiny and energy constrained. In this context, it is desirable to form a group of devices and these group of devices communicate among themselves through a group leader that ensures the minimum communication overhead and energy-efficient execution using clustering. In literature, various attempts have been made to achieve energy efficiency through different clustering techniques by addressing the problems of efficient cluster formation, even distribution of load, CH selection and reselection, and cluster reformation; few of them are discussed here.

Heinzelman et al. [6] presented LEACH that is the first energy-efficient routing scheme and is still used as a state-of-the art protocol in WSN. The basic idea of LEACH is to choose CH among a number of nodes by rotation so that energy dissipation from communication can be spread to all nodes in a network. LEACH has some disadvantages such as probabilistic approach using random number for CH selection, which might result in suboptimal CH node thus resulting in high energy consumption. Furthermore, the dynamic clustering overhead and non-uniform distribution of CH will consume more energy and lead to poor network performance. In LEACH, it is the possibility that a SN with less energy can be elected as CH; if that happened then it will become dead and consequently, that cluster becomes inaccessible. LEACH assumes even consumption of energy for every CH and does not guarantee proper CH distribution.

LEACH-centralized (LEACH-C) [7] is the modified LEACH in which clusters are created by BS. BS receives all the information regarding residual energy and location of each SN. By doing so, BS determines the number of CH and arranges network into various clusters. However, due to lack of coordination among SNs, the CH count varies in each round. CHs are selected from the set of nodes to ensure that they should have sufficient energy. The BS selects lowest energy routing paths and forwards the information of clustering and CH to all nodes using a minimum spanning tree approach. However, due to centralized approach communication overhead will increase in the reselection of CH, because reselection decision has to be made by BS. In addition, every cluster will send request; thus energy consumption will be high.

Manjeshwar and Agrawal [8] proposed an event-driven clustering approach called TEEN. In this protocol, the sensed information is communicated to BS only if some event occurs, which is based on soft and hard thresholds. The disadvantage of this approach is that the node responds only if the change in the attributes crosses these threshold values, which makes the approach less applicable in dynamic environment, as the selection of two threshold values is very sensitive and difficult for real applications. The user may keep on waiting to get response and does not get any information about the status of the node, which makes this approach not suitable for the applications, where the periodic updates are required. Later, this approach has been enhanced and proposed as adaptive TEEN (APTEEN) [9]. APTEEN combines event-driven approach of TEEN and periodic approach of LEACH to address the problems occurring in TEEN. APTEEN is good for periodic applications, but the complexity of the approach increases due to inclusion of extra threshold function and count time.

Stable Elections Protocol (SEP) is introduced to enhance stability and lifetime of heterogeneous WSN (HWSN) by Smaragdakis et al. [10]. Two level energy nodes are introduced in this protocol. Normally, the advanced node has higher energy level backup than normal node. In this scheme, CH election is based on available initial energy with SNs. The advanced node has more chance to get selected as CH than normal node. In the study, increasing percentage of advanced nodes and hence probability of CH selection improves performance in the form of network lifetime, which also improves throughput of the network.

Enhanced-SEP (E-SEP) [11] has introduced three-level communication hierarchy. E-SEP distributes SNs into three categories, where advance nodes have higher energy followed by intermediate and normal nodes respectively. By using an extra level of heterogeneity as compared to SEP [11], up to some extent energy dissipation is reduced. Multihop routing with stable election protocol (MR-SEP) [12] enhances SEP routing protocol by partitioning the network field into multiple layers of clusters. CHs are selected and member SNs join CHs in each layer according to their distance from CHs. For the transmission of data, CHs in each layer collect data from member nodes and collaborate with the CHs of adjacent layers. The CHs in upper layer perform as super CHs for the CHs of the lower layer. By adopting multiple layering approaches CHs are evenly distributed in the network field and multi-hop strategy has increased the stability period but did not get any convincing improvement in the overall network lifetime.

DEEC represents efficient clustering within multi-heterogeneous network environment, in which residual energy of each node is responsible for its probability index [13]. However, in complete distributed cluster-formation, nodes have fewer opportunities to share the global knowledge in the whole network. In DEEC, average numbers of CHs per round are much more as compared to other protocols, which brings additional direct transmissions towards BS which dissipate more energy resources. DEEC have the same problem as SEP protocol as to punish advanced node badly and need to kept global knowledge of network.

Kang et al. [14] proposed a protocol named LEACH with distance thresholds (LEACH-DT) for CHs selection. In LEACH-DT, CH selection probability depends upon its distance as a parameter from BS. BS determines the distance amongst all nodes and calculates the probability function. This information is broadcast by BS to all nodes. SNs make the decision on the basis of following information about CH selection without any centralized control. LEACH-DT also proposed multi-hop routing where SNs are divided into different groups based on their distances from BS. Using multi-hop transmission, energy consumption is reduced to some extent in which the data is communicated from distant groups to the closer ones.

Cluster chain weighted metrics (CCWM) [15] achieve energy efficiency and increase network performance based on weighted metrics. A set of CHs is selected depending on these metrics. Member nodes use direct communication for transferring data towards their respective CHs. A routing chain of elected CHs is constructed for inter clusters communication and each CH forwards data to its neighboring CH until it reaches BS. However, due to non-optimized CH election, the reselection of CH results in network overheads. Moreover, intra cluster communication is direct which leads to uneven energy consumption.

A variable size clustering scheme called Unequal Clustering Size (UCS) [16] assumes that the sensing field is circular and is divided into two layers. Clusters in layer one have the same shape and size while layer two clusters have different shape and size. To keep the energy consumption minimum, the CHs are positioned somewhere or near to the center of a cluster. Area covered by the clusters can be altered in each layer by changing radius of a layer near to BS and hence will change density of a particular cluster. This model works well in homogenous networks and provides balanced energy consumption through unequal clustering approach especially for network that deals with large amount of data.

Tarhani et al. [17] introduced SEECH protocol suitable for periodic data transmission applications. It makes use of distributed approach in which CHs and relay nodes are selected separately [17]. The reason for different CH and relay node selection is to mitigate the energy burden of CHs. SEECH protocol performed well for large scale WSNs in comparison to LEACH.

Mittal and Singh [18] presented a reactive cluster-based approach called DRESEP which is ideal for event based applications. Mittal et al. [19] proposed SEECP in that CHs are chosen in deterministic manner based on remaining energy of SNs to reduce the energy variance among sensors.

The objectives of hierarchal routing protocols are saving the dissipated energy, ensuring the network connectivity, and prolonging the lifetime of the WSNs. These objectives can be achieved via finding the optimal head nodes and hence forming optimum clusters in the WSNs. This is a difficult problem and can be considered as a Nondeterministic Polynomial (NP) optimization problem. To solve and find optimal solutions for this problem, researchers have developed the cluster based routing schemes with the optimization algorithms to achieve extended network lifespan [20,21,22,23,24,25,26,27,28,29,30,31]. ERP [22], EAERP [23], SAERP [24] and STERP using DE [25], HSA [26], SMO [27] and GA [28, 30] are recently developed optimization algorithms based clustering protocols. EAERP restructured substantial features of EAs that assures extended stability period and prolonged lifetime. ERP overcame the shortcomings of HCR algorithm [21] by improving the cluster quality of network. SAERP based routing schemes (DESTERP, HSSTERP, SMSTERP and GASTERP) are inspired from SAERP to achieve extended stability period [25,26,27,28, 30].

Kim et al. [32] proposed CHEF algorithm in that tentative CHs are elected randomly in each round. The elected CH then uses two fuzzy parameters which are local distance and energy level. Local distance is basically the sum of all distances from neighbouring nodes. By using fuzzy if–then rules, each CH determines its chance value and then advertises it. CH having greater chance value will be selected as CH and will advertise itself so that member nodes can join it. CHEF improves network lifetime, but due to periodic messages it adds network overhead and unnecessary traffic load. Furthermore, CH election process is expensive in terms of energy consumption as it is performed in the entire network that results in high energy consumption.

Sert et al. [33] proposed MOFCA protocol in which CHs are selected using fuzzy logic approach. It is designed primarily for two major factors, first it should be energy efficient and second it should be lightweight for real time implementation scenario. MOFCA selects the tentative CHs on the basis of node competitive radius. This protocol considers three parameters for the election of CHs: node’s distance from the BS, node’s remaining energy and node density. It uses fuzzy logic for calculation of above said parameters. Three fuzzy inputs are used for measurements of tentative CHs competitive radius. The main objective of MOFCA is to overcome the hotspot problem, which arises due to multi-hop communication.

In literature, there are lots of routing protocols existing but all are associated with one or few type of parameters. Researchers and scientists have been demanding a generic communication protocol which is energy efficient, prolonged network lifetime, scalable, stable and load balanced. The current research work is motivated by the above stated facts and demands. In addition, network formation and CH selection are two phases of network organization in cluster based WSNs. A number of cluster-based WSN protocols using nature-inspired optimization methods are proposed in the literature. Such cluster based routing schemes attempt to optimize either cluster formation or optimal CH selection in order to achieve the energy efficiency. There is a requirement to consider both the aspects of network organization, i.e. optimal cluster formation and optimal CH selection using an efficient optimization technique.

In present work, a newly proposed algorithm named bat flower pollinator (BFP) has been exploited using interval type-2 Fuzzy logic for solving the above said problems in clustering algorithm. BFP algorithm [34] has inherent properties of both bat algorithm (BA) [35] and flower pollination algorithm (FPA) [36]. This algorithm is designed to eradicate the inherent drawbacks of BA and FPA and has been effectively applied for antenna design applications [37]. The algorithm is good at both exploration and exploitation. In this research, a clustering algorithm attempts to optimize the cluster formation and CH selection simultaneously keeping into account energy efficiency to find nearly optimal solutions for the organization of the cluster based WSNs.

3 Bat Flower Pollinator (BFP)

The BFP is a recently introduced algorithm designed by hybridizing BA and FPA [34]. It is designed by integrating the concepts of BA and FPA and is used for optimizing the problem under test. Here the basic parameters of BA and FPA such as frequencies, location and velocity updation is performed by both flower pollinators and bats respectively to attain optimal solution. Both these algorithms work in tandem to generate two new solutions respectively and then the best among these solutions is chosen as the current best solution as shown in Fig. 1. The process of choosing new solution is a continuous process and is achieved using large number of iterations or generation with new solution being generated over continuous generations and the optimum one is retained at the end of generations. BFP algorithm is distributed into three major stages and is elaborated as;

Fig. 1
figure 1

Flow code of BFP algorithm [34]

First Stage BFP algorithm in its initial stage consists of a population of N bats and flower pollinators (where the total population is denoted by BF), initialized as

$$BF_{i,j} = BF_{\hbox{min} ,j} + a_{b} *\left( { BF_{\hbox{min} ,j} - BF_{\hbox{max} ,j} } \right)$$
(1)

where i ∈ {1,…BF}, j ∈ {1,…D}, BFi,j is the ith solution in the jth dimension; \(a_{b}\) is random number between [0,1]; BFmin,j, BFmax,j are lower bound and upper bound respectively. D is dimension of the problem under test. Note that a single population is initialized for both the algorithms that is half for BA and half for FPA. After the initialization of population, the fitness of best possible solution is taken into consideration as the initial best solution for the bats (Bi) and flower pollinators \(\left( {FP_{i} } \right)\).

Middle Stage In this stage, one solution for BA and one solution for FPA is generated using their respective solutions from the whole population. The new solution is case of BA is generated by using [38]:

$$f_{i} = f_{\hbox{min} } + \left( {f_{\hbox{max} } - f_{\hbox{min} } } \right)\beta$$
(2)
$$v_{i}^{t} = v_{i}^{t - 1} + \left( {x_{i}^{t - 1} - x^{*} } \right) f_{i}$$
(3)
$$x_{i}^{t} = x_{i}^{t - 1} + v_{i}^{t}$$
(4)

Here vi and xi are the velocity and position of the solutions respectively of the d-dimensional ith bat, \(\beta \in \left[ {0, 1} \right]\) is a uniformly distributed random number, x* is current best solution, fmax and fmin are the upper and lower limits of the problem. The local search phase is controlled by using local random walk and the new solution for this case is given by

$$x_{{\text{new}}} = x_{{\text{old}}} + \varepsilon A^{t}$$
(5)

where \(\varepsilon \in [ - 1,1]\) and At is the loudness and it increases with increase in the pulse rate. Generally Ao=0 and Amin= 0. When Amin=0, bats stop echolocation and find their prey. This can be generalized as:

$$A_{i}^{t + 1} = \alpha A_{i}^{t} , r_{i}^{t + 1} = r_{i}^{0} \left[ {1 - \exp ( - \gamma t)} \right]$$
(6)
$$A_{i}^{t} \to 0, \;r_{i}^{t} \to r_{i}^{0} , \quad {\text{as}}\;t \to \infty \;\forall 0 < \alpha ,\;\gamma < 1$$
(7)

where α and γ have constant values.

The reason for the use of BA in this case is that this algorithm is good in controlling the extent of exploration and exploitation that is the algorithm has a balanced global and local search phase. The major reason for this is the frequency based tuning using automatic switching. The major parameters that help in achieving this property are pulse rate and loudness. All these properties are combined to achieve better optimality and hence achieve a perfect combination of exploration and exploitation.

FPA also consists of a combination of local and global phase but here these phases are referred to as global and local pollination phase. Both these phases has the same significance and applicability of better exploration and exploitation respectively. The global pollination phase is intended for exploration and the general equation for this phase is given by:

$$o_{i}^{t + 1} = o_{i}^{t} + \alpha L(\lambda )\left( {R_{*} - x_{i}^{t} } \right)$$
(8)

where \(o_{j}^{t}\) is a random solution corresponding to tth iteration, \(R_{*}\) is the current best solution of the respective iteration, \(\alpha\) is the scaling factor and it controls the step size \(L(\lambda )\) by employing Lévy flight. This Lévy flight component because of its heavy tailed structure, helps the algorithm in improving the exploration capabilities and is generated by

$$L\sim\left\{ {\frac{\lambda \varGamma (\lambda )\sin (\pi \lambda /2)}{\pi } \frac{1}{{s^{1 + \lambda } }},\quad (s \gg s_{0} > 0)} \right.$$
(9)

where Γ(λ) is gamma function and its step size is given by s > 0 and s0\(\gg\) 0. The second phase in FPA us local pollination phase and the new solution for this phase is generated by

$$o_{i}^{t + 1} = o_{i}^{t} + \varepsilon (o_{j}^{t} - o_{k}^{t} )$$
(10)

where \(o_{j}^{t}\) and \(o_{k}^{t}\) are two random solutions from the population and ε in [0, 1]. Both the global and local pollination phases are controlled by using a third parameter namely switching probability. The global pollination phase is meant for improving the exploration properties where as local pollination phase is used for improving the exploitation properties. The switching probability helps in maintaining a balance between the two phases and an optimal value of this parameter helps in providing a perfect combination of exploration and exploitation to find the global optimal solution. In some cases, adaptive values of switching probabilities have also been used. But for present case, we will be using a probability value of 0.7. At the end of this phase, BA and FPA solutions are evaluated.

Final Stage The best fit solutions (\(B_{i}^{best}\)) and (\(FP_{i}^{best}\)) from both individual algorithms are compared and best among them is retained. This process is repeated for every iteration and the best solution after fulfilling the termination criteria is considered as the optimal solution. This helps both flower pollinators and bats to search for mutual solutions in collaboration with each other and this solution thus produced help the algorithm in getting out of local minima and achieve the global optimal result. Overall we can say that, when one algorithm get stuck, other algorithm move them toward their global optima. Thus combined properties of both BA and FPA helps the BFP algorithm in attaining the best optimal solution.

Also the algorithm overcome the inherent drawbacks of BA algorithm. As BA is found to converge prematurely and it has been found that it gets stuck while solving higher dimension problems [38]. So to overcome this problem, FPA has been used. Thus we can say that when BA is limited in scope, FPA helps in providing better and diverse solutions and hence helping BA to reach global solution. Similarly when FPA is limited and gets stuck, BA helps in achieving the global minima. Thus we can say that an overall balance is achieved by both flower pollinators and bats for solving the problem under test.

To solve the above said CH selection (a binary) problem for WSN, binary version of basic BFP is required. In this paper, when the position of BF is updated, the equation used to discrete the position is as follows:

$$Flag_{i} (j) = \left\{ {\begin{array}{*{20}l} {1,} \hfill & {if\; \left( {BF_{i} (j) \ge 0.5} \right)} \hfill \\ {0,} \hfill & {otherwise} \hfill \\ \end{array} } \right.$$
(11)

where \(BF_{i} (j)\) indicates the jth position of ith BF.

4 Radio Energy Dissipation Model

Power consumption is a factor in designing WSN protocols as SNs are highly energy limited. SNs consume energy for data processing, wireless communication and/or sensing data. The network energy is consumed on both sides of the communication (sender and receiver) according to a wireless energy consumption model as shown in Fig. 2. The model consists of two parts reflecting transmission and reception as given in Eqs. 12 and 14 respectively. SN needs to consume energy \(E_{TX}\) to run the transmitter circuit and \(E_{amp}\) to activate the transmitter amplifier, whereas a receiver consumes \(E_{RX}\) power for running the receiver circuit. Energy consumption in wireless communication also depends on message length (\(l\)). Thus, the transmission cost for a \(l\)-bit message is calculated as:

$$E_{TX} = \left\{ {\begin{array}{*{20}l} {lE_{elec} + l\varepsilon_{friis\_amp} d^{2} ,} \hfill & {if\,\, d < d_{0} } \hfill \\ {lE_{elec} + l\varepsilon_{two\_ray\_amp} d^{4} ,} \hfill & {if\,\, d \ge d_{0} } \hfill \\ \end{array} } \right.$$
(12)

where \(d\) is the transmitter–receiver separation and \(d_{0}\) is crossover distance and is given by:

Fig. 2
figure 2

Radio energy dissipation model [18]

$$d_{0} = \sqrt {\varepsilon_{friis\_amp} /\varepsilon_{two\_ray\_amp} }$$
(13)

The term \(E_{elec}\) signifies the per-bit energy consumed for transmission. The parameters \(\varepsilon_{friis\_amp}\) and \(\varepsilon_{two\_ray\_amp}\) represent radio amplifier energy consumption at transmitter end in free space and two-ray ground propagation models, respectively.

The energy consumption for l-bit information reception is given by:

$$E_{RX} = lE_{elec}$$
(14)

where \(E_{elec}\) is the per-bit energy cost for reception.

5 Network Model

In the proposed network model, we use three-level node energy heterogeneity. The network model consists of n randomly deployed SNs on M × M sensing layout. It is predefined as all the nodes including BS are static on post deployment. BS is at the center of network field. Communication links between each other are assumed to be symmetric. The CH on behalf of sensing network is responsible to forward collected data directly or using multi-hop to BS. Every node has the capability to sense, aggregate, and forward the data. Nodes namely normal, advanced, and super with increasing order of energy level as presented in [27, 28, 30]. Hence, in the proposed protocol, we have three-level energy value of heterogeneity as \(E_{0}\), \(E_{Adv}\), and \(E_{Super}\) for energy of normal, advanced and super nodes, respectively. Nodes with percentage population factor as \(a\) for advanced nodes with total \(n\) nodes, which is equipped with energy factor greater than \(m\) times than normal node. Super nodes are equipped with energy incrementing factor as \(m_{0}\), with percentage population factor \(a_{0}\) with respect to total n nodes. Individual node details in the form of equation are presented as follows.

Initial energy for normal nodes is \(E_{0}\) with population count as \(n(1 - a - a_{0} )\). Subsequently, advanced and super node populations are \(na\) and \(na_{0}\), respectively. Energy available with advanced and super node is \(E_{Adv}\), and \(E_{Super}\), respectively. Energy calculations for respective nodes are as follows:

$$E_{Normal} = nE_{0} (1 - a - a_{0} )$$
$$E_{Adv} = nE_{0} \left( {1 + m} \right)a$$
$$E_{Super} = nE_{0} \left( {1 + m_{0} } \right)a_{0}$$

Above equations present energy available with all three types of nodes, i.e. normal, advanced and super nodes respectively. Total initial energy proposed in the network model is calculated as \(E_{Total}\):

$$E_{Total} = E_{Normal} + E_{Adv} + E_{Super} = nE_{0} \left( {1 + ma + m_{0} a_{0} } \right)$$
(15)

Hence, from (15), we hereby conclude that if we add heterogeneity level up to level-2, available energy increased by factor \(\left( {1 + ma} \right)\), and if it is increased to level 3, then energy increased by factor \(\left( {1 + ma + m_{0} a_{0} } \right)\).

6 Proposed Clustering Algorithm

Hierarchical-based routing protocols generally follow a layer-based architecture, where CH election and cluster formation is accomplished in one layer and routing is performed in another layer. In this work, in order to deal with uncertainties during CH selection, interval type-2 FIS has been adopted along with BFP to provide the chance value for each node. The process of proposed protocol named Fuzzy type-2 BFP based STERP (FBFPSTERP), is separated into rounds comprising of set-up and steady-state phase as shown in Fig. 3.

Fig. 3
figure 3

Operation of proposed FBFPSTERP protocol

In set-up phase, the selection of CHs is performed by BS using FIS based BFP from the alive SNs having remaining energy higher than a threshold energy level as shown in Fig. 4. We use a vector as a bat flower pollinator to represent an answer to the low-energy clustering problem. Every solution might depict a number of arbitrarily chosen CH nodes. The low-energy clustering problem is transformed into finding out a CH node set in the prospect solution space using the BFP. In this way, a potential solution will be shown as a string. Therefore, binary encoding is used on every individual. Every single SN within the system contains Boolean factors denoting whether the corresponding SN is selected as the CH node or not.

Fig. 4
figure 4

CH Election Algorithm using BFP with FIS

Let \(X_{i} = \left( {X_{i1} , X_{i2} , \ldots , X_{in} } \right)\) represent the ith population vector of n SNs, where \(X_{i} (j) \in \left\{ {0, 1} \right\}\). Alive SNs and CH nodes are represented by 0 and 1 respectively. For instance, assuming a solution is (0, 1, 0, 0, 0, 0, 0, 1). There are 8 sensors in the region and Nos. 2 and 8 nodes are selected as CH nodes.

Such encoding is adequate and powerful since it contains the whole searching region. Moreover, the length of population vector is the number of design variables. As an example, for WSN with 100 SNs, the entire string length is 100. Here the lookup region is the searching area of all solutions, which meet the CH proportion constraint. That is, the ith component matches the ith SN. As stated, the number of all combinations is 2100.

The initial population of bats and flower pollinators is given by

$$BF_{i} (j) = \left\{ {\begin{array}{*{20}l} {1,} \hfill & {if\;(rand \le p) } \hfill \\ {0,} \hfill & {otherwise} \hfill \\ \end{array} } \right.$$
(16)

where p is percentage of CH selection, and rand is a uniform random number.

In order to quantify the effectiveness of each individual in the CH selection problem using FIS (in the next section), the fitness value of each individual is evaluated. The population goes through various operators (see Eqs. 110) to create an evolved population.

Finally, the fittest vector is used to seed the next phase where the non CH nodes are associated to their CHs to form clusters. This process is repeated iteratively until the termination condition occurs. The network can be clustered using Fuzzy type-2 based BFPSTERP as shown in Algorithm 1.

figure b

6.1 CH Selection Using T2FLS

With a CH selection method in a centralized way, the energy consumption of WSN is huge since the sink node needs to transmit the information of selected CHs to all the nodes in each round. Moreover, the lifetime of the whole network depends on the energy consumption related to the distance and residual energy of each node. It is hard to describe the exact mathematical model of the relationship between the network lifetime and the nodes parameters. The T2FLS does not need an exact mathematical model of the system as well, and it can make decisions even if there is insufficient data. Thus, the T2FLS is strongly recommended for WSN due to its low computational complexity and its easy application in a distributed way with low cost compared to other methods. Therefore, we develop a T2FLS to select the CHs.

Depending on the received data from SNs (id, location and residual energy), BS utilizes BFP to choose CHs based on chance value of each node using T2FLS. The node that has chance value larger than other SNs is elected as CH. The whole process for CH selection is demonstrated in Fig. 4. A T2FLS (Fig. 5a) is very similar to a type-1 FLS (T1FLS) as shown in Fig. 5b, the major structure difference being that the defuzzifier block of a T1FLS is replaced by the output processing block in a T2FLS, which consists of type-reduction followed by defuzzification [39, 40].

Fig. 5
figure 5

Probabilistic model for CH selection. a Using Fuzzy type-2 and b using Fuzzy type-1

Time Division Multiple Access (TDMA) schedules are formed by CH for accompanying member nodes in the cluster. Member nodes communicate with CH in the allotted time slots and remain in sleeping mode during unallocated time slots.

As shown in Fig. 5, remaining energy (RE), node centrality (NC), and distance to BS (D) are three input variables for T2FLS and CH election probability of a node is the only output parameter, named chance. The possibility of node to be elected as CH is more for higher values of chance.

The universal of discourse of the variables RE, NC, D and fit are [0…1], [0…1], [0…1], and [0…1], respectively.

Based on the application scenario considered, the T2FLS input set of values include RE, NC and D. The linguistic variables of the input values are considered as low, medium, and high for RE, close, reachable, and distant for NC, and nearby, average, and far for distance to BS as shown in Fig. 6a–c. The number as well as the limits of the membership functions (MFs) can be adjusted. Applying these characteristics to fuzzy logic the resulting proposed T2FLS includes the following set of input fuzzy variables:

$$\begin{aligned} & {\text{Residual}}\;{\text{energy}}\;RE \in \left\{ {{\text{low}},{\text{medium}},{\text{high}}} \right\}, \\ & {\text{Node}}\;{\text{centrality}}\;NC \in \left\{ {{\text{close}},{\text{reachable}},{\text{distant}}} \right\}, \\ & {\text{Distance}}\;{\text{to}}\;{\text{BS}}\;DBS \in \left\{ {{\text{nearby}},{\text{average}},{\text{far}}} \right\}, \\ \end{aligned}$$
(17)

and probability of a CH candidate election chance is the resulting output, shown in Fig. 6d.

$$chance \in \left\{ {{\text{very}}\;{\text{low(VL}}),\;{\text{low(L}}),\;{\text{rather}}\;{\text{low(RL}}),\;{\text{medium}}\;{\text{low(ML}}),\;{\text{medium(M)}},\;{\text{medium}}\;{\text{high(MH}}),\;{\text{rather}}\;{\text{high(RH}}),{\text{high(H}}),\;{\text{very}}\;{\text{high(VH)}}} \right\}.$$
(18)
Fig. 6
figure 6

Fuzzy sets for input and output variables

A complete set of the fundamental rules (3 × 3 × 3 = 27) has been extracted and defined so as to train the proposed T2FLS (Table 1). The following set of rules represent all the possible combinations of the input linguistic variables.

Table 1 Fuzzy inference rules

A node that has higher residual energy, closer node centrality and nearer to BS has more chances to be elected as a CH. An optimistic illustration of the problem is rule 19 in Table 1, while the contradictory one is rule 9.

Steady-State Phase Steady-state or transmission phase represents the actual communication of environmental reports from the network field. The proposed protocol adopts unique data transmission phase to achieve energy efficiency, reliability, and minimum delay transmission. For intra-cluster communication, the protocol adopts event based reactive short range single-hop transmissions [41], while multi-hopping is implemented for inter-cluster communications.

Intra-Cluster Data Transmission Phase During this data transmission phase, energy of active nodes will be modified according to the dissipated energy required for sensing, packets transmission, receiving, and aggregation. Energy is dissipated in sensing and to transmit the message packets from non CH to their associated CHs and, hence, the energy of the CMs will be modified. For CH nodes, energy is modified while performing packets receiving and aggregation. Thus, in this phase, the energy of the CM and CH nodes can be formally modified according to the following:

$$E(node_{j} ) = \left\{ {\begin{array}{*{20}l} {E(node_{j} ) - E_{sensing} - E_{{TX_{{{\text{node}}_{\text{j}} ,CH_{k} }} }} ,} \hfill & {if\;sensed\;value \ge Threshold} \hfill \\ {E(node_{j} ) - E_{sensing} ,} \hfill & {if\;sensed\;value < Threshold} \hfill \\ \end{array} } \right.$$
(19)
$$E\left( {CH_{k} } \right) = E\left( {CH_{k} } \right) - \left( {E_{RX} + E_{DA} } \right)$$
(20)

where \(E(node_{j} )\) and \(E\left( {CH_{k} } \right)\) denotes current energy of sensor \(j\) and CH \(k\) respectively, \(E_{{TX_{{node_{a} ,node_{b} }} }}\) is energy expenditure for transmitting data from \(node_{a}\) to \(node_{b}\), \(E_{RX}\) is energy dissipated for reception of data and \(E_{DA}\) is the energy dissipation in data aggregation.

Inter-Cluster Data Transmission Phase During this phase, the energy of CHs in the network will be consumed according to the dissipated energy required for packets transmission to BS. Energy is also dissipated for relay CHs to receive the message packets from distant CHs and data aggregation, and transmission to BS. Thus, in this phase, the energy of the CH nodes can be formally modified according to the following:

$$E(CH_{k} ) = \left\{ {\begin{array}{*{20}l} {E(CH_{k} ) - E_{{TX_{{CH_{k} ,CH_{R} ,}} }} ,} \hfill & {if\;d_{{CH_{k} ,BS}} \ge R} \hfill \\ {E(CH_{k} ) - (E_{RX} + E_{DA} + E_{{TX_{{CH_{k,} BS}} }} ),} \hfill & {if\;d_{{CH_{k} ,BS}} < R} \hfill \\ \end{array} } \right.$$
(21)

where \(CH_{R}\) is the relay CH that lie within the radius \(R\) centered on BS.

7 Simulation Results

This section evaluates the performance analysis over simulations experiments of FBFPSTERP and its competitive protocols. MATLAB is used to simulate existing protocols along with of FBFPSTERP which uses the BFP algorithm for CHs selection in order to reduce energy conservation of SNs. Simulative results under conventional setting have been observed in order to maintain the realistic outcomes. Random distribution of nodes is initialized in specific settings in different scenario. In homogeneous setup, 100 nodes are placed within a 100 m ∗ 100 m area having initial energy \(E_{0}\), with BS located at (50, 50). Advanced and super nodes are set to 20% and 10% of total nodes having initial energy \(2E_{0}\) and \(3E_{0}\) respectively for heterogeneous setup. Initial network parameters and linear radio model’s specifications are given in Table 2. Performance of FBFPSTERP has been critically analyzed after testing experiments in presence of existing state of the art routing protocols like LEACH, SEP-E, HCR, ERP, DRESEP, HSTERP and DETERP. The parameters setting for simulated protocols are given in Table 3.

Table 2 Network parameters
Table 3 Parameters setting for simulated algorithms

Figure 7 shows the alive nodes over the communication rounds for homogeneous setup. FBFPSTERP improves the network lifespan, as it considers remaining energy of SNs for CH election. In FBFPSTERP, the node with the highest remaining energy and shortest distance to BS has the best chance to become the CH. Therefore the nodes transmission radius decreases as its distance to BS decreases, which leads to creating smaller cluster sizes with shorter sensors chain length. Hence, the energy consumed due to intra-cluster data processing and long distances to respective CHs will be reduced. In addition, the multi-hop routing in inter-cluster is adopted in an optimum manner on the basis of distance of CH to BS as a factor.

Fig. 7
figure 7

Number of survival nodes changes with rounds updating for homogeneous setup for E0 = 1 J

Figure 8 shows the energy consumption for FBFPSTERP is less than its comparative. The reduction in average energy consumption at each round is justified for the following reasons: (1) the node which is rich in energy and minimizes the intra-cluster communication cost becomes the CH for a cluster. (2) The CHs are not fixed and gets updated on every re-clustering based on SNs that have higher energy and relatively nearby to BS.

Fig. 8
figure 8

The residual energy of the network changes over time for homogeneous setup

Tables 4, 5 and 6 present the round history of dead nodes for homogeneous setup for \(E_{0}\) = 0.25 J, 0.5 J and 1 J respectively. For the total network lifetime [i.e., the round until last node dead (LND)] and the stability period [i.e., the round until first node dead (FND)], the proposed protocol proves very favorable against all other protocols.

Table 4 Comparison on network lifetime for homogeneous setup for \(E_{0} =\) 0.25 J
Table 5 Comparison on network lifetime for homogeneous setup for \(E_{0} =\) 0.5 J
Table 6 Comparison on network lifetime for homogeneous setup for \(E_{0} =\) 1 J

The behaviour of FBFPSTERP for heterogeneous setup is shown in Figs. 9 and 10, and the statistics are given in Tables 7, 8 and 9.

Fig. 9
figure 9

Number of survival nodes changes with rounds updating for heterogeneous setup for \(E_{0} =\) 1 J

Fig. 10
figure 10

The residual energy of the network changes over time for heterogeneous setup

Table 7 Comparison on network lifetime for heterogeneous setup for \(E_{0} =\) 0.25 J
Table 8 Comparison on network lifetime for heterogeneous setup for \(E_{0} =\) 0.5 J
Table 9 Comparison on network lifetime for heterogeneous setup for \(E_{0} =\) 1 J

Table 10 presents the number of rounds taken for FND, HND and LND together with stability and instability periods of competitive algorithms for \(E_{0}\) = 1 J. There is an improvement of 156.47%, 179.8%, 1318.6% and 18.14% in stability period for FBFPSTECP as against LEACH, HCR, ERP and DRESEP respectively. Also, the instability period reduces to 2.68%, 48.74%, 22.16% and 58.68% in comparison with LEACH, HCR, ERP and DRESEP respectively for homogeneous setup. Similarly, for heterogeneous setup, there is a significant progress in stability period for FBFPSTERP as compared to competitive protocols.

Table 10 Comparison of network lifespan of simulated algorithms along with stability and instability periods for \(E_{0} =\) 1 J

To evaluate the effect of node density in each approach, the number of nodes is varied from 100 to 500 and the results are demonstrated as given in Tables 11 and 12. In dense networks, more CHs help to maximize the network lifetime. The performance of FBFPSTERP is much better than other protocols for varying node density. By increasing the node density, the resulting network has better lifetime as multi-hop communication in inter-cluster is utilized in an optimum manner.

Table 11 Effect of node density on the performance of FBPSTERP for homogeneous setup
Table 12 Effect of node density on the performance of FBFPSTERP for heterogeneous setup

8 Conclusion

Due to advancement in the technology and need for machine-to-machine connectivity, WSN overplays the role compared to other wireless networks. In this context, different applications based on WSNs need to be executed efficiently in terms of energy and communication. To achieve this, there is a need to collaborate among various devices at various levels, which can be achieved by the grouping of these devices, that is, through the clustering. In this view, we have presented a new clustering protocol for randomly distributed WSNs to improve the network lifetime and stability period. This protocol finds optimum number of CHs using Fuzzy type-2 based BFP using sensor parameters such as remaining energy, node centrality and distance to BS. To increase the lifetime of the WSN, threshold-based data transmission algorithm is employed in inter-cluster communication. We have extended our protocol by multi-hop routing scheme in which only CH closest to BS transmits its data directly to BS to reduce energy dissipated by CHs located far away from BS. The simulation results show that FBFPSTERP has better energy-efficiency and prolonged lifetime in different sizes of networks compared with other existing protocols.