1 Introduction

Wireless sensor networks (WSNs) are formed by a collaboration of sensors through data sensing, processing, and wireless communication among the sensor nodes. These networks are organized for sensing event-driven information and transmitting it to the base station (BS) for in-depth evaluation [1]. WSNs have delivered beneficial outcomes in several applications such as environmental monitoring, surveillance missions, health monitoring, home automation, target tracking, traffic monitoring, fire management, agriculture monitoring, industrial failure detection, and energy management [2]. WSNs are often deployed in the form of thousands of nodes in remote and hostile areas which are inaccessible or unsafe for humans. Therefore, the formation of autonomous and energy efficient network among the sensor nodes becomes vital to ensure prolonged network lifetime and controlled energy depletion [3].

Energy efficiency is directly related to effective data routing wherein cluster of nodes is formed to reduce the energy consumption and control overhead while limiting the interference among the sensor nodes [4]. Generally, the energy is consumed during data sensing, processing, and transmission. Among these activities, data transmission consumes the most energy [5]. Thus, efficient data forwarding and processing techniques must be developed to extend the network lifetime. One possible solution is to use in-network data aggregation schemes [4]. This approach reduces a significant number of data packets transmitted during the network operation by aggregating data at intermediate nodes and thus helps in bandwidth and energy savings. Data aggregation involves combining data from various sources so that aggregated information is received at the base station and circulation of redundant information is eliminated. For execution of common tasks, the nodes within the network must communicate with each other or through intermediate nodes [6].

To develop a data aggregation scheme, three main constituents of data aggregation should be considered. First, the aggregation function used by the protocol. Second, data aggregation scheduling which defines the waiting period before a node aggregates and forwards the received data. Third, the routing scheme which defines the routing protocol used to send the aggregated data towards the sink by generating a network structure [1]. This paper focuses on the routing scheme of data aggregation which potentially optimizes the routing procedure by utilizing the available processing capability of the intermediate sensor nodes. The aggregation task in our network is achieved by formation of tree-based data aggregation in a three-level hierarchy. This reduces the processing and communication cost for randomly distributed nodes.

The rest of this paper is organized as follows: in Sect. 2, the related works are discussed. In Sect. 3 and 4, flower pollination algorithm (FPA) and enhanced FPA (EFPA) algorithms are outlined respectively. In Sect. 5, the proposed methodology and strategy are presented in detail. Section 6 discusses the performance of the proposed algorithm by comparing it with other approaches. Finally, in Sect. 7, conclusions are drawn and possible future directions are described.

2 Related Work

The main operational sustainability concern in WSN is its energy resource constraint. A great number of energy efficient routing protocols have been proposed in recent years for WSNs based on the network organization and the routing protocol operations. Some of these focused on minimizing the communication distance to reduce the energy consumption and a handful of them focused on fair energy distribution to avoid the routing hole (hot spot) problems [1,2,3,4].

Based on the logical structure, the routing protocols are divided into two categories. The first category is flat routing, in which the role of each node in the network is same and there are no special nodes. The advantage of this type of protocols is their robustness. The other category is hierarchical-based routing. One of the most classical paradigms of hierarchical-based routing is the clustering, in which cluster is an infrastructure and nodes play different roles. This method involves division of network into small sets of nodes called clusters. Within each cluster, the hierarchy is divided into a cluster head (CH) node and member nodes [3]. The data from the member nodes is collected by the CH. Then, the data is aggregated and forwarded to the upstream node. LEACH [6] and HEED [7] are two classic models of clustering. They differ in the selection method of CH. LEACH is formulated on the assumption that energy of all nodes is equal during the election while HEED considers the variation of energy in nodes to optimize the network lifetime.

Hierarchal routing protocols are energy efficient compared with flat routing protocols. The advantage of hierarchal approach is to control the data duplication and is best suited for data aggregation. Several hierarchal based energy efficient routing protocols have been referred to in the literature such as TEEN [8], APTEEN [9], SEP [10], DEEC [11], LEACH-DT [12] etc. LEACH [6], TEEN [8] and APTEEN [9] protocols are based on random CH selection. These protocols select CH based on a random acquired value. If this value is less than a certain threshold, the nodes will be the CH. Because of the randomness during the selection, the selected CH is prone to be distributed improperly and unevenly. This could cause the uneven distribution of the traffic flow in different cluster head nodes. One of the direct consequences is that some CHs exhaust energy, therefore, the performance of the entire network is affected. Many LEACH-type schemes are applied in homogeneous WSNs. In homogenous sensor networks, sensor nodes cannot adapt to the presence of heterogeneity when the network is in operation. As a result, these nodes which consume more energy will die first, and as a result LEACH-type protocols turn out to be unstable. A number of variants to LEACH protocol are stated in literature based on heterogeneity [10,11,12,13,14], distance-based thresholds [12], multi-hopping [14, 15], deterministic CH selection [16] and reactive protocol [8, 9, 17, 18].

To get better performance, stable election protocol (SEP) [10] is proposed to maintain the hierarchical routing in WSNs where two types of nodes have their own election probability. Kumar et al. [13] proposed an energy-efficient heterogeneous clustered (EEHC) protocol in the heterogeneous model. The nodes in the network are categorized into three levels according to their initial energy. EEHC is based on SEP, and the three types of nodes in EEHC have their own election probability to be CHs within a fixed time to keep the system stable. Mittal and Singh [17] proposed a reactive cluster-based routing protocol named distance-based residual energy efficient SEP (DRESEP) optimal for event driven applications like battlefield surveillance, forest fire detection and health care monitoring. It considers residual energy of nodes and their distances from BS as CH nomination parameters. Dual-hop communication protocol is used for data transmission between CH and BS (inter-cluster routing). Data aggregation is performed by CH in each cluster for the purpose of saving the residual energy. Mittal et al. also proposed a stable energy efficient clustering protocol (SEECP) [18] in which CHs are selected in deterministic fashion based on residual energy of nodes in order to balance the load effectively among nodes.

Researchers have combined the clustering scheme with the biologically inspired routing scheme to achieve longer lifetime [19,20,21,22,23,24,25,26]. Evolutionary algorithms (EAs) are used to handle the cluster-based problem to optimize energy consumption and prolong lifetime of network with heterogeneity. Evolutionary based clustered routing protocol (ERP) [21], energy-aware ERP (EAERP) [22], stable-aware ERP (SAERP) [23] and threshold-sensitive energy efficient routing protocols (TERP) using DE [24], HSA [25] and SMO [26] are some of the recently developed EA based clustering algorithms. EAERP redesigned some significant features of EAs, which can assure longer stable period and extend the lifetime with efficient energy dissipation. ERP overcame the shortcomings of hierarchical cluster-based routing (HCR) algorithm [20] by uniting the clustering aspects of cohesion and separation error. SAERP combined the main idea of SEP and EAs with an aim to increase the stability period of the network. In TERP, threshold decision based communication and dual-hop communication is employed for intra-cluster and inter-cluster communication respectively in order to improve the network lifetime. Each of these routing schemes (DETERP, HSTERP and SMSTERP) demonstrated their advantages in prolonging the lifetime of HWSNs [24,25,26].

In the recent past, the routing was emphasized with clustering but nowadays tree based routing algorithms gain attention due to inherent property of efficient routing by using different tree branching techniques. A tree-based routing protocol establishes and maintains a shared routing tree to deliver data from a source to receivers of group. Tree based protocols gives the high data forwarding efficiency and low robustness. In the chain based approach (another class of hierarchical-based routing), all the nodes are placed in the chain type fashion, one node associates with other node next to it. In the chain based approach token passing scheme is used, finally whole the data which is gathered and aggregated is send to the leader node which fuses the data and sends it to the BS [27]. It is well explained in Power-efficient gathering in sensor information systems (PEGASIS) approach.

Tree-based clustering (TBC) protocol is considered to be an improvement to LEACH [28]. It forms several clusters likewise in LEACH, and every cluster has cluster members associated with CH. A routing tree is constructed with the help of nodes within the cluster and CH is chosen as root of it. Distance information between the member nodes and root is used by the CH for the tree configuration. Every node is location-aware, so the distance between root and itself can be estimated by it. Every cluster is divided into some levels. To determine the level the distance between the node and root is considered. At level-0 root is set and a node in level L(i) will choose the node in the level L(i)-1or the node nearest to itself as its parent node. In two neighboring levels, the data is transferred simultaneously between the nodes, and each node will fuses the received data and send it to its parent node.

PEDAP [29] is a tree based routing protocol that makes use of minimum spanning tree. Minimum spanning tree leads to loop free topology which costs minimum for the transmitting of data. Likewise PEGASIS, PEDAP also fuses the data and having almost the same network assumptions. In PEDAP, BS has to build the topography which consumes a large amount of energy wastage. It is because if the topography is built by BS, it should send a lot of information to the selected sensor nodes, including the TDMA slot information for each node. It also gives the proper information about which nodes will act as child nodes and parent nodes. This exchange of information will cause a large amount of energy wastage and also leads to long delay.

In [30], a tree based power saving routing protocol is constructed in which two parameters are considered to control the tree construction and these are: the maximum capacity of a node to have s number of children, and the maximum tree depth. In the construction of tree, before sending a message to the sink, a sink rooted tree must be constructed. To assign a logical ID for each node, a new addressing scheme is used. Nodes depth and neighbors depth is calculated by each node.

Tree-based Efficient Protocol for Sensor Information (TREEPSI) is proposed in that a root node is selected before data transmission [31]. To build tree path two ways are defined. One is path computation centrally by a sink, afterwards it is broadcasted. Second, the root node visits other nodes by employing a standard tree algorithm at the very first stage known as initial stage. In data transmission phase, the leaf node collects the data and this collected data is aggregated and send by root node to sink. This process continues until the root node dies and the new node is selected. The communication distance is reduced by using TREEPSI as compared to PEGASIS that permits it to scale back power consumption as much as 30% to PEGASIS.

A General Self-organizing Tree Branching Energy Balancing Protocol (GSTEB) is introduced with an aim to achieve a longer network lifetime for different applications in WSN environment [32]. In each round, BS assigns a root node according to predetermined criterion and broadcasts its identity as well as coordinates to all the sensor nodes. GSTEB can reconstruct the routing tree with lesser delay and low energy consumption. The GSTEB includes four phases: Initial phase, Tree constructing phase, Self-Organized data collecting and transmitting phase and Information exchange phase. The delay in GSTEB is shorter than that of the PEGASIS and PEDAP. This is because, Tree constructing phase in GSTEB consumes little energy and causes a shorter delay because the topography is built by self-Organizing in that each node is able to choose its parent simultaneously.

The literature presented above reveals the fact that the key objective of routing protocols is to prolong the network lifetime by applying efficient cluster-based, chain-based or tree-based algorithms. However, in each of these algorithms, CHs or parent nodes bear additional workload contributed by their member nodes or leaf nodes. In this paper, an energy efficient tree-based routing algorithm is introduced to overcome the load balancing shortcomings of WSNs. In this, leaf nodes use threshold decision based reactive strategy for data transmission to parent node. Designing energy efficient tree-based routing algorithm is an NP-hard problem. Flower pollination algorithm (FPA) is recently developed heuristics that mimics the pollination process of flowers [33] and has been successfully applied for antenna design problems [34, 35]. But it has been proved quantitatively and qualitatively in [36] that FPA has a very limited scope and its deeper analysis should be done to make it as a state-of the art algorithm. This analysis paves way for researchers to design new algorithm to improve the basic FPA as per their problem requirement. In present work, a newly proposed enhanced FPA (EFPA) [35] has been exploited to the best of its potential for solving the above load balancing problem in tree-based routing algorithm.

3 Flower Pollination Algorithm

FPA is a novel bionic evolutionary algorithm that was first proposed by Yang [33]. The inspiration for FPA comes from the natural pollination process that takes place in flowering plants. In FPA, each flower stands for a feasible solution and the objective function value is regarded as the fitness value. To mimic the pollination process, there are two different pollination methods for each flower to choose: global or local pollination with switch probability p. The specific process is described as follows.

3.1 Global Pollination Phase

For each individual, a random number rand is generated. If rand <  p, then global pollination should be carried out. In the global pollination process, each flower updates its position according to the following equation [33]:

$$X_{i}^{t + 1} = X_{i}^{t} +\upgamma\,L\left( \lambda \right)\left( {X_{best}^{t} - X_{i}^{t} } \right)$$
(1)

where \(X_{i}^{t}\) and \(X_{i}^{t + 1}\) are the old and new positions of ith flower, respectively, \(X_{best}^{t}\) is the best flower at current iteration t, which has the best fitness value in the whole population, and \(\gamma\) is the scaling factor that controls the step size of global pollination. Parameter L, the Lévy flight, is used as the strength of pollination in the basic FPA, and the step size (\(\lambda\)) obeys the Lévy distribution:

$$L\sim \frac{{\lambda \varGamma \left( \lambda \right){ \sin }\left( {\pi \lambda /2} \right)}}{\pi } \frac{1}{{s^{1 + \lambda } }},\quad (s \gg s_{0} > 0) .$$
(2)

where Γ(\(\lambda\)) is the gamma function.

3.2 Local Pollination Phase

If rand > p, the local pollination process should be carried out. Flower \(X_{i}^{t}\) obtains its new position \(X_{i}^{t + 1}\) according to the difference between its old position and the position of two neighboring flowers \(X_{p}^{t}\) and \(X_{q}^{t}\). This process is considered as local search, and the updating equation [33] is defined as

$$X_{i}^{t + 1} = X_{i}^{t} + r\left( {X_{p}^{t} - X_{q}^{t} } \right)$$
(3)

where \(r\) is drawn from a uniform distribution [0, 1], and it is considered as a local random walk.

After pollination is completed, the new individuals update their positions by comparing fitness values. If the fitness of \(X_{i}^{t + 1}\) is better than that of \(X_{i}^{t}\), the new position of ith flower will be replaced by \(X_{i}^{t + 1}\). Otherwise, ith flower remains at \(X_{i}^{t}\) .

4 Enhanced Flower Pollination Algorithm

FPA due to its linear nature has gained interest among researchers and large number of improvements in its basic form as been done since its inception. Enhanced flower pollination algorithm was one such algorithm proposed in the recent past [35]. This algorithm aims at achieving enhanced performance by balancing the exploration and exploitation capabilities of the basic FPA. Exploration has been enhanced by inclusion of heavy tailed Cauchy based global pollination, exploitation has been improved by modified local pollination inspired from experience of current best solution and a balance between both of these phenomenon is achieved by use of dynamic switch probability. The major modifications are discussed as below:

In global pollination phase, instead of Lѐvy flight based step size, highly directed and heavy tailed Cauchy based step size is used. This step size due to its heavy tailed distribution is better at exploring the search space and also allows larger mutations owing for better convergence. The Cauchy based step size is generated from Cauchy distribution function given by

$$d = \frac{1}{2} + \frac{1}{\pi }{ \arctan }\left( {\frac{\delta }{g}} \right)$$
(4)

The Cauchy density function is given by

$$f_{{Cauchy\left( {0,g} \right)}} \left( d \right) = \frac{1}{\pi }\frac{g}{{g^{2} + x^{2} }}$$
(5)

where g is the scaling parameter and its value is taken as 1, d is a random number in the range of [0, 1] and \(\delta\) is the Cauchy random operator. This Cauchy distributed general equation of global pollination phase becomes:

$$X_{i}^{t + 1} = X_{i}^{t} + C\left( \delta \right)\left( {X_{best}^{t} - X_{i}^{t} } \right)$$
(6)

The local pollination phase has been enhanced by use of two random solutions from the search space to update the current solution. In terms of pollinators, the fitness of new solution is improved by experience of current best solution and local random solutions. Here if the fitness of new solution is not better than the old solution, the new position is updated based on the previous solution. The equation of local pollination phase is given by:

$$X_{i}^{t + 1} = X_{i}^{t} + a\left( {X_{best}^{t} - X_{i}^{t} } \right) + b\left( {X_{k}^{t} - X_{l}^{t} } \right)$$
(7)

where \(a \;and\; b \in \left[ {0, 2} \right]\) are uniformly distributed random variables, \(X_{j}^{t} \;and\; X_{k}^{t}\) are two random solutions with respect to kth and lth flower pollinator where \(k \ne l\). This phase, enhances the exploitative tendencies of FPA.

4.1 Dynamic Switch Probability

A balance between the exploration via global pollination and exploitation via local pollination is achieved by use of dynamic switch probability and is taken in the range of \(p\;\epsilon \; \left[{0,1} \right]\). The general equation of switch probability is given by:

$$p = p - \frac{maxiter - t}{maxiter} \times \left( {0.1} \right)$$
(8)

where t is the current iteration and maxiter is the maximum number of iterations or generation. This equation decreases the p value linearly with respect to iterations. This p value ensures that more extensive exploration at the initial stages and intensive exploitation occurs toward the end of iterations. The pseudocode for EFPA is given in Pseudocode 1.

Parent node selection in tree-based WSNs is a binary-coded problem, therefore, the basic EFPA cannot be used to tackle this problem. Inspired from the basic EFPA algorithm, binary version is used for binary optimization problems. When the position of moth is updated, to determine whether a node will be selected as parent node, a static threshold of 0.5 is used, as the following equation is used to discrete the position:

$$Flag_{i} \left( j \right) = \left\{ {\begin{array}{*{20}l} {1,} \hfill & { if \left( {X_{i} \left( j \right) \ge 0.5} \right)} \hfill \\ {0,} \hfill & {otherwise} \hfill \\ \end{array} } \right.$$
(9)

where \(X_{i} \left( j \right)\) indicates the \(j\)th position of \(i\)th flower.

figure d

5 Proposed Protocol

In hierarchical-based routing protocols, sensor nodes perform different tasks such as sensing, processing, transmitting, and receiving. Some of these sensor nodes called CHs or parent nodes, are responsible for collecting and processing data and then forwarding it to sink. The task of other nodes called member nodes or leaf nodes is to sense the sensor field and transmit the sensing data to the head nodes. Hierarchical based routing is a two-layer architecture, where head nodes selection is performed in the first layer and the second layer is responsible for routing.

In this work, EFPA based threshold-sensitive energy-efficient tree-based routing protocol (EFTETRP) is proposed. The aim of EFTETRP is to attain an extended network lifetime for various applications. The protocol process is divided into rounds consisting of the following four phases [32]: (a) Initial phase, (b) Tree constructing phase, (c) Data transmitting phase and (d) Information exchange phase.

5.1 Initial Phase

In initial phase, BS broadcasts a message to all nodes in order to make them aware about the timings and time slot length. When all the nodes receives this message from BS they will construct their residual energy level. Node then sends this packet in a circle having radius \(R_{t}\) in their allotted time slot. The neighbor nodes who can receive the information can store this information in their memory. The nodes that are not the neighbors in range \(R_{t}\) can put their radios off. Nodes used to maintain a table in their memory which can store the information regarding their neighbor nodes. After this each node will sends its information to its neighbors so that neighbors can receive this information and store it in their memory. Each node will maintain two tables in its memory which contains the information of its neighbors as well as neighbor’s neighbor information.

5.2 Tree Constructing Phase

After each and every round of the algorithm, BS will choose parent nodes from the sensor nodes with some predefined parameters termed as fitness function using EFPA. BS broadcasts parent nodes ID and coordinates to each and every sensor node. Each node will select its parent node such that the difference between the node and its parent node should be shorter than the distance between itself and BS. If the above said situation is not satisfied then the node will choose BS as its parent node only. The node which is not having any child node will consider itself as a leaf node and hence the process of transferring data will start from this node.

5.3 Data Transmitting Phase

After the construction of the routing tree, data is collected and processed to generate a data packet (DATA_PKT) that is transmitted to BS. Every node selects its parent by considering energy as well as distance optimal values. There may be many leaf nodes sharing one parent node in the same time slot. If all the leaf nodes try to send the data to the parent node at the same time, the data messages may interfere and cause routing overhead and thus decrease throughput. To avoid collisions, Code Division Multiple Access (CDMA) or Time Division Multiple Access (TDMA) techniques may be applied. The operation is divided into several timeslots, at the starting of every round. For example, in the ith timeslot, the node having its node ID \(i\) will turn on its radio and receives messages from the BS. Same approach is used by BS to construct the routing tree in each round. After that BS tells the sensor nodes when to receive or transmit the data. Leaf nodes send their sensed data to its parent node when the threshold criteria is satisfied [24,25,26]. Parent node receives the data and send the aggregated data to BS. When the BS fetches all the data from the sensor nodes, the next phase will start (Fig. 1).

Fig. 1
figure 1

EFTETRP process

5.4 Information Exchange Phase

In this phase, BS collects the energy and coordinate information of all the sensor nodes. For each round, BS builds the routing tree and network schedule by using coordinates and energy information. This information is very necessary for calculating the topology for the upcoming round in advance. As every node need to transmit data in each round. So energy of a node will be exhausted after some rounds which can influence the network topography. The BS exchange information by sending DATA_PKT to sensor nodes and in return receives CTRL_PKT from them.

5.5 Proposed Fitness Function

Let’s assume network consists of N sensor nodes which is divided into K number of branches, the number of candidate parent node (PN) is denoted by M (generally greater than K), there can be C KM ways of clustering.

Fitness function is defined as:

$$f = \alpha f_{1} + \beta f_{2} + \gamma f_{3}$$
(10)

Here \(\alpha , \beta ,\gamma \;{\sf C}\!\!\!\!\raise.8pt\hbox{=} \;\left[ {0,1} \right],\; \alpha + \beta + \gamma = 1\)

In the fitness function, \(f_{1}\) is the ratio of the total initial sum of energy of all the sensor nodes in the network to the sum of the energy of parent nodes in the present round. Here, \(f_{1}\) makes sure that the parent nodes are selected from sensor nodes that have enough energy to achieve its task.

$$f_{1} \left( {p_{j} } \right) = \frac{{\mathop \sum \nolimits_{i = 1}^{N} E\left( {n_{j}} \right)}}{{\mathop \sum \nolimits_{i = 1}^{N} E\left( {PN_{j,k} } \right)}}$$
(11)

\(f_{2}\) is to minimize the maximum Euclidean average distance that shows how far a parent node is from its leaf nodes.

$$f_{2} \left( {p_{j} } \right) = \begin{array}{*{20}c} {max} \\ {\forall k = 1,2,3 \ldots N} \\ \end{array} \;\frac{{\mathop \sum \nolimits_{\forall ni \in Cpj,k}^{N} d\left( {n_{i} , PN_{j,k} } \right)}}{{BP_{j,k} }}$$
(12)

Here \(d\left( {n_{i}, PN_{j,k} } \right) = \mathop {min}\nolimits_{\forall k = 1,2,3 \ldots N} \;\left\{ {d\left( {n_{i} , PN_{j,k} } \right)} \right\}\) is the Euclidean distance which is smallest of all from the node \(n_{i}\) to the PN \(PN_{j,k}\) and \(BP_{j,k}\) is the number of nodes in the branch Bk what the particles belong to \(P_{j}\) and this \(f_{2}\) the average Euclidean distance every PN to its all sensor nodes in the branch is smaller.

\(f_{3}\) is the distance ratio of the average distance from the PN to the BS and the Euclidean distance from the BS to the centre of the network.

$$f_{3} (p_{j} ) = \frac{{\sum\nolimits_{{i = 1}}^{N} {d(BS,\;PN_{{j,k}} )} }}{{K*d(BS,\;NC)}}$$
(13)

In \(f_{3} ,\;\sum\nolimits_{i = 1}^{N} d \left( {BS, PN_{j.k } } \right)\) is the Euclidean distance from BS to every PN and \(d\left( {BS, NC} \right)\) is the distance of BS from the centre of the network. \(f_{2}\) makes the whole network area divided into uneven PNs, the size of the PN near the BS is smaller as compare to the others.

6 Simulation Results

MATLAB is used for designing the network scenario which executes the EFPA algorithm for tree formation and parent nodes selection in order to reduce energy conservation of sensor nodes. The simulation results of EFTETRP protocol has been analyzed with respect to the performance metrics such as energy efficiency, total network residual energy and network lifetime against LEACH, SEP-E, HCR, ERP, DRESEP, HSTERP and DETERP protocols. The network characteristics used for the protocol simulations are summarized in Table 1.

Table 1 Parameters used in MATLAB simulation

Simulation results are produced by deploying 100 nodes randomly within a 100 m × 100 m area. In homogeneous setup, the network consists of 100 nodes having initial energy \(E_{0}\), in which the deployment area is 100 m × 100 m and BS is located at (50, 50). For heterogeneous setup, advanced and super nodes are set to 20% and 10% of the total nodes respectively. The advanced nodes and super nodes have initial energy 2 times and 3 times greater than normal nodes (having initial energy \(E_{0}\)) respectively. Equal weights are assigned in the fitness function of (10) for EFTETRP. The parameters setting for simulated protocols are given in Table 2.

Table 2 Parameters setting for simulated protocols

The performance of EFTETRP protocol is explored in terms of network lifetime and stability period (the time internal or the rounds before the first node dead) against the competitive algorithms. The simulation results of competitive protocols for homogeneous set up with initial energy \(E_{0}\) = 1 J are shown in Fig. 2. It shows the relation between alive nodes and number of rounds, from which it is obvious that EFTETRP enhances the entire network lifetime to some extent. The main reason for this result is that in EFTETRP, parent nodes and leaf nodes are optimally selected by making use of EFPA on the basis of energy and distance between parent nodes and BS. These measures reduce and balance the energy consumption of the whole network in comparison to HSTERP and DETERP. From the point of view of the individual, the energy of each node is saved and used efficiently, so that the lifetime of the network is extended. In addition, the proposed protocol is event driven protocols in that data transmission is possible when certain conditions are satisfied.

Fig. 2
figure 2

Number of alive nodes per round for simulated protocols for homogeneous setup for \(E_{0} = 1\) J

Figure 3 shows average energy remaining in the network per round for homogeneous set up with initial energy \(E_{0}\) = 1 J. Energy analysis demonstrates that EFTETRP have consumed less energy at each round during the network operation in comparison to competitive protocols. The reason for this phenomenon is that the uniform distribution of the trees is ensured in EFTETRP, which is conducive to balancing the energy depletion of the entire network.

Fig. 3
figure 3

Average energy per round for simulated protocols for homogeneous setup for \(E_{0} = 1\) J

The simulations are performed to check the performance of the proposed algorithm with varying initial energy of nodes. Tables 3, 4 and 5 present the round history of dead nodes for homogeneous setup for \(E_{0}\) = 0.25 J, 0.5 J and 1 J respectively. For the total network lifetime [i.e., the time until last node dead (LND)] and the stability period [i.e., the time until first node dead (FND)], the proposed protocol proves very favorable against all other protocols.

Table 3 Round history of dead nodes for homogeneous setup for \(E_{0} = 0.25\) J
Table 4 Round history of dead nodes for homogeneous setup for \(E_{0} = 0.5\) J
Table 5 Round history of dead nodes for homogeneous setup for \(E_{0} = 1\) J

The behaviour of EFTETRP for heterogeneous setup is shown in Figs. 4 and 5, and the statistics are given in Tables 6, 7 and 8.

Fig. 4
figure 4

Number of alive nodes per round for simulated protocols for heterogeneous setup for \(E_{0} = 1\) J

Fig. 5
figure 5

Average energy per round for simulated protocols for heterogeneous setup for \(E_{0} = 1\) J

Table 6 Round history of dead nodes for heterogeneous setup for \(E_{0} = 0.25\) J
Table 7 Round history of dead nodes for heterogeneous setup for \(E_{0} = 0.5\) J
Table 8 Round history of dead nodes for heterogeneous setup for \(E_{0} = 1\) J

Table 9 presents the number of rounds taken for FND, HND and LND together with stability and instability periods of competitive algorithms for \(E_{0}\) = 1 J. There is an improvement of 167.74, 180, 128.70 and 17.83% in stability period for EFTETRP as against LEACH, HCR, ERP and DRESEP respectively. Similarly, for heterogeneous setup, there is a significant progress in stability period for EFTETRP as compared to competitive protocols.

Table 9 Comparison of network lifetime of simulated protocols together with stability and instability periods for \(E_{0} = 1\) J

To evaluate the effect of node density in each approach, the number of nodes is varied from 100 to 500 with initial energy 1 J. The same parameters as that in 100 nodes scene are used to create the simulation model, and the results are demonstrated as given in Tables 10 and 11 for homogeneous and heterogeneous setups respectively. In dense networks, more parent nodes help to maximize the network lifetime. The performance of EFTETRP is much better than other approaches for varying node density. By increasing the node density, the resulting network has better lifetime. The performance of EFTETRP denotes that the reliability, the stability, and the scalability of proposed algorithm is especially excellent and is suitable to large scale WSNs.

Table 10 Effect of node density on the performance of EFTETRP for homogeneous setup
Table 11 Effect of node density on the performance of EFTETRP for heterogeneous setup

Although improvements are made in some performance metrics, there are some limits in using this algorithm yet. EFTETRP belongs to the reactive routing algorithm so that this protocol is most appropriate when an event or threshold based monitoring by the sensor network is needed. Another limitation of EFTETRP is that delay in generating the data transmission process from a node to BS is bit higher in comparison to competitive algorithms. This is because data fusion is implemented at each parent node in the path toward BS.

7 Conclusion

Energy management, stability period and network lifetime optimization are major design issues in the research of clustering protocols for WSNs. This work focuses on energy conservation in each sensor node by using EFPA based parent node selection in a tree-based routing algorithm. The parent node is selected using EFPA based on residual energy of nodes and their distance to BS. To increase the lifetime of the WSN, threshold-based data transmission algorithm is employed. The performance metrics such as network lifetime, residual energy and total energy consumption are evaluated and compared with competitive clustering and routing methodology. The simulation outcome shows that EFTETRP gives improved performance in terms of the total energy expenditure and network lifetime of WSN.