1 Introduction

A WSN entails of a huge quantity of tiny, little power and cost, spatially distributed, communicating nodes (Ramar and Rubasoundar 2015; Tan and Körpeoǧlu 2003; Kumar and Rajkumar 2014; Ali et al. 2019; Marappan and Rodrigues 2016; Goyal and Tripathy 2012; Zytoune et al. 2010; Kia and Hassanzadeh 2019; Al-Karaki and Kamal 2004). These sensor nodes are connected wirelessly, self-organize themselves, and cooperate to perform application-oriented tasks (Priyadarshi et al. 2010, 2017, 2020; Rawat and Chauhan 2018a, b). Broadly, a WSN comprises of three subsystems, namely sensing, processing, and communication (Panag and Dhillon 2018; Hani and Ijjeh 2013; Singh et al. 2017; Aderohunmu and Deng 2009;Priyadarshi et al. 2018). In the sensing subsystem, sensor nodes acquire the data by sensing the physical constraints such as, pressure, temperature, and humidity, etc. Analog to Digital Converter (ADC) converts the sensed analog signal (data) into a digital signal and passes it to the processing subsystem. In addition to this, the sensing subsystem can perform the amplification and filtration depending upon the type of sensors and data acquisition strategy used (Zaki et al. 2014; Huang et al. 2018; Jana 2015; Osamy et al. 2018; Zhao et al. 2018; Azad and Sharma 2013). The processing subsystem processes the raw data into useful information. The communication subsystem sends the processed data to the high-end processing center, called Base Station (BS) (Priyadarshi and Gupta 2019; Priyadarshi et al. 2019; Kumar et al. 2020; Soni et al. 2020; Priyadarshi et al. 2019). The BS may be linked through end-user via wired connection or the Internet (Li and Li 2018; Darabkh et al. 2018; Muthukumaran et al. 2018; Hu et al. 2008; Singh and Sharma 2015; Dietrich et al. 2009). Depending upon the application requirements, nodes may or may not be equipped with mobilizer (for mobile network) and location finding systems like GPS (Geographic Positioning System), etc. The clustering strategy in WSN can be applied for boosting the network performance (Rawat et al. 2020; Priyadarshi and Bhardwaj 2020; Priyadarshi et al. 2018, 2019, 2020; Sharma et al. 2020). The nodes are prearranged into clusters, and a leader is assigned to each cluster called a cluster head (CH) (Goyal and Tripathy 2012; Rawat et al. 2020; Priyadarshi and Bhardwaj 2020; Priyadarshi et al. 2018, 2019, 2020; Sharma et al. 2020; Batra and Kant 2016; Mosavvar and Ghaffari 2019; Karray et al. 2014; Hadi 2019). The CH handles all the activities related to clustering, such as data collection and data forwarding (Priyadarshi et al. 2018a, b, 2019). The nodes in the network can be of analogous or disparate energy depending upon the application (Batra and Kant 2016; Singh and Verma 2017; Zhang and Chen 2019; Smaragdakis et al. 2004; Dhand and Tyagi 2016; Anurag et al. 2020).

In this paper, a three-level heterogeneous clustering scheme is proposed. The heterogeneous network provides disparate energy for various groups of nodes. The proposed protocol selects the CH from the different heterogeneous nodes using the parameters such as energy (average and residual), rounds, and CH probability. The proposed technique prolongs the network lifespan by selecting the most eligible node as CH using the threshold and other criteria of energy (Fig. 1).

Fig. 1
figure 1

Clustering in WSN

2 Routing issues in WSN

The routing techniques in WSN facilitate in uplifting the network performance. The routing involves diverse issues that are required to be handled for performance enrichment (Ali et al. 2019). Some of the issues allied to routing in WSN are.

2.1 Resource constraints

Sensor nodes are battery functioned. Therefore, battery capacity is a critical resource of WSN. Further, sensor nodes are provided with restricted memory and processing abilities. These constraints impose many challenges to the protocol design of the WSN. Especially, routing algorithms should be energy efficient for the prolonged network lifetime.

2.2 Scalability

The network size and node density depends on the application’s requirements. Network performance should not be affected significantly with respect to change in node count and network size. Routing algorithms designed for WSN should be scalable.

2.3 Fault tolerance

In WSN, the rate of change of topology is very fast due to battery drainage, node failure due to environmental conditions, hardware failure, addition and deletion of the sensor nodes, etc. Fault tolerance means the ability of the network to function properly in the presence of faulty nodes. The fault tolerance level varies depending upon the application to be developed. Therefore, fault tolerance should be addressed during the protocol design.

2.4 Joint signal processing

Data sensed by the sensor nodes may have redundancy. Joint signal processing techniques like data aggregation or fusion, beam forming helps in reducing the data to be communicated to the BS. With this, energy is saved, and as well as data may be converted into meaningful information. Routing protocols using joint signal processing increase the network lifetime considerably.

2.5 Security

For some applications like enemy tracking, security surveillance, a secure connection is a significant concern. In these applications, sensor nodes may be physically damaged, or some malicious node may be placed for getting unauthorized access of data. Because of resource constraints, existing security techniques create overhead and hamper the performance of the network. Therefore, routing protocols should be secure, lightweight, and energy-efficient.

2.6 Localization and coverage

Certain applications involve sensor nodes to be installed in hostile and unattended environments. For the reliable delivery of the data to the BS, nodes should be connected and accurately localized by the GPS (Geographic Positioning System) system or by some location finding methods. Several routing techniques presume that nodes are furnished with a GPS devices or certain location finding technique.

2.7 QoS

The interpretation of QoS for WSNs is highly application dependent. QoS requirements can be seen from two perspectives: application-specific and network specific. Application specific QoS parameters are observation accuracy, coverage, number of alive nodes, and exposure. Network-specific QoS parameters depend upon the data delivery models, i.e., query-driven, event-driven, or continuous delivery models. QoS support in WSN faces many challenges, such as resource constraints, redundant data, heterogeneity, scalability, etc. Since QoS support is a critical parameter for WSNs, therefore, protocol design should also consider QoS control and assurance mechanisms (Fig. 2).

Fig. 2
figure 2

Routing issues in WSN

3 Related work

In Gnanambigai et al. (2013), the sensing area is divided into the quadrants to perform the clustering. The clusters are created in the four quadrants, and the CH of assorted clusters corresponds using the request packets transfer. This approach results in increased delays and congestion. The change of the CH role is not performed in each round. The CH is changed only when the CH energy falls below a definite level. The partition of the quadrant is performed using the node location by the nodes.

In Heinzelman et al. (2000), the clustering strategy in this approach is portioned into two stages. The nodes are alienated into clusters, and a cluster head (CH) is nominated in each cluster in the first stage. This phase is called a setup stage. The CH collects node information from cluster and aggregates the gathered information for transmitting it to BS. In the next stage, i.e., the steady stage, the information gathered by the CH is transferred to the base station (BS). The node energy is not exploited in this approach during the CH selection and uses the random selection policy for the clustering process, which makes it less efficient in terms of energy utilization.

In Smaragdakis et al. (2004), the author proposed a heterogeneous clustering approach for electing the CH. The nodes are separated into categories on the basis of their energies. The additional energy is provided to some nodes in the network. The extra energy nodes help in increasing the life of the network. This approach uses the probability model for nominating the CH. The methodology engaged in this approach does not deliberate the energy of nodes for executing the clustering phases, which fallouts in a reduced network lifespan.

In Qing et al. (2006), the CH determination is performed using the energies (average and initial) of the nodes for the enhancement of network performance. This approach uses the heterogeneous network having different types of nodes with dissimilar energies. The CH selection process used in this protocol uses probability and node energy. The nodes in this approach produce redundant data while sensing the environment, which wastes the network energy. The CH selected using the probability method acts as an aggregator and aggregates the gathered data to reduce the redundant data.

In Salim et al. (2014), the protocol works on energy gap reduction amongst CH and cluster nodes. The tasks related to the network are scattered amid all the nodes and CH. The uniform distribution of energy among the sensors increases network performance. The scalability and message overhead makes this protocol less efficient.

In Lei et al. (2008), the CH selects the other CH as relays to communicate the data. The criteria for relay selection are BS distance and energy. This approach works better for the large sensing area. The relays help in uniform distribution of energy as the choosing criteria for relays are energy and distance.

For solving the intracluster communiqué in large networks, the concept of the far zone was introduced in Katiyar et al. (2011). The role of the zone leader is assigned to higher energy sensors. It works on solving the problem of uneven cluster size. It faces the issues of scalability and complexity while performing network actions.

4 Energy model of WSN

The energy model used for the proposed approach is used for the communication of data, i.e., transmission and reception. The model is alienated into two portions one for data sending and others for data reception. Both the data sender and data receiver parts are parted from each other by a distance d (Priyadarshi et al. 2019; Smaragdakis et al. 2004; Katiyar et al. 2011). The equations for the transmission and reception of data re given as:

$$ E_{trans} \left( {k,d} \right) = \left\{ {\begin{array}{*{20}l} {kE_{elec} + k\varepsilon_{efs} d^{2} ,\quad d < d_{o} } \hfill \\ {kE_{elec} + k\varepsilon_{amp} d^{4} ,\quad d \ge d_{o} } \hfill \\ \end{array} } \right. $$
(1)
$$ E_{receive} \left( k \right) = kE_{elec} $$
(2)
$$ d_{o} = \sqrt {\frac{{\varepsilon_{efs} }}{{\varepsilon_{amp} }}} $$
(3)

Table 1 displays the symbols employed in the energy model of proposed protocol along with the meaning of various symbols (Fig. 3).

Table 1 Parameters of energy model
Fig. 3
figure 3

Energy model of WSN

5 Proposed protocol

The clustering approach in WSN has a substantial effect on extending the network lifespan. The nodes in clustering are alienated into diverse groups, which make it competent for the congregation and broadcast of data. The sensors in clusters sense the information in the working area and frontward that information to the CH. The CH transport that gathered records to the BS. The CH determination in the proposed approach has an extremely necessary part in defining the network performance. In this paper, a clustering methodology grounded on three-level heterogeneity is proposed. The sensors in the proposed protocol are classified into three diverse node sets (normal, middle, and super) using their energies. The CH determination in the proposed protocol is reliant on the three-level heterogeneity model and energy of heterogeneous nodes. The clustering procedure in the proposed technique begins with the generation of random number in the range [0, 1]. The random value is matched with the threshold for the CH determination. The nodes which fulfill the threshold norms get the CH role. The threshold is used for all the levels of heterogeneity for the CH determination. The three levels in the proposed approach compose the network efficiency and facilitate the network in surviving for a longer time duration. Figure 4 shows the steps followed by the proposed scheme.

Fig. 4
figure 4

Flowchart of proposed protocol

The basic threshold for the proposed scheme is given as (Tables 2, 3):

$$ {\text{T}}\left( n \right) = \left\{ {\begin{array}{*{20}l} {\frac{{P_{optim} }}{{1 - P_{optim} \left[ {\left( {rmod\frac{1}{{P_{optim} }}} \right)} \right]}},\quad if\,n \in G} \hfill \\ {0,\quad otherwise} \hfill \\ \end{array} } \right. $$
(4)
$$ {\text{Normal}}\;{\text{nodes}}\;{\text{energy}}\quad E_{norma} = E_{o} $$
(5)
$$ {\text{Super}}\;{\text{nodes}}\;{\text{energy}}\quad E_{sup} = E_{o} \left( {1 + x} \right) $$
(6)
$$ {\text{Middle}}\;{\text{nodes}}\;{\text{energy}}\quad E_{mid} = E_{o} \left( {1 + w} \right) $$
(7)
Table 2 Symbols and their meaning of Eq. 4
Table 3 Symbols and their meaning of Eqs. 5, 6 and 7

The extra energy due to the heterogeneous nodes uplifts the entire network energy. The proposed approach considers the energy of nodes for the CH determination. The threshold function used in the proposed method considers the round number, node energy, and CH determination probability for choosing the CH amid a variety of nodes.

The CH determination probability of a mixture of heterogeneous nodes (middle, super, and normal) is given as (Table 4):

$$ P_{nml} = \frac{{P_{optim} }}{{\left( {1 + a*w + x*t} \right)}} $$
(8)
$$ P_{mid} = \frac{{P_{optim} }}{{\left( {1 + a*w + x*t} \right)}}*\left( {1 + w} \right) $$
(9)
$$ P_{sup} = \frac{{P_{optim} }}{{\left( {1 + a*w + x*t} \right)}}*\left( {1 + x} \right) $$
(10)
Table 4 Symbols and their meaning of Eqs. 8, 9, and 10

The energy of the super node is superior to the middle and normal nodes. Due to the heterogeneous character of the nodes, the primary energy assigned to the nodes is different.

$$ E_{sup} > E_{mid} > E_{norma} $$
(11)

\( E_{sup} ,E_{mid} ,\;and\;E_{norma} \) are energies of super, middle, and normal nodes. The first step in the clustering segment of the proposed protocol begins with the production of a random value. The assessment of that value is performed with the threshold.

$$ rand\left( n \right) < = T\left( n \right) $$
(12)

The threshold for the three-level heterogeneous nodes is represented as (Table 5):

$$ T\left( n \right) = T\left( n \right)*\frac{r}{{E_{average} }}*\frac{1}{P}*E_{remianing} $$
(13)
Table 5 Symbols and their meaning of Eqs. 14, 15, and 16

The threshold function (normal node) is calculated as:

$$ {\text{T}}\left( {norma} \right) = \left\{ {\begin{array}{*{20}l} {\frac{{P_{nml} }}{{1 - P_{nml} \left[ {\left( {rmod\frac{1}{{P_{nml} }}} \right)} \right]}}*\frac{r}{{E_{average} }}*\frac{1}{{P_{nml} }}*E_{remianing} ,\quad if\;n \in G} \hfill \\ {0,\quad otherwise} \hfill \\ \end{array} } \right. $$
(14)

The threshold function (middle node) is calculated as:

$$ {\text{T}}\left( {mid} \right) = \left\{ {\begin{array}{*{20}l} {\frac{{P_{mid} }}{{1 - P_{mid} \left[ {\left( {rmod\frac{1}{{P_{mid} }}} \right)} \right]}}*\frac{r}{{E_{average} }}*\frac{1}{{P_{mid} }}*E_{remianing} ,\quad if\;n \in G} \hfill \\ {0,\quad otherwise} \hfill \\ \end{array} } \right. $$
(15)

The threshold function (super node) is calculated as:

$$ {\text{T}}\left( {sup} \right) = \left\{ {\begin{array}{*{20}l} {\frac{{P_{sup} }}{{1 - P_{sup} \left[ {\left( {rmod\frac{1}{{P_{sup} }}} \right)} \right]}}*\frac{r}{{E_{average} }}*\frac{1}{{P_{sup} }}*E_{remianing} ,\quad if\,n \in G} \hfill \\ {0,\quad otherwise} \hfill \\ \end{array} } \right. $$
(16)

If the threshold criteria are not satisfied by a node in a round, at that point, the subsequent node is given the opportunity, and a similar course of action is rehashed yet again for the next node.

figure a

6 Simulation and results

The assessment of existing approaches is performed with the proposed protocol for analyzing the working performance in terms of various performance parameters (network life, stability period, and CH count). The proposed approach is simulated using the MATLAB simulator. The parameters used for the simulation of the proposed protocol are shown in Table 6.

Table 6 Simulation parameters

In Fig. 5, the survival of the nodes is shown in different rounds. The energy in the network is depleted in performing diverse network activities. The proposed approach utilizes the energy levels of nodes and CH probability in the clustering phases to help the network in sustaining for the higher rounds. The network survived for 3073, 3134, 3475, 5743 rounds in LEACH, SEP, DEEC, and proposed protocol. The graph can be analyzed to showcase the efficacy of the proposed protocol for improving the network life.

Fig. 5
figure 5

Network lifetime

Figure 6 shows the nodes (dead) in different rounds to showcase the stability of the network. The proposed protocol has taken advantage of the node energy and CH probability to lift the stability region of the network. The stability region is maintained for 1037, 1278, 1820, and 2746 rounds in LEACH, SEP, DEEC, and proposed protocol. The results demonstrates the competence of the proposed protocol in making the network more stable and efficient.

Fig. 6
figure 6

Dead nodes

Figure 7 shows the investigation of the CH formed in different rounds. The CH generated in proposed schemes is higher in comparison to other existing approaches. The CH selection using the energy parameters makes the CH count uniform in the network. The nodes are given equal opportunity for the CH competition, and each node can exploit their energy and probability values of winning the CH election.

Fig. 7
figure 7

CH count

Figure 8 showcases the performance of the different protocols in comparison to the proposed approach. The node’s status in different rounds is examined in the figure. The round number in which first, half, and last node (fnd, hnd, and lnd) loss all their energy and become dead are shown for different protocols. The proposed technique has delineated better results in each round, and it showcases the efficacy by improving network performance.

Fig. 8
figure 8

Network performance

7 Conclusion

The proposed protocol is designed for performing the clustering in the three-level heterogeneity model. The nodes in this paper are categorized into three types according to their energy levels. The different energies of nodes assist in augmenting the lifespan of the network. The nodes of all the types are given the chance for becoming the CH. The proposed protocol primarily focuses on the efficient CH selection for escalating the system performance. It uses the probability dependent thresholds and energy of sensors for the competent CH determination. The three-level heterogeneity model applied for the clustering assists the network to stay alive for a longer extent, which improves the inclusive performance of the network.