1 Introduction

Wireless sensor networks (WSNs) is a finite set of sensor nodes and distributed over the environment. In the recent decade, development of the wireless networks has been used for an application such as medicine tracking, environment sensing, military field, security surveillance, underground mines, image tracking and many other areas [1]. WSNs consist of a large number of low power sensor nodes with a small cost. The sensor node is made up of four essential components, which are processing unit, sensing unit, storage unit, and a transceiver unit. In the WSNs, sensor nodes deployed on the field or any unreachable place such as hill place, dense forest and disaster place in random fashion [2]. Moreover, data sent from sensor nodes to the base station (BS) using single-hop routing or multi-hop routing. One of the critical challenges is that sensor node deployed in an extensive range, and information sharing between sensor nodes and the sink node is much tricky in terms of energy consumption. Meanwhile, research has been proved for energy saving of the nodes for the network life of WSNs with the help of clustering method [3].

In the clustering method, sensor nodes are consolidated into different groups, called a cluster and every cluster has a coordinator, which is referred as cluster head (CH) and remaining sensor nodes in the cluster act as cluster members (CMs) [4]. Each CH gathers information of own sensor nodes of all respective clusters. After gathering, all individual CH information sent to the sink node using the help of single-hop or multi-hop routing method [5]. One famous technique used for clustering protocol is called a data aggregation process to reduce energy consumption or duplicate information. The objective of the clustering method is to minimize energy consumption and gives better network lifetime of the sensor networks. Besides, another challenging issue of the WSNs is to charge the sensor node batteries due to the reason of the deployment strategies of the sensor nodes in the remote places [6].

The researchers have been solved the finite lifetime of the sensor node called energy harvesting (EH). The EH technique is an alternative solution to recharge the battery of the nodes from ambient sources, which are solar, wind, mechanical and thermal [7, 8]. The novel category of EH-WSNs gives an infinite amount of energy for the sensor node. Moreover, predictions of the ambient resources are a significant issue for EH-WSNs and which are categorized the two types of prediction models [9]. The first type of model is based on the previous information prediction, and another example is based on the weather forecasting, which is related to several weather metrics and different solar intensity [10, 11]. Furthermore, the designs of EH-WSNs are very different from conventional WSNs. EHs-WSNs are more useful and economical in term of network lifetime. In addition, another important issue of the EH-WSNs is energy management, because utilization of EH-WSNs is different from another traditional network system in term of energy storage [12].

A unique characteristic of EH-WSNs is the energy storage class to collect the energy from external sources. Mainly, each sensor node has a rechargeable battery and capacitor to store energy resource through chemical reaction battery for the long life of the network. However, energy storage of the sensor nodes depends on the geographic area, just like solar elevation angle, sunshine duration and another related weather condition [9]. In Fig. 1 shows the basic architecture of clustering protocol.

Fig. 1
figure 1

Basic architecture of the clustering protocol in WSNs

Therefore, it is essential to design effective and suitable clustering based routing algorithm of the above consideration for EH-WSNs to improve the energy consumption and gives maximum network lifetime.

Furthermore, the proposed algorithm showed better network lifetime comparison of the existing work, just as SEED (sleep awake energy-efficient distributed), hybrid unequal clustering layering protocol (HUCL) and CEEC (centralized energy-efficient cluster). The experiments work of the NEHCP are giving an effective result, just like more extended alive node, more packets send to the BS and lifetime of the battery is more by the harvesting system. The main objective of this paper can be summarized as the following:

  • We propose a new clustering protocol that is called as NEHCP for EH-WSNs.

  • The proposal work is divided into three-part such as initial phase, set-up phase, and data transmission phase.

  • The selection of the CH is decided using the maximum residual energy and EH-rate for the current round.

  • After deciding CH, data sent to the BS using single-hop routing or multi-hop routing.

  • The proposed method is giving better performance in term of the network lifetime for the EH-WSNs.

The remaining part of this paper is organized as follows: In Sect. 2, we have discussed some related review paper based on the clustering and routing. The system model for the NEHCP method is presented in Sect. 3. The proposed algorithm has described in Sect. 4. In Sect. 5, we described the experiments result and then, showed comparison result with the existing algorithm and followed by the conclusion of the paper in Sect. 6.

2 Related work

There are several research on the clustering protocol, which has done in the WSNs. One of the famous clustering protocol proposed that is called Low Energy Adaptive Clustering Hierarchy (LEACH) [13]. The LEACH method has divided into two sections, namely the set-up phase and steady phase. In the set-up phase showed about cluster formation and selection of the CH, it is based on the residual energy of the sensor nodes, using probability function. Stable Election Protocol (SEP) is a type of heterogeneous clustering protocol, and it is an extension of the LEACH [14]. In the SEP protocol, the sensor node is divided into two ways, such as normal node and advanced node. The author, in his study [15], introduced a hybrid unequal clustering layering protocol (HUCL), and it is combined static clustering and dynamic clustering. The Centralization Energy Efficient Cluster (CEEC) method showed a heterogeneous network [16]. The whole network has divided into three sections, such as low energy region, medium energy, and higher energy region.

In [17], the author proposed an energy-aware routing algorithm (ERA), and it is based on energy-aware clustering method. The ERA is divided into two-part, namely clustering formation and routing. In [18], they have identified as sleep awake energy-efficient distributed (SEED) technique for the extensive sensor network system. The SEED is a type of heterogeneous network model. In [19] author, worked energy-aware unequal clustering (EAUCF) protocol to reduce hot spot problems in the network. The author in his study [20], proposed distributed energy-efficient heterogeneous clustering (DEEHC) to increase the quality of service of the sensor network. The multi-objective fuzzy clustering algorithm (MOFCA) showed to reduce hot-spots problem and energy hole problem [21].

The distributed unequal clustering with the fuzzing method (DUCF) used for the efficient unequal clustering protocol [22]. This method showed balance energy distribution in the sensor network, and CH selected using a fuzzy approach. In [23] author, applied an unequal clustering mechanism (UCM) on the multi-objective immune algorithm (MOIA). This model modified through intra-clustering and inter-clustering related to energy consumption in the sensor network. In [24], they have proposed balanced flow routing (FBR) to increase the power efficiency of the network and coverage preservation. FBR method used multi-hop routing to improve network lifetime. The discrete particle swarm optimization (DPSO) protocol used for EH-WSNs [25]. This model is based on the EH centralized clustering protocol to increase network life of the sensor nodes using a prediction model as EWMA.

In [26] author, a proposed novel energy-efficient clustering (NEEC) protocol for hybrid methodology, and it is based on a distributed centralized approach. The NEEC algorithm has divided into three parts such as handling phase, set up phase and data transformation phase. The author in [27] have applied for dynamic clustering algorithm in the EH-WSNs. Besides, this protocol used relay nodes to support CH, and it is beneficial in the clustering method because after aggregation data sent to BS, that time consuming more energy. In [28], they have demonstrated neutral energy clustering (ENC) protocol, and it is based on distributed clustering. ENC technique is divided a type of the novel cluster head group (CHG). Size of the cluster is equal in this method. CHG technique helps reduce the number of cluster reformation. In [29] author, presented balanced power and aware clustering routing protocol (BPA-CRP) to increase energy efficiency among the sensor nodes.

3 System model

Here, we considered some assumptions of the system model for the EH-WSNs, and it is following the proposed algorithm. Further, it is divided into three types, such as EH-WSNs model, radio energy model and energy harvesting model. Notations and their meaning are shown in Table 1.

Table 1 Notations

3.1 EH-WSNs model

Some framework assumed for proposed algorithm in the EH-WSNs as follows:

  • Set of N sensor nodes deployed in the \(A\times A\) two dimension large area.

  • Characteristics of the deployed sensor nodes in the random manner.

  • The method is homogeneous related initial energy distribution \(E_o\).

  • All nodes and BS are stationary forms.

  • In this method, data transmission of CM to CH or CH to the sink node with the help of TDMA and CSMA.

  • Each sensor node has a harvesting device module.

  • Each sensor node has the same amount of energy \(E_0\), when it is deployed in the initial time.

  • All sensor node can harvest energy from the ambient resource; here taken the solar device.

  • EH-rate of every sensor node is different and lies between \([0,U_{max}]\).

  • Here, energy consumption rate greater than EH-rate in most of the rounds.

  • All the operation of the EH-WSNs has is divided into the round r.

In the EH system, energy prediction is a significant issue because ambient resources are uncontrollable. Therefore the management of energy resources is initially needed for EH-WSNs. One famous technique used that is called Exponential Weighted Moving Average (EWMA) to predict the uncontrolled energy resource. The EMWA technique depends on identical time slots of day, means slot duration and as shown in Fig. 2. This model is based on the primary low of sunlight. According to the diurnal scale graph of Kansal [9] is showing EH-rate as data collected from sunlight. However, according to the EWMA prediction model keeps a historical summary of days with the different seasonal condition. In Eq. (1) shows the harvesting energy (H) and the estimated energy (E). It is summed using a weighting factor, \(0<\alpha <1\). The value of \(\alpha\) is high; then last harvesting energy will be low and vice versa. In this equation d represents as present-day with n time slots.

$$\begin{aligned} E(d, n)=\alpha E(d-1, n)+(1-\alpha )H(d-1, n) \end{aligned}$$
(1)
Fig. 2
figure 2

Diurnal graph of harvesting rate in the solar [9]

3.2 Radio energy model

The radio energy model of the WSNs is same as LEACH, and it is categorized into two types, namely free-space and multi-path. In both fading channels used to calculate the amount of energy dissipated of the battery to perform sending or receiving packets in the sensor networks. Also, consumption energy entirely depends on two factors. The first factor is the medium of the deployed sensor nodes such as hill station, fog place, and clean place and another factor is the distance between the nodes. If the distance of nodes is less than the threshold value do, that is called free-space model; otherwise, a multi-path model.

$$\begin{aligned} d_o&= {} \sqrt{\frac{\epsilon _{fs}}{\epsilon _{mp}}} \end{aligned}$$
(2)
$$\begin{aligned} E_{T_x}&= {} \left\{ \begin{matrix} lE_{elec}+l\epsilon _{fs}\times d^2, &{} if &{} d<d_o\\ lE_{elec}+l\epsilon _{mp}\times d^4, &{} if &{} d\ge d_o \end{matrix}\right. \end{aligned}$$
(3)

In Eq. (2) calculated the distance parameter and value of \(d_o\) is 87m same as LEACH model. Where l is the length of packets and \(E_{elec}\) is the electronics circuit of the sensor nodes, it is working for the amplifier of the energy, filtering and modulation. Other parameters are \(\epsilon _{fs}\) and \(\epsilon _{mp}\) free-space and multi-path respectively, it depends on the distance between sender and receiver node. In Eq. (3) is showing two conditions to calculate energy. The first condition is calculating the energy within the clusters, means CM to the CH, and another condition is using to dissipate energy from CH to the sink node. The radio energy consumed for receiving with l length of the bits packet. In Eq. (4) showed as:

$$\begin{aligned} E_{R_x}=l \times E_{elec} \end{aligned}$$
(4)

The Eq. (5) is showing data aggregation function within l length of packets.

$$\begin{aligned} E_{agg}=l \times E_{DA} \end{aligned}$$
(5)

where \(E_{DA}\) is taken radio energy for per packet.

3.3 Energy harvesting model

The most important technique used in the EH-WSNs that is called EH-model, this mode derived as EMWA prediction model. The model has identified actual energy an amount of EH rate, which are gathering sensor nodes from the solar plate and energy is adding to the sensor nodes battery. The energy profile of the sensor node depends on the diurnal scale of sub-sequence days. The energy of the ith node at the beginning of the rth round E(ir), write as:

$$\begin{aligned} E(i,r)&= {} E_{resi}(i,r)+E_{har}(i,r-1) \end{aligned}$$
(6)
$$\begin{aligned} E_{har}(i,r-1)&= {} \eta _i\varDelta _t \end{aligned}$$
(7)
$$\begin{aligned} \eta _i&= {} rand(P_{h,min}(r-1),P_{h,max}(r-1)) \end{aligned}$$
(8)

where \(E_{resi}(i,r)\) and \(E_{har}(i, r-1)\) are a residual energy (without taking EH into consideration) at the starting of the rth round and EH rate during the \((r-1)\)th round of the ith sensor nodes respectively. Further, \(\eta _i\) is EH-rate of the ith sensor node amidst the \((r-1)\)th round, which is type a random variable with the followed by uniform distribution between \(P_{h, min}(r-1)\) and \(P_{h, max}(r-1)\). The EH-rate \(\eta _i\) depends on the different weather condition, if it is sunny then EH-rate high; thereby effects network lifetime of the sensor nodes. \(\varDelta _t\) identity time taken during the round. \(P_{h,min} (r-1)\) and \(P_{h,max} (r-1)\) are highest and lowest EH-rate for the sensor nodes during the \((r-1)\)th rounds respectively.

4 Proposed work

In this section, we have discussed as following characteristic of the EH-WSNs system and showed a relationship of EH-rate among the sensor nodes in the following days. In given Fig. 3, shows the performance of the EH-rate in particular time slot of the days. However, it is divided into three parts, which are EH components, time-slots of solar light intensity and deployed sensor nodes. In Fig. 3a showed different EH components, namely solar plate, EH-module and storage class. The storage class combines with components, such as rechargeable battery and super-capacitor, sometimes it is referred to hybrid energy storage. In Fig. 3b is identified as the solar light intensity of the days, and this event referred to the harvesting rate for deployed corresponding sensor nodes in Fig. 3c. It is showing EH-rate of all respective sensor nodes in the following time slots of day and charging rate fully depends on the weather condition.

Fig. 3
figure 3

Basic components of EH system

Here, each sensor node is connected small size solar panel. It is followed as low-power EH circuits and the rechargeable battery to provide totally self-powered with an efficient network lifetime. The power density of the solar panel is \(15\,{\mathrm{{mW/cm}}}^2\) and giving voltage depends on EH-rate. The calculation of consumed energy is low power WSNs module and module operate from 1.2 to 2.1 V. The given formula is showing that the energy consumption of EH-assets sensor node [30].

$$\begin{aligned} E_{EH-sensor}&= {} V_{cc} \times t_{active} \times I_{\alpha }+V_{cc}\times t_{operation}\times I_{\beta } \end{aligned}$$
(9)
$$\begin{aligned} t_{active}&= {} \frac{t_{operation}}{t_{sleep}}\times t_{duration} \end{aligned}$$
(10)

where \(V_{cc}\) is voltage supply in nodes, \(I_{\alpha }\) is current in the active mode and \(I_{\beta }\) is sleep mode current. In Eq. (9) is showing a characteristic of the energy consumption of the sensor module and Eq. (10) calculated energy in the active state. In Eq. (11) used to calculate energy level \(E_{level}\) of the sensor battery i within round r.

$$\begin{aligned} \begin{aligned} E_{battery}(i, r)&=E_{battery}(i, r-1)-E_{active}(i, r-1)\\&\quad -E_{sleep}(i, r-1)+E_{har}(i, r-1) \end{aligned} \end{aligned}$$
(11)

In addition, an essential consideration of the energy budget system is that energy consumption rate greater or equal to EH-rate in most of the rounds r in the network. Generally, the harvesting system is very dynamic, because it depends on the weather condition. It is a very fundamental issue of the NEHCP algorithm. However, Eq. (12) can calculate the energy budget for the entire sensor network system, and it shows a complete number of rounds \(r_{n}\) of energy consumption rate and EH-rate.

$$\begin{aligned} \sum _{i=1}^{n}E_{c}(i, r_{n})\ge \sum _{i=1}^{n}E_{har}(i, r_{n}-1) \end{aligned}$$
(12)

where \(E_{c}\) is a energy consumption rate and \(E_{har}(i, r-1)\) is the EH-rate of all the sensor nodes. The Eq. (11) is a vital related to energy budget system of the EH-WSNs.

This is based on the hierarchical clustering and routing protocol and gives maximum energy efficiency in term of energy consumption. Generally, a clustering-based routing method can be classified into two terms, such as centralized protocol and distributed protocol. However, in this paper, we propose a distributed algorithm. The algorithm is divided into three phase, set-up phase and data transmission phase. The entire process of the NEHCP method in the given Fig. 4. A further section, we derived some important lemma, time complexity and theoretical proof related to the algorithm.

Fig. 4
figure 4

The Block diagram of the NEHCP algorithm

4.1 Initial phase

This phase is divided into three sections, namely is information collection, distance calculation, and radius calculation.

4.1.1 Information sharing

In the information collection part, showed the characteristics of the network that means, information collection of the sensor nodes, which are deployed in a different location. However, each sensor nodes broadcast message to every sensor node and identified their location and working principle for the proposed method.

4.1.2 Distance calculation

After the node deployment and information sharing, then BS broadcasts a message in the sensor network to calculate the distance from BS to every node. Moreover, it is based on the signal strength of the network and this relation established between the maximum and minimum distance from the BS. Using Eq. (13), the distance is calculated as:

$$\begin{aligned} d_{s}=d_{max}-d_{min} \end{aligned}$$
(13)

where \(d_s\) is the difference of the longest and nearest nodes from the BS and \(d_{max}\) showed the longest distance from BS, \(d_{min}\) is the nearest distance from BS.

4.1.3 Radius calculation

The significance of this process is very important, because of the size of the cluster directly proportional to the distance of the CH from BS. However, unequal clustering minimizes the size of the cluster to the BS, for it prevents hot spot problem in the network. Eq. (14) showed to radius calculation of the ith node as follows:

$$\begin{aligned} R_c(i)=\left( 1-\rho \frac{d_{max}-d_{(i,_{BS})}}{d_{s}} \right) \times R_{l_{max}} \end{aligned}$$
(14)

where \(R_{c(i)}\) is the radius of the node i and \(\rho\) is weight factor, it lies [0, 1]. The \(R_{l_{max}}\) is maximum radius competition, \(d_{i,BS}\) denotes distance between ith node and the BS. The \({d_{s}}\) show difference of the distance between maximum node and minimum the node from BS.

4.2 Set-up phase

The set-up phase is divided into the three sections, namely advertisement of the node, CH selection, and cluster formation.

4.2.1 Advertisement of the node

In this phase, sensor nodes contain three attributes, such as node id, EH-rate and residual energy. Each node broadcasts a message with three attributes to another node in the networks to decide the CH for the current round. In addition, each sensor node consists of two tables to record the above consideration just as residual energy status and EH rate. In the table, one contains self-attribute information, just as ID, range and location, etc. In the second table keeps neighbours node attribute information to the become CH in every round.

4.2.2 CH selection

In NEHCP algorithm, each sensor node i elects CH based on the sum of maximum residual energy and EH-rate for the current round, which stats at time t. The probability \(p_{i}(t)\) is chosen the expected number of the CH nodes for this round is \(N\times P\), means k. Then, the equation as follows:

$$\begin{aligned} E[\#CH]=\sum _{i=1}^{N}p_{i}(t)\times 1 =N \times P \end{aligned}$$
(15)

And Eq. (15) derived as,

$$\begin{aligned} p_{i}(t)=\frac{E_{resi}(i,r)+E_{har}(i,r-1)}{\sum _{i=1}^{N}E_{resi}(i,r)+E_{har}(i,r-1)} \end{aligned}$$
(16)

Where N is the number of sensor nodes. \(E_{resi}(i, r)\) and \(E_{har}(i, r-1)\) are the residual energy and EH-rate of node i respectively with r rounds. The significant of 1 in Eq. (15), identifies how many numbers of the CH in the current round and it is derived from indicator function \(C_{i}(t)\). If \(C_{i}(t)\) is an indicator function determining whether or not sensor node i has been a CH in the recent round. Another way, we can write off this function If \(C_{i}(t)=1\), then it is CH for current round, otherwise 0. However, P is a percentage of the CH per round among the sensor node. In Eq. (16) is justified to the selection of the CH process with N sensor node.

4.2.3 Cluster formation

Cluster is formed using Eq. (16), each CH broadcasts an advertisement message (ADV) using a non-persistent carrier sense multiple access (CSMA) MAC within the range \(R_{c}\). The message is tiny, which contains node id and header of the different messages for an announcement purpose. Each non-CH node determines its cluster for this round by choosing the CH that requires the minimum communication range. Each node transmits to join request message (join-REQ) to CH using non-persistent CSMA MAC protocol. After cluster formation, CH sets up a TDMA schedule to transfer information from CM to CH. The use of the TDMA is based on the number of a sensor node in the particular cluster, if the CM is more then, maybe a possible collision, after using TDMA schedule; a collision can prevent in the cluster. TDMA also provides sleep-awake protocol means; each CH allocates some time its own CM to schedule. After completing work, CM goes to sleep mode, for it saves the energy of the node. And sometimes this process is known as duty cycle method.

figure a

4.3 Data transmission phase

After completion of the setup phase, the data transmission starts by the CH based on the TDMA schedule. This phase is divided into sub-phase, intra-clustering communication and inter clustering communication. During the intra-clustering transmission, the CH collects data from all the CM. Moreover, the transmission phase uses a minimum amount of energy. The radio energy of each CM node can be turned off until the sensor nodes allocated transmission time for it minimizing energy consumption in their nodes. The CH keeps the information about the received signal from the CM. When all the information received, then the CH node executes the signal processing function method to compress the information into a single signal. The above process is called a data aggregation method to remove duplicate information or redundant data. Next, inter clustering phase, after data aggregating CH sent information to the BS using single-hop or multi-hop routing. One of the famous protocol applied to the transformation of data from CH to the BS that is called CSMA and fixed spreading code. When a CH has a data to transfer, it should sense the channel, if it is busy then wait. Otherwise, the CH immediately sends data using the BS spreading code to the BS

The given formula, Eq. (17) shows the data transmission from CH to the BS and Eq. (18) identified transmission of the data from CM to the CH. Each CH dissipates energy taking the signal from the CM, aggregating the signal and last transmitting the entire aggregating signal to the BS. Actually, BS is far from the sensor nodes, therefore used multi-path mode (\(d^{4}\) power loss).

$$\begin{aligned} \begin{aligned} E_{CH}&=lE_{elec}\left( \frac{N}{k}-1\right) +lE_{DA}\frac{N}{k}+lE_{elec}\\&\quad +l\epsilon _{mp}\times d^4 \,\,to\,\,BS \end{aligned} \end{aligned}$$
(17)

where l is the number of bits in the data packets, \(E_{DA}\) is data aggregation method and k is a number of the cluster. The \(d^4\) to BS is a large distance from CH to the BS.

Each CM transmits the data to the CH and generally, it is very distance between CM and CH. This is a called as free space (\(d^{2}\) power loss).

$$\begin{aligned} E_{CM}=lE_{elec}+l\epsilon _{fs}\times d^{2} to CH \end{aligned}$$
(18)

where \(d^{2}\) to CH is the distance from CM to the CH.

figure b

Lemma 1

Time complexity of the NEHCP is \({\mathcal {O}}(N^{2})\) for the entire network within N nodes.

Proof

There are different time complexity of the given method. First, calculate a distance of each node to each other using Euclidean distance with N number of the sensor nodes and time takes \({\mathcal {O}}(N^{2})\). Other important parameter of the NECHP method is the number of a round. Round depends on the energy status that means initial energy of the battery \(E_{0}\) and EH-rate. Each round creates \({\mathcal {O}}(N_{r})\) complexity for the entire algorithm. Where, \(N_{r}\) is number rounds with N nodes and k is number a of cluster build per round r. Every round generates approximate \({\mathcal {O}}(\log {}N)\) number CH then, write \({\mathcal {O}}(N_{r}\log {}N)\) for N number of the sensor nodes. Total time complexity of the NEHCP can write \({\mathcal {O}}(N_{r}\log {}N)\)+ \({\mathcal {O}}(N^{2})\) = \({\mathcal {O}}(N^{2})\)\(\square\)

Lemma 2

The message exchange complexity of the NEHCP is \({\mathcal {O}}(1)\) for each sensor nodes and \({\mathcal {O}}(N)\) for the entire network with N nodes.

Proof

In the cluster formation approach, each node broadcasts message to become CH as message CH advertisement message or JOIND message. Therefore, message exchange complexity between CH and CM is \({\mathcal {O}}(1)\). This can be done only constant time. However, for the entire process can take \(n-1\) time for cluster formation with N number of the sensor nodes. Therefore, completely process of the NEHCP the takes \({\mathcal {O}}(N)\) for cluster formation, it is worst case for the message exchange. \(\square\)

Lemma 3

The expected number of the CH node is k within N number of the sensor nodes in the NEHCP algorithm.

Proof

We can achieve this proof through the probability function incorporated in the CH, which only determine, how many times the sensor node has been elected as CH in the past. This probability function can be expressed as;

$$\begin{aligned} p_{i}(t)=\frac{E_{resi+har}^{i}(t)}{E_{total}(t)} \end{aligned}$$

where \(E_i^{resi+har(t)}\) represents the energy of the node i. Here, total energy of the sensor nodes is defined as;

$$\begin{aligned} E_{total}(t)=\sum _{i=1}^{N}E_{resi+har}^{i}(t). \end{aligned}$$

Using this probability function, the sensor nodes choose CH helping higher energy node, then the expected number of CH \(N\times P\), means k. We can write Eq. (15) in another way.

$$\begin{aligned} E[\#CH]=\sum _{i=1}^{N}p_{i}(t)\times 1 =k \end{aligned}$$

Average energy calculation with N sensor nodes,

$$\begin{aligned}&= {} \left( \frac{E_{resi+har}^{1}(t)}{E_{total}}+\frac{E_{resi+har}^{2}(t)}{E_{total}}+\cdots +\frac{E_{resi+har}^{N}(t)}{E_{total}} \right) k\\&= {} k \end{aligned}$$

\(\square\)

Lemma 4

There is at most one CH possible within each cluster radius\(R_{c}\).

Proof

This is very often issue for clustering mechanism that each cluster should be only one CH. This prove is identified using Eq. (16) and each sensor node broadcasts \(Head-msg\) its own communication range \(R_{c}\). After receiving \(Head-msg\), CM send JOIND message to become CH. Therefore, no other chance to become CH within range \(R_{c}\).

Definition 1

Given the maximum initial energy of the sensor node battery, \(I_{max}\), and harvested energy \(h_{max}\), then threshold value on the battery capacity of the sensor node is given by [31]:

$$\begin{aligned} Th_{battery}=I_{max}+2\times h_{max} \end{aligned}$$

Definition 2

Condition of EH-rate depends on the weather condition at a certain time of the day \(D_{t}\). It is followed by two instants of the EH-rate, such that \(h_{r}\epsilon P_{max}\) and \(h_{r}\epsilon P_{min}\). This condition is followed for certain hours of the day and according to the EH theory, if weather condition is sunny day \(D_{t}\), then EH-rate increases \(D_{max}(h^{+}_{r})\); otherwise decreases \(D_{min}(h^{-}_{r})\).

4.3.1 Theoretical analysis of the energy consumption rate for the NEHCP algorithm

The energy-consuming rate is a very important parameter of the EH-WSNs because the lifetime of the senor network fully depends on the initial power of the battery \(E_o\) and EH-rate. Moreover, cluster routing protocol is a type of duplex communication. This network system is too busy to transformation the packets between CM and the CH or CH to the BS. Therefore, the utilization of the energy consuming is much needed. Related all parameter of the proposed method, we assume that the deployed number of N sensor node in the \(A\times A\) square area. If there are k number of the cluster then, average \(\frac{N}{k}\) (one CH and \(\frac{N}{k}-1\) CM).

The total energy of the NEHCP is divided into two parts, such as initial energy of the sensor node battery \(E_o\) and EH-rate energy. Here, considered battery of sensor node battery(t) with finite capacity \(c_{max}\) then should be satisfy, \(0\le battery(t)\le c_{max}, \forall t\ge 0\). However, energy dissipates in term of rounds. One round combines with the set-up phase and data transformation; therefore, energy dissipates for each round.

$$\begin{aligned} r=\frac{E_{total}}{E_{round}} \end{aligned}$$
(19)

where r is a number of rounds and \(E_{round}\) dissipate energy in the current round.

The energy dissipates between CM to the CH and most of the time data sending and receiving as follows:

$$\begin{aligned} E_{CM}=2lE_{elec}+l\epsilon _{fs}\times d^2\,\, to\,\, CH \end{aligned}$$
(20)

where l is a number of bit in the packet, generally data transmission in the wireless network, it is formed a frame.

The energy consumption of the NEHCP is divided into ways for data transformation such as intra clustering data transmission and inter clustering data transmission. The intra clustering transmission phase is made up other three components given as in Eq.

$$\begin{aligned} E_{Intra-clustering}&=E_{CM to CH}+E_{CH-Reception} \\&\quad +E_{DA} \end{aligned}$$
(21)
$$\begin{aligned} E_{CMtoCH}&= {} \sum _{i=1}^{k}\sum _{j=1}^{m_{i}}E_{Tx}(CM_{j}, CH_{i}) \end{aligned}$$
(22)

Equation (22) showed energy consumption between CM and respective CHs. where k is number of cluster and \(m_{i}\) is number of CM of ith cluster in the network. The \(E_{Tx}(CM_{j}, CH_{i})\) represents energy transmission between \(CM_{j}\) to its CH.

$$\begin{aligned} E_{CH-Reception}=\sum _{i=1}^{k}m_{i}\times E_{Rx} \end{aligned}$$
(23)

Equation (23) justified energy consumption at the CH from its CM, \(m_{i}\) is the number of CM of all the respective ith cluster k and \(E_{Rx}\) is a receiving energy.

$$\begin{aligned} E_{DA}=\sum _{i=1}^{k}l\times m_{i}\times E_{DataBit} \end{aligned}$$
(24)

where l is a number of bits and \(E_{DataBit}\) is energy consumption rate for single bit data aggregation.

Now, Next equation calculated inter cluster data transformation. The condition of NEHCP algorithm justified to calculate single hop data transformation, no intermediate node in Fig. 5. Therefore, energy consumption between CH and the BS will be

$$\begin{aligned} E_{Inter-clustering}=E_{Tx}(CH_{i}, BS) \end{aligned}$$
(25)

Let assume area occupied by each cluster is \(\frac{A^2}{k}\), it is the approximate value. In general, node distribution is \(\rho (x,y)\) in the arbitrary region, then expected square distance the sensor nodes to the CH.

$$\begin{aligned} E[d_{to CH}^2]=\int \int (x^2+y^2)\rho (x,y)dxdy \end{aligned}$$
(26)

After simplification Eq. (26) and if an area of the circle then,R=\(\frac{A}{\sqrt{\varPi k}}\) and \(\rho (x,y)\) is a constant value for r and \(\varTheta\). If the density of the sensor nodes are uniform then, cluster area

$$\begin{aligned} E[d_{to CH}^2]=\frac{1}{2\varPi }\frac{A^2}{k} \end{aligned}$$
(27)

Therefore, related Eqs. (26) and (27), distance mapped CM to the CH in the free-space \(d^2\)

$$\begin{aligned} E_{CM}=lE_{elec}+l\epsilon _{fs}\times \frac{1}{2\varPi }\frac{A^2}{k} \end{aligned}$$
(28)

The energy dissipated of the one cluster during send frame is

$$\begin{aligned} E_{cluster}=E_{CH}+\left( \frac{N}{k}-1\right) E_{CM} \end{aligned}$$
(29)

In addition, Eq. (29) can written as,

$$\begin{aligned} E_{cluster}=CH_i+\sum _{j=1}^{k_{j}}CM_{j} \end{aligned}$$
(30)

Then, for k cluster energy dissipate in one round.

$$\begin{aligned} E_{k,cluster}=k\times E_{cluster} \end{aligned}$$
(31)

However, total energy dissipates to send a frame from CM to the BS via CH,

$$\begin{aligned} \begin{aligned} E_{total} =&\,l\left( 2E_{elec}N+E_{DA}N+\epsilon _{mp}d^4 \,\,to\,\, BS+E_{elec}N \right. \\&\left. +\,\epsilon _{fs}\frac{1}{2\varPi }\frac{A^2}{k}N\right) \end{aligned} \end{aligned}$$
(32)

In above equation calculated energy consumption rate for single frame data transmission, now next showed total energy dissipated for each sensor node per rounds. It varies on the average number of the frame in round \(N_{frames/round}\) as follows:

$$\begin{aligned} E_{CH/round}&= {} N_{frames/round}\times E_{CH/frame} \end{aligned}$$
(33)
$$\begin{aligned} E_{CM/round}&= {} N_{frames/round}\times E_{CM/frame} \end{aligned}$$
(34)

where \(E_{CH/frame}\) is a energy to receive all the signals from CM and after aggregating signal the sent to the BS. The process of \(E_{CM/frame}\) shows to transfer signal at the CH.

Fig. 5
figure 5

Run-time evaluation of the NEHCP algorithm in initial stage \(EH-WSN\#1\)

5 Simulation and evaluation of the NEHCP method

In this section, we showed a simulation of the proposed algorithm and different result parameter using MATLAB. However, we have compared the NEHCP against other protocol such as CEEC, HUCL, and SEED. The experiment of NEHCP algorithm is categorized into different kinds just as a run-time instance of NEHCP algorithm, network lifetime, network stability period versus network instability period, a cumulative data packet sent and the relationship between EH-rate and energy-consuming rate.

Mainly, the sensor network is divided into two categories such as \(EH-WSN\#1\) and \(EH-WSN\#2\). First sensor network consists of 100 nodes and deployed in the \(200m\times 200\,{\mathrm{m}}\) area and second the network contains 200 nodes in the deployed large area \(500\,{\mathrm{m}}\times 500\,{\mathrm{m}}\). In this paper, we discussed most of the resulting experiment about \(EH-WSN\#1\) from given Tables 2 and 3.

Table 2 Number of sensor nodes and area size
Table 3 Simulation parameter used for the NEHCP

5.1 Run-time evaluation of the NEHCP algorithm

The performance of the NEHCP method is far better than another method. Additional performance fully depends on the solar energy system, because of each sensor node assets with harvesting module device. The sensor node is getting some extra energy from the environment in each round. Therefore, increased lifetime of the sensor node, because lifetime entirely depends on the residual energy. If residual energy is more, then will give more round. However, a number of rounds go on maximum, and then more packet received at the BS. In Figs. 5 and 6 are showing the performance of the NEHCP algorithm. Figure 5 showed the initial stage of the round and Fig. 6 is after the round stage.

Fig. 6
figure 6

Run-time evaluation of the NEHCP algorithm after round stage \(EH-WSN\#1\)

5.2 Network lifetime

There are different metrics to define the network lifetime of the WSNs. We describe the network lifetime of the sensor node until the death of the first node. In the given Table 4 has shows the first node dead (FND), half node dead (HND) and last node dead (LND) in all respective rounds of the network \(EH-WSN\#1\). However, according to our proposed method, NEHCP is giving better result compare to another. Because Table 4 is showing the average effect of the all method among them, NEHCP algorithm is providing a better result. The percentage result of the lifetime CEEC, HUCL, SEED, and NEHCP is 37.7%, 45.7%, 61.8% and 71.8% respectively.

Table 4 Lifetime of the sensor node at the different round in the EH-WSN#1

In the above Eqs. (7) and (8) have calculated weather condition model and how much energy got in a different rounds. The effect of both Eq. is a key factor. Here, another important factor of the NEHCP is energy consumption rate and harvesting rate. In Eq. (12) shows the relation between energy consuming rate and EH-rate. Moreover, most of the energy of the round consuming rate greater than or equal to the EH-rate, therefore after long time sensor node begin to die. In Figs. 7 and 8, the number of rounds versus different energy level in the EH-WSN#1 and EH-WSN#2 is shown, respectively. However, in the EH-WSN#2, the number of rounds decreases because deployed sensor nodes are more in number; therefore, it increases more energy consumption. The energy level is 0.25 J, 0.5 J, 0.75 J and 1 J, and different energy level showed a lifetime of the sensor network. In Fig. 9 is identifying a number of the alive node after different number of rounds in the EH-WSN#1 network.

Fig. 7
figure 7

Lifetime of the sensor at different energy level in the \(EH-WSN\#1\)

Fig. 8
figure 8

Lifetime of the sensor at different energy level in the \(EH-WSN\#2\)

Fig. 9
figure 9

Number of alive sensor nodes per round in the \(EH-WSN\#1\)

5.3 Network stability period versus network instability period

This relation is very significant for the sensor network because this relationship identified to give maximum performance capacity range of the sensor nodes. Network stability declared that it is the time period of the sensor nodes until the death of the first node. Moreover, this operation referred to a stable region of the sensor network. Network instability showed that time interval death of the first sensor node until the death of the last node. However, this time period is known as the unstable region of the sensor network. Table 5 shows a different percentage of the dead node in particular round. Figures 10 and 11 are showing result stable region and unstable region in the respective EH-WSN#1 and EH-WSN#2 network. Furthermore, in given Fig. 12 identified that if the sensor network goes to a stable region, then gives maximum throughput of the network, otherwise, decreased network throughput. We can see Figs. 10 and 11 that stable region, all method are the same in the initial round, but after some round stable condition decreased. After a few round, a CEEC and SEED are little different but, the performance of the NEHCP is better in each round. In addition, network stability always increased in the EH-WSNs system.

Table 5 Percentage of the sensor nodes dead in the round EH-WSN#2
Fig. 10
figure 10

Different position of the dead node in \(EH-WSN\#1\)

Fig. 11
figure 11

Different position of the dead node in \(EH-WSN\#2\)

Fig. 12
figure 12

Number of dead sensor nodes per round \(EH-WSN\#1\)

5.4 Cumulative data packet sends to the BS

The number of data packets is sent to the BS, and it is very often in the clustering method. A maximum data message sent to the BS, its depend on the size of the message, residual energy of the sensor nodes and data aggregation method. Message size contains packets header size and control message size. If residual energy is more of the node, then goes maximum packets to the BS. Transfer of maximum packet depends on the energy level of the network, and it has calculated as given in Eq. (11). In Figs. 13 and 14 show how data transfered to the BS. The NEHCP algorithm is giving better performance in given Fig. below. Eq. (35) has finds the transmission of the total number of the packet at the BS.

$$\begin{aligned} P_{total}= \sum _{i=1}^{r}p_{n} \end{aligned}$$
(35)

where r is number of round.

Fig. 13
figure 13

Number of packets send to the BS in round \(EH-WSN\#1\)

Fig. 14
figure 14

Number of packets send to the BS versus dead nodes \(EH-WSN\#1\)

5.5 Relationship between energy harvesting rate versus energy consuming rate

This is a significant consideration of the algorithm because it showed a relation between EH rate and energy-consuming rate of the network. Figure 15 showed to work in day time Harvesting system, and another critical parameter is justifying that energy-consuming rate greater than EH rate. But, it depends on the sunlight of the day time, the intensity of light will increase, then it may be opposite. But, for some round may be possible that energy-consuming less than EH rate, its depend on weather condition. Other Fig. 16 represented night time harvesting system or cloudy time and this time, EH rate is very less or almost zero. Related to both conditions, Eq. (12) justfyies that without the sunlight, it can not be possible for EH system in the EH-WSNs.

Fig. 15
figure 15

Relation between EH-rate versus energy consuming rate in day time \(EH-WSN\#1\)

Fig. 16
figure 16

Relation between EH-rate versus energy consuming rate in night time \(EH-WSN\#1\)

6 Conclusion

The main objective of the clustering protocol is to reduce the energy consumption rate in the WSNs. In addition, a lot of work has been done in the clustering routing protocol of WSNs for increasing their prolonging network lifetime. Related to the above energy-consuming problem, a new clustering technique is developed and referred to NEHCP algorithm. The proposed work is different from other clustering methods because of the use of solar energy. The use of solar energy increased the lifetime of the sensor network and beat other clustering protocol. The NEHCP has been divided into the three sections, namely initial phase, set-up phase and data transfer phase. The total time complexity of the NEHCP algorithm is \({\mathcal {O}}(N^2)\) and message passing to build the CH is \({\mathcal {O}}(N)\) in the worst case. The performance evolution and percentage lifetime of the CEEC, HUCL, SEED and NEHCP is 37.7%, 45.7%, 61.8% and 71.8% respectively. Further, this research area can be expanded through the use of efficient harvesting device in the future EH-WSNs.