1 Introduction

1.1 Background

WSN is an infrastructure-less network which consists of low cost, compact sensor nodes to monitor the physical environment. Hundreds to thousands of sensor nodes are deployed randomly in the field to be sensed. WSN plays a vital role in tracking and surveillance applications like environmental monitoring, weather forecasting, precision agriculture, natural disaster prevention, disaster management, border surveillance, smart cities, etc [1]. In the real scenario, it is used to observe various physical parameters such as temperature, pressure, moisture level, gas, acoustics, vibrations etc [2]. The sensor node in WSN is made up of sensors, microcontroller, communication unit and power supply. The sensor unit observes the environment, collects data, processes it and transmits to other sensor nodes via communication unit [3]. However, the nodes are constrained in energy, bandwidth, memory and computation. In addition, there are some typical issues like security, fault tolerance, connectivity, coverage, synchronization, scheduling, and localization problem. Generally, sensor nodes are deployed in harsh environments, where it is very hard or not possible to recharge or replace batteries [4]. The communication cost is thousand times more than sensing and computation cost [5]. Because of limited and non-rechargeable batteries, the available energy of the node should be utilized efficiently. Considering the characteristics of WSN, energy efficiency is the major issue which influences the overall performance of the network. Therefore, the design of WSN protocols should be simple, energy-efficient and flexible to varying environmental conditions.

Various clustering and routing protocols with different aspects have been developed in the literature to form an energy efficient WSN [6]. Clustering technique partitions the network and groups the nearby nodes into clusters. A leader called CH will be selected from the nodes and the remaining nodes are referred as cluster members [7]. The process of forming clusters with the same number of nodes in a network is known as equal clustering whereas with the uneven number of nodes is known as unequal clustering [8]. From each cluster, a Cluster Head (CH) will be selected based on some criteria. The CH is responsible for three operations: receiving data from cluster members, aggregating it and transmitting it to BS. The CH also acts as a relay node for other CHs to transmit data to BS. The general system model of clustering is shown in Fig. 1. Equal clustering works well and produces better results only when the distribution of nodes is uniform. Due to random deployment of the nodes, uniform distribution is highly improbable. This leads to unequal energy utilization of the nodes, especially CHs located closer to BS. When multi-hop transmission is employed, CHs nearer to BS act as relays for far CHs. Hence, CHs nearer to BS exhausts its energy and die earlier than CHs farther to BS. It disrupts the network connectivity and network lifetime is significantly minimized. This issue is referred to as hot spot problem and it can be eliminated by the process of unequal clustering. The architecture of unequal clustering is shown in Fig. 2. In unequal clustering, clusters located near BS are smaller in size and the cluster size increases when the distance from CHs to BS increases. Smaller clusters imply lesser number of cluster members and less intra-cluster traffic. Hence, it spends less energy for intra-cluster communication and preserves more energy for inter-cluster routing. It produces balanced clusters and makes all CHs to spend equal amount of energy throughout the network. It leads to uniform energy dissipation and prolongs the network lifetime.

Fig. 1
figure 1

Architecture of clustering in WSN

Fig. 2
figure 2

Architecture of unequal clustering in WSN

Generally, nodes in the WSN periodically transmit data to BS and it is well suited for periodic data monitoring applications. In time critical situations, the sudden and rapid changes in the physical environment lead to numerous data transmission and energy consumption is significantly affected. To overcome this issue, reactive protocol is introduced where data will be transmitted when the threshold value is crossed. In reactive protocol, data will be transmitted only when the sensed value exceeds the threshold value. The threshold value includes hard threshold value and soft threshold value. Once the CHs are selected, it broadcast two threshold values (hard threshold and soft threshold values) to its cluster members. When the sensed value exceeds the hard threshold value, the cluster members will transmit the data to the CH. Thus, the hard threshold tries to reduce the number of transmissions by allowing the nodes to transmit only when the sensed value is in the range of interest. The soft threshold further reduces the number of transmissions by eliminating all the transmissions which are slightly or no change in the sensed value from the hard threshold value. The soft threshold can be varied, depending on the criticality of the sensed attribute and the target application. A smaller value of the soft threshold gives a more accurate picture of the network, at the expense of increased energy consumption. Thus, the user can control the trade-off between energy efficiency and accuracy. Though reactive protocol works well in some situations, the overall picture of the network cannot be determined by reactive protocols. So, hybrid protocol is introduced where reactive data as well as periodic data will be transmitted to attain the overall picture of the network. In addition to hybrid manner, periodic data transmission with longer time interval is introduced to effectively utilize the available energy. Routing also plays a major role to achieve energy. From earlier days, ants have been used in routing process to find the shortest path from nest to source. Ants follow the pheromone trails laid by other ants and the mechanism is termed as stigmergy. The mapping of ant behavior to the computer system is defined as Ant Colony Optimization (ACO) [9]. As the data transmission from CHs to BS is an NP-hard problem, ACO provides the best solution to solve this problem. For proper load balancing and to utilize uniform energy consumption, unused paths are also used in addition to shortest paths. ACO finds the possible and shortest from CHs to BS. The new routing strategy involves the transmission of reactive data using shortest path and periodic data with longer time intervals are transmitted via unused path.

1.2 Author’s contributions

In this paper, we address the hot spot problem with efficient clustering and routing strategies. This paper proposes a fuzzy logic and ACO based Combined Unequal Clustering, Routing and Hybrid protocol. Proper selection of CHs and unequal cluster formation are done with the help of fuzzy logic. Fuzzy logic uses residual energy, distance to Base Station (BS), distance to its neighbors, node degree and node centrality as input parameters. The inclusion of five input parameters results to highly energy-efficient WSN. The routing issue is addressed by ACO technique. The proposed method uses ACO based routing for inter-cluster routing from various CHs to BS. To reduce the number of transmissions, threshold based data transmission is adopted in addition to periodic data transmission with longer time intervals. For proper load balancing, we design a new routing strategy where threshold data transmission uses the shortest path and the periodic data transmission uses unused paths. Furthermore, we proposed a cluster maintenance phase which balances the energy dissipation and hot spot problem is solved. We perform an extensive simulation on the proposed method under different deployment scenarios with first order radio energy model. A comparison is also made with respect to energy consumption, residual energy, packets to BS and network lifetime. In short, our main contributions are listed below:

  • Fuzzy logic based cluster construction and CH selection

  • ACO based inter-cluster routing

  • Data transmission in a hybrid manner (Threshold based and periodic data transmission with longer time intervals)

  • A new routing strategy which exploits all possible paths.

  • Cross level cluster maintenance phase

  • Performance evaluation and comparison with some state of art approaches

1.3 Structure of the paper

The remainder of the paper is structured as follows. A brief survey of related works is given in Section 2. The system model is explained in Section 3. Section 4 presents the proposed protocol in detail. The simulation results are discussed in Section 5. The paper is concluded and future work is given in Section 6.

2 Related work

Clustering and routing are the most important optimization problems in WSN. Many of the clustering based routing techniques have been investigated and a lot of protocols have been developed. Computational intelligence techniques like neural networks, reinforcement learning, swarm intelligence approaches, evolutionary algorithms and fuzzy logic are used to address several design issues in WSN include CH selection, routing, security, data aggregation, synchronization. As the nodes are interdependent to each other with inter-related metrics, cluster formation using predefined rules is not highly applicable. Fuzzy logic is highly useful in situations where the degree of uncertainty is high [10, 11]. In the recent years, a large number of researchers are carried out in swarm intelligence based optimization techniques because of its efficiency in solving numerous optimization problems in various domains. In this section, Hierarchical clustering based routing protocols such as classical approaches, swarm intelligence techniques and fuzzy logic approaches are discussed in detail.

2.1 Classical approaches

Heinzelman et al. [7] presented a first clustering approach named LEACH (Low Energy Adaptive Clustering Hierarchy), which is most popular and highly cited clustering technique. It is a distributed protocol where the CHs are selected randomly in every round. The CHs receives the data from its cluster members, aggregate into a single packet and forwards to BS in a single hop manner. Random election of CHs leads to selecting a sensor node as CH even if its residual energy is very low. This is the major drawback of LEACH protocol. In addition, single hop data transmission is impractical in large scale WSNs. It makes the sensors die earlier and significantly minimizes the network lifetime. A modified version of LEACH named LEACH-C (Centralized LEACH) is proposed [12]. At the beginning, BS receives sensor node’s residual energy and location information. As LEACH-C is a centralized protocol, BS executes the clustering process. BS selects some nodes as candidate nodes based on the residual energy and final CHs are selected by simulated annealing algorithm. This results of better selection of CHs and reduces energy consumption than LEACH. But, the use of direct data transmission is still a drawback.

Manjeshwar and Agrawal [13] presented the first reactive protocol presented called TEEN (Threshold Sensitive Energy Efficient sensor Network). It is useful in time critical applications where the nodes will send data when the sensed value crosses a predefined threshold value. It eliminates periodic transmission which in turns reduces the number of data transmissions and preserves energy; it is not practically useful in all situations. When the threshold value is not reached, the nodes will not transmit data and the user fails to observe the current scenario in the sensing field. Another proactive protocol called DEEC (Distributed Energy-Efficient Clustering) is proposed for multi-level heterogeneous networks which consist of advanced nodes and normal nodes. The CHs are selected based on their initial energy and remaining energy and the nodes with greater energy level will become a CH. Another reactive protocol HEER (Hybrid Energy Efficient Reactive) is proposed [14], which works in a similar way as DEEC. In addition, threshold based data transmission is used. CH transmits two threshold values and the nodes will transmit data when the sensed data is higher than the threshold value. Usage of threshold based data transmission reduces the energy consumption and enhances the stability period and network lifetime. But, it also suffers from the same drawbacks of TEEN. To eliminates the drawbacks of TEEN, [15] proposed a hybrid protocol called APTEEN. In APTEEN, the nodes will send data periodically in an energy-efficient way as well as it responds immediately to the changes in the physical environment. The user can also request query to enquire data about past, present and future values. This is useful for time critical applications and real time warnings. It minimizes the total energy consumption and longevity of the network [16].

2.2 Evolutionary and swarm intelligence approaches

Swarm intelligence based clustering protocols organize the sensor nodes into clusters to reduce the energy consumption in the entire network. For N sensor nodes, there are totally 2N-1 solutions and in every solution the node can become CH or cluster member. Hence, clustering is said to be an NP-hard problem. Evolutionary Algorithms (EA) have been successfully employed to various N-P-hard problems. In this section, EA technique addresses clustering problem are discussed in detail.

ABC algorithm is derived from the foraging behavior of honeybee and is used to address the clustering problem [17]. It is a centralized technique where the BS executes the clustering and routing process at the initialization phase itself [18]. Another work is presented in [19] to solve the clustering and routing problem. This paper proposes a clustering scheme based on PSO with an efficient particle encoding scheme. It works in a centralized manner and the derived fitness function produces balanced clusters in the network. Another work based on PSO algorithm called PSO-C protocol. It is a location aware, centralized protocol which produces clusters based on energy and distance. A cost function is derived with an aim of reducing the intra-cluster distance with optimized energy consumption. The performance of PSO-C performs well than Genetic Algorithm and k-means based clustering algorithm. As PSO-C is a location aware protocol, it is not practical in random deployment scenarios. Cluster based routing protocol named Bee-C [20] is presented based on Honey Bee Mating Optimization (HBMO). It is a multi-level protocol which uses the reproductive nature of honeybees. This protocol is designed for continuous data dissemination and it fails to produce better results in large scale WSN. A modified version of Bee-C called Bee-Sensor-C [21] is proposed for event driven sensor networks. This protocol uses dynamic clustering and on demand multi-path routing technique. When an event has occurred, sensors nearer to that event collect and transfer data. In the way, the clusters are constructed among the nearest sensors and the respective routing will be done. It increases the overhead in large scale WSN and in situations where the occurrence of the event is high. GCA (Genetic Clustering Algorithm) [22] forms clusters dynamically with an aim of maximizing the lifetime and minimizing the energy consumption. A multi-objective fitness function is proposed which reduces the CHs count and the distance from the nodes to CHs. Though fewer clusters lead to energy efficiency, it results to hot spot problem. Jin et al. [23] proposed a genetic algorithm based clustering technique to select CHs properly and reduces the distance between clusters. It uses a binary coding scheme to represent chromosomes. Another genetic algorithm based clustering technique is proposed [24] with a modified fitness function. This fitness function is derived from the standard deviation of distance between clusters, energy consumption and the number of data transmissions. ERP (Evolutionary based Clustered Routing Protocol) is also an evolutionary based protocol to solve clustering and routing problem in WSN [25]. A new fitness function is derived by combining cohesion and separation errors in the view of clustering. It maximizes the network lifetime but the stability period is significantly reduced. It is not useful in situations where the feedback from the WSN should be highly reliable.

Though the equal clustering technique balances the energy consumption within a cluster, it fails to balance the energy consumption between clusters. The inter-cluster multi-hop data transmission increases the burden of the CHs near the BS and results in hot spot problem. To alleviate this issue, various unequal clustering techniques have been developed and found in the literature [26]. EBUC (Energy Balanced Unequal Clustering) has been developed for periodic data gathering applications [27]. It is a centralized protocol and PSO algorithm runs at the BS. PSO algorithm selects the candidate CHs and constructs clusters of unequal sizes. Greedy based multi-hop routing protocol is employed for data transmission between clusters. The relay node is chosen based on residual energy and distance to BS. This protocol maximizes lifetime and reduces the rate of dead nodes. A Genetic Algorithm based Energy-Efficient Adaptive clustering Protocol (GAEEP) is proposed to improve the stability period of WSN [28]. GAEEP operates in two phases: setup phase and steady state phase. In the first phase, BS runs GA and cluster construction takes place. In steady state phase, inter-cluster data transmission takes place. This protocol uses TDMA schedule for intra-cluster communication and CDMA schedule is used for inter-cluster communication. It is found to be more energy-efficient and reliable than existing protocols in both homogeneous and heterogeneous environments. LEACH-LPR [29] is a dynamic distributed clustering protocol based on LEACH. Selection of CHs is based on residual energy, distance to BS and neighborhood density. In LEACH-LPR, GA is used for optimization and it reduces the computational overhead of evolutionary algorithms.

2.3 Fuzzy approaches

In the recent days, fuzzy logic gains more interest in the design of cluster based routing approaches in WSN. Many protocols use fuzzy logic in situations where the network is unstable and uncertainties are high. As fuzzy logic is flexible, fault tolerant, less complex and requires less computational resources, various fuzzy logic based protocols have been developed to address clustering problem. Molay et al. [30] developed a fuzzy based clustering technique with energy level, concentration and centrality as input variables. Concentration refers to the number of nodes in the vicinity and centrality refers to the rate of closeness to the center of the cluster. This technique works in a similar way as LEACH except the process of CH selection. In this approach, BS calculates the chance of becoming CHs using Mamdani fuzzy inference system and the nodes with higher chance will be elected as CHs. LEACH-FL is a modified version of [31]. This method uses residual energy, distance to BS and node density as input variables. It produces better results than other algorithms. CHEF is a distributed fuzzy based technique which does not require global information about the network [32]. CHEF increases the network lifetime with the local election of CH with two input variables such as residual energy and nearby distance. Apart from the changes in fuzzy input variables, the procedure is similar as LEACH.

The above mentioned techniques and several existing protocols which adopt equal clustering protocol fail to eliminate hot spot problem. Here, fuzzy based unequal clustering protocols are briefly explained. EAUCF (Energy Aware Fuzzy Unequal Clustering Algorithm) is proposed to maximize the stability and network lifetime [33]. It is a distributed approach which determines the competition radius using two input variables such as residual energy and distance to BS. Another fuzzy based clustering scheme called IFUC (Improved Fuzzy Unequal Clustering Algorithm) to maximize the lifetime and eliminated the hot spot problem [34]. It uses three fuzzy input variables such as residual energy, distance to BS and node density to calculate the chance of becoming CH and cluster radius. This technique uses ACO based inter-cluster data transmission from CHs to BS. It also chooses relay node based on the transmission cost and degree of energy consumption. FBUC [35] is also a distributed fuzzy based clustering technique which is an improved version of EAUCF. Here, tentative CHs are selected based on probabilistic method and competition radius is calculated by fuzzy logic. The fuzzy input parameters are residual energy, node degree and distance to BS to determine cluster radius; residual energy and node degree are used to finalize CHs from tentative CHs. Another distributed clustering technique is DUCF presented in [36]. It uses fuzzy logic for CH selection and cluster size determination. Residual energy, node degree and distance to BS are used as fuzzy input parameters for CH selection and calculation of competition radius. It balances the load among clusters and reduces energy consumption using multi-hop data transmission between clusters.

CRT2FLACO is a type 2 fuzzy based clustering protocol which also uses ACO for inter-cluster routing. It uses three fuzzy input parameters including residual energy, neighboring nodes and distance to sink. Multi-hop data transmission using ACO is employed for actual data transmission from CHs to sink. FAMACROW [37] is also a fuzzy and ACO based technique to address clustering and routing problem. It is a cross-level hierarchical protocol with residual energy, node degree and quality of communication link. ACO selects the relay node with residual energy, distance to BS, queue length and delivery likelihood. Usage of link quality and the delivery likelihood increases the reliability of the protocol.

The proposed system differs from state of art approaches in the following ways:

  • As the CH selection process depends on many overlapping metrics. The proposed system uses five fuzzy input parameters. Existing techniques use two or three fuzzy input variables.

  • TEEN and APTEEN only uses threshold based data transmission. The proposed method also employs threshold based data transmission in addition to periodic data transmission. But, none of the techniques has been proposed to transmit data periodically with longer time intervals to achieve energy efficiency and also attain overall picture of the network

  • None of the existing routing techniques exploits unused paths in addition to shortest paths.

  • Cross-level cluster maintenance phase has been introduced for uniform load distribution and maximum network lifetime.

Table 1 summarizes the existing fuzzy based clustering protocols with important features and compared with the proposed protocol.

Table 1 Summary of reviewed fuzzy approaches compared with proposed method

3 System model

3.1 Network model

A WSN is considered with N number of sensor nodes deployed randomly in the sensing region. Some assumptions are made and are listed below.

  • Nodes and BS are static after being deployed.

  • All nodes are homogeneous.

  • Nodes are not aware of the location

  • Received Signal Strength Indicator (RSSI) is used to calculate the distance to BS from sensor nodes.

  • Node death is because of energy depletion only.

  • Sensor nodes can vary the transmission power using power control according to the distance of the receiver node.

3.2 Energy model

A simplified first order radio model is considered as the energy model of the WSN [18]. The energy consumed for transmitting and receiving an l -bit packet over a distance d is given in (1) and (2).

$$ E_{TX}\left( l, d\right)\,=\,\left\{\!\begin{array}{cl} l \!\times\! E_{elec} \,+\, l \!\times\! \varepsilon_{fs} \!\times\! d^{2} & \quad if \ d \!\le \! d_{0} \\ l \!\times\! E_{elec} \,+\, l \!\times\! \varepsilon_{mp} \!\times\! d^{4} & \quad if \ d \!>\! d_{0} \end{array}\right. $$
(1)
$$ {E}_{RX}\left( l \right) = l \!\times\! E_{elec} $$
(2)

where E e l e c is the amount of energy dissipation in transmitter or receiver, d 0 is the threshold distance which is calculated by \(d_{0}=\sqrt {\varepsilon _{fs} \left / \varepsilon _{mp}\right .}\). Based on the communication distance d, free space (ε f s ) or multipath fading (ε m p ) is used in the transmitter amplifier.

4 FUCHAR protocol

4.1 Overall operation

The proposed method is a hybrid (proactive and reactive) unequal clustering protocol and the overall operation is depicted in Fig. 3. The proposed method operates in three phases: cluster construction phase, data transmission phase and cluster maintenance phase. There are two levels in cluster construction phase namely CH selection and cluster formation. In cluster construction phase, BS runs fuzzy logic to produce balanced clusters. Balanced clusters are achieved by the equal distribution of CHs in the entire network. Proper selection of CHs is a very important task in clustered WSN. For CH selection, fuzzy logic with five input variables namely residual energy, distance to BS, distance to its neighbors, node degree and node centrality are used. The output parameters are Probability of becoming CHs and cluster size. In existing methods, two or three fuzzy input variables are used to elect CHs. But, residual energy, distance to Base Station (BS), distance to its neighbors, node degree and node centrality are overlapping metrics and plays a vital role in clustered WSN. Residual energy indicates the total remaining energy of the sensor node and distance to BS refers to the distance from sensor nodes to BS. The CHs located closer to BS should be smaller size clusters than those farther from it. Distance to its neighbors is the measure of distance between the sensor node and its neighboring nodes. Node degree refers to the number of neighboring nodes in the vicinity. The higher value of node degree indicates the larger number of neighbors and intra-cluster communication will be high. So, the cluster size should be smaller for nodes with higher node degree. Node centrality is also important because it shows how much the sensor node is centrally located to communicate with other nodes. Nodes with a higher value of centrality have the higher chance of becoming CH because it reduces intra-cluster communication distance. The inclusion of these five input parameters is highly important to select CHs and cluster size efficiently. Nodes with the higher probability of becoming CH will be final CHs and the neighboring nodes with the communication radius of the CH join the cluster.

Fig. 3
figure 3

Working flow of the proposed method

In data transmission phase, ACO based inter-cluster multi-hop routing takes place. Most of the existing protocols transmit data periodically in a proactive manner. It increases the number of data transmission and the sensed data will be highly correlated. To improve energy efficiency, threshold based data transmission (reactive) is proposed. But, it fails to attain the overall picture of the network. The proposed method transmits data in two situations: threshold based and periodic data transmission with longer time intervals. Using this method, the number of data transmission is reduced and at the same time overall picture of the scenario is also attained. The proposed method introduced a new routing strategy based on ACO. For uniform energy dissipation, it exploits unused routes in addition to shortest routes. To achieve proper load balancing, threshold data transmission uses shortest paths and the periodic data transmission uses unused paths. Finally, cluster maintenance phase uses cross-level data transmission to enhance the network lifetime significantly.

4.2 Fuzzy logic based clustering mechanism

Fuzzy logic consists of four steps:

  1. i

    Fuzzification of input variables: Convert the crisp input and maps them to appropriate linguistic variable

  2. ii

    Membership Functions; Triangular and Trapezoidal membership function

  3. iii

    Fuzzy decision blocks/ Rule base: The rule base is a set of if-then rules which relates the input and output fuzzy parameters using linguistic variables.

  4. iv

    Defuzzification: It transforms the fuzzy output probability to a crisp value.

The input parameters and the linguistic variables to select CH and cluster size are tabulated in Table 2.

Table 2 Input variables and their linguistic variables

The trapezoidal membership function represents boundary variables (low, high, near, far, few, many, very low, very high, very small and very large) and triangular membership represents middle variables. The membership functions of the input and output variables are shown in Figs. 3456789 and 10. The triangular and trapezoidal membership functions can be represented in (3) and (4).

$$ \mu_{A1}\left( x \right)= \left\{\begin{array}{cl} 0 &\quad x\le a1 \\ \frac{x-a1}{b1-a1}&\quad a1\le x\le b1 \\ \frac{c1-x}{c1-b1}&\quad b1\le x\le c1 \\ 0 &\quad c1\le x \end{array}\right. $$
(3)
$$ \mu_{A1}\left( x \right)= \left\{\begin{array}{cl} 0&\quad x\le a1 \\ \frac{x-a2}{b2-a2}&\quad a2\le x\le b2\\ 1&\quad b2\le x\le c2\\ \frac{d2-x}{d2-c2}&\quad c2\le x\le d2\\ 0&\quad d2\le x \end{array}\right. $$
(4)

The fuzzy if-then rules for the selection of CHs and cluster size are tabulated in Table 3. A rule can be expressed in (5).

$$\begin{array}{@{}rcl@{}} && Rule\left( i\right) \ IF \ x_{1} \ is \ {A_{1}^{i}} \ AND \ x_{2}is {A_{2}^{i}}\ AND\ x_{3}is {A_{3}^{i}} \ AND\\ &&\qquad x_{4}is {A_{4}^{i}}\ AND\ x_{5}is {A_{5}^{i}}THEN \ y_{1}is B_{1}^{i}\ AND \ y_{2}is {B_{2}^{i}}\\ \end{array} $$
(5)

where i is the i th rule in the fuzzy rule, A 1, A 2,...., A 5 is the corresponding fuzzy set of x1, x2 .., x5. The rule base contains 243 rules and is generated based on a Madame Inference system which is simple and yields better results.

Fig. 4
figure 4

Membership function for RE

Fig. 5
figure 5

Membership function for DBS

Fig. 6
figure 6

Membership function for DNN

Fig. 7
figure 7

Membership function for ND

Fig. 8
figure 8

Membership function for NC

Fig. 9
figure 9

Membership function for PBCH

Fig. 10
figure 10

Membership function for cluster size

Table 3 Fuzzy rule base table

Centroid of Area (COA) method is employed for defuzzification which is represented in (6).

$$ COA= \frac{\int {\mu_{A}\left( x \right).xdx}} {\int {\mu_{A}\left( x \right).dx}} $$
(6)

After the computation of Probability of becoming CH, each node sends a CH_CANDITATE_MSG to its neighboring nodes. The message contains the ID and probability of becoming CH. The nodes receive the message and the node with higher probability will select them as CH and broadcast its status CH_WON to its neighbors. Some of the nodes may receive many CH WON from its neighbors. In these situations, the nodes will choose the nearest CH by sending CH_JOIN message. On receiving the message, CH checks the space for new cluster members. When the number of available cluster members is lesser than the cluster size, it will accept the new cluster member by sending CM_ACCEPT or CH_REJECT message will be sent. The rejected node requests the next nearer CH apart from the previous one and this process continues until it joins to a cluster. When no CH accepts as a cluster member, it gets selected itself as CH. Hence, no isolated nodes will present in the network.

4.3 ACO based routing mechanism

In this section, the process of finding an optimized path using ACO is explained.

  1. 1.

    Each CH sends its route information (node_id, residual energy and distance) to its neighbor CHs. The CHs stores the details in a routing table.

  2. 2.

    To find a route from CHs to BS, an ant is placed in every CH at regular time intervals. The ant chooses the relay node based on the probabilistic approach and the probability function is represented in (7)

    $$ {P}_{ij}=\frac{{(\tau_{ij})}^{\alpha} {(\eta_{ij})}^{\beta}}{{\sum}_{j\in N} {{(\tau_{ij})}^{\alpha} {(\eta_{ij})}^{\beta} }} $$
    (7)

    where, τ i j represents the pheromone information from source CH node to the next CH to be selected which is given in (8).

    $$ \tau_{ij}= \frac{1}{d_{ij}} $$
    (8)

    where d ij is the distance from source node i to its associated cluster head.

    The distance d i j is calculated as Euclidian distance which is given below

    $$ {d}_{ij}=\sqrt {{(S\left( i \right).xd-s\left( j \right)xd)}^{2}-{(S\left( i \right).yd-s\left( j \right)yd)}^{2}} $$
    (9)

    η i j is the heuristic information which indicates the energy level of nodes and is given by:

    $$ {\eta}_{ij}=\frac{E_{0}-E_{residual}}{\sum\nolimits_{k\in N} E_{k}} $$
    (10)

    where E 0 is the initial energy and E r e s i d u a l is the remaining energy of the node. The α and β are two parameters that regulate the relative weight of the pheromone trail and heuristic value.

  3. 3.

    For threshold based data transmission, the node with the higher probability is chosen as the relay node in the path from the CHs to BS. For periodic data transmission, the node with the minimum probability is chosen as the relay node in the path from the CHs to BS.

4.4 Cluster maintenance phase

Cluster maintenance phase is much important to balance the load among clusters. After some rounds, the clusters closer to BS are burdened with more inter-cluster traffic and exhaust their energy quickly. So, a cluster maintenance phase is needed for uniform distribution of load, eliminate hot spot problem and maximizes the network lifetime. In the proposed method, cluster maintenance phase operates on two levels: CH rotation in the cluster and cross-level data transmission. CH rotation takes place when the residual energy of the CH comes under a threshold value (15% of initial value). When the residual energy crosses a threshold value, new CH will be elected based on the probability of becoming CH. For uniform distribution of load and make all CHs consume an equal amount of energy, cross-level data transmission is used. When 15% of nodes served as CHs in a cluster, the BS is aware that less number of nodes is left to serve as CHs. BS broadcasts a message to next level CHs to transmit data directly (shown in Fig. 11). This process continues to all next level CHs located farther from BS. This results to uniform energy dissipation and enhances the network lifetime significantly.

Fig. 11
figure 11

The architecture of cross-level data transmission

5 Performance evaluation

In this section, we propose a performance evaluation of FUCHAR, which include the explanation of various performance metrics and the simulation according to these evaluation metrics. There are various metrics to analyze the performance of the clustering protocols. To investigate the effectiveness of FUCHAR, the following measures are used.

  • Energy consumption: This metric is used to evaluate the amount of energy spent by each node during the simulation time.

  • First Node Die (FND): This metric define/represents the number of rounds after the first node dies. It is useful to know the amount of time that all sensor nodes in WSN are fully operative.

  • Half Node Die (HND): This metric represent the number of rounds after 50% of the node dies in the network. It is useful to know the amount of time that half of the nodes in the network dies but still it operates.

  • Last Node Die (LND): This metric represent the number of rounds after that all nodes in the network dies. It is useful to know the amount of time that the network becomes inoperative.

  • Number of alive nodes per round: This metric reflects the total number of nodes that have not yet exhausted all of their energy.

5.1 Simulation setup

The proposed method is evaluated by conducting a series of simulation experiments in different scenarios. To be closer with the reality, three scenarios are used according to the location of BS. The three scenarios are denoted by S1, S2 and S3 respectively and is shown in Figs. 1213 and 14 respectively.

  • Scenario 1 (S1) - BS is located at the center of the sensing field.

  • Scenario 2 (S2) - BS is located in the edge of the sensing field.

  • Scenario 3 (S3) - BS is located away from the sensing field.

Fig. 12
figure 12

S1: BS at the center of sensing field

Fig. 13
figure 13

S2: BS at the edge of sensing field

Fig. 14
figure 14

S3: BS far away from the sensing field

These simulations were run using MATLAB (R2013a). In this simulation, 300 nodes are randomly deployed in the sensing area of 100x100m. For better comparison with other protocols, the energy consumption due to communication is computed by the first order energy model with the following parameters Eelec =  50nJ/bit, Efs =  10pJ/bit/m2, Emp =  0.001310pJ/bit/m 4 and EDA =  5nJ/bit/signal. The simulation parameters are tabulated in Table 4. To verify the effectiveness of the proposed algorithm, its performance is compared with well- known clustering protocols for WSN:

  • LEACH protocol which is the most popular and classic clustering protocol

  • TEEN is the first reactive protocol used in the time critical application.

  • DEEC is the hybrid clustering protocol which incorporates the features of TEEN.

  • EAUCF is the fuzzy based clustering protocol commonly used for comparison purposes.

Table 4 Simulation parameters

5.2 Average energy consumption

To check the energy efficiency of these protocols, the average energy consumption of three scenarios is measured and is shown in Figs. 1516 and 17. The energy consumption is calculated by the average energy spent by every individual sensor node in 2500 rounds respectively. FUCHAR achieves high energy efficiency than other methods. The energy saving in FUCHAR algorithm is because of fuzzy based intelligent CH selection and cluster size which assures the node with high residual energy, more number of neighboring nodes, high node centrality and lesser distance to BS and its neighbors. This in turn, minimizes the energy required for intra-cluster communication. The energy required by inter-cluster communication is reduced by the use of ACO based routing strategy. LEACH achieves poor performance due to the following factor: CHs are selected randomly and does not depend on any standard metrics such as residual energy, distance, etc. In addition, single hop data transmission also results to more energy dissipation. Next, TEEN achieves lesser energy consumption than LEACH but it fails to achieve better results than EAUCF and proposed method. As TEEN is a reactive protocol, it sends data when an event occurs. It reduces the number of data transmission and leads to lesser energy consumption. But, the random selection of CHs makes TEEN performance lag behind EAUCF and proposed method. Though EAUCF uses fuzzy logic for CH selection, tentative CHs are selected using probabilistic model. Thus, nodes with lesser residual energy, far distance and less degree become CHs.

Fig. 15
figure 15

S1: Average energy consumption

Fig. 16
figure 16

S2: Average energy consumption

Fig. 17
figure 17

S3: Average energy consumption

5.3 Number of data transmissions

The performance of WSN can also be analyzed based on the number of useful data transmissions takes place from CHs to BS over a time period. The performance depends on the number of useful data transmission and it does not depend on the number of data transmissions. The aggregated data from CHs finds useful until half of the nodes in the network dies. The proposed method delays HND as long as possible with the new routing strategy and cluster maintenance phase. The number of data transmission for various protocols is shown in Fig. 18. the proposed method operates in a hybrid manner, it transmit data when threshold level is reached and also periodically at longer time intervals. So, the proposed method has lesser number of data transmissions than LEACH and EAUCF. From Fig. 18, LEACH exhibits poor performance in all different scenarios than other protocols. LEACH does not control number of cluster members and it leads to decrease in data transmission from each sensor node to BS. TEEN transmit data only in situations where the sensed data reached a threshold value, it results in a lesser number of data transmissions. As the proposed method transmits data in a hybrid manner, the number of transmissions will be higher than TEEN but lesser than other protocols. And, the proposed method instructs CHs to restrict the number of cluster members based on the fuzzy output size.

Fig. 18
figure 18

Packets to BS before HND

5.4 Network lifetime

There are several ways to represent the lifetime of WSN. In this work, the lifetime is defined as the number of rounds completed till the nodes die. As the nodes in the nearby area results to same data, the death of the first node does not influence the entire operation of the network but the data quality starts to degrade. When half of the node dies (HND) in a network, the data quality will become worse. When the last node dies in the network, the network stops operating. To analyze the lifetime of WSN, the first node die (FND), half node die (HND) and last node die (LND) are considered. The FND and HND of 3 scenarios are shown in Figs. 19 and 20 respectively and the values are tabulated in Table 5. The number of alive nodes over several rounds for three scenarios is shown in Figs. 2122 and 23 respectively From fig, it is clear that the proposed method delays the death of the first node rounds in scenario 1 when compared to LEACH, TEEN and EAUCF. In scenario 1, the FND of LEACH occurs at 802 rounds and FUCHAR at 1647 rounds. In scenario 2, the FND of LEACH occurs at 542 rounds and FUCHAR at 1379 rounds. In scenario 3, the FND of LEACH occurs at 452 rounds and FUCHAR at 1246 rounds. The proposed method significantly increases the network lifetime due to (i) Proper CH selection and cluster size based on five variables (ii) data transmission in both proactive and reactive manner reduces the number of transmissions and conserve energy (iii) Exploitation of unused routes by ACO based routing strategy produces uniform distribution of load; (iv) cluster maintenance phase enhances the network lifetime and it make all the CHs spend equal amount of energy. Overall, the proposed system achieves energy efficiency and prolongs the network lifetime.

Fig. 19
figure 19

First node die (FND) for S1, S2 and S3

Fig. 20
figure 20

Half Node Die (HND) for S1, S2 and S3

Fig. 21
figure 21

S1: Number of alive nodes

Fig. 22
figure 22

S2: Number of alive nodes

Fig. 23
figure 23

S3: Number of alive nodes

Table 5 FND and HND for different scenarios

6 Conclusion

This paper proposes a Fuzzy logic based Unequal Clustering, and Ant Colony Optimization (ACO) based Routing, Hybrid (FUCARH) protocol for WSN to eliminate hot spot problem and maximize the network lifetime. Unlike the existing clustering protocols, the proposed method uses five input parameters to determine the CHs and cluster size. The proposed method utilizes threshold concept in data transmission in addition to periodic data transmission. We have also introduced new inter-cluster routing strategy for load distribution by transmitting the threshold data in shortest paths and periodic data in unused paths. Furthermore, the cluster maintenance phase makes all CHs to consume an equal amount of energy and it prolongs the network lifetime significantly. The extensive experimentation proves that the proposed method produces better performance than existing algorithms in various scenarios. This method is well suitable for both time critical and real time data gathering applications. In future, the proposed method can be extended for the network with mobile nodes and multiple sinks. Furthermore, we would also focus on the design of fuzzy logic with additional parameters such as link quality, coverage redundancy and related parameters.