Keywords

1 Introduction

Recent technological-accelerations for surging advancements regarding industrial applications in Internet-of-Things (IoT) based wireless communication have significantly aided major scientific and research platforms, where the main focus is to propose exceptionally elegant and convenient systems in terms of computational cost, design and practical execution. With these tremendous efforts available at hand, the consumer electronics industry have been made confident to manufacture wireless devices with economical value, tiny structure, and the capability of effectively utilizing the in-hand battery resources. Toward this end, sensors-based wireless networks have earned significant attention in the unlimited development of information and communication technologies [1]. However, since many of these devices are restricted by the resources available to them at hand, the communication overhead and power consumption are, hence, critical areas of research for analysis, manufacturing and development of such wireless networks in order to achieve efficient management in IoT.

Wireless sensor networks (WSNs) are made up of small, portable and energy-restricted sensor nodes deployed in an observation venue. These nodes carry the baggage of transmitting vital information using wireless radio links. Such information can have the form of multi-dimensional signals, and are of critical importance in many world-wide applications. The development of such networks demands extensive planning strategy along with superior tactical approaches for its working capabilities. This effective development motivates the existence of many real-time application scenarios such as environmental control [2], under-water networks [3], battlefield surveillance [4], medical and health-care systems [5, 6], and many more [7,8,9,10,11].

1.1 Underlying Structure of WSNs

As a function of the underlying transmission mode, WSNs can be categorized by two types of communication mechanisms: (1) direct or single-hop, and (2) multi-hop, as shown in Fig. 1. In the former method, the nodes in a network transmit their data directly to the base station (BS), also known as the sink. This in turn drains-up the battery life of the nodes, hence, resulting in an early death of the network. Therefore, this type of communication is not recommended for efficient and practical approaches. On the other hand, the later approach suggests a much more promising deal. In this method, the nodes are not needed to communicate with the BS directly, and can instead send their data to BS in multiple steps. This ultimately lessens the burden on each node, and allows the network to remain stable for a longer period of time.

Fig. 1.
figure 1

Multi-hop vs. Single-hop communication mechanism

Similarly, another distribution of WSNs is based on the type of the response that the nodes usually exhibit. Specifically, WSNs can be designed as either proactive or reactive. In a proactive mode, the nodes keep their transmitters continuously active, and periodically transmit the data independent of any parameters. Consequently, such power-hunger transmissions result in an inefficient energy utilization. On the contrary, in the reactive mode, the nodes respond only to the events that, for example, exceeds a certain threshold or when a specific event has been triggered. Since the nodes only respond to drastic changes and keep their transmitters turned-off otherwise, this yield a practically convenient system with an elongated network lifespan.

1.2 Research Developments in WSNs

The first step in establishing a WSN is the initialization and distribution of sensor nodes around the observation field. Many researchers have advocated normal distribution of the statically deployed nodes as the optimal distribution (e.g., see [12]). The deployment of nodes in the network field area is then followed by transmission of the required data. Since these nodes are left unattended, with limited resources at hand, efficient utilization of the available resources becomes a key operation factor to form a vigorous and standalone network.

To tackle this, many protocols recommended clustering of the network area, a pioneer contribution by W.B. Heinzelman [13]. Fundamentally, clustering divides the field into multiple smaller observation versions thereby making resource management a comparatively convenient task [14,15,16,17]. However, this requires free and fair election of cluster heads (CHs) in each cluster. These CHs are responsible for data fusion, i.e., they receive data from their respective clusters’ normal nodes, and transmit them to the BS.

Traditionally proposed protocols for WSNs focus mainly on performance improvements via effective selection criterion for CHs, the choice of single-hop or multi-hop communication between nodes, and whether the clustering scheme should be static or dynamic. Even though an optimal combination of the above factors yield interesting results, however, impressive results can be achieved by looking into more exciting parts of the problem. Furthermore, another much concerned and barely discussed side of the problem is the degradation due to environment. A common form of such inevitable degradation affecting the data sent over radio links is that of additive white Gaussian noise (AWGN). Several designed protocols discard this important issue and just focus on minimizing the energy consumption by assuming that the data received by nodes have not experienced any noise addition due to the environment.

1.3 Notations

In the rest of the paper, we use the following notations. We represent all the vectors used in our work with small case and bold face letters (e.g., \(\mathbf{y}\)), while all the scalars with small case normal font letters (e.g., y). We reserve upper case and bold face letters (e.g., \(\mathbf{Y}\)) for matrices, whereas calligraphic notations (e.g., \({\mathcal {N}}\)) are used for sets. Additionally, we use \(\mathbf{y}_i\), y(j) and \({\mathcal {N}}_k\) to denote ith column of matrix \(\mathbf{Y}\), jth element of vector \(\mathbf{y}\) and a subset of \({\mathcal {N}}\), respectively.

1.4 Recoveries via Sparse Representations

Contrary to the traditional Nyquist-Shannon sampling theorem, where one must sample at least double the signal bandwidth, compressive sensing (CS) has emerged out as a new framework for data acquisition and sensor design in an extremely competent way. The basic idea is that if the data signal is sparse in a known basis, a perfect recovery of the signal can be achieved leading to a significant reduction in the number of measurements that need to be stored.

According to CS, the following model can be used to recover an unknown vector \(\mathbf{v}\) from an under-determined system:

$$\begin{aligned} \mathbf{x}= {\varvec{\varPhi }}\mathbf{v}= {\varvec{\varPhi }}{\varvec{\varPsi }}{\varvec{\theta }}= {\varvec{\varTheta }}{\varvec{\theta }}, \end{aligned}$$
(1)

where \(\mathbf{x}\in {\mathbb {C}^M}\), \(\mathbf{v}= {\varvec{\varPsi }}{\varvec{\theta }}\in {\mathbb {C}^N}\) are observed signal and unknown vector, \({\varvec{\theta }}\in {\mathbb {C}^N}\) is unknown sparse signal which, for example, a node will collect, representing projection coefficients of \(\mathbf{v}\) on \({\varvec{\varPsi }}\), \({\varvec{\varTheta }}= {\varvec{\varPhi }}{\varvec{\varPsi }}\) is an \(M \times N\) reconstruction matrix (\(M < N\)). The measurement matrix \({\varvec{\varTheta }}\) is designed such that the dominant information of \({\varvec{\theta }}\) can be captured into \(\mathbf{x}\).

Reconstruction algorithms in CS exploit the fact that many signals are genetically sparse, therefore they proceed to minimize \({\ell _0},{\ell _1}\) or \({\ell _2}\)-norm over the solution space. Among them, \(\ell _1\)-norm is the most accepted approach due to its tendency to successfully recover the sparse estimate \(\hat{{\varvec{\theta }}}\) of \({\varvec{\theta }}\) as follows:

$$\begin{aligned} \hat{{\varvec{\theta }}} = \mathop {{\hbox {arg}}\min }\limits _{{\varvec{\theta }}} {\left\| {{\varvec{\theta }}} \right\| _{{\ell _1}}}, \quad \text {subject to} \quad {\mathbf{x}} \approx {\varvec{\varPhi }}{\varvec{\varPsi }}{\varvec{\theta }}. \end{aligned}$$
(2)
Fig. 2.
figure 2

Sparse model

In this regard, however, the inevitable presence of noise in wireless channels is always a challenging task to combat. Consequently, the system is modeled as

$$\begin{aligned} \mathbf{y}= \mathbf{x}+ \mathbf{n}= {\varvec{\varTheta }}{\varvec{\theta }}+ \mathbf{n}, \end{aligned}$$
(3)

where \(\mathbf{y}\in {\mathbb {C}^M}\) is the noisy version of the clean signal \(\mathbf{x}\in {\mathbb {C}^M}\) which is corrupted by the noise vector \(\mathbf{n}\in {\mathbb {C}^M}\) with i.i.d. zero mean Gaussian entries having variance \(\sigma _{\mathbf{n}}^2\), i.e., \(\varvec{\mathbf{n}(.) \sim \mathcal {N}(0, \sigma _{\mathbf{n}}^2\mathbf I )}\). A depiction of this model is shown in Fig. 2.

1.5 Contribution

A part of this work has already been published in [18], and this is the extended version of our previous work. In [18], we introduced a novel framework to tackle two major concerns in WSNs: (1) performance optimization via efficient energy utilization, and (2) combating the unavoidable presence of Gaussian noise, added as a result of multiple communications among the nodes via wireless channel. We proposed a fast and low-cost sparse representations based collaborative system enriched with layer-adaptive 3-tier communication mechanism. This is supported by an effective CHs election method and mathematically convenient coverage model guaranteeing minimization of energy and coverage holes. A computationally desired implementation of our framework is an added benefit that makes it a preferable choice for real-world applications.

To tackle AWGN, the data is transmitted in spatial-domain form and its sparse estimates are later computed at the receiver side. For a much better denoising, we let the nodes situated at a single-hop to mutually negotiate with each other for better collaboration. The data denoising is further refined by a specially designed averaging filter.

In this paper, we extend the concept of image denoising in wireless sensor networks. Specifically, we propose to use region growing based efficient denoising mechanism where we divide the entire image into various sub-regions based on their intensities, and apply smoothening filter. Motivated by this, we also extend our current framework for color images where we are especially interested in exploiting inter-channel correlation of each color image. This effective piece of information plays a crucial role in identifying the noisy components, and thereby helps discarding those components. Our proposed protocol lends itself the following salient features:

  • The implementation of a mathematically efficient coverage model along with an adaptive CHs election method help avoiding coverage holes to a greater extent.

  • Our proposed layer-adaptive 3-tier communication system greatly reduces energy holes.

  • To compute denoised data signals, we compute support-independent sparse estimates which relieves us from finding distribution of the sparse representations first, hence, giving it a support-agnostic nature.

  • Prior collaboration enjoyed by the nodes for communication yields an effectively significant energy minimization.

  • The use of a fast sparse recovery technique allows a desired computational complexity of our algorithm.

  • The use of inter-channel correlation among red, green and blue channels of color images not only makes it a suitable choice for denoising of color images, but also provides a convenient solution for fast and practical image denoising applications.

Rest of the paper is structured as follows: Sect. 2 presents an overview of the related work done in this area. We describe our proposed framework and its complexity in Sects. 3 and 4, receptively, while the results from various simulations are discussed in Sect. 5. Finally, Sect. 6 concludes the paper.

2 Related Work

Presently, researchers are fundamentally concentrating on the technologically-enriched tools for performance optimization of network structure, as a result of which the lifetime of WSNs is possible to increase. This possibility provides a roadway for scientists working in this research domain to propose low-cost, energy-efficient and optimized algorithms [19].

This includes a trend-setter work by W.B. Heinzelman, et al., proposing a multi-hop energy efficient communication protocol for WSNs, namely LEACH [20]. The objective was to reduce energy dissipation by introducing randomly elected CHs resulting in, however, unbalanced CHs distribution. Nevertheless, the partition of network area into different regions via clustering yielded a significant increase in system lifetime. As a principal competitor, a new reactive protocol, named as TEEN, was proposed by authors of [21] for event-driven applications. This protocol, even though constrained to temperature based scenarios only, proposed threshold aware transmissions thereby outperforming LEACH in terms of network lifespan.

In comparison with the aforementioned homogeneous WSNs, the authors of [22] and [23] proposed SEP and DEEC, respectively, introducing heterogeneous versions of the WSNs by allowing a specific set of nodes, defined as advanced nodes, to carry higher initial energy than other normal nodes. SEP used energy based weighted election to appoint CHs in a two-level heterogeneous network ultimately improving network stability. As a stronger contestant to SEP, DEEC deployed multi-level heterogeneity and improvised CHs election measure to attain extended lifespan of the network than SEP.

A. Khadivi et al., proposed a fault tolerant power aware protocol with static clustering (FTPASC) for WSNs in [24]. The network was partitioned into static clusters, and energy load was distributed evenly over high-power nodes, resulting in minimization of power consumption, and increased network lifetime. Another static clustering based sparsity-aware energy efficient clustering (SEEC) protocol is proposed in [25]. This protocol used sparsity and density search algorithms to classify sparse and dense regions. A mobile sink is then exploited, specifically in sparse areas, to enhance network lifetime.

As opposed to static clustering, authors of [26] presented centralized dynamic clustering (CDC) environment for WSNs. In this protocol, the clusters and number of nodes associated with each cluster remains fixed, and a new CH is chosen in each round of communication between clusters and BS. CDC showed better results than LEACH in terms of communication overhead and latency. In a similar fashion, G.S. Tomar, et al., proposed an adaptive dynamic clustering protocol for WSNs in [27], which creates a dynamic system that can change topology architecture as per traffic patterns. Mutual negotiation scheme is used between nodes of different energy levels to form energy efficient clusters. Periodic selection of CHs is done based on different characteristics of nodes. Another work proposed to use the cooperative and dynamic clustering to achieve energy efficiency [28]. This framework ensured even distribution of energy, and optimization of number of nodes used for event reporting thereby showing promising results.

D. Jia et al., tackled the problem of unreasonable CHs selection in clustering algorithms [29]. The authors considered dynamic CH selection methods as the best remedy to avoid overlapping coverage regions. Their experimental results showed increased network lifetime than LEACH and DEEC. Another energy efficient cluster based routing protocol, termed as density controlled divide-and-rule (DDR), is proposed in [30]. The authors tried to take care of the coverage and energy holes problem in clustering scenarios. They presented density controlled uniform distribution of nodes and optimum selection of CHs in each round to solve this issue. Similarly, a cluster based energy efficient routing protocol (CBER) is proposed in [31]. This protocol elects the CHs on the basis of optimal CH distance and nodes’ residual energy. CBER reported to outperform LEACH in terms of energy consumption of the network, and its lifetime.

3 Proposed Framework Design

In this section, we provide the readers with detailed understanding of our proposed routing protocol. Here, we broadly discuss the widely accepted radio model for communication among nodes. This is then followed by a comprehensive explanation of our adopted network configuration and its operation details for energy efficiency and denoising of the data.

3.1 Wireless Communication Model

For transmission and reception of required data among sensor nodes via wireless medium, we assume the simple and most commonly used first order radio communication model as given in Fig. 3. In this figure, we present the energy consumed by a node while transmitting and receiving data. We show that a packet of data traveling over radio waves has to combat against degrading factors such as noise, multi-path fading, etc. Thus, we also take into account the \(\varvec{d^2}\) losses that almost all chunks of data has to face. This is mathematically explained in terms of the following expressions:

$$\begin{aligned} \varvec{E_{Tx}(k,d)} = {\left\{ \begin{array}{ll} \varvec{k}\times \varvec{(E_{elec}}+ \varvec{\epsilon _{fs}}\times \varvec{d^2),} &{} \varvec{d < d_o}\\ \varvec{k}\times \varvec{(E_{elec}}+ \varvec{\epsilon _{mp}}\times \varvec{d^4),} &{} \varvec{d \ge d_o} \end{array}\right. } \end{aligned}$$
(4)
$$\begin{aligned} \varvec{E_{Rx}(k)} = \varvec{E_{elec}}\times \varvec{k,} \end{aligned}$$
(5)

where \(\varvec{d_o}\) is a reference distance, \(\varvec{k}\) is the number of bits in packet, \(\varvec{d}\) is the transmission distance which varies every time for each node, \(\varvec{E_{elec}}\) is the energy used for data processing, \(\varvec{\epsilon _{fs}}\) and \(\varvec{\epsilon _{mp}}\) are channel dependent loss factorsFootnote 1, \(\varvec{E_{Tx}}\) is the energy used by a node for transmission, and \(\varvec{E_{Rx}}\) is the energy used by a node for data reception. As shown, the \(\varvec{d^r}\) losses may change from \(\varvec{\left. d^r \right| _{r=2}}\) to \(\varvec{\left. d^r \right| _{r=4}}\) forcing a higher value of \(\varvec{E_{Tx}}\). A similar increase is then observed in the \(\varvec{E_{Rx}}\) values to process a highly corrupted data when assuming a noisy environment, as in our case. The generally used energy dissipation values for a radio channel are presented in Table 1.

Fig. 3.
figure 3

Radio model

Table 1. Energy dissipation measurements

3.2 Network Configuration

For the configuration model, we use a network consisting of \(\varvec{L}\) number of nodes deployed randomly. Unlike the traditional models, we adopt a spherically-oriented field, and propose to use an optimized version of area division via adaptive clustering. For a much better understanding, see the network model shown in Fig. 4. Here, for the sake of simplicity and understanding, we use the field area \(\varvec{A = \pi 150^2}\text { m}^{\varvec{2}}\), i.e., the diameter \(\varvec{D = 300}\), with a total of \(\varvec{L = 100}\) deployed nodes. To avoid formation of energy holes, and thus the death of network, we place the BS in the center of the network at coordinates \(\varvec{<i,j> = (0,0)}\). This is followed by the clustering of network field into various coronas which are then further classified into different sensing-based reporting regions. The prior computation of number of coronas, represented by \(\varvec{\eta }\), is a function of field area \(\varvec{A}\), which itself is depending on \(\varvec{D}\), and the number of nodes \(\varvec{L}\). As a sound approximate, we propose \(\varvec{\eta = D/L}\). Hence, in our case we use \(\varvec{\eta } = 3\) coronas, denoted accordingly by \(\varvec{\eta _1, \eta _2}\) and \(\varvec{\eta _3}\).

Fig. 4.
figure 4

Network model: (a) Network Configuration, (b) Nodes Deployment

Once the \(\varvec{\eta }\) number of coronas are formed, the next step is to divide each corona into various sensing regions as shown in the figure. However, for a much better network performance, the distribution is such that each sensing region in the upper level corona \(\varvec{\eta _\alpha }\) surrounds two sensing regions in lower level corona \(\varvec{\eta _{\alpha -1}}\). This is shown in Fig. 4(a), where for example region \(\varvec{R_7}\) in \(\varvec{\eta _3}\) covers both \(\varvec{R_2}\) and \(\varvec{R_3}\) in \(\varvec{\eta _2}\), hence, avoiding coverage holes by satisfying the following expression for a general network configurationFootnote 2:

$$\begin{aligned} \varvec{A} = \varvec{\pi (D/2)^2} = \varvec{\sum _{\alpha =1}^{\eta } A_{\eta _{\alpha }}} = \varvec{\sum _{\alpha } A_{R_{\alpha }},}\,\,\,\varvec{\alpha \in } \mathbb {Z}^+\varvec{,} \end{aligned}$$
(6)

where \(\varvec{A_{\eta _{\alpha }}}\) and \(\varvec{A_{R_{\alpha }}}\) represented area of each corona and sensing region, respectively. It is worth noting that we do not divide the corona surrounding BS further, \(\varvec{\eta _1}\) in our case, to avoid unneeded and poor use of available resources. Thus, we can safely write:

$$\begin{aligned} \varvec{A_{\eta _{1}}} = \varvec{A_{R_{1}}} = \varvec{\pi \beta ^2,} \text { where }\varvec{\beta \in \mathbb {R}^+.} \end{aligned}$$
(7)

3.3 Nodes Deployment and Layer-Controlled CHs Nomination

As soon as the network is clustered out into various coronas and sensing regions, the next step is to distribute the nodes randomly over these regions. To optimize resources, a sensible decision is to deploy an equal percentage of nodes over different regions to ensure minimization of coverage holes, and elongation of network lifetime. Therefore, in this scenario, we propose to deploy 20\(\%\) of the nodes in region \(\varvec{R_1}\) and the rest 80\(\%\) of the nodes to be distributed evenly over \(\varvec{R_{2,3, ..., 8,9}}\) regions as shown in Fig. 4(b). This nodes’ deployment always depend upon the network field area and number of nodes. Hence, for any other network configuration, an adjusted percentage can be calculated to optimize communication among nodes, and to avoid energy and coverage holes.

Following the deployment of nodes and prior network initialization, the election of CHs is carried out in all \(\varvec{R_{2,3, ..., 8,9}}\) regions. Since the use of CHs in clustering techniques plays an important role to improve network lifespan, effective criterion for CHs election is equally necessary for further improving performance of the network. The most commonly used measures for electing CHs are residual energy and distance from BS. We propose a blend of both to increase the life of each node. Furthermore, we introduce a layering-based election of CHs. This means that the election will take place in lower level coronas \(\varvec{\eta _{\alpha }}\) first, and will then move to high level coronas \(\varvec{\eta _{\alpha +1}}\) for higher level CHs. The reason to adopt this is the effectiveness noted in CHs election. Thus, in each round, all the nodes are assessed based on their residual energies and top 5\(\%\) of the nodes having highest residual energies in their respective regions are shortlisted. These shortlisted nodes then contest against each other where the node with smallest distance to the CHs of both associated regions in lower level corona is elected as CH. The nodes in \(\varvec{\eta _2}\) are evaluated in a similar fashion based on the residual energy but having minimum distance with the center of their respective region.

3.4 Layer-Adaptive 3-Tier Communication Mechanism

For transfer of data among various nodes, we propose a layer-adaptive 3-tier architecture. Our communication mechanism is enriched with distance-optimized transmissions to avoid wastage of energy. The nodes use a multi-hop scheme instead of directly transmitting the data of interest to BS. In tier-1 phase, all the normal nodes gather data, and send it to the nearest CH. This CH may not necessarily be the same region CH. Here, we allow nodes in \(\varvec{\eta _\alpha }\) to transmit the data to CHs of even \(\varvec{\eta _{\alpha -1}}\). This is the reason why we distributed the sensing regions in such a way that each region in upper level corona is bordered with two regions in lower level corona. However, the nodes of a region in \(\varvec{\eta _\alpha }\) cannot transmit to another region on the same corona, i.e., it must either send data to its own CH in \(\varvec{\eta _\alpha }\), or any other nearest CH in the two bordered regions on lower level corona \(\varvec{\eta _{\alpha -1}}\) as explained in Fig. 5.

Fig. 5.
figure 5

3-tier communication architecture

In the next tier-2 phase of communication, the CHs of \(\varvec{\eta _\alpha }\) aggregate their data and then send it to the CHs of \(\varvec{\eta _{\alpha -1}}\). Note that even though the CH of \(\varvec{R_3}\) is receiving data from CHs of both \(\varvec{R_7}\) and \(\varvec{R_8}\), this is blessing in disguise. This is because, as shown in the figure, the CHs of both \(\varvec{R_7}\) and \(\varvec{R_8}\) have not received data from all the nodes in its region, since some nodes find another nearest CH, so these CHs are aggregating and then forwarding a comparatively smaller amount of load thereby not overburdening themselves. Also, the CHs change in each round based on the election criterion, so it ultimately saves energy. Finally in tier-3 phase, all the CHs in lower level coronas send their data to the BS, hence, completing the data transmission process.

3.5 Coverage Model

For reduction in coverage holes, we express the coverage scenario of nodes by a mathematical model. All the deployed sensor nodes are represented in set notation as \(\varvec{\kappa = \{\mu _1, \mu _2, \mu _3, ..., \mu _L\}}\). The coverage model of one alive node \(\varvec{\mu _\alpha }\) belonging to the set \(\varvec{\kappa }\) can be expressed as a sphere centered at \(\varvec{<i_\alpha ,j_\alpha >}\) with radius \(\varvec{h_\alpha }\). We let a random variable \(\varvec{\aleph _\alpha }\) define an event when a data pixel \(\varvec{<a,b>}\) is within the coverage range of any node \(\varvec{\mu _\alpha }\). As a result, the equivalent of likelihood of the event \(\varvec{\aleph _\alpha }\) to happen, as denoted by \(\varvec{P\{\aleph _\alpha \}}\), is represented as \(\varvec{P_{cov}\{a,b,\mu _\alpha \}}\). A decomposed version of the above is given as follows:

$$\begin{aligned} \varvec{P\{\aleph _\alpha \}} = \varvec{P_{cov}\{a,b,\mu _\alpha \}}= {\left\{ \begin{array}{ll} \varvec{1,} &{} \varvec{(a-i_\alpha )^2} + \varvec{(b-j_\alpha )^2 \le h_\alpha ^2}\\ \varvec{0,} &{} \text {otherwise} \end{array}\right. } \end{aligned}$$
(8)

where the equation translates that a data pixel \(\varvec{<a,b>}\) is surrounded by the coverage range of any random node \(\varvec{\mu _\alpha }\) if the distance between them is smaller than the threshold radius \(\varvec{h_\alpha }\). However, since the event \(\varvec{\aleph _\alpha }\) is stochastically independent from others, this means \(\left. \varvec{h_\alpha } \text { and } \varvec{h_\gamma } \text { are not related} \varvec{\implies \alpha , \gamma \in [1,L]} \right. \) and \(\varvec{\left. \alpha \ne \gamma \right. }\). This gives us the following conclusive equations:

$$\begin{aligned} \varvec{P\{\overline{\aleph _\alpha }\}} = \varvec{1-P\{\aleph _\alpha \}} = \varvec{1-P_{cov}\{a,b,\mu _\alpha \},} \end{aligned}$$
(9)
$$\begin{aligned} \varvec{P\{\aleph _\alpha \cup \aleph _\gamma \}} = \varvec{1-P\{\overline{\aleph _\alpha }\cap \overline{\aleph _\gamma }\}}= \varvec{1-P\{\overline{\aleph _\alpha }\}.P\{\overline{\aleph _\gamma }\},} \end{aligned}$$
(10)

where \(\varvec{P\{\overline{\aleph _\alpha }\}}\) denotes the statistical complement of \(\varvec{P\{\aleph _\alpha \}}\) which means that \(\varvec{\mu _\alpha }\) failed to assist data pixel \(\varvec{<a,b>}\). Importantly, this data pixel is given coverage if any of the nodes in the set is covering it otherwise a coverage hole would form. Hence, the following expressions denote the probability such that data pixels would be within the coverage range of at least one of the nodes in the set to minimize coverage holes:

$$\begin{aligned} \varvec{P_{cov}\{a,b,\kappa \}} = \varvec{P\{\bigcup \limits _{\alpha =1}^{L}\aleph _\alpha \}}&= \varvec{1 - P\{\bigcap \limits _{\alpha =1}^{L}\overline{\aleph _\alpha }\},}\nonumber \\&\qquad = \varvec{1 - \prod _{\alpha =1}^{L}(1-P_{cov}\{a,b,\mu _\alpha \}).} \end{aligned}$$
(11)

For further facilitation, we present the coverage rate as fraction of area under coverage, denoted by \(\varvec{Q}\), and the overall area of the observation field as follows:

$$\begin{aligned} \varvec{P_{cov}\{\kappa \}}=\varvec{\sum _{\alpha =1}^{L}\sum _{\gamma =1}^{L}\frac{P_{cov}\{a,b,Q\}}{A}} \end{aligned}$$
(12)

3.6 Data Denoising

After taking care of the energy efficiency, second major problem is retrieving the original data back. This is because the received data is generally degraded by AWGN so it is of no use unless denoised. For this purpose, we propose denoising of the data samples via Bayesian analysis based sparse recovery techniques. To do so, we take into account the data correlation of various adjacent nodes, and use this as an important piece of information for collaboration among nodes.

We use three stages for CS based sparse recovery technique to denoise the data. In doing so, received data is converted to sparse domain first (e.g., wavelet transform for images data). This is followed by computing similar and correlated data by adjacent nodes, giving them weights based on the similarity extent. Using equivalent sparse representations of data samples, probability of active taps is computed giving us the location of undesired corrupted support locations [32]. With the help of correlation information, an averaging based collaborative step is performed to remove the unwanted noisy components as shown via flowchart in Fig. 6. Here, we denote the initially denoised image by \(\bar{\mathbf{X}}_{\varvec{d}}\).

Finally, we apply a specially developed averaging filter to further smooth out the data as discussed in the later sections. This filter fundamentally works on finding similar data samples, and then averaging those samples to provide a clean estimate of the data. Using a CS based pre-determined dictionary, a reverse transform is applied to give back the denoised data in spatial-domain representation as \(\hat{\mathbf{x}} = {\varvec{\varTheta }}\hat{{\varvec{\theta }}}\).

Similarity via Distance Vs. Correlation: For the similar and correlated data, we first compute samples from the data, for example, overlapping patches or blocks in images. Once the overlapping patches are formed, the next step is to find a certain number of similar patches, for each patch, that would be used during collaboration. The grouping of patches in such a way using a similarity measure has led to a number significant improvements in a wide range of application like signal/image/bio-medical processing, computer vision, machine intelligence, etc. (e.g., see [32,33,34,35,36,37,38,39]).

A number of techniques for similarity based grouping of patches have been proposed in the literature. Some of those include self-organizing maps [40], vector quantization [41], fuzzy clustering [42] and a review on these [43]. The recently developed denoising algorithms use a distance based measure where similarity between different signals are realized in terms of the inverse of the point-wise distance between them. Therefore, a smaller distance between the signals would imply a higher similarity and vice versa. The generally used distance based similarity measure is the Euclidean distance as used by the state-of-the-art denoising image algorithms like NL-means [44], BM3D [45], etc.

However, despite being an effective way of finding similarity, Euclidean distance based similar-intensity grouping has a limitation; it limits the search for number of similar patches. For instance, even though natural images have some similarity in their structure, the number of similar patches vary. Consequently, in an image having a smaller number of similar patches, the collaboration is not that effective thereby disturbing the performance of denoising, especially in case of high noise. This creates a bottleneck specifically for lower resolution images where finding similar-intensity patches becomes a difficult task.

To tackle this case and have a similarity measure that can be used globally even in lower resolution images or images having a smaller number of similar-intensity patches, novel methods are being proposed to find better ways of collaboration by using efficient grouping of similar patches. For example, the authors in [46] search the similar patches by using not only a patch itself but the noise too where they propose the concept of noise similarity, while the authors in [47] propose sequence-to-sequence similarity (SSS) which is an essential way of preserving the edge information.

In our case, we take care of the aforementioned problem by introducing intensity-invariant grouping. The idea is to stack all the patches that have a similar inherent structure without relying on the intensity values as shown in the Stage 01 of Fig. 6. The correlation coefficient serves as the best tool to be utilized for the said purpose. For two random signals \(\varvec{\mathbf{y}_\alpha }\) and \(\varvec{\mathbf{y}_\gamma }\), the correlation coefficient is given as,

$$\begin{aligned} \varvec{r(\mathbf{y}_\alpha ,\mathbf{y}_\gamma )} = \frac{ \varvec{cov(\mathbf{y}_\alpha ,\mathbf{y}_\gamma )} }{\varvec{\sigma _{\mathbf{y}_\alpha }\sigma _{\mathbf{y}_\gamma }}}\varvec{,} \end{aligned}$$
(13)

where \(\varvec{-1 \le r(}\mathbf{y}_{\varvec{\alpha }}\varvec{,}\mathbf{y}_{\varvec{\gamma }}\varvec{) \le 1}\). A value close to \(\varvec{1}\) or \(\varvec{-1}\) means larger positive and negative correlation, respectively, while a value close to \(\varvec{0}\) means smaller correlation.

Selection of the Measurement Matrix: Since we will be denoising the image patches by using the sparse estimates from collaborative filtering in the transform domain, the use of an appropriate measurement matrix or dictionary also serves as a key step. Generally, the dictionary mainly consist of basis vectors through which any random patch can be represented as a linear combination of the basis elements. In our case, we are representing any patch using the obtained sparse vector and the dictionary as already shown in Fig. 2.

Decorrelation of the Measurement Matrix: Each patch can be written as linear combination of basis elements from the dictionary. The columns of this dictionary are derived from wavelet basis and are normalized to have unit norms. Prior finding support sets of \(\widehat{{\varvec{\theta }}}_{\varvec{\alpha }}\) via sparse estimation of patches, we will reduce the correlation between dictionary columns for a robust computational and performance ability. Consequently, we remove weak supports by rejecting highly correlated columns as the information they encode could easily be encoded by other columns which correlate with them. We denote it by the decorrelator operator as follows

$$\begin{aligned} {\varvec{\varTheta }}= \varvec{\varGamma }_{\varvec{\tau }}\varvec{(}{\varvec{\varTheta }}\varvec{')} \end{aligned}$$
(14)

where \(\varvec{\varGamma }_{\varvec{\tau }}\varvec{(.)}\) is the de-correlation operator that removes all the columns of \({\varvec{\varTheta }}\varvec{'}\) with correlation greater than \(\varvec{\tau }\).

Gaussianity Property: This should be noted that the Gaussianity property of the noisy data received and then aggregated at the receiver (e.g., CHs or BS) should remain intact. This is because, even though our proposed Bayesian analysis based denoising algorithm is agnostic to support distribution of the sparse coefficients, it does need the data samples to be corrupted by Gaussian noise collectively. A concise version of this is provided in the following Lemma 1 to support the accuracy of our denoising algorithm.

Lemma 1

The aggregated data samples received at either CHs or BS keep the Gaussianity property intact, hence, we can denoise the cumulative version of the AWGN corrupted data.

Proof

To show this, we consider two independent Gaussian random data samples \(\varvec{P}\) and \(\varvec{Q}\) sent by nodes \(\left. \varvec{\mu _\alpha } \text { and } \varvec{\mu _\gamma ,} \text { both } \varvec{\in \kappa }\right. \). For data aggregated by CH, we let \(\varvec{\left. Z=\rho P+\delta Q\right. }\). Without loss of generality, let \(\varvec{\rho } \text { and }\varvec{\delta }\) be positive real numbers because for \(\varvec{\rho <0}\), \(\varvec{P}\) would be replaced by \(\varvec{-P}\), and then we would write \(\varvec{\mid \rho \mid }\) instead of \(\varvec{\rho }\). The commutative probability function can be written as:

$$\begin{aligned} \varvec{F_Z(z)}=\varvec{P\{Z\le z\}}&=\varvec{P\{\rho P}+\varvec{\delta Q\le z\}} \nonumber \\&\qquad \qquad \qquad =\varvec{\int \int _{\rho P+\delta Q\le z}\varphi (p)\varphi (q) dpdq} \end{aligned}$$
(15)

where \(\varvec{\varphi (.)}\) represents the unit Gaussian density function. However, as the integrand \(\varvec{(2\pi )^{-1}\exp (-(p^2+q^2)/2)}\) possesses circular symmetry, the numerical property of this integral is a function of length of the origin from \(\varvec{\left. \rho p+\delta q=z\right. }\). Consequently using coordinates rotation, we can conclude

$$\begin{aligned} \varvec{F_Z(z)}\qquad =\qquad \varvec{\int _{p=-\infty }^{\zeta }\int _{q=-\infty }^{q=\infty }\varphi (p)\varphi (q)dpdq}\qquad =\qquad \varvec{\varDelta (\zeta )} \end{aligned}$$
(16)

where \(\varvec{\zeta }=\varvec{\frac{z}{\sqrt{\rho ^2+\delta ^2}},}\) and \(\varvec{\varDelta (.)}\) shows standard Gaussian CDF. Hence, the CDF of \(\varvec{Z\vert _{L=2}}\) is a zero-mean Gaussian random variable having total variance equal to \(\varvec{\left. \rho ^2+\delta ^2\right. }\).

Fig. 6.
figure 6

Flowchart of data denoising

Fig. 7.
figure 7

An example of dividing the Cameraman image into 64 different groups/bins (left to right): first row; group 1–8, second row; group 2–16, third row; group 17–24, fourth row; group 25–32, fifth row; group 33–40, sixth row; group 41–48, seventh row; group 49–56, 8th row; group 57–64

3.7 Region Growing Based Smoothening Filter

As a final step for removing out the noisy components from the image, we perform region growing method on the output image resulted from the previous process. For this image, we store the pixels in different number of bins based on their intensity levels. For instance, we assign group 1 to the pixels that have, for example, intensity range from 0–3, group 2 to pixel intensities from 4–7, and so on. We do this for all the pixels and as a result we create different bins with pixels and their locations stored within those bin groups. We show an example of applying such intensity-leveling on the Cameraman image in Fig. 7. In this figure, we display all the intensity groups/bins as binary images where the white pixels correspond to the pixels of the Cameraman image belonging to the relevant group.

For each bin, we apply the region growing algorithm to find the connected pixels within that bin. This means that the local similar intensity pixels are identified first. Afterwards, if the number of connected pixels in each bin exceed a certain threshold, then we replace those connected pixels by their mean. Similarly, we repeat this process for all the bins which ultimately provides us with the region growing based processed image that we denote by \(\bar{\mathbf{X}}_{\varvec{r}}\). Finally, we get our final denoised image \(\bar{\mathbf{X}}\) using the weighted average of the image \(\bar{\mathbf{X}}_{\varvec{d}}\) from denoiser and the region growing processed image \(\bar{\mathbf{X}}_{\varvec{r}}\) as follows

$$\begin{aligned} \bar{\mathbf{X}} = \varvec{\varrho _1} \bar{\mathbf{X}}_{\varvec{d}} + \varvec{\varrho _2} \bar{\mathbf{X}}_{\varvec{r}}\varvec{,} \end{aligned}$$
(17)

where \(\varvec{\varrho _1}\) and \(\varvec{\varrho _2}\) are the weights which are a function of the noise variance.

3.8 Effective Collaboration via RGB Channels of Color Images

As opposed to the case of grayscale single channel images, color images having three R, G and B channels that provide a more advanced way through which the patches can collaborate. Since finding similar patches using more effective approaches is the key for such collaboration, the three channels of a color images supply an important piece of information in the form of the channel correlation that can be used to identify similar patches.

To understand this, consider the three R, G and B channels of the standard Mandrill image as shown in Fig. 8 as separate images. Since the additive white Gaussian noise is independent in all three channels of the image, we denoise the color image by denoising each channel separately. This results in formation of rectangular patches for all three channels. To denoise a patch in a specific channel of the observed color image, once the patches are extracted, similar patches are grouped together by taking into account information from both reference channel and the other two channels.

Fig. 8.
figure 8

A depiction of collaboration among patches across all three channels

For example in Fig. 8, to denoise the reference patch, denoted by ‘R’, from the red channel, similar patches are grouped together from the red channel firstly. This ensures the identification of patches as similar and gives a set containing the information of similar patch numbers. Using this set from the red channel, the similar patches from other channels, for this specific patch, are also identified. Then, the reference patch in the red channel may collaborate with the patches from all channels. Since the idea is to refine the probabilities of active taps by using the sparse vectors that may share the same support, finding similar patches using all three channels can be very effective. These grouped patches for all channels can then ultimately be used to effectively estimate the sparse vectors that are in turn used to obtain denoised patches. These steps are performed for all the patches in all the three channels which ultimately provide us with a denoised color image.

4 Computational Complexity

The computational complexity of our proposed framework is dominated by that of the sparse recovery algorithm that we use, which fortunately has a low computational complexity when compared to other similar existing algorithms for sparse recovery. With the dimensions of our problem at hand, the complexity for estimating one \(\varvec{{\varvec{\theta }}_\alpha }\) via the sparse recovery algorithm is of order \(\varvec{\mathcal {O}(MN^2\varUpsilon )}\) where \(\varvec{\varUpsilon }\) is the expected number of non-zeros that is generally a very small number.

5 Results and Discussions

In this section, we compare our proposed scheme with the state-of-the-art and traditional routing protocols such as LEACH [20], TEEN [21], SEP [22], DEEC [23] and DDR [30]. We use the values given in Table 1, and our experimentation is divided into two main scenarios: (1) efficient resource utilization, and (2) data denoising. The comparison is carried out over \(\left. \varvec{L = 100, 1000} \text { and } \varvec{10000}\right. \) nodes with following metrics: stability and instability period, network lifetime, energy consumption, computational complexity, peak signal-to-noise ratio (PSNR) and structural similarity (SSIM) index.

Fig. 9.
figure 9

Stability period

Fig. 10.
figure 10

Instability period

Fig. 11.
figure 11

Energy utilization comparison

Fig. 12.
figure 12

Network lifetime

Fig. 13.
figure 13

Computational overload comparison

A comparison of stability period for \(\varvec{L = 100}\) is shown in Fig. 9. This figure demonstrates the number of alive nodes over 8000 sensing rounds. It is evident from the figure that our proposed scheme significantly outperforms all the protocols, and shows promising results. The first node die time of our approach is around 2900, while that of LEACH, TEEN, SEP, DEEC and DDR is around 800, 1900, 1600, 2000, and 1400, respectively. Similarly, Fig. 10 illustrates the all node die time (ADT) of these protocols for \(\varvec{L} = 100\). It can be clearly seen that the ADT of our method is \(\sim \)6390, \(\sim \)5290, \(\sim \)5490, \(\sim \)5190 and \(\sim \)4290 better than LEACH, TEEN, SEP, DEEC and DDR, respectively. We show that our scheme provides the best ADT, and hence, is the most suitable candidate for practical applications.

We provide a comparison of energy efficient resource utilization in Fig. 11. Here, we show that all protocols start with same energy levels. However, based on the optimized communication method, our scheme demonstrates outstanding results beating all the contestants. In Fig. 12, we compare the network lifetime of our proposed method with LEACH and DDR for \(\left. \varvec{L = 100, 1000} \text { and } \varvec{10000}\right. \). It is validated that our protocol is equally competitive on large scale network scenarios outperforming each of the traditional methods.

The complexity of our approach is dominated by the communication yielding a convenient implementation of our method as compared with other protocols as shown in Fig. 13. We compare the computational time consumed by the contestant methods using a \(\left. \text {2.20 GHz Intel Core i7-3632QM}\right. \) machine for different number of nodes. This figure proves the robustness of our protocol by showing superior performance, hence, lending itself the most preferable choice for real-time applications.

Table 2. Comparison of denoising image data samples in terms of PSNR/SSIM

Finally, the detailed denoising results of various standard images are shown in Figs. 14, 15, 16, 17, 18, 19 and 20, and summarized in Table 2. We opt globally adopted PSNR and SSIM as evaluation metrics to prove that the denoising section of our proposed framework produces equally promising outcomes. The provided table summarizes denoising resultsFootnote 3 of a number of images, as PSNR/SSIM, over a range of noise levels, i.e., \(\left. \varvec{\sigma _n = [10,15,20,50,100]}\right. \). Similarly, we also present the extensive denoising results of color images in Fig. 21. As can be seen by in these figures and table, the recovered images are a very good approximation of original images thereby verifying the effectivenesses of our proposed framework.

Fig. 14.
figure 14

Denoising \(\varvec{256 \times 256}\) grayscale Lena standard test data images over noise \(\varvec{\sigma = [5,10,15,20,25,50,100]}\) when received at a node \(\varvec{\mu _\alpha }\). The graphical results show PSNR [dB] and SSIM results in the form of graphs.

For experimentation, we transmitted various images among deployed nodes and showed that the resultant images received at the receiver suffers from Gaussian noise. The PSNR and SSIM values of the corresponding received noisy images are shown in the table. In comparison with our denoised images, we show that a significant amount of improvement is achieved in terms of the noise being removed, and the actual data is recovered to a greater extent. Consequently, these results confirm that our proposed framework is indeed an effective and robust model for real-time scenarios in WSNs which outperforms many traditionally proposed routing protocols.

Fig. 15.
figure 15

Denoising \(\varvec{256 \times 256}\) grayscale Barbara standard test data images over noise \(\varvec{\sigma = [5,10,15,20,25,50,100]}\) when received at a node \(\varvec{\mu _\alpha }\). The graphical results show PSNR [dB] and SSIM results in the form of graphs.

Fig. 16.
figure 16

Denoising \(\varvec{256 \times 256}\) grayscale House standard test data images over noise \(\varvec{\sigma = [5,10,15,20,25,50,100]}\) when received at a node \(\varvec{\mu _\alpha }\). The graphical results show PSNR [dB] and SSIM results in the form of graphs.

Fig. 17.
figure 17

Denoising \(\varvec{256 \times 256}\) grayscale Peppers standard test data images over noise \(\varvec{\sigma = [5,10,15,20,25,50,100]}\) when received at a node \(\varvec{\mu _\alpha }\). The graphical results show PSNR [dB] and SSIM results in the form of graphs.

Fig. 18.
figure 18

Denoising \(\varvec{256 \times 256}\) grayscale Cameraman standard test data images over noise \(\varvec{\sigma = [5,10,15,20,25,50,100]}\) when received at a node \(\varvec{\mu _\alpha }\). The graphical results show PSNR [dB] and SSIM results in the form of graphs.

Fig. 19.
figure 19

Denoising \(\varvec{256 \times 256}\) grayscale Living Room standard test data images over noise \(\varvec{\sigma = [5,10,15,20,25,50,100]}\) when received at a node \(\varvec{\mu _\alpha }\). The graphical results show PSNR [dB] and SSIM results in the form of graphs.

Fig. 20.
figure 20

Denoising \(\varvec{256 \times 256}\) grayscale Mandrill standard test data images over noise \(\varvec{\sigma = [5,10,15,20,25,50,100]}\) when received at a node \(\varvec{\mu _\alpha }\). The graphical results show PSNR [dB] and SSIM results in the form of graphs.

Fig. 21.
figure 21

Denoising color images by the proposed color denoising method. 1st column: original images, 2nd and 3rd columns: noisy and denoised images at \(\varvec{\mathcal {N} (0,50)}\), 4th and 5th columns: noisy and denoised images at \(\varvec{\mathcal {N} (0,40)}\), 6th and 7th columns: noisy and denoised images at \(\varvec{\mathcal {N} (0,30)}\),

6 Conclusions

In this work, we discussed our proposed framework that ensures energy-efficiency and data-denoising in a wireless sensor network. Our system is enriched with a layer-adaptive method that uses a 3-tier communication mechanism for effective and energy-efficient communication among the nodes ultimately minimizing the energy holes. Our presented mathematical coverage model effectively dealt with the formation of coverage holes thereby yielding a robust network. For combating noise in the data, we proposed a collaborative transform-domain based denoising algorithm to take care of the unwanted components. As shown with the help of many simulation results, our framework outperformed traditional algorithms by a significant margin, and provided a computationally-desirable algorithm for real-time applications.

As a future direction, the current work can be enhanced using recently-proposed deep learning models for carrying out the denoising task. Specifically, since there are still some traces of noise in the data, low level deep features from Convolutional neural networks (CNNs) can come in handy to effectively represent patches of the corrupted images. Another interesting direction, as inspired by the CNNs, is to extract the features from transform domain and feed them as input to the CNNs instead of the images itself. This would help the model train well by learning the underlying structures within the images.