1 Introduction

In recent years, wireless sensor networks (WSNs) are increasingly being used to ensure reliable monitoring and analysis of unknown and untested environments. WSNs consist of collections of tiny disposable low-power unattended devices equipped with programmable computing, multiple-parameter-sensing, and wireless communication capacity that can communicate either among each other or directly to a central base station (Al-Karaki and Kamal 2004; Kuorilehto et al. 2008; Swami et al. 2007). These collections can be used to measure ambient conditions and to monitor areas for object or event detection in many applications. Some of the most important applications are: border zone control (BZC), battlefield surveillance, fire prevention/detection, health care applications, environment monitoring, traffic control, etc.

Even though WSNs are being used in many situations and will become ubiquitous and indispensable as the Internet, they have significant operating constraints which open the opportunity to devise novel technical solutions (Al-Karaki and Kamal 2004; Rmer and Friedemann 2004). WSNs are limited in bandwidth, processing and storage capacities, but the most critical is the limitation in energy, which directly affects the network lifetime. In a traditional setting, each sensor is connected to the central base station by a wired connection. This connection is used to send signals controlling the sensor, to supply energy to the sensors, and to receive the information from the sensor. However, in many cases, wired connections are difficult, expensive or even impossible to set up. In such situations, wireless sensors are the most viable solutions. These sensors must be designed in such a way that they are able to communicate with each other and with the base station. For example, in the case of BZC, we can envision a network consisting of hundreds to thousands of sensors which are deployed in a plane. However, BZC security is not feasible if the nodes placement needs to employ a deterministic deployment of sensors to achieve optimal placement (Toumpis and Tassiulas 2006). The sensor constraints combined with a typical deployment of large number of sensors create many challenges to the management of WSNs. Consequently, innovative techniques that reduce energy inefficiency are essential (Sengupta et al. 2013; Rawat et al. 2013).

In the design phase of the WSN, it is critically important to place the sensors and set up routing protocols to maintain connectivity and maximize the network lifetime (Barragan and Gonzalez 2008; Yoon et al. 2007). Since there is little control on the sensor placement, localization in WSNs has been extensively studied (Wang et al. 2008). However, since localization of the network services can consume precious energy resources, we could take a different stance. Here, we are more concerned about harvesting information from the sensors more than the exact localization of a sensor. Hence, we refer to optimal placement in the sense of the distribution or density of sensors over areas or regions, and not on placing sensors at specific geographical coordinates.

In this paper, we propose a strategy for the optimal placement of sensors to maximize the network lifetime under the constraint that connectivity is preserved. In most WSNs, we cannot supply a sensor with energy. As a result, each sensor has to rely on its original (limited) supply of energy for sensing, computing and communication operations. In some cases, it is possible to attach an additional energy source, like solar cells, to the sensor, but this may not be suitable in applications where dependence on environmental conditions (i.e., weather) is not desired. To testify our proposed strategy, a computer-based model is built using OpNet discrete event simulator (riverbed.com 2017). A comparison shows that our optimization strategy outperforms the other strategies or techniques. This is because the energy consumption in the network using our strategy tends to be uniformly distrusted.

2 Sensor placement and routing

In the following discussion, we assume the following characteristics of the network:

  1. 1.

    Homogeneous sensors (same hardware and software features).

  2. 2.

    Fixed messages length and constant channel speed.

  3. 3.

    There are multiple paths from the sensors to the base station.

  4. 4.

    The flow of information is always towards the base station.

In many practical cases, data collection is the principal purpose of a WSN. However, there are many applications where extracting the raw information from the network may be more valuable and/or human interpretation is legally required. Thus, a significant amount of work describing in-network collaborative signal and information processing should be mentioned. Hence, we focus here on the case where sensors must communicate their results to the base station. For a wireless sensor, this implies sending a radio frequency (RF) signal which requires a certain amount of energy as well as processing information. In general, when an RF signal propagates through the air, its power (P) decreases with distance r as \((P/r^2)\).

If we consider a fixed message length, L, and a constant channel speed , C, it results in a constant time, \(t_0\), for the transmission of a signal. This implies that the energy density is directly proportional to the power. Thus, the energy change with the distance, r, is \(E/r^2\). So, to ensure that the signal is detected by the receiver, we must make sure that the signal energy density, (\(E/r^2\)), must exceed some detection threshold \(d_0\). In principle, it is possible to send the signals directly to the central base station. In this arrangement, a sensor located at a distance, r, from the base station must send signals of energy, E, for which the resulting density, \((E/r^2)\), must exceed the sensitivity level, \(d_b\), of the base station (i.e., for which \(E/r^2 = d_b\)). In this arrangement, the energy of each signal must therefore satisfy the inequality \(e_s \ge E_0 \triangleq r^2 \cdot d_b\).

Although the transmission range of the sensors is large and hundreds to thousands sensors in the network are considered, a reasonable size area is monitored, for instance, BZC applications. Thus, for the sensors whose distance r from the central station is reasonably large, the direct-to-base-station arrangement requires a large energy \(E_0\). Since each sensor has a limited supply of energy, this arrangement will drain the sensor energy really fast. To avoid this problem, it is better to set up communications in such a way that each sensor only sends low-energy signals. A sensor which is close to the base station can send the signal directly to it, but sensors which are further away from the base station can only send signals to other sensors as described in Fig. 1. In this scenario, the transmission range of two nodes is represented by a circle; one of them can transmit directly to the base station while the other sends the signal through other sensors. Furthermore, in many monitoring situations, it is necessary to transmit signals to the base station at a high speed. To achieve such a speed, the routed signals follow the shortest path from the sensor to the base station.

Fig. 1
figure 1

Transmission range of sensors

It is worth mentioning that each retransmission drains energy from the sensor and delays the signal. Hence, to speed up the communication and to maximize the lifetime of the sensor network, it is desirable to minimize the number of retransmissions. On the other hand, the shorter the distance covered by each retransmission, the more retransmissions are needed. Thus, to minimize the number of retransmissions, we must make sure that the distance is as large as possible, as depicted in Fig. 2. The retransmission distance is limited by the bounded energy of the transmitted signal and by the limited ability of a sensor to receive signals from a distant sensor. The energy of the signal sent by a sensor is denoted by \(e_s\). Accordingly, the energy density of this signal that changed with distance r is \(e_s/r^2\).

Fig. 2
figure 2

Increasing the distance covered by each retransmission

We define \(r_b\) as the largest distance at which the signal sent by a sensor can be received by the base station. It is easy to see that \(e_s/r_b^2= d_b\). Hence, we can compute the communication range as \(r_b=\sqrt{e_s/d_b}\). Similarly, the sensors located at the distances larger than \(r_b\) cannot directly reach the base station. Henceforth, their signals have to go first to intermediate sensors for retransmission. If \(d_s\) denotes the minimal energy density at which the signal can be received by a sensor, then the largest distance \(r_s\) at which the signal sent by a sensor can be received by other sensors is determined by \(r_s=\sqrt{e_s/d_s}\).

Based on this propagation model, we assume that all sensors within \(r_b\) from the base station establish a direct link with the base station, while all sensors located beyond a radius \(r_b\) transmit their signals to other sensors which are closer to the base station where the inter-sensor distance is approximately \(r_s\). This is illustrated in Fig. 3. The intermediate sensors should then re-transmit signals until they reach the base station, so that the end user can access the information.

Fig. 3
figure 3

Transmission and retransmission of signals

3 Formulation of the optimization problem and related work

Depending on the intended application, two possible formulations of the problem are considered: short-term monitoring and long-term monitoring. In some practical situations, we are interested in reasonably short-term monitoring. In such situations, we know the intended duration of the monitoring period, T, and we aim at minimizing the cost of monitoring. The monitoring cost is defined as the overall number of sensors N under the constraint that within the corresponding placement the sensors are guaranteed to function during the whole period T. In other situations (e.g., in border security), we are interested in long-term monitoring. In such situations, the existing resources do not allow deploy a sensor network that would guarantee to work for the entire intended duration period. Thus, the objective is then to maximize the network lifetime within the resource limitations because of the number of sensors. The resources limitations and energy consumptions in the WSN and their impact on the network lifetime have driven many studies, researches, etc., to enhance the network lifetime and performance (Toumpis and Tassiulas 2006; Zhao and Gurusamy 2008; Xu et al. 2005; Yong et al. 2006; Cheng et al. 2004; Al-Karaki and Gawanmeh 2017). In summary, given N sensors, our goal will be to maximize the lifetime T of the network.

To solve the aforementioned optimization problems, we must therefore first provide models that illustrate the spatial distribution of the sensors deployment. Constantly, we derive a representation of the total number of sensors N and the WSN lifetime T based on these models.

The primary aim of this paper is to derive optimal placement of sensor nodes over large geographical regions. The sensor placement can be characterized by describing the sensors density. In other words, for each spatial location \(\vec {x}\), we can describe the number of sensors in its neighboring area, by a density function \(\rho (\vec {x})\). Suppose we spatially divide the region of deployment to cells, then the overall number of sensors N can be obtained by adding the number of sensors placed on these cells. As we decrease the size of the cells, we can express N as

$$\begin{aligned} N= \int {\rho (\vec {x})\cdot d\vec {x}} \end{aligned}$$
(1)

The sensor network is active if at each location there are active sensors that can still detect the signals emerging at this location. Once at some location \(\vec {x}\) all sensors exhaust their energy, signals generated at this location are no longer detected and thus the network is no longer \(100\%\) efficient. We define the lifetime T of a sensor network as the smallest lifetime of all the sensors which form this network (Schurgers and Srivastava 2001). If we denote the lifetime of a sensor at the location \(\vec {x}\) by \(T(\vec {x})\), then \(T = min_{\vec {x}} T(\vec {x})\).

The main objective of the sensor network is to detect signals. Thus, to determine the sensor location, one must expect the number of signals generated at different locations. The optimization scheme utilized in this work is similar to those presented in Yang et al. (2016a) and Yang et al. (2015) . For instance in Yang et al. (2016b), multi-stratum resource optimization (MSRO) was proposed to maximize radio coverage and meet the QoS requirement for cloud-based radio over optical fiber networks (C-RoFN). Yang et al. (2015) presents a novel cross stratum optimization (CSO) architecture in elastic data center optical interconnection. The CSO architecture can allow global optimization and control across elastic optical network and data center application stratum heterogeneous resources to implement the optical as a service (OaaS). In Yang et al. (2016a), the authors proposed a novel MSRO architecture with network functions virtualization for C-RoFN using software defined control.

Now, lets start by considering an idealized situation where a complete information about the signals is readily available. However, in realistic situation partial knowledge about sensor locations is available. Similarly, for the distribution of sensors, the location of signals is also described by the corresponding density. Namely, it is assumed that for each spatial location \(\vec {x}\), we know the number \(\rho (\vec {x})\) of RF signals (i.e., network packets) per unit area and per moment of time generated in the vicinity of this location.

The main drain on sensor energy is the signal transmission. As was previously mentioned, all sensors in the network are assumed to have the same hardware characteristics. They start with the same initial amount of energy and consume the same amount of energy in every transmission and retransmission. If \(E_s\) denotes the initial amount of energy in a sensor, and \(e_s\) denotes the amount of energy required for a single transmission/retransmission, then each sensor can make \(N_s \triangleq E_S/ e_s\) transmissions/retransmissions during its lifetime. Note that \(N_s\) is an upper bound on transmissions/retransmissions that does not consider acquisition and in-sensor processing. We made this simplification based on the predominance of the radio transmissions on energy consumption.

Now, if the sensor exhausts its battery power, it can no longer be able to transmit/retransmit signals. As a result of this inability, the signals generated in the vicinity of this sensor are no longer detected. Hence, to find out the lifetime \(T(\vec x)\) of a sensor, we must find out the number of signals per second \(S(\vec x)\) that this sensor can detect and/or retransmit. The lifetime can then be determined as the time after which the sensor performs \(N_s\) transmissions (i.e., \(T(\vec x)\cdot S(\vec x)= N_s\)). Thus,

$$\begin{aligned} T(\vec x) = N_{s}/S(\vec x) \end{aligned}$$
(2)

To find the number \(S(\vec x)\) at a point \((\vec x)\), let’s first recall why we need retransmissions in the first place, and how retransmissions are organized. For simplicity, let’s use the coordinate system, whose origin is the base station (i.e., the base station coordinate is \(\vec 0 = (0 , 0)\)). The transmissions are coming from the signals generated in the direct vicinity of \(\vec x\) (i.e., the area radius is \(r_s\)). The retransmissions are coming from all the sensors for which the path to the base station comes through location \(\vec x\). It is worth mentioning that some signals in transmission/retransmission process might be handled by other sensors in its vicinity.

The following analogy can visually clear the aforementioned description. If one places a source of light at the location \(\mathbf {0}\) (i.e., the base station), and assumes that the sensor located at a point \(\vec x\) is solid and not transparent to this light, then the shadow of this sensor consists exactly of the points for which the shortest path to the base station passed through this sensor. In these terms, the number of transmissions and retransmissions performed by the sensor is equal to the total number of signals generated in the vicinity of this sensor and in the area of its shadow, see Fig. 4. In order to describe this shadow in geometric terms, the shortest path on a plane is a straight line. Thus, the shortest path from an arbitrary location \(\vec y\) to the base station \(\mathbf {0}\) is a segment of the straight line connecting \(\mathbf {0}\) and \(\vec y\). The points on this segment can be described by the expression \(\beta \cdot \vec y\) where \(\beta \epsilon [0,1]\). The condition that this shortest path passed through the given sensor location \(\vec x\) means that \(\beta \cdot \vec y= \vec x\) (i.e., \(\vec y = \lambda \cdot \vec x\) for the value \(\beta = \lambda ^{-1}\)). Since \(\beta \le 1\), we have \(\lambda \ge 1\). Vice versa, for each \(\lambda \ge 1\), the shortest path from a point \(\vec y = \lambda \cdot \vec x\) to the base station \(\mathbf {0}\) consists of all points \(\beta \cdot (\lambda \cdot \vec x)\). In particular, for \(\beta = \lambda ^{-1}\), this shortest path contains the location \(\vec x\) of the sensor under consideration.

Fig. 4
figure 4

Sensor at location \(\vec x\) with its shadow

Thus, in geometric terms, locations from which the shortest path to \(\mathbf {0}\) go through the point \(\vec x\) have the form \(\lambda \cdot \vec x\) for \(\lambda \ge 1\). Based on this geometric analysis, one can evaluate the number of transmissions and retransmissions handled by a sensor at a location \(\vec x\). Let \(\vec {r} {\mathop {=}\limits ^{\text {def}}}|\vec x|\) denote the distance from this sensor to the base station and \(\alpha\) be a small angle covered by this vicinity. Now, if we start with the \(r_s\)-vicinity of this sensor, we assume that all the sensors within this angle are at a distance \(\epsilon [r - \frac{r_s}{2}, r + \frac{r_s}{2}]\) from the base station, see Fig. 5. The width of this region is approximately equal to \(\alpha \cdot r\), its length is \(r_s\), and thus its area is \(\alpha \cdot r\cdot r_s\). Since the sensors density in the vicinity of the location \(\vec x\) is \(\rho (\vec {x})\) sensors per unit area, there are \(n= \rho (\vec {x})\cdot \alpha \cdot r\cdot r_s\) sensors in this region. These sensors must handle all the signals from the entire shadow area. The total number of these signals per second can be found by adding up all the signals generated in the area at different distances from the base station.

Fig. 5
figure 5

Vicinity \(r_s\) of the sensor \(\vec x\)

In a zone shown in Fig. 5, where the distances are between R and \(R+dR\) from the base station, the zone width and length are \(\alpha \cdot R\) and dR, respectively. Thus, its area is given by \(\alpha \cdot R\cdot dR\). This area includes a point on the same axis as our original sensor (i.e., a point with coordinates \(\vec x \cdot (R/r)\)). In the vicinity of this point, the number of signals generated per area and per unit of time is equal to \(\rho _{s}(\vec {x}\cdot (R/r))\). Thus, within the entire zone between distances R and \(R+dR\), \(\rho _{s}(\vec {x}\cdot (R/r))\cdot \alpha \cdot R\cdot dR\) signals are generated per unit time. The total number \(S_t\) of signals generated per unit time can be obtained by adding up the amounts from all these zones. Thus, based on this model, this number can be computed as

$$\begin{aligned} S_t = \int _r^\infty \rho _{s} \left( \vec {x}\cdot (R/r)\right) \cdot \alpha \cdot R\cdot dR \end{aligned}$$

Here, \(n=\rho (\vec x)\cdot \alpha \cdot r\cdot r_s\) sensors process S signals per unit time. Thus, every moment of time, a sensor processes on average \(S(\vec x) =S_t/n\) signals. Substituting the expressions for n and S into this formula, we can conclude that

$$\begin{aligned} S(\vec {x}) = \frac{\int _r^\infty \rho _{s} \left( \vec {x}\cdot (R/r)\right) \cdot \alpha \cdot R\cdot dR}{\rho (\vec {x}) \cdot \alpha \cdot r\cdot r_s} \end{aligned}$$

Dividing both numerator and denominator by \(\alpha\), we conclude that

$$\begin{aligned} S(\vec x) = \frac{\int _r^\infty \rho _{s} \left( \vec x \cdot (R/r)\right) \cdot R\cdot dR}{\rho (\vec x)\cdot r\cdot r_s} \end{aligned}$$

With respect to the sensor lifetime \(T(\vec x)\), once we know the average number of signals \(S(\vec x)\), we can then determine this sensor\('\)s lifetime as \(T(\vec x)= N_s/S(\vec x)\). So, the above formula for \(S(\vec x)\) leads to

$$\begin{aligned} T(\vec x)= \frac{N_s\cdot \rho (\vec x)\cdot r\cdot r_s}{\int _r^\infty \rho _{s} (\vec x \cdot (R/r))\cdot R\cdot dR} \end{aligned}$$
(3)

Finally, the lifetime T of a network can be determined as the smallest of the sensors lifetime \(T= min_{\vec x} T(\vec x)\). Substituting (3) into this formula, we conclude that

$$\begin{aligned} T= \min _{\vec x}\frac{N_s\cdot \rho (\vec x)\cdot r\cdot r_s}{\int _{r}^{\infty }\rho _{s} (\vec x \cdot (R/r))\cdot R\cdot dR} \end{aligned}$$
(4)

In summary, we arrive at the following formulations of the sensors placement problem:

  • The first problem: given the lifetime T (described by Eq. 4), we must minimize the overall number of sensors N (as described by Eq. 1).

  • The second problem: given the overall number of sensors N, we need to maximize the lifetime T.

3.1 Solution to the first problem

The first optimization problem is corresponding to short-term monitoring applications, where the monitoring time T is given. The lifetime of every sensor should be equal to the monitoring time. Let’s show that all sensors must have lifetime \(T:T(\vec x)=T\). The main requirement here is that all sensors must be active during the period of time T. This means that at every possible sensor location \(\vec x\), we must have \(T(\vec x)\ge T\). If at some location \(\vec x\), we have \(T(\vec x)> T\). This means that at the end of the monitoring period T, the sensors located at the vicinity of \(\vec x\) are still have the remaining energy. So, fewer sensors could have been placed in the vicinity of the location \(\vec x\) and still provided monitoring for the required period of time. Hence, the situations with \(T(\vec x)> T\) are not optimal. In the optimal sensor placement, all the sensors must indeed have exact same lifetime \(T:T(\vec x)=T\). Therefor; using (3) we have

$$\begin{aligned} T(\vec x)=\frac{N_s\cdot \rho (\vec x)\cdot r\cdot r_s}{\int _{r}^{\infty } \rho _{s}(\vec x \cdot (R/r))\cdot R\cdot dR}= T. \end{aligned}$$
(5)

Using (6), we get an explicit expression for the desired density of sensors \(\rho (\vec x)\)

$$\begin{aligned} \rho (\vec x)= \frac{T}{N_s\cdot r\cdot r_s}\cdot \int _{|\vec x|}^{\infty } \rho _{s} (\vec x \cdot (R/r))\cdot R\cdot dR. \end{aligned}$$
(6)

3.2 Solution to the second problem

Here, we present the solution to the second optimization problem which corresponds to the long-term monitoring applications (e.g., BZC). Let’s assume that in the optimal sensor placement, for two different sensor locations \(\vec x\ne \vec y\), the corresponding sensors have different lifetimes \(T(\vec x)\ne T(\vec y)\). Without loss of generality, we can assume that \(T(\vec x)< T(\vec y)\).

The network is efficient if all the sensors are active. Thus, after the time \(T(\vec x)\), the sensor network is no longer efficient. Since \(T(\vec y)> T(\vec x)\), the sensors located at the vicinity of \(\vec y\) still have remaining energy. This would mean that the sensors located in the area around \(\vec y\) are under-used. This is because there are fewer transmissions and retransmission in this area than in other places. Thus, if possible, we could re-distribute some of these sensors to other areas. This will increase the lifetime of all other sensors and the overall lifetime of the WSNs too.

It is worthy notice that the situations with \(T(\vec x)\ne T(\vec y)\) are not optimal. Henceforth, in the optimal sensor placement, all sensors must have \(T(\vec x)= T(\vec y)\) for some T. To derive the solution to the second optimization problem, similar arguments as in the first optimization one will lead to the same solution (Eq. 5). The only difference is that in the first optimization problem, we knew the lifetime T of all sensors. However, in the second optimization problem T is unknown. To find T, we must use the fact that we are given the overall number of sensors N. As a result, we can conclude that

$$\begin{aligned} N&= \int \rho (\vec x)\cdot d\vec x\\&= \int \frac{T}{N_s \cdot |\vec x| \cdot r_s}{\left( \int _{|\vec x|}^{\infty } \rho _{s} (\vec x \cdot (R/r))\cdot R\cdot dR)\right) } \end{aligned}$$

Constant factors T, \(N_s\), and \(r_s\) can be moved outside the integral. We can define \({r_{0}}{N_{0}}=N\), where \(r_0 =\frac{T}{N_{s}}{r_{s}}\) and

$$\begin{aligned} N_0 \triangleq \int \frac{1}{|\vec x|}\left( \int _{|\vec x|}^{\infty } \rho _{s} (\vec x \cdot (R/r))\cdot R\cdot dR) \right) \end{aligned}$$
(7)

Now, by substituting \(r_0=\frac{N}{N_0}\) into (6), we find that the optimal distribution for the sensor placement is

$$\begin{aligned} \rho (\vec x) = \frac{N}{N_0\cdot |\vec x|}\cdot \int _{|\vec x|}^{\infty } \rho _{s} \left( \vec x \cdot (R/r)\right) \cdot R\cdot dR \end{aligned}$$
(8)

where \(N_0\) is determined by (7).

3.3 Discussion on the solution to the optimization problems

It is worth mentioning that for the second optimization problem, the optimal sensor placement depends only on the non-homogeneity of the signal distribution. For example, if we double the number of signals (i.e., the function \(\rho _s (\vec x)\) is replaced by \(2 \cdot \rho _s (\vec x)\)), then the integral for \(\rho _s (\vec x)\) in (7) will double. In addition, the value \(N_0\) in the denominator will also double. As a result, the optimal sensor density will remain the same. The same observation is true if we multiply the original density function \(\rho _s (\vec x)\) by an arbitrary factor. Furthermore, we can observe the same if we replace the original density function \(\rho _s (\vec x)\) with another density function for which the overall number of signals per unit time is 1 (i.e., \(\int \rho _s (\vec x)\cdot d\vec x = 1\)).

For the first optimization problem, if we increase the values of the signal density by a factor of \(\lambda\), then the resulting values of the optimal sensors density will also multiply by the same factor. Once we have found the general formulas for the optimal sensor placement, we can start with the analysis taking into account information regarding the location and distribution of the signals.

4 Optimal sensor placement under uncertainty

In the previous section, the solution to the optimal sensor placement problem was described under the idealized assumption (i.e., the signal distribution \(\rho _s (\vec x)\) is known). Practically, we rarely know the signal distribution. Usually, in many cases we only know the area where the signal is located. Henceforth, to solve the sensor placement problem, we utilize the maximum entropy approach (Jaynes and Bretthorst 2003; Chokr and Kreinovich 1994). Now, we know how to compute the optimal sensor placement for each signal density \(\rho _s (\vec x)\). In the case of uncertainty, we only have partial information about the signal density. In other words, instead of having a single density function consistent with our knowledge, we might have many functions which are all consistent with our incomplete knowledge. In this case, a reasonable approach is to select the most appropriate representative density function, and to place the sensors based on this selection.

4.1 Maximum entropy approach

Maximum entropy approach was started by P. Laplace. The simplest situation is when we have finitely many (n) alternatives and we have no information about their probabilities. This simple situation is invariant with respect to the arbitrary permutations of the original alternatives. So, it is reasonable to select the probabilities which reflect this symmetry (i.e., equal probabilities \(p_1=\cdots = p_n\)). Since the total probability \(\sum _{i=1}^{p_n}p_i\) must be equal to 1, it can be concluded that \(p_1=\cdots =p_n = \frac{1}{n}\). This idea is called Laplace principle of indifference.

The Laplace’s simple idea can be naturally applied to our uncertainty problem where several possible distributions are consistent with our knowledge. In this case, it is reasonable to view these distributions as equally probable alternatives. The variables are discretized to make sure that the overall number of alternatives is finite. As the discretization constant tends to 0, we can get the distribution of the class of all non-discretized distributions.

Ultimately, only one distribution with a probability of 1 will result. This is the distribution which has the largest possible value of the entropy \({S_e {\mathop {=}\limits ^{\text {def}}} -\int \rho _s (\vec x)\cdot ln(\rho _s (\vec x))\cdot d\vec x}\), where \(\rho _s (\vec x)\) denotes the probability density function, for further details see Klir (2005).

It is known that for a given set A, among all distributions located on this set, the entropy is the largest when this distribution is uniform. So, the maximum entropy approach means that the signals are uniformly distributed on the area A. In other words, it is assumed that different locations may generate the exact same number of signals per second. This assumption makes sense since we know nothing about the difference between the sensors at different locations.

4.2 Derivation of the problem and resulting formula

We would like to describe how the aforementioned assumptions translate into the exact expression for the optimal sensor placement. We firstly need to decide how to represent the domain A. For simplicity, we assume that the region A is convex and bounded. In this case, for each point \(\vec x \epsilon A\) the straight line segment going from the origin \(\mathbf {0}\) towards this point (and further) intersects with the boundary of A at exactly one point. We denote the distance from this point to \(\mathbf {0}\) by \(R_A(\vec x)\), see Fig. 6. Once we know the values \(R_A(\vec x)\) for different locations \(\vec x\), we can uniquely reconstruct the border of the area A independently of its shape and area. These values can be used to represent the area A. Here, (7) leads to

$$\begin{aligned} \rho _s (\vec x) = \frac{N}{2N_0\cdot |\vec x|}\cdot (R_{A}^2(\vec x) - |\vec x|^2) \end{aligned}$$
(9)

where the normalization constant \(N_0\) can be determined from the condition \(\rho _s (\vec x) =1\).

Fig. 6
figure 6

Reconstruction of the border of area A

As an example, we can apply our work if we assume that the area is circle C of a radius \(R_C\). In (9), \(R_A (\mathbf {x})\) is equal to \(R_C\) for all \(\mathbf {x}\). Plug this formula into (7) will give

$$\begin{aligned} \rho _s (\vec x) = \frac{N}{2N_0\cdot r}\cdot (R_C^2 - r^2) \end{aligned}$$

where \(r= |\vec x|\).

Now, the normalization condition leads to

$$\begin{aligned} \int \rho (\vec x)\cdot d\vec x&= \int _{0}^{R_C}\rho (\vec r)\cdot 2\pi r\cdot dr\\&= \int _{0}^{R_c}\frac{\pi \cdot N}{N_0}(R_{C}^2 - r^2)dr = \frac{\pi \cdot N}{N_0}\cdot \frac{2}{3}\cdot R_{C}^3 \end{aligned}$$

given that \(N_0=\frac{2}{3}\pi \cdot R_{C}^3\).

In spite of the uniform distribution of signals, we have a denser distribution of sensors closer to the base station. This is actually very reasonable: while sensors on the periphery only need to transmit their own signals, the sensors near the center also need to retransmit a lot of other signals. As a result, they lose energy faster and so we need more of them. Furthermore, to optimize the number of those sensors our proposed strategy can incorporate with more complex protocols to track the energy status of each sensor to overcome the energy hole problem. The protocol is iterative and in each round the energy status of the sensors is assessed. As a result, sensors with low energy are excluded and then the next iteration works on the sensors that are still having suitable amount of energy for transmission/retransmission process. Therefore, the strategy proposed in this paper is applied to the list of energy-suitable sensors to ensure the network connectivity. This will give our proposed strategy to work with a more general protocols.

5 Simulation using OPNET

The simulation is illustrated using OPNET modeler release 14.5 simulator. This simulator contains a ZigBee model that complies with the accorded ZigBee specifications (Zigbee.org 2017). The routing protocol used by the network layer is an ad-hoc on-demand distance vector (AODV). In order to find the destination device, it broadcasts out a route request (RREQ) to its neighbors. The neighbors then broadcast the RREQ to their neighbors and so on until the destination is reached. Once the destination is reached, it sends its route reply (RREP) via unicast transmission following the lowest cost path back to the source. Once the source receives the RREP, it will update its routing table for the destination address with the next hop in the path and the path cost.

For the simulation, once the network model is specified, the process model needs to be configured in the process editor. In the ZigBee 802.15.4 MAC Layer process model a data rate of 250 kbps is specified, the channel sensing duration and packet reception power Threshold are left as default values. The ACK mechanism is set to disabled since the acknowledgments are considered in the Control Traffic not in the Data Traffic. The Data Traffic is the parameter implied in this study to compare the use of different nodes.

We can modify the parameters in the base station attributes. Here, the coordinates of the base station can be selected. In the MAC parameters, the CSMA/CA parameters are set to default settings, (i.e., the minimum backoff exponent is 3 and the maximum number of backoffs is 4). In the physical layer parameters, the 2450 MHz band is enabled, corresponding to the MICAz mote specifications (Alkaline Technical Information 2017; Crossbow Technology Inc. 2011). The transmit power is set to 0.4 mW. This will enable the outdoor range to reach 100 m. The transmission power and reception power are considered the same. In the network parameters, the Beacon Enabled Network is disabled, this characteristic is not supported by the ZigBee model suite. Mesh routing is enabled to perform better than tree-based routing. This has been proven by several studies. The route discovery timeout is equal 10 s and the packet size is 1024 bits.

5.1 Simulation output parameters

Even though there are several available statistics we can collect in Modeler, in this study only the following measurement will be considered:

  • Total data traffic received: represents the total traffic received by the MAC from the physical layer in bits/s. This includes the retransmissions.

  • Total data traffic sent: traffic transmitted by all the 802.15.4 MACs in the network in bits/s. While computing the size of the transmitted packets for this statistic, the physical layer and MAC headers of the packet are also included. This statistics include all the traffic sent by the MAC via CSMA/CA.

  • Data traffic received per node: traffic received by the MAC from the physical layer in bits/s. This includes the retransmissions.

  • Data traffic sent per node: traffic transmitted by the MAC in bits/s.

Considering these information, it is possible to calculate the energy consumption in each sensor node comprising the network. This means that the sensor energy resources are being drained. It is important to clarify that the considered energy consumption refers only to the energy used to send and receive information, in other words, the energy for communications among sensors and the base station. The energy consumed in data processing and storage is not taken into account in this analysis. Having the energy consumption in each sensor, the distribution of the energy consumption in the entire network can be observed for each scenario. Then, we can determine which scenario will offer the longest lifetime.

5.2 Scenarios

Here, we consider two different scenarios

  1. 1.

    Scenario A (120_Randomized_Placement): This scenario follows a randomized deployment describe in Agrawal (2017). Here, the sensors are randomly placed over a circle with a radius of 250 m, see Fig. 7. As one can observe, the sensor nodes are spread and taking into account the presence of nodes in the whole area. This means that the sensors follow a semi-uniform deployment, but not necessarily in exact coordinates according to a grid. This scenario covers approximately a circular area of 196,350 m\(^2\).

  2. 2.

    Scenario B (120 Optimal_Placement): This scenario is corresponding to the proposed approach. It follows the optimal sensor placement distribution formula obtained in (9). As displayed in Fig. 8, the sensors are deployed over a circle with a radius of 250 m. The coverage in this scenario is similar to that of Scenario A.

Fig. 7
figure 7

Scenario A: 120 randomized placement

Fig. 8
figure 8

Scenario B: 120_optimal_placement

Here, the proposed approach, scenario B, will be simulated to compare with the upper and lower bounds. The upper bound is achieved when full information about the WSN is available including the full capacities of links. Du et al. (2017) achieve upper bound by transforming the network lifetime maximization into a maximum flow problem with cardinality constraint along with vertex capacity constraints. This transformation may lead to high optimization complexity and exchange more information which generates high traffic load and thus more energy consumption. However, the lower bound which produces when the nodes are randomly distributed (uniform placement) requires no information about the network. Our proposed strategy assumes only information about the sensors residual energy. Hence, it will outperform the lower bound and in the meanwhile it is less complicated and requires less information than the upper bound case. The best upper baseline is achieved when the nodes are lined up as shown in Fig. 9. In this figure if we assume that \(C_{ij}\), where i denotes the path and j denotes the hop number, represents the probability of successful packet delivery over hop j in path i. Here, if n6 needs to reach the base station, it can go through path1:\(n1 \longrightarrow n2 \longrightarrow ~~\longrightarrow BS\) or it can take an alternative path using our strategy as follows \(path2: n6 \longrightarrow n4 \longrightarrow n2 \longrightarrow BS\). The average number of transmission/retransmission to reach the destination via pathi is given by

$$\begin{aligned} Path Cost = \sum _{j}\frac{1}{C_{ij}} \end{aligned}$$
(10)
Fig. 9
figure 9

Illustrative example to show alternative paths to reach BS

It is worth notice that to achieve successful transmission/retransmission to reach the destination using path2 will be greater than that of path1. However, path1 will be faster than path2 as the distance between adjacent nodes is less that leads to increase the link capacity between adjacent nodes. Unfortunately, the faster path (i.e., path1) will have higher delay due to nodes processing time. For instance, if the node takes \(\tau\) time to process the data then the processing times of path1 and path2 will be \(6\tau\) and \(3\tau\), respectively. Accordingly, in the illustrative example, our proposed strategy, Scenario B, tends to use path2 instead of path1 as it tends to use the nodes placed at a very further distance that is still hear the precedent node.

5.3 Results and discussions

Comparing both scenarios A and B, one can see that Scenario A shows medium level energy consumption. In this scenario, the sensors are deployed in a semi-random fashion because the nodes are not assigned an exact coordinate. In addition, The sensors still need to ensure connectivity in the network by deploying them in such a way that they cover all the sensed area. The randomized placement does not follow a regular grid, therefore; the nodes can be spread from a helicopter or implementing using other techniques. However, it is not easy to ensure this distribution of sensors (Tsai et al. 2008). Figure 10 shows the energy consumption per node in Scenario A. The total energy in the plot is 8.43 J. However, Scenario B shows competitive energy consumption compared to other approaches. It consumes less energy per node than Scenario A as depicted in Fig. 11. The total energy in the plot is 9.88 J.

Fig. 10
figure 10

Energy consumption per node in Scenario A

Fig. 11
figure 11

Energy consumption per node in Scenario B

According to the mathematical analysis, the sensor placement using Scenario B offers several advantages over Scenario A. First, the connectivity is ensured in the network and the final optimal sensor placement formula can be easily manageable to change the number of sensors. Second, Scenario B distributes the energy more evenly than Scenario A. This means that the nodes in the network are draining their energy accordingly, thus, there are fewer nodes consuming a lot of energy and fewer nodes consuming a little energy, as seen in Scenario A.

Table 1 lists the energy consumption per node for both Scenarios A and B. Comparing the energy consumptions one can easily observe that Scenario B is more uniformly distributed than Scenario A. Moreover the energy distribution in Scenario B is the most uniformly distributed. In Scenario A, the energy consumption goes from 0.018 to 0.137 J and in Scenario B the energy consumption goes from 0.044 to 0.124 J.

Table 1 Standard deviation (J)

6 Conclusion

A new strategy to optimize the placement of sensors over a field in order to overcome the energy constraint problem presents in the WSNs was proposed. The placement of the sensors directly affects the performance of the network and the routing efficiency, and we have to ensure connectivity and maximize the network lifetime. Two optimization problems corresponds to short-term monitoring and long-term monitoring applications were presented. A mathematical analysis is carried out and tried under several cases, until finally; a solution is given to each one of them. In addition, two scenarios differ in the distribution of the sensors and the coverage areas were illustrated. The results show that the proposed approach, Scenario B, counts with advantages over the other and is the most suitable for different applications. One of the advantages of employing the proposed optimal placement of sensors, Scenario B, over Scenario A is that the energy consumption in our approach tends to be evenly distributed over the whole network. Ultimately, the propose optimal placement of sensors extends the network lifetime. As a future work, the WSNs performance can be enhanced if some scheduling algorithms and routing techniques are incorporated with the proposed strategy. Furthermore, this incorporation will certainly incense the proposed strategy capability to overcome the energy hole problem.