Keywords

1 Introduction

As one of the most significant applications of wireless communication technologies, a wireless ad hoc network consists of autonomous or mobile nodes which communicate with each other without a centralized control or assistance. All the nodes in the network can transmit, receive and forward messages, and thus require no support of backbone networks. Therefore, ad hoc networks provide more robustness and flexibility in the presence of node failures than those requiring infrastructure supports and are quite useful in environmental monitoring, infrastructure surveillance, disaster relief, battlefield, and scientific exploration.

To reduce the computational complexity and cost of studies and designs in wireless ad hoc networks, in addition to using powerful computer-aided design and analysis tools, the past decades have seen an increasing amount of attempts focusing on the analytical description of system characteristics and performance metrics through the modeling and analysis of wireless networks. As one of the promising tools, stochastic geometry [1] has been widely adopted, where the node distribution is assumed to follow a Poisson Point Process (PPP) [2,3,4] or a Binomial Point Process (BPP) [5,6,7]. However, the PPP model is inadequate/inaccurate in many practical wireless networks where a finite number of nodes are randomly distributed in a finite area, because it assumes an unbounded number of nodes and does not take into account the effects of the network boundaries; and the BPP model can analyze neither the exterior interference at a reference receiver nor the average performance metrics at any node but only at a specific reference node. Both models provide us with average results over time and space, but cannot present performance metrics for a specific network deployment and/or a time instance [8].

Since most of the performance metrics in finite wireless ad hoc networks are nonlinear functions of the distances among communicating, relaying, and interfering nodes, probabilistic distance-based model has been extensively studied and applied as a significant complementary tool to the PPP and BPP models to quantify these metrics, such as interference [9], Signal-to-Interference-plus-Noise Ratio (SINR) [10], path loss [11], node degree [10], link/hop distance [12, 13], outage probability [11], link capacity [10], localization [14], energy consumption [15], stochastic properties of a random mobility model [16, 17], etc. As a result, nodal distance distributions (NDDs) are eventually involved in such quantifications, which intrinsically depend on the network coverage and nodal spatial distribution. Depending on whether one of the communicating nodes in a node pair is fixed or random, there are generally two types of NDD, i.e., Ref2Ran NDD from a given reference node to a random node and Ran2Ran NDD between two random nodes. In addition to the utilization in wireless ad hoc networks, NDD-based models have also been widely adopted for modeling and analyzing cellular networks [8, 18,19,20,21,22,23]. Therefore, how to effectively obtain the relevant NDDs is definitely significant to accurately quantify the distance-dependent performance metrics when modeling and analyzing finite wireless networks, which has attracted plenty of attention in the current literature [11, 18,19,20,21, 24,25,26,27,28,29,30,31,32,33,34,35,36,37].

This paper focuses on the survey of the state-of-the-art approaches to the two types of NDD as shown in Sect. 2 and their applications in wireless ad hoc networks as shown in Sect. 3. The relevant open issues, challenges, and directions are also discussed in Sect. 4. Finally, Sect. 5 concludes the paper.

2 NDDs Associated with Finite Regions

In this section, we briefly survey the state-of-the-art approaches to both Ref2Ran and Ran2Ran NDDs associated with arbitrary polygons.

2.1 Ref2Ran NDD

In our technical report [35], we have proposed a systematic recursive approach to obtain the Ref2Ran NDD from an arbitrary reference node (i.e., anywhere in the plane) to a random node inside an arbitrary polygon, which eliminates the inappropriate assumptions and limitations in the previous work that the network area has to be in certain specific shapes (including squares [18], disks/circles [11], hexagons [19, 20], regular polygons [31], and convex n-gons [30]), and the reference node has to be inside or on the boundary of the network. Specifically, we first obtain the Ref2Ran NDD from a vertex of an arbitrary triangle to the triangle using the area-ratio approach, and then based on which the Ref2Ran NDD from an exterior or interior reference point to the triangle can be obtained through decomposition and recursion (D&R) methods. Such NDDs from an arbitrary reference point to an arbitrary triangle are called Ref2Ran triangle-NDDs.

Fig. 1.
figure 1

An arbitrary reference point R and arbitrary polygons (unit: m).

Based on the obtained Ref2Ran triangle-NDDs, the distribution of the distance from an arbitrary reference point to an arbitrary polygon can also be obtained through a D&R method, since any polygon can be triangulated. If a reference point R is inside a polygon, the polygon can be triangulated from R as shown in Fig. 1(a). On the other hand, if R is outside, we can just triangulate the polygon as shown in Fig. 1(b). Then with a weighted probabilistic sum, the CDF of the distance distribution from R to the polygon is

$$\begin{aligned} F(d)=\sum \limits _{i}^{K}\frac{S_i}{S}F_i(d), \end{aligned}$$
(1)

where K is the number of triangles generated after the triangulation (\(K=6\) and 4 for the examples shown in Fig. 1(a) and (b), respectively), \(S_i\) is the area of triangle \(\mathcal {T}_i\), S is the area of the polygon, and \(F_i(d)\) is the CDF of the distance distribution from R to \(\mathcal {T}_i\), which has been obtained already.

It is possible that a subarea of a network area has a higher node density than other subareas, which is referred to as tiered network structure in this paper, due to the reasons such as the bottleneck nodes near the hotspot under heavy loads run out of energy, some nodes are physically damaged in a hostile environment, or there are no nodes being deployed in a specific area which needs not to be monitored. In this case, the above weighted probabilistic sum is still applicable, but with the weights modified correspondingly due to the node density difference. Take the tiered structure shown in Fig. 1(c) for example. Assuming the node density ratio between \(\mathcal {P}_1\) (the white area) and \(\mathcal {P}_2\) (the grey area) is \(\lambda _1:\lambda _2\) (\(\lambda _1\) and \(\lambda _2\) are not 0 at the same time), then the distance distribution from an arbitrary reference point R to the whole area \(\mathcal {P}_3\) (\(\mathcal {P}_1\) plus \(\mathcal {P}_2\)) is

$$\begin{aligned} F_3(d)=\sum \limits _i^{2}\frac{S_i\lambda _i}{\sum _j^2S_j\lambda _j}F_i(d), \end{aligned}$$
(2)

where \(S_i\) is the area of \(\mathcal {P}_i\), and \(F_i(d)\) is the distance distribution from R to \(\mathcal {P}_i\), which can be obtained by (1). The obtained results have been applied in our recent work [8] to analyze the outage probability of the macro and femto BSs in arbitrarily-shaped cells for tiered cellular networks.

The authors in [36] made an algorithmic implementation based on our proposed approach. Instead of using D&R methods, they modified the shoelace formula to calculate the area of the intersection between the polygon and the circle centered at an arbitrary reference point R with a radius of d. Then the probability that the distance from R to a point uniformly distributed at random within the polygon is no longer than d is the area of the intersection divided by the area of the polygon. In the modified shoelace formula, the area of the intersection between the circle and each triangle generated by triangulating the polygon from R is obtained based on our approach.

2.2 Ran2Ran NDD

For obtaining Ran2Ran NDDs, previous work had to assume that the networks are in certain specific shapes, including disks [9, 26, 28], triangles [29, 32], rectangles [10, 12, 13, 28], rhombuses [24], trapezoids [27], and regular polygons [10, 15, 21, 25, 33], which considerably limits the applicability of these approaches in modeling and analyzing wireless networks. Our recently proposed approach in [37] can handle the networks in the shape of arbitrary polygons as well as the polygons with different node densities in different subareas. The D&R method is also applied to this end by triangulating polygons. Specifically, we first obtain the Ran2Ran NDDs associated with arbitrary triangles (i.e., Ran2Ran triangle-NDDs), including the Ran2Ran NDD within an arbitrary triangle, and that between two arbitrary triangles which can be disjoint or share a common vertex or side. Then the Ran2Ran NDDs associated with arbitrary polygons can be obtained through D&R methods, since any polygon can be triangulated. Therefore, the Ran2Ran NDD-based performance metrics of wireless ad hoc networks associated with arbitrary polygons can be quantified properly.

Nonuniform node distribution can also be considered. Take the irregular polygon with the triangulation shown in Fig. 1(b) for example. Assuming the node density ratio among \(\mathcal {T}_1\), \(\mathcal {T}_2\), \(\mathcal {T}_3\), and \(\mathcal {T}_4\) is \(\lambda _1{:}\lambda _2{:}\lambda _3{:}\lambda _4\) (\(\lambda _1\), \(\lambda _2\), \(\lambda _3\), and \(\lambda _4\) are not 0 at the same time), through D&R, the CDF of the Ran2Ran NDD within the polygon is given by a weighted probabilistic sum,

$$\begin{aligned} F(d)=\sum \limits _{i=1}^4\sum \limits _{j=1}^4\frac{S_i\lambda _iS_j\lambda _j}{\left( \sum _{k=1}^4S_k\lambda _k\right) ^2}F_{ij}(d), \end{aligned}$$
(3)

where \(S_x\) is the area of triangle \(\mathcal {T}_x\), and \(F_{ij}(d)\) or \(F_{ji}(d)\) is the CDF of the Ran2Ran NDD within triangle \(\mathcal {T}_i\) if \(i=j\), or between two triangles \(\mathcal {T}_i\) and \(\mathcal {T}_j\) if \(i\ne {}j\), which have been obtained above as Ran2Ran triangle-NDDs.

For tiered polygons shown as in Fig. 1(c), with the Ran2Ran triangle-NDDs obtained based on the above approach, the CDFs of the Ran2Ran NDDs within \(\mathcal {P}_1\) and \(\mathcal {P}_2\), i.e., \(F_{11}(d)\) and \(F_{22}(d)\), can be obtained. Assuming the node density ratio among \(\mathcal {P}_1\) and \(\mathcal {P}_2\) is \(\lambda _1{:}\lambda _2\) (\(\lambda _1\) and \(\lambda _2\) are not 0 at the same time), and with a weighted probabilistic sum, we have

$$\begin{aligned} F_\mathrm{{33}}(d)&=\sum \limits _{i=1}^{2}{\sum \limits _{j=1}^{2}\frac{S_id_iS_jd_j}{\left( \sum _{k=1}^{2} S_{{k}}\lambda _k\right) ^2}F_{ij}(d)},\nonumber \\ F_{\mathrm{{3}}i}(d)&= \sum \limits _{j=1}^{2}{\frac{S_{{j}}\lambda _j}{\sum _{k=1}^{2} S_{{k}}\lambda _k}}F_{{i} {j}}(d), \qquad (i\in \{1, 2\}) \end{aligned}$$
(4)

where \(S_x\) is the area of \(\mathcal {P}_x\), and \(F_{ij}(d)\) or \(F_{ji}(d)\) is the CDF of the Ran2Ran NDD within \(\mathcal {P}_i\) if \(i=j\), or between \(\mathcal {P}_i\) and \(\mathcal {P}_j\) if \(i\ne {}j\). With \(F_{11}(d)\), \(F_{22}(d)\), and \(F_{12}(d)\) obtained by (3), \(F_{33}(d)\), \(F_{13}(d)\), and \(F_{23}(d)\) can be obtained by (4).

3 Applications in Finite Wireless Ad Hoc Networks

As mentioned before, NDD can be utilized to characterize most of the performance metrics in finite wireless ad hoc networks due to their nonlinear relationships with the distances among nodes. In this section, we categorize the existing applications in the current literature into different levels, including graph, transceiver, link, path, and network levels. We also show the efficacy of the new approaches highlighted in Sect. 2 on some selected performance metrics with arbitrary shapes/densities, in comparison with previous approaches/approximations such as using average density, ignoring border effect (e.g., in PPP), and so on.

3.1 Graph Level

There are several representative performance metrics at the graph level, such as kth nearest neighbor (k–NN) distance, node degree (just k–NN below a certain threshold), etc. Especially, 1–NN and \((n-1)\)–NN (n is the total number of nodes in the network) correspond to the nearest and farthest neighbor distances, respectively, which are useful for routing protocol design in ad hoc networks [21, 30]. For example, in a sparse network where the network size is much larger than the communication range of the nodes, a nearest-neighbor routing is beneficial to reducing energy consumption and increasing network throughput. On the other hand, in a small dense network, choosing the farthest node for packet relay, the routing overhead can be alleviated by reducing the number of transmissions. In addition, the nearest neighbor distance distribution was also utilized in [16, 17] to evaluate the nearest-job-next service discipline for mobile collectors or chargers (known as mobile elements).

Suppose there are n nodes uniformly distributed at random in a network area. For a node i (either a random or reference node), the distances from the other \(n-1\) nodes to node i are ordered as \(d_1\le {}d_2\le {}\cdots \le {}d_{n-1}\). Let \(\varDelta _k\) denote the random variable which represents the k–NN distance to node i. The PDF of \(\varDelta _k\) according to order statistic is

$$\begin{aligned} f_{\varDelta _k}(d)=\frac{(n-1)!}{(k-1)!(n-1-k)!}[F(d)]^{k-1}[1-F(d)]^{n-1-k}f(d), \end{aligned}$$
(5)

where F(d) and f(d) are the CDF and PDF of any NDD introduced in Sect. 2, respectively.

Fig. 2.
figure 2

Nearest neighbor distance distribution.

Fig. 3.
figure 3

Nearest neighbor energy consumption (K vs. \(\alpha \)).

Suppose there are \(n=10\) nodes randomly distributed in \(\mathcal {P}_3\) shown in Fig. 1(c). For \(\lambda _1{:}\lambda _2=1{:}1\) (i.e., uniform distribution) and 10:1 (nonuniform distribution), Fig. 2 shows the corresponding Ran2Ran nearest neighbor distance distributions, compared with the result obtained based on the PPP model. Due to the ignoring of the network border effect in PPP and the different node density ratios, there exist gaps among the comparisons. Also in the nonuniform case where \(\mathcal {P}_1\) has a higher density, surrounded by \(\mathcal {P}_2\) with a lower density, nodes are more likely closer to each other in \(\mathcal {P}_1\), with nearer nearest-neighbors than the uniform case, as shown in Fig. 2.

3.2 Transceiver Level

The performance metrics at the transceiver level include path loss [11], received signal strength for a given transmission power, transmission energy consumption to ensure a certain received power [15, 30, 38], etc.

Path Loss and Signal Strength. Let us assume a general path-loss model, where the path loss of the transmission power at distance d is

$$\begin{aligned} L(d)=\beta {}d_0^{\alpha }d^{-\alpha }, \end{aligned}$$
(6)

where \(\beta \) is a path-loss constant determined by the hardware features of transceivers, \(d_0\) is a given reference distance, and \(\alpha \) is the path-loss exponent. As a result, the received signal strength at distance d is just

$$\begin{aligned} P_r(d)=L(d)P_t, \end{aligned}$$
(7)

where \(P_t\) is the transmission power. Therefore, given a distance distribution, the distributions of path loss and received signal strength can also be obtained by using the change-of-variable technique. The model can be readily extended to include the shadowing and fading effects of wireless channels. For example, log-normal shadowing and Rayleigh fading can be considered. For the Rayleigh fading channel, we have the PDF of the channel power gain as

$$\begin{aligned} f_X(x|d)=\frac{1}{P_r(d)}\text {e}^{\frac{-x}{P_r(d)}}. \end{aligned}$$
(8)

Then the PDF of the signal strength at the receiver is

$$\begin{aligned} f_X(x)=\int _{d_{\text {min}}}^{d_{\text {max}}}f_X(x|d)f(d)\text {d}d, \end{aligned}$$
(9)

where f(d) is the PDF of any NDD, and \(d_{\text {min}}\) and \(d_{\text {max}}\) are the minimum and maximum distances between the transmitter and receiver, respectively. In addition, log-normal shadowing and Rayleigh fading can also be modeled as independent random variables that are not related to inter-node distances. As shown in [39], the shadowing effect follows a log-normal distribution with standard deviation \(\sigma \) (typically between 0 and 8 dB), and Rayleigh fading follows an exponential distribution of mean 1. Therefore, the extension of the path-loss model along with shadowing and fading is the multiplication of independent random variables and can still be analyzed based on the NDD-based model.

Transmission Energy Consumption. The energy consumed by a radio transmitter is proportional to the \(\alpha \)th power of the distance to the receiver. In a simplistic model with wide applicability [15, 38], the average one-hop energy consumption of the radio transmitter can be formulated as

$$\begin{aligned} E_{\mathrm {Tx}}=\epsilon \int {}d^{\alpha }f(d)\mathrm {d}d=\epsilon {}K, \end{aligned}$$
(10)

where \(\epsilon \) is a constant related to the environment, f(d) is the PDF of any relevant NDD, and K can be viewed as the normalized average bit-energy consumption. Based on the obtained NDDs shown in Fig. 2, Fig. 3 shows the variations of K as \(\alpha \) increases. Due to the nonlinear effect of the path loss exponent, even a small difference in distance distributions can lead to a big difference in energy consumption. Again, PPP-based model differs from the reality due to the ignored border effect, and nonuniform node distribution also has a great effect.

3.3 Link Level

The interference [9, 10], SINR [10], outage (just SINR below a certain threshold) [11], link capacity [10], etc., achieved at either a random or fixed receiver are link-level performance metrics. Assuming that all the transmitters in the network have the same transmission power \(P_t\), the cumulative interference at the receiver from all its interfering nodes is

$$\begin{aligned} I=P_t\sum \limits _iL(d_i), \end{aligned}$$
(11)

where \(L(d_i)\) is given in (6), and \(d_i\) is the distance from the receiver to the ith interfering node. So the SINR achieved at the receiver is

$$\begin{aligned} SINR =\frac{P_tL(d)}{N_oW+I}, \end{aligned}$$
(12)

where d is the distance from the receiver to its transmitter, W is the communication bandwidth, and \(N_0\) is the one-sided spectral density of additive white Gaussian noise. Given a modulation and coding scheme, outage probability represents the chance that the SINR achieved at a receiver is no larger than a specified threshold so that the reception is considered unsuccessful. Therefore, the CDF of the received SINR is significant to determine the link outage probability. Meanwhile, according to Shannon’s theory, the capacity of the link between the transmitter and receiver is

$$\begin{aligned} C=W\log _2(1+ SINR ). \end{aligned}$$
(13)

Since I, \( SINR \), and C are all functions of distance, given the corresponding NDD, their distributions can also be obtained, which are significant for statistically analyzing the performance of ad hoc networks. For the network shown in Fig. 1(c) with \(n=10\) nodes randomly distributed in \(\mathcal {P}_3\), Figs. 4 and 5 show the cumulative interference from the other 9 interferers and SINR at R, respectively, for both \(\lambda _1{:}\lambda _2=1{:}1\) and 10 : 1, in comparison with the result obtained based on the PPP model (\(P_t=2\) mWatt, \(L(d)=-38-20\text {lg}(d)\) (dB), \(W=5\) MHz, and \(N_0=-174\) dBm/Hz). Higher density in \(\mathcal {P}_1\), surrounding R, thus causes a higher interference and yields a lower SINR at R, as shown in the figures.

Fig. 4.
figure 4

Cumulative interference at R shown in Fig. 1(c).

Fig. 5.
figure 5

SINR achieved at R shown in Fig. 1(c).

3.4 Path Level

The metrics at the link level shown above are utilized to investigate the performance of single-hop communications (i.e., via a direct link). For analyzing multi-hop transmissions at the path level, NDD can still be utilized. For example, hop distance is crucial to the route discovery delay, the reliability of message delivery, and the minimization of multi-hop energy consumption. The authors in [13] investigated the distribution of the minimum hop distance H between a random source and destination pair based on NDD. The closed-form expressions for the probability that two nodes can communicate within \(H=1\) hop or \(H=2\) hops were derived. Analytical bounds were provided for the paths with \(H>2\) hops. In [40], the NDD between a fixed source and destination pair with a single relay uniformly distributed at random in between is utilized to obtain the distribution of the capacity of the two-hop relay communication.

3.5 Network Level

The analysis on the network capacity belongs to this level. Network capacity can be investigated from the perspective of either concurrent links or flows. For example, in a clustered ad hoc network, there are concurrent single-hop communications between cluster members and their heads in several clusters, where the network capacity can be obtained based on the link capacity. On the other hand, in an ad hoc mesh network, there might be several multi-hop communications (referred to as flows) happening concurrently. The network transport capacity in this case can be investigated based on the capacity studied at the path level.

As shown above, ignoring border effect or using average density often in PPP and existing work skews results greatly, which highlights the need for NDDs to analyze performance metrics accurately in finite ad hoc networks.

4 Issues, Challenges, and Directions

Although the relevant research on both types of NDDs has achieved remarkable breakthroughs, there are still open issues which are challenging to be solved.

Nonuniform Distance Distributions. Most of the existing probabilistic distance-based models and the tools from stochastic geometry assume uniform node distribution. However, in many realistic ad hoc networks, nodes are not always uniformly distributed, either initially or due to node mobility. For both types of NDDs, the approaches based on D&R methods can handle the case where nodes are uniformly distributed with different node densities in different subareas of a network, which leads to a discrete nonuniform node distribution. It is necessary to consider a more general, continuous nonuniform node distribution.

3D Distance Distributions. The Ref2Ran NDDs can be easily extended to consider a reference point with height by the Pythagorean theorem, while for a more general 3D scenario and the Ran2Ran NDDs, there is little work on the approaches in the current literature.

Non-Euclidean Distance Distributions. Most of our existing work, and the existing literature, focused on Euclidean (2-norm) distances, which indeed have wide applications in wireless communication systems as radio signals propagate in the same, uniform medium. However, other distance norms, e.g., Manhattan (1-norm), Chebyshev (\(\infty \)-norm) and generic l-norms, also have wide application in natural sciences and engineering disciplines, including computer networking. For example, when vehicles travel in a downtown urban scenario, Manhattan distance is more appropriate for travel distance calculation (or carry-and-forward delay in VANETs), which is in fact called taxicab geometry. There are only a few isolated distance results on taxicab geometry and other non-Euclidean geometries, including non-planar geometries such as Lobachevskian (hyperbolic) and Riemannian (spherical) geometries, which are increasingly used for modeling logical and physical networks. We plan to extend our planar geometry results to non-planar ones, with new results and new approaches.

High-Order and Multi-hop Distance Distributions. So far, our work focused on the distance distributions between two random points, or between a random point and a reference point. A high-order distance distribution involves more than two points. For example, in relay communications, the relay can choose different forwarding schemes (amplify-and-forward, decode-and-forward, and so on) and the destination can select or combine the signal from the source or relay (selective combination, maximum rate combination, and so on). For amplify-and-forward and selective combination, the signal strength received at the destination depends on the path loss of both the source-to-relay and relay-to-destination channels, eventually the product of the distances among the source, relays and destination. Similarly there is a need for the sum, difference and ratio of two distances. Furthermore, more than two distances can be involved, e.g., in the distribution of hop distances between source and destination. This is a very hard problem and there are only a few results for two or three hops [40].

Joint and Conditional Distance Distributions. The ultimate goal is to address the joint and conditional distance distributions, with potentially correlated distances due to the triangular inequality. For engineering problems, although closed-form explicit expressions with elementary functions are our target, as what we have achieved for rhombuses, hexagons and triangles, we also have the freedom to develop algorithmic approaches that produce symbolic results in a parametrized way. For example, for an arbitrary geometry without symbolic expressions, it is unlikely to derive its properties symbolically, but given an arbitrary geometry with parameters, the algorithm and thus computer program can output parametrized expressions. This can further guide and speed up numerical calculation and error or bound analysis, determining expression truncation and numerical precision, for practical usage purposes.

5 Conclusion

As a significant complementary tool to the PPP/BPP models from stochastic geometry, probabilistic distance-based model has been extensively studied and applied in finite wireless networks. In this paper, we surveyed the state-of-the-art approaches to both the Ref2Ran and Ran2Ran NDDs with arbitrary shapes and nonuniform densities, and their applications in ad hoc networks. The still-open issues were also discussed for the future research directions in this field.