Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

3.1 Information Dissemination in Loosely-Coupled VANETs

The goal of this book is to explore the possibility to enable ITSs services by exclusively leveraging on existing technologies such as WiFi networks and cellular networks. In this chapter, we evaluate the feasibility of using WiFi networks to disseminate or collect information on a small geographical area with a pure V2V approach. As it does not make sense to build a novel WiFi infrastructure for vehicular communications, we will investigate the possibility to realize V2V-based services by leveraging on commodity hardware and software, which can be currently found on smartphones, tablets and OISs. The main advantage of this approach is the dramatic economic cost reduction, while the main drawback is the heritage of all the characteristics of the WiFi technology, without any chance of adapting it for the vehicular communications scenario. In fact, relying on already deployed devices prevents from performing any technical optimization in some crucial areas such as low layer protocol stacks (physical and MAC layer) and antenna design.

As a consequence, our scenario is affected by a series of issues. First of all, the transmission range of a device maxes out at 100 m but it is often far smaller. The physical layer of WiFi technology has not been conceived for communication in the harsh conditions that characterizes a vehicular wireless channel, which reduces the reliability of the communications. Furthermore, the infrastructure mode of the WiFi protocol has a high setup time and does not scale very well, making difficult to adapt it to highly dynamic vehicular network topologies. For these scenarios, the WiFi ad-hoc mode can be a better choice, at least in terms of scalability.

Because of these important issues, the IEEE 802.11 technology cannot be used to build and maintain stable VANETs. Instead, it can only used to build loosely-coupled VANETs, whose members have weak bonds and the concept itself of network topology is feeble. In such VANETs, it is infeasible to use unicast V2V protocols to disseminate information, as the network topology is not sufficiently stable. Therefore, in loosely-coupled VANETs, information can be disseminated exclusively by means of broadcast or geocast protocols, either single- or multi-hop. Single-hop broadcast protocols have an intrinsic higher transmission efficiency, but they have a limited transmission range and they can therefore send information only in a small region centered in the source. Conversely, multi-hop broadcast have lower transmission efficiency due to the half-duplex nature of the employed radios, but allow to reach vehicles that are out of the transmission range of the original source. Furthermore, with multi-hop broadcast protocols, it is possible to provide some geocast functionalities also in loosely-coupled networks.

In this chapter, we focus on both information dissemination applications and data collection applications. In particular, we present a novel theoretical framework for the analytical performance evaluation of a family of multihop broadcast dissemination protocols that can be really installed on already deployed WiFi-ready devices, and therefore do not require to modify the lower layers of the protocol stack. Such a low complexity theoretical framework is useful to characterize the main performance metrics of a family of probabilistic multihop broadcast protocols with applications to VANET scenarios. First, we show that the average positions of a given number of points of a PPP falling in a segment with finite length are equally spaced. Then, assuming a silencing mechanism at each hop, we derive a recursive (hop-wise) theoretical performance evaluation framework which exploits the assumption of fixed and equally spaced vehicles positions in each retransmission hop. In particular, such a performance analysis is likely to be representative of the average (with respect to the nodes’ spatial distribution) performance of the broadcast protocols at hand, as it is confirmed by simulations carried out with the Network Simulator 2 (ns-2) [1]. Moreover, the proposed analytical model applies also to other vehicle spatial distributions, provided that the average inter-vehicle distance is fixed. The impact of node mobility will also be evaluated. Although we consider two novel illustrative broadcast protocols, we underline that our approach is general.

As aforementioned, we also analyze a data collection scenario, where the VANET acts as a vehicular sensor network. The presented approach consists in the creation, during a downlink phase, of a clustered VANET topology during fast broadcast data dissemination, from the Access Point (AP), through a novel clustering protocol, denoted as Cluster-Head Election IF (CHE-IF). This clustered topology is then exploited, during an uplink phase, to collect information from the vehicles and perform distributed detection. Our results highlight the existing trade-off between decision delay and energy efficiency. Unlike classical sensor networks for distributed detection, the proposed vehicular distributed detection schemes exploit the natural vehicle clustering and have to cope with their “ephemeral” nature. More precisely, vehicle mobility has a direct impact on the maximum amount of data which can be collected, thus leading to the concept of decentralized detection on the move. In particular, we analyze the performance of vehicular decentralized detection schemes, based on the observation, by all vehicles of a VANET, of a spatially constant phenomenon of interest.

The chapter is structured as follows. In Sect. 3.2, multihop broadcast protocols for linear networks are introduced. Section 3.3 is devoted to the derivation of the average distribution of a given number of points of a PPP in a segment with finite length. In Sect. 3.4, a succinct overview of the IEEE 802.11b standard is provided. In Sect. 3.5, the family of probabilistic broadcast protocol with silencing is accurately described. In Sect. 3.6, the analytical framework for performance evaluation of the probabilistic broadcast protocols of interest is presented. In Sect. 3.7, after the validation of the analytical framework by means of numerical simulation, the performance of the novel probabilistic broadcast protocols is investigated and compared with that of other (known) protocols. Finally, in Sect. 3.8 we analyze the data collection scenario, where the VANET acts as a vehicular sensor network.

3.2 Multihop Broadcast Protocols

Reducing the number of redundant packets, while still ensuring good coverage and low latency, is one of the main objectives in multi-hop broadcasting. In fact, a too large number of transmissions acts unavoidably leads to unsustainable levels of latency, retransmissions, and collisions: the overall phenomenon is typically referred to as broadcast storm problem [2] and it mainly affects dense networks. The problem of minimizing the number of transmissions has been deeply investigated by the Mobile Ad-hoc NETworks (MANETs) research community: the theoretically optimal solution consists in designating, as relays, the nodes belonging to the Minimum Connected Dominant Set (MCDS) of the network [3]. The nodes within the MCDS have the following properties: (i) they form a connected graph; (ii) every other node of the network is one-hop connected with a node in the MCDS; (iii) the MCDS has the lowest cardinality over all the possible collections of nodes that satisfy the previous two requirements.

The general goal of a multihop broadcast protocol is to attain the widest network coverage in the shortest possible time. This can be obtained by pursuing three intermediate goals: (i) minimizing the number of communication hops; (ii) minimizing the number of effective retransmissions in every hop; (iii) minimizing the latency associated with a single hop. The number of transmission hops can be minimized by designating, as relays, the nodes forming the MCDS. However, the number of retransmissions and the latency are directly affected by the protocol characteristics, and there is no general rule for minimizing them.

Following the “idealized” MCDS-based design approach, a plethora of multihop broadcast protocols have been recently proposed in the VANET literature. Some of them, such as the Emergency Message for Vehicular environments (EMDV) protocol [4], achieve remarkable performance by exploiting partial or complete knowledge of the network topology [5]. However, since collecting this type of information may be very expensive in terms of overhead, other techniques (requiring a reduced information exchange) have been proposed. An efficient IEEE 802.11-based protocol, denoted as Urban Multihop Broadcast (UMB), was proposed by Korkmaz et al. [6, 7]. UMB suppresses the broadcast redundancy by means of a black-burst contention approach [8], followed by a Ready-To-Send/Clear-To-Send (RTS/CTS)-like mechanism. According to this protocol, a node can broadcast a packet only after having secured channel control. A different approach is adopted by another IEEE 802.11-based protocol, denoted as Smart Broadcast (SB) [9]. Similarly to UMB, SB partitions the transmission range of the source, associating non-overlapping contention windows to different regions. The Binary Partition Assisted Protocol (BPAB) [10] uses concepts from both UMB and SB, thus presenting similar performance, with an improvement, with respect to the SB protocol, in VANETs with low vehicle spatial density and irregular topologies. Finally, a different approach is considered when analyzing the class of probabilistic broadcast protocols, designed around the idea that each node forwards a received packet according to a characteristic Probability Assignment Function (PAF), computed by each node in a distributed manner [11, 12]. An entire class of probabilistic broadcast protocols is proposed and analyzed by Wisitpongphan et al. [13].

Fig. 3.1
figure 1

A typical linear network topology of a VANET

3.2.1 Reference Scenario

Figure 3.1 shows the linear network topology of reference for a generic multihop broadcast protocol: a static one-dimensional wireless network with a source and \(N\) (receiving) nodes. The assumption of static nodes is not restricting. In fact, from the perspective of a single transmitted packet, because of the very short transmission time (with typical IEEE 802.11 transmission rates), the network appears as static [14]. At the same time, a one-dimensional network is suitable for analyzing highway-like VANETs, where the width of the road (lying in the interval \([10-40]\) m) is significantly smaller than the transmission range of an IEEE 802.11 network interface. These motivations are supported by simulation results, illustrated in Sect. 3.7.

We consider a deterministic free-space propagation model (i.e., without fading) and a fixed transmit power: therefore, each vehicle has a fixed transmission range, denoted as \(z\) (dimension: [m]). The network size (the line length) is set to \(L\) (dimension: [m]). For generality, we denote as normalized network size the positive real number \(\ell _\mathrm{norm} \triangleq L/z\). Generally, \(\ell _\mathrm{norm} > 1\) and this motivates the need for multihop communication protocols.

On the basis of empirical traffic data [15], the nodes’ positions are generated according to a Poisson Point Process (PPP) of parameter \(\rho _\mathrm{s}\), where \(\rho _\mathrm{s}\) is the vehicle (linear) spatial density (dimension: [veh/m])—the symbol “veh” it is not a realistic unit of measure, but it will be used for the sake of clarity. Consequently, \(N\) is a random variable characterized by a one-dimensional Poisson distribution with parameter \(\rho _\mathrm{s} L\). Similarly, the random variable \(N_z\), denoting the number of nodes lying in the transmission range of the source (e.g., within the interval \((0,z)\)), has a Poisson distribution with parameter \(\rho _\mathrm{s} z\). Thanks to the properties of the Poisson distribution, the inter-vehicle distance is exponentially distributed with parameter \(\rho _\mathrm{s}\) and the (constant) average distance between two consecutive vehicles is \(1/\rho _\mathrm{s}\).

As shown in Fig. 3.1, the source node, denoted as node \(0\), is placed at the west end of the network, and we assume a single propagation direction (eastbound). Each of the remaining \(N\) nodes is uniquely identified by an index \(i \in \{1,2,\dots ,N\}\). The distance between the \(i\)-th and \(j\)-th nodes (\(i,j \in \{1,2,\dots ,N\}, \, i \ne j\)) is denoted as \(d_{i,j}\). Each vehicle can exactly estimate the value of \(d_{i,j}\), thanks to the following assumptions: (i) the position of the source is a-priori known by every node; (ii) each vehicle knows its own position under the assumption of the presence (on board) of a Global Positioning System (GPS) receiver; (iii) each rebroadcaster inserts its own geographical coordinates within the packet. In the one-dimensional and with a single propagation direction scenario described in Fig. 3.1, the operational principle of a multihop broadcast protocol is quite simple. The initial transmission of a new packet from the source is denoted as the \(0\)-th hop transmission, while the source itself identifies the so-called \(0\)-th Transmission Domain (TD). After the source transmission, the packet is then received by the \(N_z\) source’s neighbors, that are the potential rebroadcasters at the \(1\)-st hop. Hence, their ensemble constitutes the \(1\)-st TD. Each vehicle in the \(1\)-st TD decides to forward the packet according to a PAF specified by the broadcast protocol. The use of silencing corresponds to the fact that the “fastest” retransmitter—among the set of those which have decided to retransmit—silences the others. Note that a collision may happen if at least two nodes of a TD retransmit simultaneously. The propagation process is therefore constituted by multiple packet retransmissions, that continue at most until the east end of the network—as will be clear in the following, with a probabilistic broadcasting protocol the retransmission process might terminate before reaching the end of the network.

3.2.2 Performance Metrics of Interest

In this chapter, the performance of probabilistic multihop broadcast protocols are investigated using the following average metrics: (i) the REachability (RE), (ii) the Transmission Efficiency (TE), and (iii) the end-to-end delay (D). The RE (adimensional), originally introduced by Ni et al. [2], is the fraction of nodes that receive the source packet among the set of all reachable nodes. The cardinality of the set of the reachable nodes is denoted as \(n_\mathrm{reach}\), and can be expressed as \(n_\mathrm{reach} = \min (N,n^*)\), where \(n^*\) is the minimum index such as the condition \(d_{n^*,n^*+1} > z\) is verified. This definition is necessary since in PPP scenarios, as those considered in this Section, there can exist a pair of disconnected consecutive nodes (\(n^*,n^*+1\)). The TE (adimensional) is defined as the ratio between the RE of a packet and the overall number of rebroadcast acts experienced during its transmission to the last reachable node. Finally, D (dim: [ms]) is defined as the duration of the packet trip between the source and the last reachable node. We remark that only the packets received correctly at the \(n_\mathrm{reach}\)-th node of the network are considered for the evaluation of D. Therefore, this definition of D corresponds to a worst case scenario.

Owing to the symmetry of the forwarding process, the entire network can be modeled on the basis of the (local) analysis of a single TD. Therefore, in Sect. 3.3 we focus on a single TD—the reasons behind this assumption will be better clarified in Sect. 3.5.

3.3 Average Distribution of Poisson Points in a Segment with Finite Length

In one-dimensional networks, like those considered in this chapter, inter-node distance knowledge is necessary to implement the MCDS solution. For this reason, most of the proposed multihop broadcast protocols assume, at least to some extent, such a knowledge. Therefore, the first step to derive an analytical model consists in statistically characterizing the spatial distribution of the vehicles. In the literature, node positions are frequently modeled with a PPP. Despite its apparent simplicity, the derivation of an analytical performance evaluation framework based on the assumption of Poisson spatial distribution of the vehicles is not straightforward.

Fig. 3.2
figure 2

Illustrative realization of a PPP

We now present a constructive definition of a PPP with parameter \(\rho _\mathrm{s} \in \mathbb {R}^+\), directly inspired from the one presented in Papoulis’ book [16]. Given a finite interval \((-T/2, T/2) \subset \mathbb {R}\), place \(n \in \mathbb {N}\) points in \((-T/2, T/2)\), under the constraint that \(n/T = \rho _\mathrm{s}\). A PPP is obtained by letting \(n \rightarrow \infty \) and \(T \rightarrow \infty \), under the constraint that \(n/T\) remains equal to \(\rho _\mathrm{s}\). A PPP has the following properties: (i) the distance between two consecutive points is a random variable with an exponential distribution with parameter \(\rho _\mathrm{s}\); (ii) given \(z \in \mathbb {R}^+\), the number of points falling in the finite interval \(\fancyscript{I} \triangleq (0,z) \subset \mathbb {R}\) is a random variable with a Poisson distribution with parameter \(\rho _\mathrm{s}z\). In Fig. 3.2, an illustrative realization of a PPP with parameter \(\rho _\mathrm{s}\) is shown. With reference to Fig. 3.2, denoting by \(n\) the number of Poisson points falling in \(\fancyscript{I}\) it is possible to define the \(n\)-dimensional positions vector

$$\begin{aligned} \mathbf{R} ^{(n)} = [R_1 R_2 \dots R_n] \end{aligned}$$
(3.1)

where \(R_i\) (\(i \in \{1,2,\dots ,n\}\)) is the distance of the \(i\)-th point from the source (placed in zero)—in the illustrative case in Fig. 3.2, \(n=2\).

In Appendix B.1, it is shown that the marginal Probability Density Function (PDF) of \(R_j\) is:

$$\begin{aligned} f_{R_j}^{(n)}(r)={\left\{ \begin{array}{ll} \frac{n!}{z^{n}}\frac{(z-r)^{n-j}\,r^{j-1}}{(n-j)!\,(j-1)!} &{} r\in (0,z)\quad \quad \quad \quad j=1,\dots ,n\\ 0 &{} \mathrm{otherwise}. \end{array}\right. } \end{aligned}$$
(3.2)

In Fig. 3.3, the PDFs of the positions of consecutive nodes are shown for various values of \(n\): (a) \(1\), (b) \(2\), and (c) \(4\). In Appendix B.1, it is also shown that the average position of the \(j\)-th node can be expressed as follows:

$$\begin{aligned} \overline{R}_j^{(n)}=\int \limits _{0}^{z}r\frac{n!}{z^{n}}\frac{(z-r)^{n-j}\,r^{j-1}}{(n-j)!\,(j-1)!}\,\mathrm{d}r_{j}=j\frac{z}{n+1}\quad \quad \quad j=1,\dots ,n. \end{aligned}$$
(3.3)

From Eq. (3.3), it clearly emerges that, for a given number of nodes falling in a finite segment \(\fancyscript{I}\), their average positions are equally spaced. The average node positions, for various values of the number \(n\) of nodes in \(\fancyscript{I}\), are also shown in Fig. 3.3.

Fig. 3.3
figure 3

\(\{f_{R_i^{(n)}}(r)\}_{i=1} ^{n}\) for different values of \(n\): a \(n=1\), b \(n=2\), and c \(n=4\)

Thanks to these results, the average performance analysis of a broadcast protocol in a network with Poisson node distribution can be carried out by simply studying a deterministic scenario, where the nodes are placed in correspondence to the average positions of the corresponding Poisson-based scenario. Moreover, this average analysis applies to other vehicle spatial distributions (e.g., taking into account the constraint on the vehicle lengths) with equally spaced average positions.

3.4 A Quick Overview of the IEEE 802.11b Standard

3.4.1 The IEEE 802.11 Standard

In this chapter, we assume that the physical and the Medium Access Control (MAC) layers of every node adhere to the IEEE 802.11b standard [17]. The IEEE 802.11 standard has been introduced in Sect. 2.5.2, hence we limit our discussion to the aspects that are of interest for this section. In particular, the PHY layer aspects are discussed in Sect. 3.4.2, while the relevant MAC layer characteristics are analyzed in Sect. 3.4.3.

3.4.2 Physical Layer

The IEEE 802.11 standard defines several PHY layers differing in terms of modulation format and carrier frequencies [17]. Because of their obsolescence we ignore some of them, namely, the legacy Frequency Hopping Spread Spectrum (FHSS), Direct Sequence Spread Spectrum (DSSS), and InfraRed (IR) modulations, respectively, defined in Chaps. 14, 15, and 16 of the standard [17]. We also ignore the IEEE 802.11n amendment [18], which defines an high rate Multiple Input Multiple Output (MIMO) modulation format, because it has not reached yet a sufficient diffusion (especially in handheld devices), and because its MIMO capabilities are not yet supported by the IEEE 802.11p amendment. We therefore focus on the remaining physical layers, introduced in the amendments a, b, and g.

The IEEE 802.11b amendment (now in Chap. 18 of the IEEE 802.11 standard [17]) has introduced the so-called High Rate Direct Sequence Spread Spectrum (HR/DSSS) modulation, which combines the original DSSS modulation of the legacy standard with a 8-chip Complementary Code Keying (CCK) modulation, providing a maximum data rate of 11 Mbit/s. The IEEE 802.11b standard defines 14 overlapped channels of 22 MHz width centered in the nearby of 2.4 GHz frequency. Because of overlapping, there is a strong co-channel interference, and therefore the channels cannot be used all together. Thanks to its adaptive rate selection capabilities, an IEEE 802.11b network interface can select the desired data rate in the \(\{1,2,5.5,11 \}\) Mbit/s set. Obviously, a lower data rate leads to a higher receiver sensitivity, thus allowing to operate in harder channel conditions, with a lower Signal to Noise Ratio (SNR). As a rule of thumb, downscaling the data rate from 11Mbit/s and 1Mbit/s allows to improve the sensitivity of approximately 8 dB.

On the other hand, the IEEE 802.11a amendment (now in Chap. 17 of the IEEE 802.11 standard [17]) is based on a more robust Orthogonal Frequency Division Multiplexing (OFDM) modulation, which offers a greater maximum data rate of 54 Mbit/s. Also in this case, the radio interface can adaptively select lower data rates, scaling down up to 6 Mbit/s. Differently from IEEE 802.11b, the IEEE 802.11a works in the \([5.2, 5.8]\) GHz frequency band. The number of channels is not fixed, as it is possible to use channels of three different size, namely, 5, 10, and 20 MHz. Each channel is separated into 52 orthogonal sub-carriers. Depending on the modulation scheme each sub-carrier encodes a specific number of bits in each symbol; for example, using the relatively simple Binary Phase Shift Keying (BPSK) modulation scheme, each sub-carrier encodes 1 bit. The signals of all sub-carriers are transformed into the time domain as symbols of fixed length. Subsequent symbols are separated by a guard interval in order to avoid interferences between distinct symbols. The duration of the guard interval is function of the channel bandwidth. In particular, with a channel size equal to, respectively, 5, 10, and 20 MHz, the corresponding guard interval length is equal to 3.2, 1.6, and 0.8 \(\upmu \mathrm{s}\).

Finally, the IEEE 802.11g release (now in Chapter 19 of the IEEE 802.11 standard [17]) defines the Extended Rate PHY (ERP), a collection of different PHYs that are partially retro-compatible with the pre-existent modulation formats (especially the HR/DSSS). However, only the ERP-OFDM mode is implemented by almost all the chipsets, while the other modulation are not very spread in the market [19]. The ERP-OFDM modulation format is basically a simple transposition of the IEEE 802.11a OFDM modulation in the 2.4 GHz band, with a few minor changes to provide backwards compatibility. It supports the channel bandwidth, data rates and guard intervals of the IEEE 802.11a modulation.

3.4.3 MAC Layer

The basic building block of an IEEE 802.11 network is the Basic Service Set (BSS), a group of STAtions (STAs) that can communicate with each other. IEEE 802.11 offers different opportunities to build a BSS. For instance, nodes can form an Independent BSS (IBSS) with no central coordination authority, or, as in environments with infrastructure—i.e., Access Point (AP)—be part of an infrastructure BSS which is identified by an individual identification number.

The IEEE 802.11 standard defines three types of frames, management, control, and data, that share a set of common characteristics. In particular, all the frames include a bit field for frame control, a duration field, several addresses, the frame body and a Frame Control Sequence (FCS) for error detection. Each subtype is derived and adapted from the generic format (i.e., specific fields and data elements are added or left out). IEEE 802.11 provides several approaches for medium access control: (i) Point Coordination Function (PCF), which is only applicable if an AP is available; (ii) Distributed Coordination Function (DCF), which can be used also in fully distributed networks; (iii) Hybrid Coordination Function (HCF), defined in the IEEE 802.11e amendment [20]. Within the HCF, there are two channel access methods, similar to those defined in the legacy 802.11 MAC: HCF Controlled Channel Access (HCCA), and EDCA—already introduced in Sect. 2.13. In both EDCA and HCCA, every packet has to be assigned to a particular Access Class (AC). In turn, every AC establishes different channel access settings, allowing to assign different priority levels to the packets [21]. Since the PCF and the HCCA mechanism are not of interest in VANETs, in the rest of the section we focus on the DCF mechanism. The DCF defines two different channel access mechanisms, both based on a Carrier Sense Multiple Access with Collision Avoidance (CSMA/CA) strategy, which differ for the number of employed control packets: the Basic Access (BA) and the Ready To Send / Clear To Send Access. Due to the broadcast nature of the communications, the contention channel is managed through the BA mechanism, whose operational principles are described in the following paragraph.

Fig. 3.4
figure 4

Distributed Coordination Function mechanism defined by the IEEE 802.11 standard

3.4.3.1 Basic Access

The functioning of the Basic Access is represented in Fig. 3.4. When a node has a frame ready to be transmitted, it checks if the channel remains idle for a period of time at least longer than a Distributed InterFrame Space (DIFS): if this is the case, the node is free to immediately transmit. On the opposite, if the wireless medium is busy, the node defers its transmission until the medium remains idle for a whole DIFS without interruption. In the latter case, once the DIFS has elapsed, the node generates a random backoff period, which corresponds to an additional waiting time before transmitting (pre-backoff). The node transmits when the backoff time has elapsed. At each transmission act, the backoff time is uniformly chosen in the range \([0, cw-1]\), where \(cw\) is the current backoff window size, that is constant and equal to the minimum value defined by the standard, denoted as \(CW _\mathrm{min}\), and corresponding to 32. The backoff period is slotted and the duration of the backoff, expressed in terms of number of backoff slots, is denoted as Backoff Counter (BC). This number is decremented as long as the medium is sensed idle, and it is frozen when a transmission is detected on the channel (this is an instance of a collision avoidance mechanism). Decrementing restarts when the medium is sensed idle again for more than a DIFS. At the end of every packet transmission, the node is forced to enter a post-backoff phase, which coincides with the subsequent pre-backoff, if the node has another packet in the transmission queue. It is important to observe that when a relay finds the channel idle, it can immediately transmit, but this is not mandatory. In order to reduce the number of collisions within a TD, we have interpreted the standard in a non-persistent manner, imposing that every relay enters into the pre-backoff phase, regardless of the channel status. We also remark that the extension of our approach to scenarios with IEEE 802.11p [22] communications, as envisioned in VANETs, is straightforward. Our approach (based on the IEEE 802.11b standard) is meaningful under the assumption of smartphone-based vehicular communications [23, 24].

In case of unicast transmissions, after a period of time equal to a Short InterFrame Space (SIFS), the destination STA has to send an acknowledgment (ACK) in order to confirm the (hopefully), successful reception of the packet. The SIFS is shorter than the DIFS, thus giving a higher priority to the ACK transmission. After the reception of the ACK, the sender STA is forced to begin a backoff period, denoted as post-backoff. Otherwise, the sender would “capture” the channel precluding access to the other STAs. So, if another frame is ready for transmission before this post backoff period ends, the STA has to execute it until the end before transmitting the frame. If the original sender does not correctly receive the ACK, and if it has not yet exceed the maximum number of retry, it can re-attempt the packet transmission from the beginning, after having doubled its CW value. We note that the CW is initialized to \(\mathrm{CW}_\mathrm{min}\), and it cannot exceed the value \(\mathrm{CW}_\mathrm{max}\).

In case of a broadcast transmission the use of the ACK is forbidden, determining a slightly different Basic Access behavior.Footnote 1 Without ACKs, the sender cannot known the status of the packet reception at the destination and therefore there are never retransmissions (at least at MAC layer). This leads to several consequences:

  • CW is never increased and it is always equal to \(\mathrm{CW}_\mathrm{min}\);

  • the transmissions are intrinsically less reliable;

  • the transmission overhead is smaller.

Because of the less reliable transmissions it is necessary to adopt suitable countermeasures at the upper layers (network and transport).

We finally observe that, when the last received packet by a certain node was corrupted,Footnote 2 the DIFS period shall be replaced by a longer Extended IFS (EIFS) period. If we define the duration of a SIFS, DIFS and EIFS, as respectively, \(T_\mathrm{SIFS}\), \(T_\mathrm{DIFS}\), and \(T_\mathrm{EIFS}\) (dimension: [\(\upmu \mathrm{s}\)]), the following relations hold:

$$\begin{aligned} T_\mathrm{DIFS}&= T_\mathrm{SIFS} + 2 T_\mathrm{SLOT} \\ T_\mathrm{EIFS}&= T_\mathrm{SIFS} + T_\mathrm{DIFS} + T_\mathrm{ACK}, \end{aligned}$$

where \(T_\mathrm{ACK}\) (dimension: [\(\upmu \mathrm{s}\)]) is the time required to transmit an ACK frame.

3.4.4 Main IEEE 802.11 Parameters

Table 3.1 summarizes the main default parameters defined for the IEEE 802.11a and IEEE 802.11b standards [17].

Table 3.1 Main parameters of IEEE 802.11a and IEEE 802.11b standards

3.5 Probabilistic Broadcast Protocols with Silencing

3.5.1 Preliminaries Considerations

The general goal of a multihop broadcast protocol is to attain the widest network coverage in the shortest possible time. This can be obtained by pursuing three intermediate goals: (i) minimizing the number of communication hops; (ii) minimizing the number of effective retransmissions in every hop; (iii) minimizing the latency associated with a single hop. The number of transmission hops can be minimized by designating, as relays, the nodes forming the MCDS. However, the number of retransmissions and the latency are directly affected by the protocol characteristics, and there is no general rule for minimizing them—this motivates the presence, in the literature, of a large number of heuristic broadcast protocols.

A probabilistic broadcast protocol tries to achieve the goals outlined in the previous paragraph in a probabilistic and completely distributed manner: (i) probabilistic, in the sense that every intermediate node decides to retransmit a packet according to a certain PAF, computed on a per-packet manner—even if, in general, one could introduce a per-flow PAF, in this Section we focus on single packet transmissions; (ii) distributed, in the sense that every node autonomously makes a retransmission decision without any coordination with its neighbors.

In “classical” probabilistic broadcast protocols (without silencing), if no suitable counter-measures are adopted, it is possible that more than one node in a TD decides to rebroadcast the packet (even without collisions). This leads to inefficiencies—besides complicating the mathematical analysis. A more efficient probabilistic broadcast protocol, regardless of the expression of the PAF, is obtained in the presence of a single retransmitting node in every TD. This can be obtained by imposing that the reception of a packet sent by a node of a TD silences the preceding nodes of the same TD. As a consequence, the next TD starts from the node which follows the “silencer.” Note that the last TD partially overlaps with the previous one if the “silencer” is not a member of the MCDS.

In this chapter, we consider two novel probabilistic broadcast protocols with silencing, whose operations can be described as follows, with respect to the first TD.

  1. 1.

    The source sends a new packet (directly mapped on an IEEE 802.11 frame).

  2. 2.

    The nodes within a distance \(z\) from the source receive the packet and form the \(1\)-st TD. Their number is denoted as \(N_{z}\).

  3. 3.

    Every node in the \(1\)-st TD probabilistically decides, according to the given PAF and taking into account its distance from the source, to retransmit (or not) the packet.

  4. 4.

    The potential forwarders (i.e., the nodes of the \(1\)-st TD which have decided to retransmit) compete for channel access, by using the BA mechanism of the IEEE 802.11b standard (described in Sect. 3.4), first entering in the pre-backoff phase and, then, generating a random waiting time (denoted as BC, in Sect. 3.4). For the purpose of analytical simplicity, we assume that the BCs of the losing contenders are set to \(\infty \).

  5. 5.

    The BCs are continuously decreased by all nodes, until (in the case of a successful forwarding) only one of them reaches 0, say the \(k\)-th BC. During a transmission of a node the other BCs freeze. Should there be the BCs of at least two nodes which reach simultaneously zero, both nodes would transmit and, thus, collide. We assume that the packets involved in a collision are considered undetectable and ignored by the other nodes. The corresponding \(k\)-th node retransmits the packet.

  6. 6.

    The remaining \(N_z-1\) nodes decode the packets, reset their timers, and discard the potentially queued packet. The nodes (spatially) preceding the \(k\)-th node will refrain from retransmitting from then on.

  7. 7.

    The whole process (from step 1) is restarted at the \(2\)-nd TD, for which the \(k\)-th node acts as the source. The \(2\)-nd TD is composed by all nodes lying in the interval \((d_{0,k}, d_{0,k}+z) \subset \mathbb {R}\), and it can also include some former nodes of the \(1\)-st TD (those following the \(k\)-th node).

The two novel probabilistic broadcast protocols, polynomial and SIF, are described in the following two subsections.

3.5.2 Polynomial Broadcast Protocol

This protocol is characterized by a polynomial PAF, with the following form:

$$\begin{aligned} p(d,z,g) \triangleq \left( \frac{d}{z}\right) ^{g} \end{aligned}$$
(3.4)

where \(d\) denotes the distance (dimension: [m]) between the node of interest and the previous relay (or source, in the case of the first TD); \(z\) is the already introduced transmission range; \(g \in \mathbb {N}\) is the polynomial order. According to the assumptions in Sect. 3.2, both \(z\) and \(d\) are known, without the need of exchanging additional messages. In fact, \(z\) can be estimated by knowing the transmit power and the channel propagation model, while \(d\) can be estimated by simply inserting the position of the source vehicle in every transmitted packet (under the assumption of having an accurate GPS receiver).

The shape of \(p\), as a function of \(d\), is shown in Fig. 3.5, for different values of \(g\). It can be observed that the function \(p\) is monotonic and concave for all values of \(g\). For high values of \(g\), it becomes quite “selective,” since it is approximately zero everywhere, but in the proximity of \(z\). Note that the case with \(g=0\) (\(p=1, \, \forall d\)) corresponds to the flooding protocol, i.e., each node retransmits. In this case, the BC value is randomly selected in \(\{0,1,\dots ,cw-1\}\) as mandated by the IEEE 802.11 standard (Sect. 3.4).

Fig. 3.5
figure 5

Probability of retransmission (denoted as \(p\)) of the polynomial probabilistic protocol as a function of the distance \(d\) for several values of \(g\)

3.5.3 Silencing Irresponsible Forwarding

This broadcast protocol directly derives from the Irresponsible Forwarding (IF) protocol, originally presented in [25], with the introduction of the silencing mechanism outlined in Sect. 3.5.2. Besides this difference, IF and SIF share the same PAF, namely:

$$\begin{aligned} p(d,z,g) \triangleq \exp \left\{ -\rho _\mathrm{_s} \frac{(d-z)}{c}\right\} \end{aligned}$$
(3.5)

where \(c\) is an adimensional shaping coefficient and \(\rho _\mathrm{s}\) is the vehicle spatial density. The latter can be estimated in a straightforward manner. In fact, under the assumption of knowing with a sufficient accuracy its transmission range, a node can estimate its local vehicular spatial density by simply counting the number of nodes lying within its transmission range and dividing them by the transmission range. The design of an efficient method for accurate estimation of the vehicular spatial density goes beyond the scope of this manuscript. However, intuitively it is sufficient to periodically send (and receive) Hello messages to the surrounding nodes. Alternatively, it is possible to rely on already existing beaconing mechanisms, such as the exchange of Cooperative Awareness Messages (CAMs) foreseen by the European Car-to-car consortium (broadcasted by default every 500 ms) [26].

Fig. 3.6
figure 6

Probability of retransmission (denoted as \(p\)) of the SIF protocol as a function of the distance \(d\) for several values of \(c\) and \(\rho _\mathrm{s}z\)

Similarly to the PAF of the polynomial broadcast protocol, also the PAF of the SIP protocol “rewards” the farthest nodes (with respect to the transmitter). However, unlike the polynomial PAF, the SIF’s PAF protocol also takes into accounts the (linear) vehicular spatial density, thus allowing to better adapt to different traffic conditions—this is the very idea of IF. The shape of \(p\), as a function of \(d\), is shown in Fig. 3.6, for different values of \(c\) and \(\rho _\mathrm{s}\). It can be observed that the SIF’s PAF is monotonically increasing and concave for all values of \(c\). Moreover, it becomes selective far small values of \(c\) (e.g., 1), while it tends to flatten for high values of \(c\) and for low values of \(\rho _\mathrm{s}\). Also in this case, the BC value is randomly selected in \(\{0,1,\dots ,cw-1\}\) as mandated by the IEEE 802.11 standard (Sect. 3.4).

3.6 A Recursive Analytical Performance Evaluation Framework

In Sect. 3.2, it has been stated that, since all TDs are statistically identical, the global behavior of the network can be modeled by analyzing a single TD. By exploiting the properties of probabilistic broadcast protocols with silencing (described in Sect. 3.5), the following assumptions hold: (i) the inter-node distance is characterized by a (memoryless) exponential distribution, so that the topology of every TD is (statistically) identical; (ii) the PAF only depends on the distance and is, therefore, memoryless; (iii) the IEEE 802.11b contention mechanism is memoryless, in the sense that it is restarted at every retransmission. Under these assumptions, every retransmission act can be interpreted as a renewal that resets the statistics of the forwarding process. Moreover, since all TDs are statistically identical, without loss of generality we can focus on the first TD.

Therefore, a complete analytical performance evaluation framework can be derived in the following manner: (i) characterizing the first TD with local performance metrics (e.g., the successful transmission probability and the delay); (ii) deriving global performance metrics (e.g., D, RE, TE), by means of a recursive approach.

In Sect. 3.6.1, the local performance (i.e., single TD) is investigated under the assumption of a given number of equally spaced nodes, by considering, without loss of generality, the first TD. In Sect. 3.6.2, we derive the global metrics for an overall deterministic network scenario, where the nodes are equally spaced in the interval \((0,L)\). Then, in Sect. 3.6.3 the results obtained in the deterministic scenario are extended to the original PPP-based scenario.

3.6.1 Local (Single Transmission Domain) Performance Analysis with a Given Number of Nodes

Without loss of generality, we focus on the first TD, corresponding to the interval \(\fancyscript{I}\) introduced in Sect. 3.3. We consider a deterministic scenario with a fixed number \(n\) of nodes equally spaced in the interval \(\fancyscript{I} = (0,z) \subset \mathbb {R}\). Every node in a TD is identified by an index \(i \in \{1,2,\dots ,n\}\). The nodes are thus positioned as in Fig. 3.3 and the positions vector \(\mathbf{R} ^{(n)}\) is defined as in (3.1).

According to the operational principles of the considered protocol, after the reception of a packet in a given TD, each node decides to (or not to) retransmit according to the protocol’s PAF. The nodes that lose the contention set their BCs to \(\infty \), while the winners set their BCs according to the policy of the specific broadcast protocol. The protocol execution could lead to three different outcomes: (i) nobody decides to retransmit; (ii) some nodes decide to retransmit, but all their transmitted packets collide; (iii) some nodes decide to retransmit, and a single node transmits successfully (when its BC because zero, no other BC is zero). It is useful to define the following events, associated to the forwarding process in a TD:

$$\begin{aligned} \fancyscript{F}_1&\triangleq \{\text {nobody decides to retransmit}\} \\&= \{ \mathrm{BC}_i = \infty , \forall i \in \{0,1,\dots ,n\}\} \\ \fancyscript{F}_2&\triangleq \{\text {all the transmitted packets collide}\} \\&= \{ \forall i \in \{0,1,\dots ,n\}: \mathrm{BC}_i < \infty , \, \exists j \in \{0,1,\dots ,n\} , \, j \ne i, \mathrm{BC}_j < \infty \\&\quad \text {such as} \, \mathrm{BC}_i=\mathrm{BC}_j \} \\ \fancyscript{F}&\triangleq \{\text {nobody wins the contention}\} = \fancyscript{F}_1 \cup \fancyscript{F}_2 \\ \fancyscript{S}_i&\triangleq \left\{ \text {the node} \, i \, \text {successfully retransmits} \right\} \qquad i \in \{1,\dots ,n\} \\&= \{\mathrm{BC}_i < \infty , \mathrm{BC}_i = \min (\{\mathrm{BC}_m \}_{m=1} ^{n}) \\&\quad \cup \{ \text {if} \, \, \exists j \in \{1,\dots ,n\}, i \ne j: \mathrm{BC}_j < \mathrm{BC}_i, \text {then } \exists \, m \in \{1,\dots ,n\}, \\&\quad m \ne j, m \ne i: \mathrm{BC}_j = \mathrm{BC}_m \}\qquad i \in \{1,\dots ,n\} \\ \fancyscript{S}&\triangleq \{\text {a node successfully retransmits}\} = \bigcup _{i=1} ^{n} {\fancyscript{S}_{i}}. \end{aligned}$$

The probabilities of the above defined events are the following:

$$\begin{aligned} p_\mathrm{rtx}^{(n)}(i)&\triangleq \mathrm{P} \{ \fancyscript{S}_i \} \qquad i=1,2,\dots ,n \\ p_\mathrm{succ}^{(n)}&\triangleq \mathrm{P} \{\fancyscript{S} \} = \sum _{i=1}^{n}p_\mathrm{rtx}^{(n)}(i)\\ p_\mathrm{fail}^{(n)}&\triangleq 1-\mathrm{P} \{\fancyscript{S} \} = 1 - \sum _{i=1}^{n}p_\mathrm{rtx}^{(n)}(i). \end{aligned}$$

Let us now introduce the random variable \(Y\in \left\{ 0,1,2,\dots ,n\right\} \) with the following PMF:

$$\begin{aligned} P_{Y}(y)=P\left\{ Y=y\right\} = \left\{ \begin{array}{cc} p_\mathrm{fail}^{(n)} &{} y=0\\ p_\mathrm{rtx}^{(n)}(y) &{} y \in \{1,2,\dots ,n\}. \end{array} \right. \end{aligned}$$

Since the event \(\{Y=0\}\) identifies the failure event, the random variable \(Y\) indicates either which node has effectively retransmitted or a failure. Moreover, it can be observed that:

$$ \bigcup _{y=1} ^{n} \{Y = y\} =\fancyscript{F} \cup \fancyscript{S}.$$

Obviously,

$$\begin{aligned} P_{Y}(y|\fancyscript{S})=P_{Y}(Y=y|\fancyscript{S}) = \left\{ \begin{array}{cc} 0 &{} y=0\\ \frac{p_\mathrm{rtx}^{(n)}(x)}{\sum _{i=1}^{n}{p_\mathrm{rtx}^{(n)}(i)}} &{} y \in \{1,2,\dots ,n\}. \end{array} \right. \end{aligned}$$

In other words, if there is a retransmission (\(\fancyscript{S}\)), then \(P_{Y}(y | \fancyscript{S})\) (\(y \in \{0,1,2,\dots ,n\}\)) is the probability that the \(y\)-th node has retransmitted.

As shown in Appendix B.2, the transmission probabilities \(\{p_\mathrm{rtx}^{(n)}(i)\}\) can be expressed as follows:

$$\begin{aligned} p_\mathrm{rtx}^{(n)}(i)=p_{i}\sum _{m=1}^{n}q^{(m)}p_{V_{i}^{(n)}}(m-1) \end{aligned}$$
(3.6)

where \(p_{i}\) denotes the value of the PAF (3.4) for the \(i\)-th node and depends on the considered protocol; \(q^{(m)}\) is the probability that the \(i\)-th node wins the contention among a set of \(m\) competing nodes (the same for a given value of \(n\)); \(V_{i}^{(n)} \in \left\{ 0,\dots ,n-1\right\} \) is the following discrete random variable:

$$\begin{aligned} V_{i}^{(n)} \triangleq \left\{ \text {number of nodes, among the n nodes, competing with the i-th node}\right\} . \end{aligned}$$

The derivation of \(q^{(m)}\) and of the PMF of \(V_{i}^{(n)}\) can also be found in Appendix B.2.

After deriving \(p_\mathrm{rtx}^{(n)}(i)\), it is possible to compute the per-hop delay, denoted as \(D_i\), of a retransmission from the \(i\)-th node. Since the per-hop delay is meaningful only if the \(i\)-th node decides to retransmit, it is of interest to study the statistical distribution of \(D_i\) conditioned on \(S_i\). For this reason, we introduce the random variable \(D_{i|i}\), which can be defined as follows:

$$\begin{aligned} D_{i|i} \triangleq T_\mathrm{slot} (\textit{DIFS}+N_{i|i}^\mathrm{bo})+T^\mathrm{tx}\qquad i=1,\dots ,n \end{aligned}$$

where \(T^\mathrm{tx}\) (dimension: [s]) is the transmission time; \(T_\mathrm{slot}\) (dimension: [s/slot]) is the deterministic duration of the backoff slot; \(\textit{DIFS}\) (dimension: [slot]) is the duration of the DIFS; and \(N_{i|i}^\mathrm{bo}\) (dimension: [slots]) is the number of slots spent by the \(i\)-th node during the backoff (conditionally on the event \(S_i\)). We assume that both the packet size, denoted as \(P\) (dimension: [bits]), and the transmission rate, denoted as \(R\) (dimension: [bits/s]), are constant, thus leading to a deterministic packet transmission time \(T^\mathrm{tx} = P/R\). Taking into account that \(\textit{DIFS}\), \(T_\mathrm{slot}\), and \(T^\mathrm{tx}\) are deterministic, the average value of \(D_{i|i}\) becomes:

$$\begin{aligned} \overline{D}_{i|i}=T_\mathrm{slot} (\textit{DIFS}+\overline{N}_{i|i}^\mathrm{bo})+T^\mathrm{tx}\qquad i=1,\dots ,n \end{aligned}$$
(3.7)

where, according to the derivation in Appendix B.3,

$$\begin{aligned} \overline{N}_{i|i}^\mathrm{bo} = \frac{p_{i}}{cw p_\mathrm{rtx}^{(N)}(i)} \sum _{v=0}^{N-1}p_{V_{i}}^{(N)}(v)\sum _{k=1}^{cw-1} \left[ k \sum _{j=0}^{J_{k,v}} P'_v(k,j) + T^\mathrm{tx} \sum _{j=1}^{J_{k,v}} j P'_v(k,j) \right] \end{aligned}$$
(3.8)

where \(J_{k,v} \triangleq \min (k,\lfloor (v/2)\rfloor )\) denotes the maximum number of collisions that can happen in slots \(0,1,\dots ,k-1\), while the matrix \(\mathbf{P}_v' = \{P_v(k,j) \}\) is defined in Appendix B.3.

Proceeding in a similar manner, it is also possible to obtain the average number of retransmissions per-hop of the node \(i\), denoted as \(\overline{N}_\mathrm{rtx}^\mathrm{hop}(i)\):

$$\begin{aligned} \overline{N}_\mathrm{rtx}^\mathrm{hop} (i) = \frac{p_{i}}{cw p_\mathrm{rtx}^{(N)}(i)} \left( 1 + \sum _{v=0}^{N-1} p_{V_{i}^{(N)}}(v) \sum _{k=1}^{cw-1}\sum _{h=2} ^{v} h N_{k,v} (0,h) \sum _{j=0} ^{J_{k,v}} M_{k,v} (j,h) \right) \end{aligned}$$
(3.9)

where the matrices \(\mathbf{M}_ {k,v} = M_{k,v} (j,h)\) and \(\mathbf{N}_{k,v} = N_{k,v} (j,h)\) are defined in Appendix B.3.

3.6.2 Global Performance Analysis with Fixed Number of Nodes

Once the per-TD performance has been analyzed (as described in Sect. 3.6.1), the global performance metrics introduced in Sect. 3.2.2 (namely, RE, TE, and D) can be computed by following a recursive approach, based on the inductive principle. This recursive approach is extensively described, for the evaluation of D, in Appendix B.4, but can be directly re-adapted for the evaluation of RE and TE. In the remainder of this subsection, we outline the final results, trying to provide the reader with the intuition behind them.

Recall that we consider a deterministic scenario with a fixed number \(N\) of nodes equally spaced in the interval \((0,L) \subset \mathbb {R}\), where \(L = z\ell _\mathrm{norm}\). For simplicity, we assume that a generic TD contains \(n=N/\ell _\mathrm{norm}\) nodes. This corresponds to a best-case scenario, where the farthest node of each TD is the domain forwarder (the “silencer,” as denoted in Sect. 3.5).

3.6.2.1 Delay

The computation of the average D is carried out by taking into account only the packets that successfully arrive at the end of the network (i.e., at the last reachable node) and ignoring the (remaining) packets that stop earlier. On the basis of the approach described in detail in Appendix B.4, the average end-to-end delay can be given the following recursive formulation:

$$\begin{aligned} D \triangleq \overline{D}^{(N)}=\overline{T}^\mathrm{tx}_\mathrm{src} + \sum _{i=1}^{n}\Bigl (\overline{D}^{(N-i)}+\overline{D}_{i|i}\Bigr ) p_{Y}(i | \fancyscript{S}) \end{aligned}$$
(3.10)

where \(\overline{D}^{(N-i)}\) is the average delay in a network with \(N-i\) nodes and \(\overline{T}^\mathrm{tx}_\mathrm{src}\) is the average transmission time of the source, which differs from those of the following nodes, since the source does not contend with any other node and its transmission is not affected by collisions. Since the average time spent in the backoff is \((cw-1)/2\), \(\overline{T}^\mathrm{tx}_\mathrm{src}\) can be expressed as

$$\begin{aligned} \overline{T}^\mathrm{tx}_\mathrm{src} \triangleq T^\mathrm{tx} + T_\mathrm{slot} \left( \textit{DIFS} + \frac{cw-1}{2}\right) . \end{aligned}$$
(3.11)

3.6.2.2 RE

The average RE can be defined as follows:

$$\begin{aligned} \mathrm{RE}\triangleq \frac{\overline{N}_\mathrm{reach}}{N} \end{aligned}$$
(3.12)

where \({N}_\mathrm{reach}\) is a random variable denoting the number of nodes reached by a packet. As a consequence of our assumptions, \({N}_\mathrm{reach}\) is lower bounded by \(n\), since the transmission from the source reaches \(n\) nodes (those of the first TD) with probability 1. The average value \(\overline{N}_\mathrm{reach}\) can be obtained by following the approach described in Appendix B.4, but for the replacement of \(p_{Y}(i | \fancyscript{S})\) with \(p_{Y}(i)\) and of \(\overline{D}_{i|i}\) with the number of additional nodes covered by a new transmission. For example: a transmission from the 1-st node of the first TD will reach only one additional node (namely, the \((n+1)\)-th); a transmission from the 3-rd node will reach three additional nodes (namely, the \((n+1)\)-th, \((n+2)\)-th, and \((n+3)\)-th); and so on. The reader should note that, unlike the delay, in the computation of the RE we are not conditioning on the fact of reaching the \(N\)-th node of the network, i.e., the last reachable node of the network. Therefore, also the packets which stop being retransmitted are taken into account.

After the execution of the recursive approach outlined in Appendix B.4, it is sufficient to add a constant equal to \(n\), corresponding to the number of nodes directly reached by the source at the first hop. The final expression of \(\overline{N}_\mathrm{reach}\) becomes (using the notation of Appendix B.4):

$$\begin{aligned} \overline{N}_\mathrm{reach} = \overline{N}_\mathrm{reach}^{(N)}&= n + \sum _{i=1}^{n}\Bigl (\overline{N}_\mathrm{reach}^{(N-i)}+i\Bigr )p_{Y}(i) \nonumber \\&= n + \sum _{i=1}^{n}\Bigl (\overline{N}_\mathrm{reach}^{(N-i)}+i\Bigr )p_\mathrm{rtx}^{(n)}(i) \end{aligned}$$
(3.13)

where \(\overline{N}_\mathrm{reach}^{(N-i)}\) corresponds to the average number of nodes reached in a network with \(N-i\) nodes and can be recursively computed in the same way.

3.6.2.3 TE

In order to reduce the computational burden, we adopt the following approximated formulation of TE:

$$\begin{aligned} \text {TE} \triangleq {\frac{\text {RE}}{{N}_{\text {rtx}}}} \end{aligned}$$
(3.14)

where \(\overline{N}_\mathrm{rtx}\) denotes the average overall number of retransmissions over all hops. From a computation viewpoint \(\overline{N}_\mathrm{rtx}\) is approximated by \(\overline{N}_\mathrm{rtx}^{m^{(*)}}\), where \(m^{*}\) corresponds to the average number of reached nodes—it is a sort of approximated indicator of the “depth” of the propagation process. Since the RE can be interpreted as the ratio between the average number of reached nodes and the total number (\(N\)) of nodes, \(m^{*}\) can be approximated as follows:

$$\begin{aligned} m^{*} \simeq N \cdot \mathrm{RE}. \end{aligned}$$

At this point, \(\overline{N}_\mathrm{rtx}^{(m^{*})}\) can be computed by applying the recursive approach presented in Appendix B.4, by replacing (i) \(p_{Y}(i|\fancyscript{S})\) with \(p_{Y}(i)\) and (ii) \(\overline{D}_{i|i}\) with the average number of transmissions per hop, denoted by \(\overline{N}_\mathrm{rtx}^\mathrm{hop}\) and given in (3.9).

3.6.3 Generalization to a PPP-Based Scenario

According to the original PPP-based model, described in Sect. 3.2, the number of nodes within \(\fancyscript{I}\), denoted as \(N_z\), has the following Poisson distribution:

$$ p_{N_z} (n,\rho _\mathrm{s}z) = \frac{{e}^{-\rho _\mathrm{s}z}(\rho _\mathrm{s}z)^{n}}{n!} \qquad n \in \{0,1,2,\dots \}.$$

However, since a real vehicle has a finite length, it is not possible to have an infinite number of vehicles within \(\fancyscript{I}\). Therefore, it makes sense to impose an arbitrary limit to the maximum number of nodes within \(\fancyscript{I}\), denoted as \(N_\mathrm{c}\). The new truncated Poisson random variable, denoted as \(N'_z\), has the following distribution:

$$\begin{aligned} p_{N'_z} (n,\rho _\mathrm{s}z) =\frac{\frac{e^{-\rho _\mathrm{s}z}(\rho _\mathrm{s}z)^{n}}{n!}}{\sum _{i=1}^{N_\mathrm{c}}\frac{e^{-\rho _\mathrm{s}z}(\rho _\mathrm{s}z)^{i}}{i!}} \qquad n \in \{1,2,\dots ,N_\mathrm{c}\} \end{aligned}$$

where we have also removed the event \(n=0\)—this would correspond to an empty TD.

In order to exploit the results of Sect. 3.6.1, the stochastic network topology of the PPP needs to be mapped into a deterministic one with equally spaced nodes. In order to do this, the interval \(\fancyscript{I}\) is partitioned in \(N^\mathrm{int}\) sub-intervals of length \(z / N^\mathrm{int}\), where \(N^\mathrm{int} \in \{N_\mathrm{c}, N_\mathrm{c}+1, N_\mathrm{c}+2,\dots \}\) is a design parameter. The computational burden and the accuracy are directly related to the value of \(N^\mathrm{int}\). After some numerical tests, we observed that the value \(N^\mathrm{int}=100\) is a good tradeoff between precision and computational time. Thus, the \(i\)-th sub-interval is:

$$\fancyscript{I}_i = \left[ \frac{(i-i) z}{N^\mathrm{int}}, \frac{i z}{N^\mathrm{int}}\right] \qquad i=1,2,\dots ,N^\mathrm{int}.$$

Every sub-interval can contain at most one node: in general, we assume that in each sub-interval there is a “virtual” node. Consequently, it is possible to associate a transmission probability \(p_\mathrm{rtx}^\mathrm{eq}(i)\) to the generic sub-interval \(\fancyscript{I}_i\), defined as \(p_\mathrm{rtx}^\mathrm{eq}(i)\), and a corresponding per-node delay, denoted as \( D(i)^\mathrm{eq}\) (\(i=1,\dots ,N^\mathrm{int}\)).

We define as \(p_\mathrm{rtx}^{(n)}(j)\) the probability of retransmission of the \(j\)-th node, given that there are exactly \(n\) nodes in the interval \(\fancyscript{I}\). Using the total probability theorem, \(p_\mathrm{rtx}^\mathrm{eq}(i)\) can be expressed as follows:

$$\begin{aligned} p_\mathrm{rtx}^\mathrm{eq}(i)&=\sum _{n=1}^{N_\mathrm{c}}\left( p_\mathrm{rtx}^\mathrm{eq}(i)|N'_\mathrm{z}=n\right) P(N'_\mathrm{z}=n) \nonumber \\&=\sum _{n=1}^{N_\mathrm{c}}\sum _{j=1}^{n}p_\mathrm{rtx}^{(n)}(j)\;f(i,j,n)\;p_{N'_\mathrm{z}}(n,\rho _\mathrm{s}z)\qquad i \in \{1,\dots ,N^\mathrm{int}\} \end{aligned}$$
(3.15)

where \(f(i,j,n)\) is an indicator function defined as follows:

$$\begin{aligned} f(i,j,n) \triangleq {\left\{ \begin{array}{ll} 1 &{} \overline{\mathrm{R}}_j^{(n)}\in \fancyscript{I}_i \\ 0 &{} \overline{\mathrm{R}}_j^{(n)}\notin \fancyscript{I}_i. \end{array}\right. } \end{aligned}$$
(3.16)

The probability \(p_\mathrm{rtx}^\mathrm{eq}(i)\) is now a function of \(p_\mathrm{rtx}^{(n)}(i)\) (\(n \in \{1,2,\dots ,N_c\}\), \(i \in \{1,2,\dots ,n\}\)), which can be computed with combinatorics, since it is associated with a deterministic scenario with \(n\) static nodes equally spaced in \([0,z]\).

At this point, by using (3.6) in Eq. (3.15), it is possible to obtain a closed-form expression for \(p_\mathrm{rtx}^\mathrm{eq}(i)\). Leveraging on the knowledge of \(p_\mathrm{rtx}^\mathrm{eq}(i)\), by using Eq. (3.15) into Eqs. (3.7) and  (3.9), it is possible to obtain, respectively, \(D(i)^\mathrm{eq}\) (\(i=1,\dots ,N^\mathrm{int}\)) and \({n_\mathrm{rtx}^\mathrm{hop}}^\mathrm{eq}\). Then, it is possible to use the framework presented in Sect. 3.6.2 to derive RE, TE, and D for a deterministic network composed by \(N_\mathrm{c} \ell _\mathrm{norm}\) nodes, since \(N_\mathrm{c}\) is the (imposed) number of nodes in the interval \(\fancyscript{I}\) (and, thus, in each TD).

As anticipated at the end of Sect. 3.1, we remark that the presented analytical framework can be employed to study other types of broadcast protocols, not necessarily probabilistic, by simply re-adapting the definition of \(p_\mathrm{rtx}^{(n)}(i)\) and \(D_{i|i}\). This is the subject of our current research activities.

3.7 Performance Analysis in Realistic Scenarios

3.7.1 Polynomial Protocol

In this section, we compare the results obtained with the analytical framework presented in Sect. 3.6 with those obtained through numerical simulations carried out with the ns-2 simulator [1]. In particular, the polynomial protocol has been “inserted” on top of the IEEE 802.11b model, after fixing the bugs reported by Chen et al. [27]. We observe that, conditionally on the fact of suitably scaling the packet size and the packet generation rate, from the perspective of our framework the IEEE 802.11a/p standards will offer the same performance of the IEEE 802.11b standard. All the results presented are accurate within \(\pm 5~\%\) of the values shown with \(95~\%\) confidence. The relevant parameters of the simulation are listed in Table 3.2. The results are obtained for a fixed node spatial density \(\rho _\mathrm{s} = 0.1\) veh/m, while the possible values of the transmission range \(z\) are listed in Table 3.2. In particular, the values of \(z\) are selected so that the corresponding values of \(\rho _\mathrm{s} z\) are between 10 veh and 40 veh. In the numerical simulations, we do not consider any case with \(\rho _\mathrm{s} z < 10\) veh, since this corresponds to topologies that are disconnected with a high probability, as shown in [11].

Table 3.2 Main IEEE 802.11b network simulation parameters
Fig. 3.7
figure 7

D (a), RE (b), and TE (c), as a function of \(\rho _{s}z\) obtained using the polynomial protocol and different values of \(g\), by considering \(cw=31\), \(l_\mathrm{norm}=8\), \(P=\text {1,000}\) bytes, and \(R=1\) Mbps. Both simulation (Sim) and analytical results (Ana) are shown

In Fig. 3.7, (a) D, (b) RE, and (c) TE are shown as functions of \(\rho _{s}z\), for different values of \(g\), by taking into account both the results of the analytical framework and numerical simulations, thus allowing to assess the validity of the analytical model. As shown by Busanelli et al. [11], using the considered values of \(\rho _{s}z\) (between 10 and 40 veh), the network is fully connected (i.e., \(n_\mathrm{reach}=N\)) with a high probability. From Fig. 3.7b it emerges that, in terms of RE, there is an excellent match between the results of the theoretical framework and those of the simulator. As shown by Fig. 3.7c, the agreement between analysis and simulations is still good also in terms of TE. On the other hand, the delay predicted by the analytical framework overestimates the true delay for small values of \(g\) (e.g., \(g=0\)), whereas it becomes very accurate for large values of \(g\) (e.g., \(g=7\)). The comparative investigation of analytical and simulation results indicates the validity of the proposed framework (especially for large values of \(g\)).

According to the results in Figs. 3.7a, c, it emerges that a higher polynomial degree leads to a better performance, regardless of the value of \(\rho _\mathrm{s}z\), in terms of both D and TE. Conversely, since the PAF is highly selective for large values of \(g\) (as shown in Fig. 3.5), this leads to poor performance in terms of RE, as shown in Fig. 3.7b. By considering small values of \(g\) (e.g., \(g=0\) corresponds to flooding), one observes the opposite phenomenon: a drastic improvement in terms of RE, at the price of a slightly higher D and a smaller TE.

Fig. 3.8
figure 8

D as a function of RE, parametrized with respect to \(\rho _\mathrm{s}z\) (for various values of \(g\)) (a) and \(g\) (for various values of \(\rho _\mathrm{s}z\)) (b). The results are obtained by using the polynomial protocol and considering \(cw=31\), \(l_\mathrm{norm}=8\), \(P=\text {1,000}\) bytes, and \(R=1\) Mbps

Fig. 3.9
figure 9

a \(g^{*}\) and b D, as a function of \(\rho _\mathrm{s}z\)

In order to better understand the impact of \(g\) and \(\rho _\mathrm{s} z\) on the protocol performance: in Fig. 3.8a, D is shown, parametrized with respect to \(g\), as a function of RE for different values of \(\rho _\mathrm{s} z\); while in Fig. 3.8b D is shown, parametrized with respect to \(\rho _\mathrm{s} z\), as a function of RE for different values of \(g\). From the results in Fig. 3.8a, it emerges that even little variations of \(g\) lead to radically different protocol behaviors. On the contrary, \(\rho _\mathrm{s}z\) has an impact on the performance only for small values of \(\rho _\mathrm{s}z\), while for increasing values of \(\rho _\mathrm{s}z\) (e.g., larger than 20 veh) its impact vanishes.

From the results in Figs. 3.7 and 3.8, it clearly emerges that there is no optimal value of \(g\). However, the proposed framework allows to optimize a single performance metric, after having imposed some constraints on the other metrics, on the basis of proper quality of service criteria. A possible choice consists in ignoring TE and minimizing D under the constraint of attaining a target value of RE. Since D is a decreasing function of \(g\), it is possible to define the following quasi-optimal \(g^{*}\):

$$\begin{aligned} g^{*}(\rho _\mathrm{s}z)=\left\{ \mathrm{max}(g) | \mathrm{RE}(\rho _\mathrm{s}z)>0.95\right\} . \end{aligned}$$

Selecting \(g=g^{*}\) allows to achieve the minimum delay under a constraint on the RE. The obtained \(g^{*}\) is shown, as a function of \(\rho _\mathrm{s}z\), in Fig. 3.9a, and the following considerations can be drawn: (i) \(g^{*}\) is an increasing monotonic function of \(\rho _\mathrm{s}z\); (ii) with the exception of the region in proximity to \(\rho _\mathrm{s}z=0\), where \(g^{*}\) tends to \(0\), \(g^{*}\) has a quasi-linear dependence with respect to \(\rho _\mathrm{s}z\). It can be shown that if \(g=g^{*}\), \(\mathrm{RE} \simeq 1\) for each value of \(\rho _\mathrm{s}z\). Note that the selection of \(g^{*}\) allows to maximize RE. However, as shown in Fig. 3.9, D is always higher than 0.08 s, a delay which is instead guaranteed by the use of \(g=7\), as shown in the same figure.

Fig. 3.10
figure 10

D (a), RE (b), and TE (c), as a function of \(\rho _{s}z\) obtained using the SIF protocol and different values of \(c\), by considering \(cw=31\), \(l_\mathrm{norm}=8\), \(P=\text {1,000}\) bytes, and \(R=1\) Mbps. Both simulation (Sim) and analytical results (Ana) are shown

3.7.2 Silencing Irresponsible Forwarding

As pointed out in Sect. 3.6, the proposed framework can be applied to a large family of broadcast protocols. In this section, the framework is applied to SIF. In particular, the validity of the proposed analytical framework is clearly shown in Fig. 3.10, where (a) D, (b) RE, and (c) TE are shown, as functions of \(\rho _{s}z\), for different values of \(c\), by directly comparing both analytical and simulation results. As with the polynomial broadcast protocol, in this case as well there is a good agreement between the results obtained with the analytical model and the simulations. In particular, it can be observed that the accuracy of the model depends on the value of the shape parameter \(c\) (the highest average accuracy, over all metrics, is observed with \(c=7\)). By comparing Figs. 3.8 and 3.10, one can observe that polynomial and SIF protocols have a different dependence on \(\rho _{s}z\). In particular, in the case of SIF, as the product \(\rho _{s}z\) increases RE remains roughly the same, while D decreases and TE increases. In other words, SIF performs better in dense networks. On the other hand, in the case of the polynomial protocol (Fig. 3.8), D and TE have an opposite behavior (namely, D slightly increases and TE slightly decreases for increasing values of \(\rho _{s}z\)), and RE strongly depends on \(\rho _{s}z\), especially in sparse networks. In general, SIF outperforms the polynomial broadcast protocol.

Fig. 3.11
figure 11

D as a function of RE, parametrized with respect to \(\rho _\mathrm{s}z\) (for various values of \(c\)) (a) and \(c\) (for various values of \(\rho _\mathrm{s}z\)) (b). The results are obtained by using the SIF protocol and considering \(cw=31\), \(l_\mathrm{norm}=8\), \(P=\text {1,000}\) bytes, and \(R=1\) Mbps

Furthermore, from Fig. 3.10 it is clear that also for SIF there is no optimal value of the parameter \(c\) which simultaneously optimizes the performance according to all considered metrics. This fact can be better understood from Fig. 3.11, where D is shown as a function of RE, parametrized, respectively, with respect to (a) \(\rho _\mathrm{s}z\) and (b) \(c\). In particular, from Fig. 3.11b it emerges that if one wants to guarantee a minimum value of RE (say 0.95), it is necessary to use a sufficiently high value of \(c\). This, in turns, does not minimize \(D\), which, as shown in Fig. 3.10a, is directly proportional to \(c\). Moreover, the results in Fig. 3.11a strengthen the observations carried out regarding the results in Fig. 3.10. In fact, they clearly evidence two important characteristics of SIF: (i) RE is not affected by the value of \(\rho _{s}z\), as SIF automatically adapts; (ii) counterintuitively, D is a decreasing function of \(\rho _{s}z\) (i.e., SIF performs better in dense networks).

3.7.3 Comparison with Benchmark Protocols

As aforementioned, the theoretical framework presented in this manuscript can be used for evaluating a large number of broadcast protocols. In this subsection, it is applied to two benchmark broadcast protocols: (i) the flooding protocol (denoted with “FLOOD”), where each node forwards a received message; (ii) the optimal MCDS-based protocol (denoted with “MCDS”), where a hypothetical network genius selects as relays only the nodes belonging to the MCDS set (as described in Sect. 3.1). In both cases, the silencing mechanism is employed.

Fig. 3.12
figure 12

a D, b RE, and c TE, obtained using the SIF, polynomial, flooding, and MCDS protocols, with \(\rho _{s}z=16\) veh, \(c^{*}=4.8\), and \(g^{*}=2.7\). The results are obtained through simulations by considering different topology, namely, a single-lane static network, a multi-lane static network, and a multi-lane mobile network (highway-style)

Fig. 3.13
figure 13

a D, b RE, and c TE, obtained using the SIF, polynomial, flooding, and MCDS protocols, with \(\rho _{s}z=16\) veh, \(c^{*}=4.8\), and \(g^{*}=2.7\). Both simulation and analytical results are shown

These benchmark protocols are compared with the SIF and polynomial protocols, considering a vehicle spatial distribution characterized by a Poisson distribution with parameter \(\rho _\mathrm{s}z=16\) veh. In order to have a significant comparison, the optimal values of \(c\) and \(g\) (\(c^{*}=4.8\) and \(g^{*}=2.7\)) are considered. These values, obtained through the analytical framework, allow to minimize D under the constraint of having a RE higher than 0.95, in a scenario with \(\rho _\mathrm{s}z=16\) veh. The results, attained through both simulations and theoretical analysis, are shown in Fig. 3.12. From the results in Fig. 3.12, a few considerations can be drawn. First, for all considered metrics, there is a performance loss between the MCDS-based and the optimized SIF/polynomial protocols. At the same time, the SIF/polynomial protocols exhibit a similar performance gain with respect to flooding (with the exception of the RE metric). It is also possible to observe that, counterintuitively, the SIF and the polynomial protocols offer a similar performance level. This result can be motivated by considering that their PAFs tend to converge to a common shape, when using, respectively, the optimal values \(g^{*}\) and \(c^{*}\) as their key parameters. Finally, an excellent match between simulation and theoretical results can be observed.

3.7.4 Highway-Style Scenarios

The goal of this subsection is to assess (a-posteriori) the validity of the assumption, made in Sect. 3.2, of considering a uni-dimensional static network. The validation is performed through simulations, by taking into account the protocols considered in Sect. 3.7.3 (namely, flooding, MCDS-based, SIF, and polynomial protocols). According to our assumption, we expect that the performances offered by these protocols will not be significantly affected by the network topology. To this end, we consider three different scenarios: (i) the uni-dimensional (single-lane) static network presented in Sect. 3.2; (ii) a multi-lane static network; (iii) a multi-lane mobile network. The multi-lane static scenario is composed by \(N_\mathrm{lane}=6\) adjacent lanes, each with width equal to \(w_\mathrm{lane} = 4\) m. Such a network is obtained by simply replicating the single-lane topology. In particular, in each lane the positions of the vehicles are generated according to a PPP of parameter \(\rho _\mathrm{s}/N_\mathrm{lane}\). Similarly, the multi-lane mobile scenario is composed by \(N_\mathrm{lane}=6\) adjacent lanes (3 per direction of movement), each with width equal to \(w_\mathrm{lane} = 4\) m. In this case, the vehicles are moving according to the Intelligent Driver Motion with Lane Changes (IDM-LC) mobility model [28] and, therefore, their positions do not have Poisson distribution. The mobility traces have been obtained using VanetMobiSim [29] and plugged in the ns-2 network simulator. The vehicles’ speeds are independent and uniformly distributed in the interval \((20-40)\) m/s. Greater insights about the mobility models and the trace generation process are provided in [30]. It should be noticed that the value of the per-lane vehicular density (\(\rho _\mathrm{s}\)) is time-averaged, since it is computed directly from the mobility trace and thus is time-varying. In Fig. 3.13, we show the results obtained by considering \(\rho _\mathrm{s}=16\) veh and the optimal values of \(c\) and \(g\) (\(c^{*}=4.8\) and \(g^{*}=2.7\)). It can be easily noticed that the performances obtained in the considered scenarios are quite similar. Hence, this proves (a-posteriori) that the assumptions made in Sect. 3.2 are substantially correct. More specifically, it can be observed that increasing the width of the network leads to very similar values of RE and D, and to slightly higher TE (this can be justified by considering that there is a higher number of nodes in the neighborhood of a vehicle). Instead, if we consider the same scenario but with mobile vehicles, one can observe that the RE becomes slightly lower, while D and TE become higher. This behavior is motivated by the tendency of mobile VANETs to form ephemeral clusters of vehicles [31], leading to a reduced RE and increased D but to a higher TE.

Finally, the limited impact of the vehicle mobility on the protocols’ performance could have been expected by considering the values of the worst case transmission time (about 0.2 s) and of the the maximum allowed speed (roughly equal to 40 m/s, corresponding to 144 Km/h). In these conditions, two vehicles proceeding in opposite directions on a highway have a differential speed of 80 m/s, and this leads, in turn, to a distance variation of 16 m during a packet transmission time. A distance of 16 m (the worst-case variation) corresponds to a small fraction of the transmission range of a typical IEEE 802.11 network interface (in Fig. 3.13, we have considered \(z=160\) m).

3.8 VANETs as Distributed Wireless Sensor Networks

In this section, we consider a data collection application in a V2I scenario, where the vehicles act as a distributed wireless sensor network. In particular, we present a vehicular decentralized detection scheme, based on the observation, by all vehicles of a VANET, of a spatially constant phenomenon of interest (e.g., the average smog level or traffic situation on a given road). Our approach consists in the creation, during a downlink phase, of a clustered VANET topology during fast broadcast data dissemination, from the Access Point (AP), through a clustering protocol, denoted as Cluster-Head Election IF (CHE-IF). Such a clustered topology is then exploited, during an uplink phase, to collect information from the vehicles and perform distributed detection. Our results highlight the existing trade-off between decision delay and energy efficiency. Unlike classical sensor networks for distributed detection, the proposed vehicular distributed detection scheme exploit the natural vehicle clustering and have to cope with their “ephemeral” nature. More precisely, vehicle mobility has a direct impact on the maximum amount of data that can be collected, thus leading to the concept of decentralized detection on the move.

3.8.1 System Model

We consider a static one-dimensional wireless network with \(N\) (receiving) nodes, like the one presented in Sect. 3.2.1. The system model is the same used during the rest of the book and presented in Sect. 3.2.1. In particular, the reference scenario is represented by Fig. 3.1. All vehicles observe a spatially constant phenomenon, i.e., a phenomenon whose status does not change from vehicle to vehicle along the road. For example, vehicles could monitor if the average smog (or fog) level overcomes a critical threshold: the VANET would declare that it does if it happens for most of the road. The observed phenomenon can be generically defined as

$$ H = \left\{ \begin{array}{ll} H_0 &{} \text {with probability} p_0\\ H_1 &{} \text {with probability} 1-p_0\end{array} \right. $$

where \(p_0 \triangleq {\mathbb P\{H=H_0\}}\), being \({\mathbb P\{\fancyscript{A}\}}\) the probability that the event \(\fancyscript{A}\) happens. The value \(H_0\) can be interpreted as the fact that the underlying physical phenomenon is, on average (along the road), below a given threshold, whereas the value \(H_1\) can be interpreted as the fact that the underlying physical phenomenon is, on average (along the road), above a given threshold.

3.8.2 Clustered VANET Creation and IVCs

In this section, we derive the communication model for the vehicular distributed detection scenario. First, a downlink phase is envisioned, where the AP broadcasts a query to all vehicles in the network, in order to obtain information about the phenomenon of interest. During this phase, the CHE-IF protocol, besides guaranteeing fast information dissemination, automatically creates a clustered architecture, by opportunistically exploiting the ephemeral vehicular clusters. After a clustered network topology has been generated, during the uplink phase the decentralized detection task is performed by transferring the sensed data from the vehicles to the AP, through multi-hop communications and considering local fusion in each vehicular cluster.

3.8.2.1 Downlink

The philosophy of CIF protocol [31] is to establish a weak artificial packet flow, with the purpose to discover the presence of naturally formed clusters. Then, this information is exploited in order to optimize the forwarding procedure, increasing the reliability and the transmission efficiency, but without building up a true clustered infrastructure.

We propose a derivation of CIF, denoted as CHE-IF, that introduces some expedient mechanisms to make CIF a protocol capable to efficiently construct a stable clustered infrastructure. The new CHE-IF protocol is a totally decentralized protocol, since each node designates its own CH without pursuing a common global consensus. The purpose is to obtain an operative clustered topology in the shortest time, in order to start the data collection process as soon as possible. This behavior fits well with the intrinsic dynamic nature of a VANET, characterized by continuous topology changes that vanish the hypothetical advantages of a centralized clustering protocol. Moreover, a refinement of the cluster structure can be performed once the collection process is started, making small adjustment of the network topology.

The CHE-IF protocol is designated in order to choose a single CH among the retransmitting nodes of a transmission domain. This ideally yields to the creation of an unique set of connected CHs, able to cover the entire area of interest. After choosing the CHs, the cluster will naturally form. In fact, the nodes not designated as CHs become children of the nearest CH, leading to the formation of clusters of similar dimension.

The CHE-IF protocol defines 3 types of packets: (i) Cluster Initialization Packet (CIP); (ii) Probe Packet (PP); (iii) Cluster Confirmation Packet (CCP). The CHE-IF protocol is characterized by three phases. In the first one, through the exchange of some dedicated packets (CIPs and PPs), every node fills a temporary routing table containing the list of the potential CHs in its transmission range. During the second phase, that starts after a time \(T_\mathrm{w} ^\mathrm{CIP}\) (set proportionally to the length of the network), each node elects its CH based on the information contained in its routing table.Footnote 3 Due to the lack of global consensus, it is not guaranteed that node decisions match together. For instance, some nodes could designate a CH that does not believe to be a CH. For this reason, there is also a third phase, called confirmation phase, during which the AP sends a special packet (the CCP) that is retransmitted only by the CHs (with probability 1). Listening to the CCP, the network nodes can become aware of the identity of the true CHs.

While the second and the third phases are relatively simple, the first phase is more complicated, as it requires, at every hop, 4 steps that are graphically represented in Fig. 3.14 and described in the following.

Fig. 3.14
figure 14

CH election of the CHE-IF protocol

Fig. 3.15
figure 15

Network topologies (upper part) and their logical representations (lower part): direct communications between CHs

Fig. 3.16
figure 16

Network topologies (upper part) and their logical representations (lower part): multi-hop communications between CHs and AP

The first phase consists of the transmission of the CIP by a node of the \((i-1)\)-th transmission domain, which leads to the identification of the \(i\)-th transmission domain (the AP in the case of 1-st transmission domain). The CIP is sent with a transmit power \(P_\mathrm{t} ^\mathrm{CIP}\) and contains a unique identification (ID) and the source address of the AP.

The second step derives directly from the IF protocol and is a sort of “virtual contention.” In particular, every node in the \(i\)-th transmission domain decides to become or not a potential forwarder by performing the same probabilistic election mechanism of the IF protocol. The winners of this contention will begin the third step, while the others will simply discard the packet.

The third step derives from the concept of “ephemeral cluster.” Once a node wins the first virtual contention, it schedules the retransmission of a very short packet, denoted as Probe Packet (PP). A PP bears just two information items: (1) the unique identification (ID) of the CIP; (2) the distance from the node in the previous transmission domain from which it has received the packet. The PPs are intrinsically single hop, i.e., they are not forwarded. A PP is transmitted with a power defined as \(P_\mathrm{t} ^\mathrm{PP} = 0.25 P_\mathrm{t} ^\mathrm{CIP}\), in order to reduce network congestion, since a node is interested only in signaling its presence to its neighbors, and with a high priority, in order to reduce the overall latency. Moreover, a low transmission power allows to reduce channel interference. The specific power and priority setting of a PP have to be tuned according to the used MAC protocol, as shown in [31]. After winning the virtual contention, every potential forwarder sends a PP. It then waits for a short interval, denoted as \(T_\mathrm{w} ^\mathrm{PP} \frac{d}{z}\), where \(T_\mathrm{w} ^\mathrm{PP}\) is a proper constant. If, within this interval, it receives at least a PP containing a value of distance larger than its own, it stops and discards the packet (in fact, there is an other better placed forwarder); conversely, it retransmits the CIP. In the worst case, when a collision between two or more PPs happens, this selection mechanism fails and no node of the cluster is elected. In this case, the retransmitter in the previous transmission domain will retransmit the CIP for restarting the CH designation procedure at the \(i\)-hop. This can happen until a maximum of 3 times, otherwise the whole designation procedure is considered failed.

The fourth step corresponds to the transmission of the CIP from the designated forwarding nodes at the \(i\)-th transmission domain.

3.8.2.2 Uplink

The uplink phase exploits the clustered structure created during the downlink phase. More precisely, during the uplink phase, the data acquired by the \(N\) vehicles of the VANET are transmitted to the final AP. Note that, unlike a regular sensor network, the created VANET can be used as long as its structure does not break, due to vehicle mobility. In other words, there is a maximum amount of data which can be collected [32].

The observed signal at the \(i\)-th vehicle can be expressed as

$$\begin{aligned} r_i = \left\{ \begin{array}{ll} 0 + w_{i} &{} \text {if }H=H_{0}\\ s+w_{i} &{} \text {if }H=H_{1}\end{array} \right. \quad \quad i=1,\ldots ,N \end{aligned}$$
(3.17)

where \(\left\{ w_i\right\} \) are additive noise samples. Note that \(s\) is considered as a deterministic parameter. Assuming that the noise samples \(\{w_i\}\) are independent random variables with the same Gaussian distribution \(\fancyscript{N}(0,\sigma ^2)\), the common observation signal-to-noise ratio (SNR) at the vehicles, denoted as \(\mathrm{SNR_\mathrm{vehicle}}\), can be defined as \(\mathrm{SNR_\mathrm{vehicle}} \triangleq s^2/\sigma ^2\) [33]. Each vehicle makes a decision comparing its observation \(r_i\) with a threshold value \(\tau =s/2\) and computes a local decision \(u_i=U(r_i-\tau )\), where \(U(\cdot )\) is the unit step function. Note that a vehicle could transmit one single decision per packet or, by collecting consecutive phenomenon observations, it could transmit packets with more decisions. The strategy selection depends on the desired trade-off between data and overhead per transmitted packet. However, investigating this aspect goes beyond the scope of this Section.

Suppose that during the downlink phase the CHE-IF protocol has led to the creation of \(n_\mathrm{c}<N\) cluster. Each vehicle can communicate only with its local CH. Possible clustered topologies are represented in Figs. 3.15 and 3.16, according to the particular strategy for communications towards the AP.

In particular, when a sufficiently high transmit power is available, all CHs can communicate directly with the AP, as shown in Fig. 3.15. On the other hand, when the transmit power is not sufficiently high, multi-hop communications are required to transfer the information from the CHs towards the AP, as shown in Fig. 3.16.

The performance of the decentralized detection techniques presented here, has been analyzed and discussed in [32], considering realistic VANET clustered topologies.